Improvements in processor performance of desktop PCs and handheld devices has driven technology to melt chunks of digital information into media streams (an audio file, a TV, radio, Internet broadcast, or an authorized digital content distribution network). The physical principles and mathematics for such techniques were developed long ago; the technical progress is the trigger of making audio steganography a reality. What was considered a ‘spy’ or a secret lab technology ten years ago, is now available to public as a turnkey-quality SDK, or a ready app. This will create more demand and new fields of application.
The Solution for Media: Audio Interaction with the Audience
DataArt was asked to build a second-screen system that would be able to tag the audio part of a TV stream (broadcast from a studio to a TV transmitter) with time markers. A special application then receives the markers back, and they subsequently were synced with the channel ID and time offset of the video source. The requirement was that the part of the program and the time where it was watched on the client device had to be arbitrary.
The method DataArt chose is based on mixing an audio signal with its delayed copy (the artificial echo). This is a proven method to have the encoded information survive digital compression and over-the-air transmission (“robust encoding”), as the echoing signal undergoes the same changes as the original one. The method then detects changes in the spectrum of the audio signal induced by the applied echo. The series of spectrum changes over time carries the encoded digital information.
An application is an interactive quiz, where questions pop up synchronously with specific content being transmitted via a TV channel. Here, the application “listens” passively to ambient audio signals until a special digital code is detected. The app then compares the TV stream and the time mark with the ones marked for popping up a question. If there is a match, the app displays the question on the screen – regardless, at what time the content is being watched. Generally speaking, the ability to digitally recognize a particular audio stream and the time marker within it enables a synchronous action to be performed once this audio is detected. This can also be used for captioning, and will help media corporations to interact directly with viewers, generate new advertising options and revenue streams and producing more specific content.
Audio steganography is also a part of the more common term of Digital Watermarking – which is widely used in counter-piracy systems that patch an AV signal on its way to a cinema. This enables to determine not only whether a screen-copy was illegally recorded, but also when and where it was recorded, making it easier to investigate piracy incidents, that are still to be a wide-spread issue for the media industry.
The requirements for these applications differ substantially: for anti-piracy coding, for example, the hidden markers should be extremely strong against attempts to remove them, whereas for content-synchronization applications a faster bit-per-second rate takes precedence.
Moreover, DataArt’s Computer Vision team is currently working on a number of R&D and production applications with digital signal processing algorithms. The roster of developed applications includes a mobile-based face identification system, a client-server solution for food recognition, and an all-custom augmented reality engine solution.