DSPRelated.com
Forums

A Sound Pattern Detection Within A Continious Audio Stream

Started by Ptomaine January 22, 2006
>Well, all audio watermarking is based on steganography -- the hiding >of information in another signal. There are a number of approaches, >and they are usually parameterizable in terms of the amount of data >that needs to be carried, the robustness against signal processing, >and the audibility. The latter is strongly affected by the overall >quality and nature of the "cover" signal.
I'm affraid, David, that audio watermarking is not the right way. The problem is that I cannot control the process of embedding audio watermarks both into the radio audio stream and into the training samples. Moreover, no one would do that for the sake of my program features. The common usage of the audio event detection feature of my program is monitoring. Every advertisement audio block is surrounded by an intro (pilot or header) and a coda (tail). At the specified time the program starts listenning to the incomming audio stream and it must recognize the header (pilot) and start recording the whole advertisement block until the coda (tail). The problem is that the training samples of the header and the tail might be recorded and supplied from any source (CD or the air). I cannot control the quality of that sound and cannot make people embed watermarks. I have to work with the supplied material de facto.
Ptomaine wrote:
> I'm affraid, David, that audio watermarking is not the right way. > The problem is that I cannot control the process of embedding audio > watermarks both into the radio audio stream and into the training > samples.
In that case, you're going to have to rely on some form of audio "fingerprinting". This requires that you figure out what the invariant features of your samples are versus what features vary depending on the delivery path. Then, do a pattern-matching search (a kind of cross-correlation) on the invariant features. Perceptual coding algorithms can be helpful, because they tend to throw away irrelevant features and preserve important ones. This helps to reduce the amount of data you need to examine. It may also help to couple the searches for "intro" and "coda". Rather than searching for them independently, look for them in pairs at the expected spacing(s) in time. A big circular buffer, into which you record continuously, can help with the implementation. Then you just copy the segments you want to keep out to permanent storage once you've identified them. -- Dave Tweed
David Tweed wrote:
> Mark wrote: > > David Tweed wrote: > > > Not silly at all. Can you modify the audio stream coming from the > > > radio station? Back when I was working in audio watermarking, we > > > did some experiments related to getting a toy to react to sounds > > > coming over a TV set's speaker. We embedded inaudible triggers > > > into the cartoon's audio track, and when the watermark decoder in > > > the toy recognized the trigger via its built-in microphone, it > > > would toggle an output to make the toy do something. > > > > I am having a hard time thinking of a signal that can pass though > > a TV audio channel (50 to 15 kHz) and TV speaker (even less than > > that) that could trigger a toy an still be "inaudable". > > > > Can you be more specific? > > Well, all audio watermarking is based on steganography -- the hiding > of information in another signal. There are a number of approaches, > and they are usually parameterizable in terms of the amount of data > that needs to be carried, the robustness against signal processing, > and the audibility. The latter is strongly affected by the overall > quality and nature of the "cover" signal. > > Our algorithm could be "dialed in" for a wide range of applications. > At one end of the spectrum, we could hide quite a bit of data in CD > quality music with no audibility issues except for some very highly > trained listeners who knew exactly what they were looking for. For > TV audio applications, the data requirements were much less, the > robustness needed to be higher, and audibility was less of a concern. > The low quality of audio in the soundtrack to begin with could hide > a lot. You couldn't hear our signal in a TV speaker, but you could > definitely hear that particular watermark in the CD quality test > environment with a better cover signal. > > It was my job to build a very low-cost decoder that could be built > into a toy. I ended up with a cheap electret microphone, about four > stages of opamps for gain and filtering, and an 8-bit microcontroller > (6502 derivative) to run the algorithm. We could play the audio > through a cheap PC speaker at one end of a conference room table, > and our box at the other end would light up a sequence of LEDs to > show that it had seen the triggers, even while people in the room > were talking. It was a pretty impressive demo, but I don't know > whether it ever made its way into any real products. This was about > six years ago. > > -- Dave Tweed
OK, so your water mark signal is "hidden" or masked under an normal audio program, not just played out during silence.... as a crude analogy, your tree is hidden in the forest, not in an open field.... does it survive being fed through perceptual encoding i.e. MP3 for example? thanks Mark
Mark wrote:
> OK, so your water mark signal is "hidden" or masked under an normal > audio program, not just played out during silence.... > > as a crude analogy, your tree is hidden in the forest, not in an open > field....
Exactly. But when was the last time you ever heard more than a millisecond of silence in a TV program for kids? :-)
> does it survive being fed through perceptual encoding i.e. MP3 for > example?
I don't remember for sure what the results were, but yes, we were doing a lot of work in that area. We could definitely survive one pass through MP3, and we were trying to address the tandem coding issues related to watermarking material that was *already* MP3. If you weren't careful, you'd lose too much audio quality independent of the watermark. -- Dave Tweed
David Tweed wrote:
> It was a pretty impressive demo, but I don't know > whether it ever made its way into any real products. This was about > six years ago.
Maybe it made its way into these: http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/ The time frame's about right. -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (603) 226-0404 x536 The sooner you get behind, the more time you'll have to catch up
Jim Thomas wrote:
> David Tweed wrote: > > It was a pretty impressive demo, but I don't know > > whether it ever made its way into any real products. > > This was about six years ago. > > Maybe it made its way into these: > http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/ > The time frame's about right.
No, it explicitly says that those use RF. I remember reading about that at the time, and we were sort-of competing against that technology. We had the advantage of not requiring any additional hardware on the TV set itself. -- Dave Tweed
David Tweed wrote:
> Jim Thomas wrote: > >>David Tweed wrote: >> >>>It was a pretty impressive demo, but I don't know >>>whether it ever made its way into any real products. >>>This was about six years ago. >> >>Maybe it made its way into these: >>http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/ >>The time frame's about right. > > > No, it explicitly says that those use RF. I remember reading > about that at the time, and we were sort-of competing against > that technology. We had the advantage of not requiring any > additional hardware on the TV set itself.
Actimates came with a decoder that connected to the audio (I think) output of the TV. I assume it sent commands to the toy via RF. Maybe they were using some sort of audio watermarking, and maybe it was not as robust as the one you worked on (thus requiring a clean audio path between the TV and the decoder). I bought one for the purpose of maintaining marital bliss, but it nearly killed me. Teletubbies + Microsoft = (Gag Reflex)^2 -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (603) 226-0404 x536 The sooner you get behind, the more time you'll have to catch up