comp.dsp | A Sound Pattern Detection Within A Continious Audio Stream| page 2

Reply by Ptomaine ●January 24, 20062006-01-24

>Well, all audio watermarking is based on steganography -- the hiding
>of information in another signal. There are a number of approaches,
>and they are usually parameterizable in terms of the amount of data
>that needs to be carried, the robustness against signal processing,
>and the audibility. The latter is strongly affected by the overall
>quality and nature of the "cover" signal.

I'm affraid, David, that audio watermarking is not the right way. The
problem is that I cannot control the process of embedding audio watermarks
both into the radio audio stream and into the training samples. Moreover,
no one would do that for the sake of my program features.

The common usage of the audio event detection feature of my program is
monitoring. Every advertisement audio block is surrounded by an intro
(pilot or header) and a coda (tail). At the specified time the program
starts listenning to the incomming audio stream and it must recognize the
header (pilot) and start recording the whole advertisement block until the
coda (tail). The problem is that the training samples of the header and the
tail might be recorded and supplied from any source (CD or the air). I
cannot control the quality of that sound and cannot make people embed
watermarks. I have to work with the supplied material de facto.

Reply by David Tweed ●January 24, 20062006-01-24

Ptomaine wrote:
> I'm affraid, David, that audio watermarking is not the right way.
> The problem is that I cannot control the process of embedding audio
> watermarks both into the radio audio stream and into the training
> samples.

In that case, you're going to have to rely on some form of audio
"fingerprinting". This requires that you figure out what the invariant
features of your samples are versus what features vary depending on
the delivery path. Then, do a pattern-matching search (a kind of
cross-correlation) on the invariant features.

Perceptual coding algorithms can be helpful, because they tend to
throw away irrelevant features and preserve important ones. This
helps to reduce the amount of data you need to examine.

It may also help to couple the searches for "intro" and "coda".
Rather than searching for them independently, look for them in
pairs at the expected spacing(s) in time. A big circular buffer,
into which you record continuously, can help with the implementation.
Then you just copy the segments you want to keep out to permanent
storage once you've identified them.

-- Dave Tweed

Reply by Mark ●January 25, 20062006-01-25

David Tweed wrote:
> Mark wrote:
> > David Tweed wrote:
> > > Not silly at all. Can you modify the audio stream coming from the
> > > radio station? Back when I was working in audio watermarking, we
> > > did some experiments related to getting a toy to react to sounds
> > > coming over a TV set's speaker. We embedded inaudible triggers
> > > into the cartoon's audio track, and when the watermark decoder in
> > > the toy recognized the trigger via its built-in microphone, it
> > > would toggle an output to make the toy do something.
> >
> > I am having a hard time thinking of a signal that can pass though
> > a TV audio channel (50 to 15 kHz) and TV speaker (even less than
> > that) that could trigger a toy an still be "inaudable".
> >
> > Can you be more specific?
>
> Well, all audio watermarking is based on steganography -- the hiding
> of information in another signal. There are a number of approaches,
> and they are usually parameterizable in terms of the amount of data
> that needs to be carried, the robustness against signal processing,
> and the audibility. The latter is strongly affected by the overall
> quality and nature of the "cover" signal.
>
> Our algorithm could be "dialed in" for a wide range of applications.
> At one end of the spectrum, we could hide quite a bit of data in CD
> quality music with no audibility issues except for some very highly
> trained listeners who knew exactly what they were looking for. For
> TV audio applications, the data requirements were much less, the
> robustness needed to be higher, and audibility was less of a concern.
> The low quality of audio in the soundtrack to begin with could hide
> a lot. You couldn't hear our signal in a TV speaker, but you could
> definitely hear that particular watermark in the CD quality test
> environment with a better cover signal.
>
> It was my job to build a very low-cost decoder that could be built
> into a toy. I ended up with a cheap electret microphone, about four
> stages of opamps for gain and filtering, and an 8-bit microcontroller
> (6502 derivative) to run the algorithm. We could play the audio
> through a cheap PC speaker at one end of a conference room table,
> and our box at the other end would light up a sequence of LEDs to
> show that it had seen the triggers, even while people in the room
> were talking. It was a pretty impressive demo, but I don't know
> whether it ever made its way into any real products. This was about
> six years ago.
>
> -- Dave Tweed

OK, so your water mark signal is "hidden" or masked  under an normal
audio program, not just played out during silence....

as a crude analogy, your tree is hidden in the forest, not in an open
field....

does it survive being fed through perceptual encoding i.e. MP3 for
example?

thanks

Mark

Reply by David Tweed ●January 25, 20062006-01-25

Mark wrote:
> OK, so your water mark signal is "hidden" or masked  under an normal
> audio program, not just played out during silence....
> 
> as a crude analogy, your tree is hidden in the forest, not in an open
> field....

Exactly. But when was the last time you ever heard more than a
millisecond of silence in a TV program for kids? :-)

> does it survive being fed through perceptual encoding i.e. MP3 for
> example?

I don't remember for sure what the results were, but yes, we were
doing a lot of work in that area. We could definitely survive one
pass through MP3, and we were trying to address the tandem coding
issues related to watermarking material that was *already* MP3. If
you weren't careful, you'd lose too much audio quality independent
of the watermark.

-- Dave Tweed

Reply by Jim Thomas ●January 25, 20062006-01-25

David Tweed wrote:
> It was a pretty impressive demo, but I don't know
> whether it ever made its way into any real products. This was about
> six years ago.

Maybe it made its way into these:

http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/

The time frame's about right.

-- 
Jim Thomas            Principal Applications Engineer  Bittware, Inc
jthomas@bittware.com  http://www.bittware.com    (603) 226-0404 x536
The sooner you get behind, the more time you'll have to catch up

Reply by David Tweed ●January 25, 20062006-01-25

Jim Thomas wrote:
> David Tweed wrote:
> > It was a pretty impressive demo, but I don't know
> > whether it ever made its way into any real products.
> > This was about six years ago.
> 
> Maybe it made its way into these:
> http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/
> The time frame's about right.

No, it explicitly says that those use RF. I remember reading
about that at the time, and we were sort-of competing against
that technology. We had the advantage of not requiring any
additional hardware on the TV set itself.

-- Dave Tweed

Reply by Jim Thomas ●January 25, 20062006-01-25

David Tweed wrote:
> Jim Thomas wrote:
> 
>>David Tweed wrote:
>>
>>>It was a pretty impressive demo, but I don't know
>>>whether it ever made its way into any real products.
>>>This was about six years ago.
>>
>>Maybe it made its way into these:
>>http://www.cnn.com/TECH/ptech/9903/26/teletubbies.idg/
>>The time frame's about right.
> 
> 
> No, it explicitly says that those use RF. I remember reading
> about that at the time, and we were sort-of competing against
> that technology. We had the advantage of not requiring any
> additional hardware on the TV set itself.

Actimates came with a decoder that connected to the audio (I think) 
output of the TV.  I assume it sent commands to the toy via RF.

Maybe they were using some sort of audio watermarking, and maybe it was 
not as robust as the one you worked on (thus requiring a clean audio 
path between the TV and the decoder).

I bought one for the purpose of maintaining marital bliss, but it nearly 
killed me.  Teletubbies + Microsoft = (Gag Reflex)^2

-- 
Jim Thomas            Principal Applications Engineer  Bittware, Inc
jthomas@bittware.com  http://www.bittware.com    (603) 226-0404 x536
The sooner you get behind, the more time you'll have to catch up

Previous 12Next

A Sound Pattern Detection Within A Continious Audio Stream

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group