DSPRelated.com
Forums

embedding messages (ID numbers) in audio

Started by Natalie December 15, 2008
Hi everyone,
I want to embed ID numbers in audio so that I can recover them on a
separate device.  Some of my constraints:
-The audio will be transmitted from a standard TV/DVD set-up (so DVD
encoding, TV speakers).
-The audio will be received and processed on a cell phone (so cell
phone microphone, processing capabilities (but can be a high-end cell
phone))
-There will be significant background noise picked up by the
microphone
-The ID numbers will occur about every 30 seconds, and each needs to
be about 10-12 bits of data.
-It would be vastly preferable if the encoding didn't alter the
original audio track significantly; that is, if people listening to
the audio couldn't really tell that there was data embedded in the
audio

I've looked at the literature on audio watermarking, but this seems to
differ from what I want to do in a couple of significant ways:
-I'm not worried about a malicious attacker, so detection and spoofing
aren't concerns
-I'm broadcasting over a very noisy channel

I'd appreciate any info that people could point me toward -
applications, literature, websites, whatever.
Thanks!!!
On Mon, 15 Dec 2008 15:25:20 -0800, Natalie wrote:

> Hi everyone, > I want to embed ID numbers in audio so that I can recover them on a > separate device. Some of my constraints: -The audio will be transmitted > from a standard TV/DVD set-up (so DVD encoding, TV speakers). > -The audio will be received and processed on a cell phone (so cell phone > microphone, processing capabilities (but can be a high-end cell phone)) > -There will be significant background noise picked up by the microphone > -The ID numbers will occur about every 30 seconds, and each needs to be > about 10-12 bits of data. > -It would be vastly preferable if the encoding didn't alter the original > audio track significantly; that is, if people listening to the audio > couldn't really tell that there was data embedded in the audio > > I've looked at the literature on audio watermarking, but this seems to > differ from what I want to do in a couple of significant ways: -I'm not > worried about a malicious attacker, so detection and spoofing aren't > concerns > -I'm broadcasting over a very noisy channel > > I'd appreciate any info that people could point me toward - > applications, literature, websites, whatever. Thanks!!!
I doubt that you'll find literature on your specific application. You'll have to study up on cell-phone encoders -- I don't know what gets used now, but you want to make sure that whatever you use doesn't get compressed right out of the data stream; that may be a challenge in itself. -- Tim Wescott Control systems and communications consulting http://www.wescottdesign.com Need to learn how to apply control theory in your embedded system? "Applied Control Theory for Embedded Systems" by Tim Wescott Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html
On Dec 15, 3:25 pm, Natalie <natlinn...@gmail.com> wrote:
> Hi everyone, > I want to embed ID numbers in audio so that I can recover them on a > separate device. Some of my constraints: > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > encoding, TV speakers). > -The audio will be received and processed on a cell phone (so cell > phone microphone, processing capabilities (but can be a high-end cell > phone)) > -There will be significant background noise picked up by the > microphone > -The ID numbers will occur about every 30 seconds, and each needs to > be about 10-12 bits of data. > -It would be vastly preferable if the encoding didn't alter the > original audio track significantly; that is, if people listening to > the audio couldn't really tell that there was data embedded in the > audio > > I've looked at the literature on audio watermarking, but this seems to > differ from what I want to do in a couple of significant ways: > -I'm not worried about a malicious attacker, so detection and spoofing > aren't concerns > -I'm broadcasting over a very noisy channel > > I'd appreciate any info that people could point me toward - > applications, literature, websites, whatever. > Thanks!!!
You might look at the methods used to transmit CTCSS subtones. A sequence of subtones could encode any desired information. The subtones are normally used as a static channel identifier and assigned in system propagated band removed from the audio at the receiver as in: http://www.ofcom.org.uk/static/archive/ra/publication/mpt/mpt_pdf/mpt1306.pdf Dale B. Dalrymple http://dbdimages.com
On Mon, 15 Dec 2008 19:03:12 -0600, Tim Wescott wrote:

> On Mon, 15 Dec 2008 15:25:20 -0800, Natalie wrote: > >> Hi everyone, >> I want to embed ID numbers in audio so that I can recover them on a >> separate device. Some of my constraints: -The audio will be >> transmitted from a standard TV/DVD set-up (so DVD encoding, TV >> speakers). -The audio will be received and processed on a cell phone >> (so cell phone microphone, processing capabilities (but can be a >> high-end cell phone)) -There will be significant background noise >> picked up by the microphone -The ID numbers will occur about every 30 >> seconds, and each needs to be about 10-12 bits of data. >> -It would be vastly preferable if the encoding didn't alter the >> original audio track significantly; that is, if people listening to the >> audio couldn't really tell that there was data embedded in the audio >> >> I've looked at the literature on audio watermarking, but this seems to >> differ from what I want to do in a couple of significant ways: -I'm not >> worried about a malicious attacker, so detection and spoofing aren't >> concerns >> -I'm broadcasting over a very noisy channel >> >> I'd appreciate any info that people could point me toward - >> applications, literature, websites, whatever. Thanks!!! > > I doubt that you'll find literature on your specific application. > > You'll have to study up on cell-phone encoders -- I don't know what gets > used now, but you want to make sure that whatever you use doesn't get > compressed right out of the data stream; that may be a challenge in > itself.
I failed to mention that if the compression is _not_ as fancy as I think it is, then you probably want some wide spreading rate spread spectrum, that splats the signal across 300Hz to 3kHz. That'll keep it down in the mud as far as audibility is concerned, but let it be recovered at the other end. This would work great on a regular telephone line, but I wouldn't promise that it'll work on any old cell phone. -- Tim Wescott Control systems and communications consulting http://www.wescottdesign.com Need to learn how to apply control theory in your embedded system? "Applied Control Theory for Embedded Systems" by Tim Wescott Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html

Natalie wrote:
> Hi everyone, > I want to embed ID numbers in audio so that I can recover them on a > separate device. Some of my constraints: > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > encoding, TV speakers). > -The audio will be received and processed on a cell phone (so cell > phone microphone, processing capabilities (but can be a high-end cell > phone)) > -There will be significant background noise picked up by the > microphone > -The ID numbers will occur about every 30 seconds, and each needs to > be about 10-12 bits of data. > -It would be vastly preferable if the encoding didn't alter the > original audio track significantly; that is, if people listening to > the audio couldn't really tell that there was data embedded in the > audio
Key question: can you modify the vocoder on the receive side? If you can, then embedding a data and making it inaudible to the listener is not a problem. If you can't, then your options are very limited.
> I've looked at the literature on audio watermarking,
You've looked at the right direction. The "controllable echo" is the most robust scheme. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Hi, Natalie. Sounds like an interesting problem. <grin!>

On Mon, 15 Dec 2008 15:25:20 -0800 (PST), Natalie <natlinnell@gmail.com> wrote:
> Hi everyone, > I want to embed ID numbers in audio so that I can recover them on a > separate device. Some of my constraints: > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > encoding, TV speakers). > -The audio will be received and processed on a cell phone (so cell > phone microphone, processing capabilities (but can be a high-end cell > phone)) > -There will be significant background noise picked up by the > microphone
Just to clarify: in your application, - The original_sound+ID signal will issue from the speakers of an arbitrary consumer-level "off-the-rack" TV set, stereo, or similar device, although it might be part of a cable TV channel airing, radio broadcast, or a CD or DVD. - It will propagate through normal atmosphere for distances of (say) 1-20 meters, and is subject to surrounding noise sources such as conversations from nearby cubicles, A/C, Muzak&friends, toilets flushing in the next room, and so forth. <grin!> - It will be heard by anyone in the surrounding area. - It will be picked up by a cellphone (again, an arbitrary consumer telephone). One question: will the ID be processed by a cellphone app, and, say, display a message on the cellphone screen? Or will the cellphone be used as a dial-in microphone, that is, will it send the audio signal to some fixed remote site where your application will process it?
> -The ID numbers will occur about every 30 seconds, and each needs to > be about 10-12 bits of data. > -It would be vastly preferable if the encoding didn't alter the > original audio track significantly; that is, if people listening to > the audio couldn't really tell that there was data embedded in the > audio
For example, if you were designing a system that would (say) allow an advertising agency representative to walk into a TV store with a cellphone and find out which of their ads were showing at the moment. Mixing in (say) high-volume DTMF (touch-tone) digits every 30 seconds would accomplish your _ID_ purpose, but it might be considered somewhat... distracting. <grin!>
> I've looked at the literature on audio watermarking, but this seems to > differ from what I want to do in a couple of significant ways: > -I'm not worried about a malicious attacker, so detection and spoofing > aren't concerns > -I'm broadcasting over a very noisy channel > > I'd appreciate any info that people could point me toward - > applications, literature, websites, whatever.
I don't know if it will work over a cellphone link, and the data rate may not meet your specifications, but the NIST WWV time broadcasts contain digital data mixed in with audio by using a 100Hz subcarrier. You can obtain a description of the format used from NIST Special Publication 250-67 "NIST Time and Frequency Radio Stations" by searching for "250-67" at: National Institute of Standards and Technology http://www.nist.gov/ Whatever method you finally choose, given the possible variations in ambient noise I think you ought to assume that the delived signal would be extremely error-prone (in multiple ways), and intermittently delivered (cell signal loss, PA loudspeaker, etc.) and plan accordingly. I'd also advise that, after you have a final approach and have tested the error rate, that you make your (proposed?) customer aware of the system's limitations ("I know you'd prefer up-to-the-second reports, but cell towers _have_ been known to fail occasionally..."). Your goal in doing this would not be to "talk down" your work <grin!> but to explain the constraints it has to live with -- mostly not of your making. Customers don't generally like surprises. Hope this helps. At the very least, your correcting any of my mistaken assumptions will help clarify the situation to the other readers. Frank Mckenney -- Over the ages, the condition of the arts has been seen as a part -- a striking and important part -- of the exercise of critical imagination, of the human mind, in their broader compass. -- Robert Conquest, "The Dragons of Expectation" -- Frank McKenney, McKenney Associates Richmond, Virginia / (804) 320-4887 Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all)
On Dec 16, 6:09&#4294967295;am, Frnak McKenney
<fr...@far.from.the.madding.crowd.com> wrote:
> Hi, Natalie. Sounds like an interesting problem. <grin!> > > On Mon, 15 Dec 2008 15:25:20 -0800 (PST), Natalie <natlinn...@gmail.com> wrote: > > Hi everyone, > > I want to embed ID numbers in audio so that I can recover them on a > > separate device. &#4294967295;Some of my constraints: > > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > > encoding, TV speakers). > > -The audio will be received and processed on a cell phone (so cell > > phone microphone, processing capabilities (but can be a high-end cell > > phone)) > > -There will be significant background noise picked up by the > > microphone > > Just to clarify: in your application, > > &#4294967295; - The original_sound+ID signal will issue from the speakers of an > &#4294967295; &#4294967295; arbitrary consumer-level "off-the-rack" TV set, stereo, or > &#4294967295; &#4294967295; similar device, although it might be part of a cable TV channel > &#4294967295; &#4294967295; airing, radio broadcast, or a CD or DVD. > > &#4294967295; - It will propagate through normal atmosphere for distances of > &#4294967295; &#4294967295; (say) 1-20 meters, and is subject to surrounding noise sources > &#4294967295; &#4294967295; such as conversations from nearby cubicles, A/C, Muzak&friends, > &#4294967295; &#4294967295; toilets flushing in the next room, and so forth. &#4294967295;<grin!> > > &#4294967295; - It will be heard by anyone in the surrounding area. > > &#4294967295; - It will be picked up by a cellphone (again, an arbitrary > &#4294967295; &#4294967295; consumer telephone). > > One question: &#4294967295;will the ID be processed by a cellphone app, and, > say, display a message on the cellphone screen? > > Or will the cellphone be used as a dial-in microphone, that is, will > it send the audio signal to some fixed remote site where your > application will process it? > > > -The ID numbers will occur about every 30 seconds, and each needs to > > be about 10-12 bits of data. > > -It would be vastly preferable if the encoding didn't alter the > > original audio track significantly; that is, if people listening to > > the audio couldn't really tell that there was data embedded in the > > audio > > For example, if you were designing a system that would (say) allow > an advertising agency representative to walk into a TV store with a > cellphone and find out which of their ads were showing at the > moment. &#4294967295;Mixing in (say) high-volume DTMF (touch-tone) digits every > 30 seconds would accomplish your _ID_ purpose, but it might be > considered somewhat... &#4294967295;distracting. &#4294967295;<grin!> > > > I've looked at the literature on audio watermarking, but this seems to > > differ from what I want to do in a couple of significant ways: > > -I'm not worried about a malicious attacker, so detection and spoofing > > aren't concerns > > -I'm broadcasting over a very noisy channel > > > I'd appreciate any info that people could point me toward - > > applications, literature, websites, whatever. > > I don't know if it will work over a cellphone link, and the data > rate may not meet your specifications, but the NIST WWV time > broadcasts contain digital data mixed in with audio by using a 100Hz > subcarrier. &#4294967295;You can obtain a description of the format used from > NIST Special Publication 250-67 "NIST Time and Frequency Radio > Stations" by searching for "250-67" at: > > &#4294967295; &#4294967295; National Institute of Standards and Technology > &#4294967295; &#4294967295;http://www.nist.gov/ > > Whatever method you finally choose, given the possible variations in > ambient noise I think you ought to assume that the delived signal > would be extremely error-prone (in multiple ways), and > intermittently delivered (cell signal loss, PA loudspeaker, etc.) > and plan accordingly. > > I'd also advise that, after you have a final approach and have > tested the error rate, that you make your (proposed?) &#4294967295;customer > aware of the system's limitations ("I know you'd prefer > up-to-the-second reports, but cell towers _have_ been known to fail > occasionally..."). &#4294967295;Your goal in doing this would not be to "talk > down" your work <grin!> but to explain the constraints it has to > live with -- mostly not of your making. &#4294967295;Customers don't generally > like surprises. > > Hope this helps. &#4294967295;At the very least, your correcting any of my > mistaken assumptions will help clarify the situation to the other > readers. > > Frank Mckenney > -- > &#4294967295; &#4294967295; Over the ages, the condition of the arts has been seen as a part > &#4294967295; &#4294967295; -- a striking and important part -- of the exercise of critical > &#4294967295; &#4294967295; imagination, of the human mind, in their broader compass. > &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295;-- Robert Conquest, "The Dragons of Expectation" > -- > Frank McKenney, McKenney Associates > Richmond, Virginia / (804) 320-4887 > Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all)
Ah! So sorry not to be completely clear. You were correct on all your clarifying points. As to your question: I'd like to do the processing _on_ the cellphone. Thanks for the pointers, I'm reading through the docs. This isn't my specialty, so it's kinda slow going. :)
On Dec 16, 2:23&#4294967295;pm, Natalie <natlinn...@gmail.com> wrote:
> On Dec 16, 6:09&#4294967295;am, Frnak McKenney > > > > > > <fr...@far.from.the.madding.crowd.com> wrote: > > Hi, Natalie. Sounds like an interesting problem. <grin!> > > > On Mon, 15 Dec 2008 15:25:20 -0800 (PST), Natalie <natlinn...@gmail.com> wrote: > > > Hi everyone, > > > I want to embed ID numbers in audio so that I can recover them on a > > > separate device. &#4294967295;Some of my constraints: > > > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > > > encoding, TV speakers). > > > -The audio will be received and processed on a cell phone (so cell > > > phone microphone, processing capabilities (but can be a high-end cell > > > phone)) > > > -There will be significant background noise picked up by the > > > microphone > > > Just to clarify: in your application, > > > &#4294967295; - The original_sound+ID signal will issue from the speakers of an > > &#4294967295; &#4294967295; arbitrary consumer-level "off-the-rack" TV set, stereo, or > > &#4294967295; &#4294967295; similar device, although it might be part of a cable TV channel > > &#4294967295; &#4294967295; airing, radio broadcast, or a CD or DVD. > > > &#4294967295; - It will propagate through normal atmosphere for distances of > > &#4294967295; &#4294967295; (say) 1-20 meters, and is subject to surrounding noise sources > > &#4294967295; &#4294967295; such as conversations from nearby cubicles, A/C, Muzak&friends, > > &#4294967295; &#4294967295; toilets flushing in the next room, and so forth. &#4294967295;<grin!> > > > &#4294967295; - It will be heard by anyone in the surrounding area. > > > &#4294967295; - It will be picked up by a cellphone (again, an arbitrary > > &#4294967295; &#4294967295; consumer telephone). > > > One question: &#4294967295;will the ID be processed by a cellphone app, and, > > say, display a message on the cellphone screen? > > > Or will the cellphone be used as a dial-in microphone, that is, will > > it send the audio signal to some fixed remote site where your > > application will process it? > > > > -The ID numbers will occur about every 30 seconds, and each needs to > > > be about 10-12 bits of data. > > > -It would be vastly preferable if the encoding didn't alter the > > > original audio track significantly; that is, if people listening to > > > the audio couldn't really tell that there was data embedded in the > > > audio > > > For example, if you were designing a system that would (say) allow > > an advertising agency representative to walk into a TV store with a > > cellphone and find out which of their ads were showing at the > > moment. &#4294967295;Mixing in (say) high-volume DTMF (touch-tone) digits every > > 30 seconds would accomplish your _ID_ purpose, but it might be > > considered somewhat... &#4294967295;distracting. &#4294967295;<grin!> > > > > I've looked at the literature on audio watermarking, but this seems to > > > differ from what I want to do in a couple of significant ways: > > > -I'm not worried about a malicious attacker, so detection and spoofing > > > aren't concerns > > > -I'm broadcasting over a very noisy channel > > > > I'd appreciate any info that people could point me toward - > > > applications, literature, websites, whatever. > > > I don't know if it will work over a cellphone link, and the data > > rate may not meet your specifications, but the NIST WWV time > > broadcasts contain digital data mixed in with audio by using a 100Hz > > subcarrier. &#4294967295;You can obtain a description of the format used from > > NIST Special Publication 250-67 "NIST Time and Frequency Radio > > Stations" by searching for "250-67" at: > > > &#4294967295; &#4294967295; National Institute of Standards and Technology > > &#4294967295; &#4294967295;http://www.nist.gov/ > > > Whatever method you finally choose, given the possible variations in > > ambient noise I think you ought to assume that the delived signal > > would be extremely error-prone (in multiple ways), and > > intermittently delivered (cell signal loss, PA loudspeaker, etc.) > > and plan accordingly. > > > I'd also advise that, after you have a final approach and have > > tested the error rate, that you make your (proposed?) &#4294967295;customer > > aware of the system's limitations ("I know you'd prefer > > up-to-the-second reports, but cell towers _have_ been known to fail > > occasionally..."). &#4294967295;Your goal in doing this would not be to "talk > > down" your work <grin!> but to explain the constraints it has to > > live with -- mostly not of your making. &#4294967295;Customers don't generally > > like surprises. > > > Hope this helps. &#4294967295;At the very least, your correcting any of my > > mistaken assumptions will help clarify the situation to the other > > readers. > > > Frank Mckenney > > -- > > &#4294967295; &#4294967295; Over the ages, the condition of the arts has been seen as a part > > &#4294967295; &#4294967295; -- a striking and important part -- of the exercise of critical > > &#4294967295; &#4294967295; imagination, of the human mind, in their broader compass. > > &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295;-- Robert Conquest, "The Dragons of Expectation" > > -- > > Frank McKenney, McKenney Associates > > Richmond, Virginia / (804) 320-4887 > > Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all) > > Ah! &#4294967295;So sorry not to be completely clear. &#4294967295;You were correct on all > your clarifying points. &#4294967295;As to your question: &#4294967295;I'd like to do the > processing _on_ the cellphone. &#4294967295;Thanks for the pointers, I'm reading > through the docs. &#4294967295;This isn't my specialty, so it's kinda slow > going. &#4294967295;:)- Hide quoted text - > > - Show quoted text -
and do you mean you get a new ID number every 30 seconds, or the same number is repeated over and over every 30 seconds so you have many chances to figure it out? either way, you have a very difficult problem if it has to be inaudible to the average person.... Mark
On Tue, 16 Dec 2008 11:23:58 -0800 (PST), Natalie <natlinnell@gmail.com> wrote:
> On Dec 16, 6:09 am, Frnak McKenney ><fr...@far.from.the.madding.crowd.com> wrote: >> Hi, Natalie. Sounds like an interesting problem. <grin!> >> >> On Mon, 15 Dec 2008 15:25:20 -0800 (PST), Natalie <natlinn...@gmail.com> wrote: >> > Hi everyone, >> > I want to embed ID numbers in audio so that I can recover them on a >> > separate device. Some of my constraints: >> > -The audio will be transmitted from a standard TV/DVD set-up (so DVD >> > encoding, TV speakers). >> > -The audio will be received and processed on a cell phone (so cell >> > phone microphone, processing capabilities (but can be a high-end cell >> > phone)) >> > -There will be significant background noise picked up by the >> > microphone >> >> Just to clarify: in your application, >> >> - The original_sound+ID signal will issue from the speakers of an >> arbitrary consumer-level "off-the-rack" TV set, stereo, or >> similar device, although it might be part of a cable TV channel >> airing, radio broadcast, or a CD or DVD. >> >> - It will propagate through normal atmosphere for distances of >> (say) 1-20 meters, and is subject to surrounding noise sources >> such as conversations from nearby cubicles, A/C, Muzak&friends, >> toilets flushing in the next room, and so forth. <grin!> >> >> - It will be heard by anyone in the surrounding area. >> >> - It will be picked up by a cellphone (again, an arbitrary >> consumer telephone). >> >> One question: will the ID be processed by a cellphone app, and, >> say, display a message on the cellphone screen? >> >> Or will the cellphone be used as a dial-in microphone, that is, will >> it send the audio signal to some fixed remote site where your >> application will process it? >> >> > -The ID numbers will occur about every 30 seconds, and each needs to >> > be about 10-12 bits of data. >> > -It would be vastly preferable if the encoding didn't alter the >> > original audio track significantly; that is, if people listening to >> > the audio couldn't really tell that there was data embedded in the >> > audio >> >> For example, if you were designing a system that would (say) allow >> an advertising agency representative to walk into a TV store with a >> cellphone and find out which of their ads were showing at the >> moment. Mixing in (say) high-volume DTMF (touch-tone) digits every >> 30 seconds would accomplish your _ID_ purpose, but it might be >> considered somewhat... distracting. <grin!> >> >> > I've looked at the literature on audio watermarking, but this seems to >> > differ from what I want to do in a couple of significant ways: >> > -I'm not worried about a malicious attacker, so detection and spoofing >> > aren't concerns >> > -I'm broadcasting over a very noisy channel >> >> > I'd appreciate any info that people could point me toward - >> > applications, literature, websites, whatever. >> >> I don't know if it will work over a cellphone link, and the data >> rate may not meet your specifications, but the NIST WWV time >> broadcasts contain digital data mixed in with audio by using a 100Hz >> subcarrier. You can obtain a description of the format used from >> NIST Special Publication 250-67 "NIST Time and Frequency Radio >> Stations" by searching for "250-67" at: >> >> National Institute of Standards and Technology >> http://www.nist.gov/ >> >> Whatever method you finally choose, given the possible variations in >> ambient noise I think you ought to assume that the delived signal >> would be extremely error-prone (in multiple ways), and >> intermittently delivered (cell signal loss, PA loudspeaker, etc.) >> and plan accordingly.
--snip--
> Ah! So sorry not to be completely clear. You were correct on all > your clarifying points.
S'OK. It's difficult to absorb many details in one pass. For example, it wasn't until this re-(re-*)reading of your original post that I finally read the phrase "about 10-12 bits of data" correctly: I had been reading it as "bytes. <grin!> Which means that the NIST WWVx encoding of one pulse per second would give you 30 bits/second, enough for some redundancy/error correction. You will also need some sort of "preamble" or distinct synchronization pattern so the software dan tell when the ID bitstream begins.
> ... As to your question: I'd like to do the > processing _on_ the cellphone.
Ah. That's good: you don't need to worry about any mashing of the sound signal within the cellphone network. But it could make the software development and distribution a bit trickier if your application needs to be able to executed on AnyOldCellphone(tm), including the SuperSecret Model X With "Revolutionary DSP-Based Investment Analysis", which will be announced next Friday. ("But wait! There's more! ... <grin!>)
> ... Thanks for the pointers, I'm reading > through the docs. This isn't my specialty, so it's kinda slow > going. :)
If it helps, the WWV stuff is encoded as narrow (170ms=0) and wide (470ms=1) bursts of 100Hz tone. From your point of view, I think, what's important is "my software can tell the difference between a 1, a 0, and nothing" and "most people don't notice it". Frank McKenney -- Physics is mathematical not because we know so much about the physical world, but because we know so little; it is only its mathematical properties that we can discover. -- Bertrand Russell -- Frank McKenney, McKenney Associates Richmond, Virginia / (804) 320-4887 Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all)
On Dec 17, 9:02&#4294967295;am, Frnak McKenney
<fr...@far.from.the.madding.crowd.com> wrote:
> On Tue, 16 Dec 2008 11:23:58 -0800 (PST), Natalie <natlinn...@gmail.com> wrote: > > On Dec 16, 6:09 am, Frnak McKenney > ><fr...@far.from.the.madding.crowd.com> wrote: > >> Hi, Natalie. Sounds like an interesting problem. <grin!> > > >> On Mon, 15 Dec 2008 15:25:20 -0800 (PST), Natalie <natlinn...@gmail.com> wrote: > >> > Hi everyone, > >> > I want to embed ID numbers in audio so that I can recover them on a > >> > separate device. &#4294967295;Some of my constraints: > >> > -The audio will be transmitted from a standard TV/DVD set-up (so DVD > >> > encoding, TV speakers). > >> > -The audio will be received and processed on a cell phone (so cell > >> > phone microphone, processing capabilities (but can be a high-end cell > >> > phone)) > >> > -There will be significant background noise picked up by the > >> > microphone > > >> Just to clarify: in your application, > > >> &#4294967295; - The original_sound+ID signal will issue from the speakers of an > >> &#4294967295; &#4294967295; arbitrary consumer-level "off-the-rack" TV set, stereo, or > >> &#4294967295; &#4294967295; similar device, although it might be part of a cable TV channel > >> &#4294967295; &#4294967295; airing, radio broadcast, or a CD or DVD. > > >> &#4294967295; - It will propagate through normal atmosphere for distances of > >> &#4294967295; &#4294967295; (say) 1-20 meters, and is subject to surrounding noise sources > >> &#4294967295; &#4294967295; such as conversations from nearby cubicles, A/C, Muzak&friends, > >> &#4294967295; &#4294967295; toilets flushing in the next room, and so forth. &#4294967295;<grin!> > > >> &#4294967295; - It will be heard by anyone in the surrounding area. > > >> &#4294967295; - It will be picked up by a cellphone (again, an arbitrary > >> &#4294967295; &#4294967295; consumer telephone). > > >> One question: &#4294967295;will the ID be processed by a cellphone app, and, > >> say, display a message on the cellphone screen? > > >> Or will the cellphone be used as a dial-in microphone, that is, will > >> it send the audio signal to some fixed remote site where your > >> application will process it? > > >> > -The ID numbers will occur about every 30 seconds, and each needs to > >> > be about 10-12 bits of data. > >> > -It would be vastly preferable if the encoding didn't alter the > >> > original audio track significantly; that is, if people listening to > >> > the audio couldn't really tell that there was data embedded in the > >> > audio > > >> For example, if you were designing a system that would (say) allow > >> an advertising agency representative to walk into a TV store with a > >> cellphone and find out which of their ads were showing at the > >> moment. &#4294967295;Mixing in (say) high-volume DTMF (touch-tone) digits every > >> 30 seconds would accomplish your _ID_ purpose, but it might be > >> considered somewhat... &#4294967295;distracting. &#4294967295;<grin!> > > >> > I've looked at the literature on audio watermarking, but this seems to > >> > differ from what I want to do in a couple of significant ways: > >> > -I'm not worried about a malicious attacker, so detection and spoofing > >> > aren't concerns > >> > -I'm broadcasting over a very noisy channel > > >> > I'd appreciate any info that people could point me toward - > >> > applications, literature, websites, whatever. > > >> I don't know if it will work over a cellphone link, and the data > >> rate may not meet your specifications, but the NIST WWV time > >> broadcasts contain digital data mixed in with audio by using a 100Hz > >> subcarrier. &#4294967295;You can obtain a description of the format used from > >> NIST Special Publication 250-67 "NIST Time and Frequency Radio > >> Stations" by searching for "250-67" at: > > >> &#4294967295; &#4294967295;National Institute of Standards and Technology > >> &#4294967295; &#4294967295;http://www.nist.gov/ > > >> Whatever method you finally choose, given the possible variations in > >> ambient noise I think you ought to assume that the delived signal > >> would be extremely error-prone (in multiple ways), and > >> intermittently delivered (cell signal loss, PA loudspeaker, etc.) > >> and plan accordingly. > > --snip-- > > > Ah! &#4294967295;So sorry not to be completely clear. &#4294967295;You were correct on all > > your clarifying points. > > S'OK. It's difficult to absorb many details in one pass. For > example, it wasn't until this re-(re-*)reading of your original post > that I finally read the phrase "about 10-12 bits of data" correctly: > I had been reading it as "bytes. <grin!> > > Which means that the NIST WWVx encoding of one pulse per second > would give you 30 bits/second, enough for some redundancy/error > correction. &#4294967295;You will also need some sort of "preamble" or distinct > synchronization pattern so the software dan tell when the ID > bitstream begins. > > > ... &#4294967295;As to your question: &#4294967295;I'd like to do the > > processing _on_ the cellphone. > > Ah. &#4294967295;That's good: &#4294967295;you don't need to worry about any mashing of the > sound signal within the cellphone network.
How do you figure? Are you counting on compression algorithms to reliably reproduce what you cannot hear above the regular audio? Dirk
> > But it could make the software development and distribution a bit > trickier if your application needs to be able to executed on > AnyOldCellphone(tm), including the SuperSecret Model X With > "Revolutionary DSP-Based Investment Analysis", which will be > announced next Friday. &#4294967295;("But wait! &#4294967295;There's more! &#4294967295;... &#4294967295;<grin!>) > > > ... &#4294967295;Thanks for the pointers, I'm reading > > through the docs. &#4294967295;This isn't my specialty, so it's kinda slow > > going. &#4294967295;:) > > If it helps, the WWV stuff is encoded as narrow (170ms=0) and wide > (470ms=1) bursts of 100Hz tone. &#4294967295;From your point of view, I think, > what's important is "my software can tell the difference between a > 1, a 0, and nothing" and "most people don't notice it". > > Frank McKenney > -- > &#4294967295; &#4294967295;Physics is mathematical not because we know so much about the > &#4294967295; &#4294967295;physical world, but because we know so little; it is only its > &#4294967295; &#4294967295;mathematical properties that we can discover. > &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295;-- Bertrand Russell > -- > Frank McKenney, McKenney Associates > Richmond, Virginia / (804) 320-4887 > Munged E-mail: frank uscore mckenney ayut mined spring dawt cahm (y'all)- Hide quoted text - > > - Show quoted text -