comp.dsp | comparing audio signals to determine delay

Hello people,

I'm kinda new to all this, but I think I'm in the right place for the issue 
I'm trying to solve:

For an assignment I have to do in my study, I have to implement a simulation 
of a large sensor network, consisting of nodes equiped with microphones. The 
idea is to let the nodes record the audio from their surroundings, probably 
transform (simplify) it in some way and communicate this with others. Thus, 
a node receives information from it's neighbours and compares this with it's 
own information (for the purpose of the simulation, there will only be one 
main sound source), if it finds that the neighbouring node is "hearing the 
same audio", it calculates the delay between it's own signal and the 
neighbours signal to establish their distance. After some time, hopefully, 
the nodes will be able to establish their location with respect to its 
neighbours

The issue I'm currently dealing with is which way to compare the signals? 
I've googled around, found some about FFTs, but is this suitable in a node 
with somewhat limited memory and/or CPU-capabilities? Other ideas where 
using some sort of peak-detection or dividing the audio in parts of say 
20ms, calculating averages for those parts.. Any ideas are welcome!

Thanks for your time,

With kind regards,

Freek Uijtdewilligen
University of Twente, the Netherlands

Reply by robert bristow-johnson ●March 19, 20072007-03-19

On Mar 19, 8:54 am, "F.B. Uijtdewilligen"
<f.b.uijtdewilli...@student.utwente.nl> wrote:

>
> The issue I'm currently dealing with is which way to compare the signals?
> I've googled around, found some about FFTs, but is this suitable in a node
> with somewhat limited memory and/or CPU-capabilities? Other ideas where
> using some sort of peak-detection or dividing the audio in parts of say
> 20ms, calculating averages for those parts.. Any ideas are welcome!

look into cross-correlation.

suppose y(t) has some delayed component of x(t) in it in addition to
some other signal, v(t), completely unrelated to x(t):

   y(t) = A*x(t-T) + v(t)

then the crosscorrelation:

   R_xy(tau) = integral{ x(t)*y(t+tau)*w(t) dt}

will be maximum when tau = T.   w(t) is just a nice window function to
reduce edge effects and make your integral finite in domain.

r b-j

Reply by Freelance Embedded Systems Engineer ●March 19, 20072007-03-19

F.B. Uijtdewilligen wrote:
> I'm kinda new to all this, but I think I'm in the right place for the issue 
> I'm trying to solve:
> 
> For an assignment I have to do in my study, I have to implement a simulation 
> of a large sensor network, consisting of nodes equiped with microphones. The 
> idea is to let the nodes record the audio from their surroundings, probably 
> transform (simplify) it in some way and communicate this with others. Thus, 
> a node receives information from it's neighbours and compares this with it's 
> own information (for the purpose of the simulation, there will only be one 
> main sound source), if it finds that the neighbouring node is "hearing the 
> same audio", it calculates the delay between it's own signal and the 
> neighbours signal to establish their distance. After some time, hopefully, 
> the nodes will be able to establish their location with respect to its 
> neighbours
> 
> The issue I'm currently dealing with is which way to compare the signals? 
> I've googled around, found some about FFTs, but is this suitable in a node 
> with somewhat limited memory and/or CPU-capabilities? Other ideas where 
> using some sort of peak-detection or dividing the audio in parts of say 
> 20ms, calculating averages for those parts.. Any ideas are welcome!
> 
> Thanks for your time,
> With kind regards,
> Freek Uijtdewilligen
> University of Twente, the Netherlands 

You haven't stated what communication bandwidth is available or if the sensor nodes are battery powered.  It's not likely that you can compare the time signals of audio signal (20-20kHz), unless you have bandwidth and grid-powered sensors.  Even then you'll need GPS level timing in order to synchronize the signals, (example acoustic arrays and beamforming).

FFT is doable, but depends on the frequency resolution desired and the required bandwidth.  For example, if this is a security application and you are trying to detect vehicles, then signal detection from 500-1000 Hz with a resolution of 50 Hz might be acceptable for tracking the vehicle with the sensor network.  

So it depends on the details of application or assumptions that you want to make.

Reply by F.B. Uijtdewilligen ●March 19, 20072007-03-19

"Freelance Embedded Systems Engineer" <g9u5dd43@yahoo.com> schreef in 
bericht news:45fec2fb$0$1413$4c368faf@roadrunner.com...
> F.B. Uijtdewilligen wrote:
>> I'm kinda new to all this, but I think I'm in the right place for the 
>> issue I'm trying to solve:
>>
>> For an assignment I have to do in my study, I have to implement a 
>> simulation of a large sensor network, consisting of nodes equiped with 
>> microphones. The idea is to let the nodes record the audio from their 
>> surroundings, probably transform (simplify) it in some way and 
>> communicate this with others. Thus, a node receives information from it's 
>> neighbours and compares this with it's own information (for the purpose 
>> of the simulation, there will only be one main sound source), if it finds 
>> that the neighbouring node is "hearing the same audio", it calculates the 
>> delay between it's own signal and the neighbours signal to establish 
>> their distance. After some time, hopefully, the nodes will be able to 
>> establish their location with respect to its neighbours
>>
>> The issue I'm currently dealing with is which way to compare the signals? 
>> I've googled around, found some about FFTs, but is this suitable in a 
>> node with somewhat limited memory and/or CPU-capabilities? Other ideas 
>> where using some sort of peak-detection or dividing the audio in parts of 
>> say 20ms, calculating averages for those parts.. Any ideas are welcome!
>>
>> Thanks for your time,
>> With kind regards,
>> Freek Uijtdewilligen
>> University of Twente, the Netherlands
>
> You haven't stated what communication bandwidth is available or if the 
> sensor nodes are battery powered.  It's not likely that you can compare 
> the time signals of audio signal (20-20kHz), unless you have bandwidth and 
> grid-powered sensors.  Even then you'll need GPS level timing in order to 
> synchronize the signals, (example acoustic arrays and beamforming).
>
> FFT is doable, but depends on the frequency resolution desired and the 
> required bandwidth.  For example, if this is a security application and 
> you are trying to detect vehicles, then signal detection from 500-1000 Hz 
> with a resolution of 50 Hz might be acceptable for tracking the vehicle 
> with the sensor network.
> So it depends on the details of application or assumptions that you want 
> to make.
>

There are some prototype sensor-nodes available, which are battery-powered 
and equiped with mics, and I know they already do some filtering to the 
audio signal they capture, I've mailed to inquire some more details about 
the nodes, especially the bandwith capabilities. Should the timing be a 
problem, assumed the nodes all have the time synchronized before deployment? 
I don't suppose they would run out of sync once they are properly 
synchronized...

In this first stage, I try to simulate the nodes being able to find the 
distance to its neighbours, locating or tracking a sound object is a stage 
further in time.. Some of the logic in that will probably be the same, yet 
it still is a different thing alltogether and therefore not a part of my 
research..

But still thanks for the advice, I'll post when I have more information 
about the nodes..

Reply by Jerry Avins ●March 19, 20072007-03-19

F.B. Uijtdewilligen wrote:

   ...

> There are some prototype sensor-nodes available, which are battery-powered 
> and equiped with mics, and I know they already do some filtering to the 
> audio signal they capture, I've mailed to inquire some more details about 
> the nodes, especially the bandwith capabilities. Should the timing be a 
> problem, assumed the nodes all have the time synchronized before deployment? 
> I don't suppose they would run out of sync once they are properly 
> synchronized...

Many general-purpose oscillator crystals have frequency tolerances of 
one part in 10^7, although crystals can be had that are better. How long 
do you need to run before resynching? How much drift can you tolerate? 
What procedure will you use to synch them all?

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Reply by naebad ●March 19, 20072007-03-19

On Mar 20, 4:24 am, "robert bristow-johnson"
<r...@audioimagination.com> wrote:
> On Mar 19, 8:54 am, "F.B. Uijtdewilligen"
>
> <f.b.uijtdewilli...@student.utwente.nl> wrote:
>
> > The issue I'm currently dealing with is which way to compare the signals?
> > I've googled around, found some about FFTs, but is this suitable in a node
> > with somewhat limited memory and/or CPU-capabilities? Other ideas where
> > using some sort of peak-detection or dividing the audio in parts of say
> > 20ms, calculating averages for those parts.. Any ideas are welcome!
>
> look into cross-correlation.
>
> suppose y(t) has some delayed component of x(t) in it in addition to
> some other signal, v(t), completely unrelated to x(t):
>
>    y(t) = A*x(t-T) + v(t)
>
> then the crosscorrelation:
>
>    R_xy(tau) = integral{ x(t)*y(t+tau)*w(t) dt}
>
> will be maximum when tau = T.   w(t) is just a nice window function to
> reduce edge effects and make your integral finite in domain.
>
> r b-j

Cross correlation does nto work well when estimating delays. (unless
the signals are white)You need generalized cross correlation.

N.

Reply by robert bristow-johnson ●March 19, 20072007-03-19

On Mar 19, 4:32 pm, "naebad" <minnae...@yahoo.co.uk> wrote:
>
> Cross correlation does not work well when estimating delays (unless
> the signals are white). You need generalized cross correlation.

the signals need not be white, but what they need to be is broadbanded
and *not* periodic, and then cross correlation works okay.  given
those conditions, R_xy(tau) in

   R_xy(tau) = integral{ x(t)*y(t+tau)*w(t) dt}

will be the same tau that you get from minimizing this difference
function:

   min    integral{ (x(t) - B*y(t+tau))^2 * w(t) dt}
   B,tau

the value of B will be about 1/A and tau will be about T if y(t) is
expressed as

   y(t) = A*x(t-T) + v(t)

and v(t) is completely uncorrelated to x(t).  if x(t) is periodic or
quasi-periodic, then the problem is that there are several values of T
(and therefore several values of tau) where the above is equally true
so your measured delay will be ambiguous.  but it doesn't have to be
white, just nonperiodic and reasonably broadbanded.

dunno what "generalized" cross correlation is.

r b-j

Reply by Andreas Huennebeck ●March 20, 20072007-03-20

robert bristow-johnson wrote:

> On Mar 19, 4:32 pm, "naebad" <minnae...@yahoo.co.uk> wrote:
>>
>> Cross correlation does not work well when estimating delays (unless
>> the signals are white). You need generalized cross correlation.
> 
> the signals need not be white, but what they need to be is broadbanded
> and *not* periodic, and then cross correlation works okay. 

Since audio is often periodic I would reduce the AC signal to a series of
RMS or peak values with much reduced time resolution (depending
on the required precision of the location). This also reduces the workload
of the CPU doing the cross correlation dramatically.

bye
Andreas
-- 
Andreas H&#4294967295;nnebeck | email: acmh@gmx.de
----- privat ---- | www  : http://www.huennebeck-online.de
Fax/Anrufbeantworter: 0721/151-284301
GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
PGP-Key: http://www.huennebeck-online.de/public_keys/pgp_andreas.asc

Reply by ●March 20, 20072007-03-20

On Mar 20, 9:52 am, "robert bristow-johnson"
<r...@audioimagination.com> wrote:
> On Mar 19, 4:32 pm, "naebad" <minnae...@yahoo.co.uk> wrote:
>
>
>
> > Cross correlation does not work well when estimating delays (unless
> > the signals are white). You need generalized cross correlation.
>
> the signals need not be white, but what they need to be is broadbanded
> and *not* periodic, and then cross correlation works okay.  given
> those conditions, R_xy(tau) in
>
>    R_xy(tau) = integral{ x(t)*y(t+tau)*w(t) dt}
>
> will be the same tau that you get from minimizing this difference
> function:
>
>    min    integral{ (x(t) - B*y(t+tau))^2 * w(t) dt}
>    B,tau
>
> the value of B will be about 1/A and tau will be about T if y(t) is
> expressed as
>
>    y(t) = A*x(t-T) + v(t)
>
> and v(t) is completely uncorrelated to x(t).  if x(t) is periodic or
> quasi-periodic, then the problem is that there are several values of T
> (and therefore several values of tau) where the above is equally true
> so your measured delay will be ambiguous.  but it doesn't have to be
> white, just nonperiodic and reasonably broadbanded.
>
> dunno what "generalized" cross correlation is.
>
> r b-j

No,  they need to be white for a good estimate of delay. There is not
space to discuss it all here. You need to look at Knapp and Carters
paper.

Knapp, G.C. Carter, The generalized correlation method for estimation
of time delay, IEEE Trans. ASSP. 24 (4) (1976) 320-326. ...

F.

Reply by Jerry Avins ●March 20, 20072007-03-20

minfitlike@yahoo.co.uk wrote:

   ...

> No,  they need to be white for a good estimate of delay. There is not
> space to discuss it all here. You need to look at Knapp and Carters
> paper.
> 
> Knapp, G.C. Carter, The generalized correlation method for estimation
> of time delay, IEEE Trans. ASSP. 24 (4) (1976) 320-326. ...


You must be looking at one specific approach and assuming that the 
conditions to make it work apply generally. Suppose the signal consisted 
of a clean narrow spike. That would make timing the differential delay 
easy, and there's nothing noisy about it, let alone white.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Previous12 Next

comparing audio signals to determine delay

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group