DSPRelated.com
Forums

Sound Localization using Sound Amplitude

Started by jigajigajoo July 8, 2011
I'm trying to figure out a way, given an array of 4-6 omnidirectional
microphones in ideal conditions, to use the amplitude of a signal to
localize the location of that sound.

I'm targeting this application to a microcontroller (an Arduino), which has
a max sample rate of maybe 2kHz (per mic), and so trying to do TDOA
analysis on the signal is hard. Not only that, but any difficult DSP is
definitely out of the picture, computation-wise. 

Assuming for a second TDOA is out of the picture, could I use the
difference in amplitude from a single sound source (say a 220-Hz sine wave
outputted out of a speaker) to find an (approximate) direction of the sound
source. Let's say that accuracy can be in very approximate, and that we
just want to find the angle of the speaker +/- 5 degrees from the center of
the mic array.

I've set up some mems mics, wired them up, and I'm getting analog reads
from an ADC for all six mics, so I'm working with the raw sample data. The
data is good, I'm just having trouble with the analysis.

The microphone array is currently in a line, each mic ~64mm from the next
one.

Also, why has no-one ever tried using amplitude analysis versus using
tdoa/fdoa for approximate sound localisation? It sounds like an obvious way
to do things...

Thanks!


On 7/8/2011 5:10 AM, jigajigajoo wrote:
> I'm trying to figure out a way, given an array of 4-6 omnidirectional > microphones in ideal conditions, to use the amplitude of a signal to > localize the location of that sound. > > I'm targeting this application to a microcontroller (an Arduino), which has > a max sample rate of maybe 2kHz (per mic), and so trying to do TDOA > analysis on the signal is hard. Not only that, but any difficult DSP is > definitely out of the picture, computation-wise. > > Assuming for a second TDOA is out of the picture, could I use the > difference in amplitude from a single sound source (say a 220-Hz sine wave > outputted out of a speaker) to find an (approximate) direction of the sound > source. Let's say that accuracy can be in very approximate, and that we > just want to find the angle of the speaker +/- 5 degrees from the center of > the mic array. > > I've set up some mems mics, wired them up, and I'm getting analog reads > from an ADC for all six mics, so I'm working with the raw sample data. The > data is good, I'm just having trouble with the analysis. > > The microphone array is currently in a line, each mic ~64mm from the next > one. > > Also, why has no-one ever tried using amplitude analysis versus using > tdoa/fdoa for approximate sound localisation? It sounds like an obvious way > to do things... > > Thanks! > >
er ... how do you know no one has done it? Clearly fdoa doesn't apply here. How is "using amplitude analysis" different than simply a *particular* method for tdoa? Isn't tdoa the most direct underlying physical approach in meeting this objective? (Independent of the algorithm used). If I got the numbers right: 64mm at 343,200 mm/sec (speed of sound in air) is 64/343,200 = 186 microseconds Times 5 to get from one end of the array to the other (the very best case) is around 1 millisecond. Your sample interval is 4.5 milliseconds and you need to resolve better than 1 millisecond. Your apparent worst case situation is 5 degrees off broadside. The distance difference from one end to the other is: 5 X 64 mm = 320mm The difference in travel distance from one end to the other at an angle of incidence of 5 degrees off normal/broadside is 28mm with equates to 28/343,200 = 80 microseconds which is one period at 12kHz - which implies sampling at something like 25kHz *at least*! Fred
On Jul 9, 12:10&#4294967295;am, "jigajigajoo" <jigajigajoo@n_o_s_p_a_m.gmail.com>
wrote:
> I'm trying to figure out a way, given an array of 4-6 omnidirectional > microphones in ideal conditions, to use the amplitude of a signal to > localize the location of that sound. > > I'm targeting this application to a microcontroller (an Arduino), which has > a max sample rate of maybe 2kHz (per mic), and so trying to do TDOA > analysis on the signal is hard. Not only that, but any difficult DSP is > definitely out of the picture, computation-wise. > > Assuming for a second TDOA is out of the picture, could I use the > difference in amplitude from a single sound source (say a 220-Hz sine wave > outputted out of a speaker) to find an (approximate) direction of the sound > source. Let's say that accuracy can be in very approximate, and that we > just want to find the angle of the speaker +/- 5 degrees from the center of > the mic array. > > I've set up some mems mics, wired them up, and I'm getting analog reads > from an ADC for all six mics, so I'm working with the raw sample data. The > data is good, I'm just having trouble with the analysis. > > The microphone array is currently in a line, each mic ~64mm from the next > one. > > Also, why has no-one ever tried using amplitude analysis versus using > tdoa/fdoa for approximate sound localisation? It sounds like an obvious way > to do things... > > Thanks!
What about reverberation? Rooms can have non-min phase characteristics so that the reflection can have higher amplitude than the source! Hardy
On Jul 8, 8:10&#4294967295;am, "jigajigajoo" <jigajigajoo@n_o_s_p_a_m.gmail.com>
wrote:
> I'm trying to figure out a way, given an array of 4-6 omnidirectional > microphones in ideal conditions, to use the amplitude of a signal to > localize the location of that sound. >
if you include delay differences, along with amplitude differences, you can arrange 4 microphones at the vertices of a tetrahedron and you could process the 4 signals in such a way to give you a direction vector in 3 dimensions. 3 microphones in a triangle and give you 2- dimensional localization and 2 microphones can get you 1-dimensional localization. sorta like a Condorcet election (something i've been thinking about since 2009, because of the Instant Runoff Voting battle we had here in Burlington), you would have 6 different pairs of signals from these 4 signals. then you would cross-correlate each pair, search for the lag with maximum amplitude in each of the 6 cross-correlations which would give you the path length difference for each pair. from those 6 path length differences, you can come up with a least-squares estimate of azimuth and zenith angles relative to the given geometry of the 4 microphones. r b-j
On 7/8/2011 7:58 PM, robert bristow-johnson wrote:
> On Jul 8, 8:10 am, "jigajigajoo"<jigajigajoo@n_o_s_p_a_m.gmail.com> > wrote: >> I'm trying to figure out a way, given an array of 4-6 omnidirectional >> microphones in ideal conditions, to use the amplitude of a signal to >> localize the location of that sound. >> > > if you include delay differences, along with amplitude differences, > you can arrange 4 microphones at the vertices of a tetrahedron and you > could process the 4 signals in such a way to give you a direction > vector in 3 dimensions. 3 microphones in a triangle and give you 2- > dimensional localization and 2 microphones can get you 1-dimensional > localization. > > sorta like a Condorcet election (something i've been thinking about > since 2009, because of the Instant Runoff Voting battle we had here in > Burlington), you would have 6 different pairs of signals from these 4 > signals. then you would cross-correlate each pair, search for the lag > with maximum amplitude in each of the 6 cross-correlations which would > give you the path length difference for each pair. from those 6 path > length differences, you can come up with a least-squares estimate of > azimuth and zenith angles relative to the given geometry of the 4 > microphones. > > r b-j > >
r b-j, Not with his sample rate you won't! Fred
On Jul 9, 12:02&#4294967295;am, Fred Marshall <fmarshallxremove_th...@acm.org>
wrote:
> On 7/8/2011 7:58 PM, robert bristow-johnson wrote: > > > > > On Jul 8, 8:10 am, "jigajigajoo"<jigajigajoo@n_o_s_p_a_m.gmail.com> > > wrote: > >> I'm trying to figure out a way, given an array of 4-6 omnidirectional > >> microphones in ideal conditions, to use the amplitude of a signal to > >> localize the location of that sound. > > > if you include delay differences, along with amplitude differences, > > you can arrange 4 microphones at the vertices of a tetrahedron and you > > could process the 4 signals in such a way to give you a direction > > vector in 3 dimensions. &#4294967295;3 microphones in a triangle and give you 2- > > dimensional localization and 2 microphones can get you 1-dimensional > > localization. > > > sorta like a Condorcet election (something i've been thinking about > > since 2009, because of the Instant Runoff Voting battle we had here in > > Burlington), you would have 6 different pairs of signals from these 4 > > signals. &#4294967295;then you would cross-correlate each pair, search for the lag > > with maximum amplitude in each of the 6 cross-correlations which would > > give you the path length difference for each pair. &#4294967295;from those 6 path > > length differences, you can come up with a least-squares estimate of > > azimuth and zenith angles relative to the given geometry of the 4 > > microphones. > > > r b-j, > > Not with his sample rate you won't! >
oh, 2 kHz. that's not even good enough for amateur radio SSB. dunno if you can do anything with that sample rate. r b-j

"robert bristow-johnson"  wrote in message 
news:5e44484e-d6f0-45d7-84ad-408ee1926a53@u28g2000yqf.googlegroups.com...

On Jul 9, 12:02 am, Fred Marshall <fmarshallxremove_th...@acm.org>
wrote:
> On 7/8/2011 7:58 PM, robert bristow-johnson wrote: > > > > > On Jul 8, 8:10 am, "jigajigajoo"<jigajigajoo@n_o_s_p_a_m.gmail.com> > > wrote: > >> I'm trying to figure out a way, given an array of 4-6 omnidirectional > >> microphones in ideal conditions, to use the amplitude of a signal to > >> localize the location of that sound. > > > if you include delay differences, along with amplitude differences, > > you can arrange 4 microphones at the vertices of a tetrahedron and you > > could process the 4 signals in such a way to give you a direction > > vector in 3 dimensions. 3 microphones in a triangle and give you 2- > > dimensional localization and 2 microphones can get you 1-dimensional > > localization. > > > sorta like a Condorcet election (something i've been thinking about > > since 2009, because of the Instant Runoff Voting battle we had here in > > Burlington), you would have 6 different pairs of signals from these 4 > > signals. then you would cross-correlate each pair, search for the lag > > with maximum amplitude in each of the 6 cross-correlations which would > > give you the path length difference for each pair. from those 6 path > > length differences, you can come up with a least-squares estimate of > > azimuth and zenith angles relative to the given geometry of the 4 > > microphones. > > > r b-j, > > Not with his sample rate you won't! >
oh, 2 kHz. that's not even good enough for amateur radio SSB. dunno if you can do anything with that sample rate. r b-j If you have enough processing power, (the OP was talking about a some fairly small processor IIRC) you can do an FFT on the signals from each microphone, and if you sample the signals simultaneously (or with a known offset between channels) you can get phase resolution between the channels much better than the sample rate. Of course, this assumes that the signal is relatively stationary for at least a short while. In my former company, we were able to track vehicles to am RMS bearing accuracy of about 3 degrees using an array of 5 microphones in a box about half the size of a shoebox. Best wishes, --Phil Martel
On 7/10/2011 12:19 PM, Phil Martel wrote:
> > > "robert bristow-johnson" wrote in message > news:5e44484e-d6f0-45d7-84ad-408ee1926a53@u28g2000yqf.googlegroups.com... > > On Jul 9, 12:02 am, Fred Marshall <fmarshallxremove_th...@acm.org> > wrote: >> On 7/8/2011 7:58 PM, robert bristow-johnson wrote: >> >> >> >> > On Jul 8, 8:10 am, "jigajigajoo"<jigajigajoo@n_o_s_p_a_m.gmail.com> >> > wrote: >> >> I'm trying to figure out a way, given an array of 4-6 omnidirectional >> >> microphones in ideal conditions, to use the amplitude of a signal to >> >> localize the location of that sound. >> >> > if you include delay differences, along with amplitude differences, >> > you can arrange 4 microphones at the vertices of a tetrahedron and you >> > could process the 4 signals in such a way to give you a direction >> > vector in 3 dimensions. 3 microphones in a triangle and give you 2- >> > dimensional localization and 2 microphones can get you 1-dimensional >> > localization. >> >> > sorta like a Condorcet election (something i've been thinking about >> > since 2009, because of the Instant Runoff Voting battle we had here in >> > Burlington), you would have 6 different pairs of signals from these 4 >> > signals. then you would cross-correlate each pair, search for the lag >> > with maximum amplitude in each of the 6 cross-correlations which would >> > give you the path length difference for each pair. from those 6 path >> > length differences, you can come up with a least-squares estimate of >> > azimuth and zenith angles relative to the given geometry of the 4 >> > microphones. >> >> >> r b-j, >> >> Not with his sample rate you won't! >> > > > oh, 2 kHz. that's not even good enough for amateur radio SSB. > > dunno if you can do anything with that sample rate. > > r b-j > > If you have enough processing power, (the OP was talking about a some > fairly small processor IIRC) you can do an FFT on the signals from each > microphone, and if you sample the signals simultaneously (or with a > known offset between channels) you can get phase resolution between the > channels much better than the sample rate. Of course, this assumes that > the signal is relatively stationary for at least a short while. > > In my former company, we were able to track vehicles to am RMS bearing > accuracy of about 3 degrees using an array of 5 microphones in a box > about half the size of a shoebox. > > Best wishes, > --Phil Martel
Phil, Are you suggesting that there was something off in the numbers I wrote out earlier? If so, I'd be interested in knowing. I wan't so much thinking about phase per se as about time differences. How does one measure a time difference without samples to resolve? I guess you could assume very high SNR, a single/known frequency or waveform and then imply the temporal difference based on amplitude differences. But I didn't consider that. It still requires 1 sample per period per sensor and the sensors not separated by more than 1 wavelength. Fred
On Jul 10, 3:42&#4294967295;pm, Fred Marshall <fmarshallxremove_th...@acm.org>
wrote:
> On 7/10/2011 12:19 PM, Phil Martel wrote: > > > > > > > > > "robert bristow-johnson" wrote in message > >news:5e44484e-d6f0-45d7-84ad-408ee1926a53@u28g2000yqf.googlegroups.com... > > > On Jul 9, 12:02 am, Fred Marshall <fmarshallxremove_th...@acm.org> > > wrote: > >> On 7/8/2011 7:58 PM, robert bristow-johnson wrote: > > >> > On Jul 8, 8:10 am, "jigajigajoo"<jigajigajoo@n_o_s_p_a_m.gmail.com> > >> > wrote: > >> >> I'm trying to figure out a way, given an array of 4-6 omnidirectional > >> >> microphones in ideal conditions, to use the amplitude of a signal to > >> >> localize the location of that sound. > > >> > if you include delay differences, along with amplitude differences, > >> > you can arrange 4 microphones at the vertices of a tetrahedron and you > >> > could process the 4 signals in such a way to give you a direction > >> > vector in 3 dimensions. 3 microphones in a triangle and give you 2- > >> > dimensional localization and 2 microphones can get you 1-dimensional > >> > localization. > > >> > sorta like a Condorcet election (something i've been thinking about > >> > since 2009, because of the Instant Runoff Voting battle we had here in > >> > Burlington), you would have 6 different pairs of signals from these 4 > >> > signals. then you would cross-correlate each pair, search for the lag > >> > with maximum amplitude in each of the 6 cross-correlations which would > >> > give you the path length difference for each pair. from those 6 path > >> > length differences, you can come up with a least-squares estimate of > >> > azimuth and zenith angles relative to the given geometry of the 4 > >> > microphones. > > >> r b-j, > > >> Not with his sample rate you won't! > > > oh, 2 kHz. that's not even good enough for amateur radio SSB. > > > dunno if you can do anything with that sample rate. > > > r b-j > > > If you have enough processing power, (the OP was talking about a some > > fairly small processor IIRC) you can do an FFT on the signals from each > > microphone, and if you sample the signals simultaneously (or with a > > known offset between channels) you can get phase resolution between the > > channels much better than the sample rate. Of course, this assumes that > > the signal is relatively stationary for at least a short while. > > > In my former company, we were able to track vehicles to am RMS bearing > > accuracy of about 3 degrees using an array of 5 microphones in a box > > about half the size of a shoebox. > > > Best wishes, > > --Phil Martel > > Phil, > > Are you suggesting that there was something off in the numbers I wrote > out earlier? &#4294967295;If so, I'd be interested in knowing. &#4294967295;I wan't so much > thinking about phase per se as about time differences. &#4294967295;How does one > measure a time difference without samples to resolve? &#4294967295;I guess you could > assume very high SNR, a single/known frequency or waveform and then > imply the temporal difference based on amplitude differences. &#4294967295;But I > didn't consider that. &#4294967295;It still requires 1 sample per period per sensor > and the sensors not separated by more than 1 wavelength. > > Fred- Hide quoted text - > > - Show quoted text -
Hello Fred, You can take 2 different sets of data for the same signal where the only essential difference is the time delay and frequency offset and then do a frequency domain approach to the TDOA. For example find the DFT of each signal. Then conjugate one of the signals and then form a sample by sample inner product between the two. The find the argument of each resulting sample. When plotted (argument vs sample(frequency bin) you will get a straight line whose slope is the time offset and intercept is the frequency offset. Notice how the TDOA and the frequency offset get separated out in the analysis - I found this quite useful in RF applications when I didn't have to worry about carrier and doppler frequency offsets. You can even weight the samples by their spectral power before doing the regression fit. This is basically doing TDOA autocorrelation via FFT except for the converting back to the time domain and peak search. Clay
On 7/11/2011 7:15 AM, Clay wrote:

> > > Hello Fred, > > > You can take 2 different sets of data for the same signal where the > only essential difference is the time delay and frequency offset and > then do a frequency domain approach to the TDOA. For example find the > DFT of each signal. Then conjugate one of the signals and then form a > sample by sample inner product between the two. The find the argument > of each resulting sample. When plotted (argument vs sample(frequency > bin) you will get a straight line whose slope is the time offset and > intercept is the frequency offset. Notice how the TDOA and the > frequency offset get separated out in the analysis - I found this > quite useful in RF applications when I didn't have to worry about > carrier and doppler frequency offsets. You can even weight the > samples by their spectral power before doing the regression fit. > > This is basically doing TDOA autocorrelation via FFT except for the > converting back to the time domain and peak search. > > > Clay > > >
Ah! OK. Thanks Clay. Fred