comp.dsp | Estimating time offset between two audio signals.

I'm looking for some recommendations on real-time algorithms which are
able to estimate the time offset (delay) between two signals. One
signal is the source signal (speaker signal). The other signal is a
filtered version of the speaker signal (echo, microphone signal).
Delay might be as large as 500ms.

Running NLMS on downsampled signals to estimate the time offset works
great if the delay is "stationary", but in this case the time offset
might change suddenly over time. For example, it might be 250ms for
some time and then suddenly (within 10-20 milliseconds) "jump" to
300ms and shortly after jump back to 250ms. This is, of course, a
symptom of another problem (with the audio stream), but right now I
have to come up with a way of dealing with this problem.

Another issue is the convergence time. If the delay is 300ms and the
delay changes, it takes 300ms before the NLMS algorithm adapts to that
new delay....BUT...if the delay during those 300ms changes to
something else, the algorithm has a hard time tracking that. It
doesn't "see" the delay change. How do you deal with that???

I'm not sure if there are any fast and robust real-time algorithm for
this type of problem, but if there is, I'm sure folks here on comp.dsp
will enlighten me.

Reply by HardySpicer ●April 19, 20132013-04-19

On Apr 19, 2:46&#4294967295;pm, Mauritz Jameson <mjames2...@gmail.com> wrote:
> I'm looking for some recommendations on real-time algorithms which are
> able to estimate the time offset (delay) between two signals. One
> signal is the source signal (speaker signal). The other signal is a
> filtered version of the speaker signal (echo, microphone signal).
> Delay might be as large as 500ms.
>
> Running NLMS on downsampled signals to estimate the time offset works
> great if the delay is "stationary", but in this case the time offset
> might change suddenly over time. For example, it might be 250ms for
> some time and then suddenly (within 10-20 milliseconds) "jump" to
> 300ms and shortly after jump back to 250ms. This is, of course, a
> symptom of another problem (with the audio stream), but right now I
> have to come up with a way of dealing with this problem.
>
> Another issue is the convergence time. If the delay is 300ms and the
> delay changes, it takes 300ms before the NLMS algorithm adapts to that
> new delay....BUT...if the delay during those 300ms changes to
> something else, the algorithm has a hard time tracking that. It
> doesn't "see" the delay change. How do you deal with that???
>
> I'm not sure if there are any fast and robust real-time algorithm for
> this type of problem, but if there is, I'm sure folks here on comp.dsp
> will enlighten me.Ads not by this site

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1164314&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1164314

Reply by Mauritz Jameson ●April 19, 20132013-04-19

@HardySpicer

Will that work for situations where the delay "jumps" ?

Reply by dszabo ●April 19, 20132013-04-19

>I'm looking for some recommendations on real-time algorithms which are
>able to estimate the time offset (delay) between two signals. One
>signal is the source signal (speaker signal). The other signal is a
>filtered version of the speaker signal (echo, microphone signal).
>Delay might be as large as 500ms.
>
>Running NLMS on downsampled signals to estimate the time offset works
>great if the delay is "stationary", but in this case the time offset
>might change suddenly over time. For example, it might be 250ms for
>some time and then suddenly (within 10-20 milliseconds) "jump" to
>300ms and shortly after jump back to 250ms. This is, of course, a
>symptom of another problem (with the audio stream), but right now I
>have to come up with a way of dealing with this problem.
>
>Another issue is the convergence time. If the delay is 300ms and the
>delay changes, it takes 300ms before the NLMS algorithm adapts to that
>new delay....BUT...if the delay during those 300ms changes to
>something else, the algorithm has a hard time tracking that. It
>doesn't "see" the delay change. How do you deal with that???
>
>I'm not sure if there are any fast and robust real-time algorithm for
>this type of problem, but if there is, I'm sure folks here on comp.dsp
>will enlighten me.
>

Are you measuring the delay of the signal as it traverses acoustic space? 
I.E.  The time it takes for the signal to get from the speaker to the
microphone?  Or is it some kind of loop latency measurement from the
microphone to the speaker and back to the microphone?

I recall you posting about this before, its great that you were able to get
the LMS implementation working, if only in some capacity.  I still can't
get my head around what exactly you are trying to measure.

Reply by Mauritz Jameson ●April 19, 20132013-04-19

@dszabo :

It's not that complicated :)

I have an audio stream coming in from the network (RTP). I'm pushing
the digital audio to an audio driver so that the audio is played out
through some loudspeakers. I'm also pulling audio from my microphone
via the audio driver interface. At some point, an "echo" of the audio
which was played out through the loudspeakers, will show up in the
digital microphone data.

Let's say that the speaker audio at time t = 0 contains the utterance
'A' and at time t = 256ms, I see the echo of that utterance in the
microphone signal. Then the delay is 256ms.

So I'm looking for a real-time algorithm which:

- can estimate that delay
- can adapt quickly to sudden changes (+-50ms) in the delay
- is robust in the sense that it also works if the audio signals are a
bit noisy (light office noise)
- is suitable for audio block processing (10ms blocks)

Reply by dszabo ●April 19, 20132013-04-19

>@dszabo :
>
>It's not that complicated :)
>
>I have an audio stream coming in from the network (RTP). I'm pushing
>the digital audio to an audio driver so that the audio is played out
>through some loudspeakers. I'm also pulling audio from my microphone
>via the audio driver interface. At some point, an "echo" of the audio
>which was played out through the loudspeakers, will show up in the
>digital microphone data.
>
>Let's say that the speaker audio at time t = 0 contains the utterance
>'A' and at time t = 256ms, I see the echo of that utterance in the
>microphone signal. Then the delay is 256ms.
>
>So I'm looking for a real-time algorithm which:
>
>- can estimate that delay
>- can adapt quickly to sudden changes (+-50ms) in the delay
>- is robust in the sense that it also works if the audio signals are a
>bit noisy (light office noise)
>- is suitable for audio block processing (10ms blocks)
>

Right on.

So you are measuring the time it takes to output data, play it through the
loudspeaker, sound to travel from the loud speaker to the microphone, and
input data.  I think I get it now.

Is the microphone signal being fed back to the loudspeaker, which would
create multiple echos?

I beleive I had suggested this previously, but you might look at some
paper's on this site:

http://miracle.otago.ac.nz/tartini/papers.html

Tartini is used for pitch detection and is based on a combination of auto
correlation and difference algorithms.  It was designed to run in real time
to provide feedback to musicians.  I bet you could easily adapt it, or some
of the concepts described in these papers, to what you are trying to do. 
While some of it should be fairly obvious, it does talk a bit about
optimising the algorithms for performance, which would at least be worth a
read.

Something to think about.  Implement some kind of a peak detection.  Then,
set up a detection algorithm to find transient moments in the audio.  When
a transient is detected, perform a correlation of the input data with the
output with a window containing the transient, say 100ms, and over a period
you beleive to contain the echo, say 250ms to 750ms.  Rather than a brute
force correlation, you could use an FFT based algorithm, or the SNAC
algorithm described in the papers measured earlier.

To further optimize things, you can narrow the period upon finding a lock,
so that successive measurements take less time.  This will make a loss of
detection respond faster, and the period can be opened up to relock.

Reply by dszabo ●April 19, 20132013-04-19

I should probably point out that your capacity to calculate a delay is
dependent on the presence of transient sound.  For example, is you have a
sine wave going through, the best you can do is measure the phase
difference between the input and output, but there would be an ambiguity in
the number of whole cycles that have passed.  This example can be
extrapolated to any periodic signal.

Suppose your delay is 200ms, and you have a signal that repeats every
150ms.  You would start the signal every n*150ms, and receive it every 200
+ m*150ms.  At 300 ms, you will have just sent out a signal, and at 350ms
you will receive it, which would imply a 50ms delay.

What all this means, is that trying to calculate a delay of >100ms during a
tonal aspect of a sound is a fools errand because the sound is likely (for
some sounds) to have a period of less than 100 ms.  Your best bet is to
wait for a transient that you can look for.

Reply by Vladimir Vassilevsky ●April 19, 20132013-04-19

On 4/19/2013 11:13 AM, Mauritz Jameson wrote:

> I have an audio stream coming in from the network (RTP). I'm pushing
> the digital audio to an audio driver so that the audio is played out
> through some loudspeakers. I'm also pulling audio from my microphone
> via the audio driver interface. At some point, an "echo" of the audio
> which was played out through the loudspeakers, will show up in the
> digital microphone data.

Now you can see why it is difficult to do EC at far end. No wonder all 
systems work EC at near end or over synchronous transport.
Why can't you do it that way?

>
> So I'm looking for a real-time algorithm which:
>
> - can estimate that delay
> - can adapt quickly to sudden changes (+-50ms) in the delay
> - is robust in the sense that it also works if the audio signals are a
> bit noisy (light office noise)
> - is suitable for audio block processing (10ms blocks)

It depends.

Before jumping into the hell of difficulties, fix your system first.

Vladimir Vassilevsky
DSP and Mixed Signal Designs
www.abvolt.com

Reply by Vladimir Vassilevsky ●April 19, 20132013-04-19

On 4/19/2013 3:32 PM, dszabo wrote:
> I should probably point out that your capacity to calculate a delay is
> dependent on the presence of transient sound.  For example, is you have a
> sine wave going through, the best you can do is measure the phase
> difference between the input and output, but there would be an ambiguity in
> the number of whole cycles that have passed.  This example can be
> extrapolated to any periodic signal.
>
> Suppose your delay is 200ms, and you have a signal that repeats every
> 150ms.  You would start the signal every n*150ms, and receive it every 200
> + m*150ms.  At 300 ms, you will have just sent out a signal, and at 350ms
> you will receive it, which would imply a 50ms delay.
>
> What all this means, is that trying to calculate a delay of >100ms during a
> tonal aspect of a sound is a fools errand because the sound is likely (for
> some sounds) to have a period of less than 100 ms.  Your best bet is to
> wait for a transient that you can look for.


Q: Why it is impossible to have sex in Red Square in Moscow ?
A: Because every bystander idiot would be trying to give his invaluable 
advice.

Vladimir Vassilevsky
DSP and Mixed Signal Designs
www.abvolt.com

Reply by dszabo ●April 19, 20132013-04-19

>On 4/19/2013 3:32 PM, dszabo wrote:
>> I should probably point out that your capacity to calculate a delay is
>> dependent on the presence of transient sound.  For example, is you have
a
>> sine wave going through, the best you can do is measure the phase
>> difference between the input and output, but there would be an ambiguity
in
>> the number of whole cycles that have passed.  This example can be
>> extrapolated to any periodic signal.
>>
>> Suppose your delay is 200ms, and you have a signal that repeats every
>> 150ms.  You would start the signal every n*150ms, and receive it every
200
>> + m*150ms.  At 300 ms, you will have just sent out a signal, and at
350ms
>> you will receive it, which would imply a 50ms delay.
>>
>> What all this means, is that trying to calculate a delay of >100ms
during a
>> tonal aspect of a sound is a fools errand because the sound is likely
(for
>> some sounds) to have a period of less than 100 ms.  Your best bet is to
>> wait for a transient that you can look for.
>
>
>Q: Why it is impossible to have sex in Red Square in Moscow ?
>A: Because every bystander idiot would be trying to give his invaluable 
>advice.
>
>Vladimir Vassilevsky
>DSP and Mixed Signal Designs
>www.abvolt.com
>
>

I love this guy!  Can we hang out some time?  Grab a drink and talk about
the finer points of Kalman filters?

Previous12 3 Next

Estimating time offset between two audio signals.

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group