Reply by Mauritz Jameson April 22, 20132013-04-22
Vlad,

The problem is the audio driver. It doesn't always accept speaker data. So if I have speaker data that I want to send to the audio driver I can either wait until the audio driver accepts the data or throw it away. On top of that, the audio driver doesn't always return mic data. So what do I do then? Wait until it becomes available or substitute the missing data with silence packets?

Reply by Vladimir Vassilevsky April 22, 20132013-04-22
On 4/22/2013 8:24 AM, Mauritz Jameson wrote:
> Vlad, > > You wrote: > > "Estimate the rate of upcomming/outgoing data. Resample the data so > everything would work as if it is on the same sample clock." > > I think that's something you would do if you have sample rate drift? > Meaning: You get more or less far-end data per second than near-end > data, right? This is not the problem in this case. The problem is the > audio subsystem. The delay on the transmission path between the > digital speaker buffer (which stores incoming audio from RTP) and the > digital microphone buffer (which stores audio delivered to the > application by the audio driver) varies too much (sudden jumps by > more than 30ms). By transmission path I mean: > > digital spk buffer -> audio driver (spk) -> acoustic path -> audio > driver (mic) -> digital mic buffer > > The delay on the acoustic path is naturally constant. >
If there is no sample clock slip, then the issue could be fixed by sufficient buffering. Why is the problem? VLV
Reply by Al Clark April 22, 20132013-04-22
robert bristow-johnson <rbj@audioimagination.com> wrote in
news:a5e2fb34-f356-46cb-835c-a6703bfe61f5@a14g2000vbm.googlegr
oups.com: 

> On Apr 19, 6:12&#4294967295;pm, "dszabo" <62466@dsprelated> wrote: >> >On 4/19/2013 3:32 PM, dszabo wrote: >> >> I should probably point out that your capacity to >> >> calculate a delay is dependent on the presence of >> >> >Q: Why it is impossible to have sex in Red Square in >> >Moscow ? A: Because every bystander idiot would be trying >> >to give his invaluable advice. >> >> >Vladimir Vassilevsky >> >DSP and Mixed Signal Designs >> >www.abvolt.com >> >> I love this guy! &#4294967295;Can we hang out some time? &#4294967295;Grab a drink >> and talk a > bout >> the finer points of Kalman filters? > > might have to wait until the next comp.dsp conference. i > missed the first two, but will endevour to make it to the > next one, whenever it is. > > r b-j
Vlad hosted one and I hosted the other, I guess you're up Robert, Have you set a date? Al
Reply by Mauritz Jameson April 22, 20132013-04-22
Vlad,

I guess you are suggesting that I measure how many speaker samples I send to the audio driver versus how many microphone samples I receive from the audio driver per time unit ? If I receive 'M' samples per second and I send 'N' samples per second, then I resample the speaker sample buffer from 'N' Hz to Fs_common and I resample the mic sample buffer from 'M' Hz to Fs_common. Data from the resampled buffers is used as input to the AEC. The output of the AEC is resampled back from Fs_common to 'M' Hz.

Am I understanding you correctly?
Reply by Mauritz Jameson April 22, 20132013-04-22
Vlad,

You wrote:

"Estimate the rate of upcomming/outgoing data. Resample the data so 
everything would work as if it is on the same sample clock."

I think that's something you would do if you have sample rate drift? Meaning: You get more or less far-end data per second than near-end data, right? This is not the problem in this case. The problem is the audio subsystem. The delay on the transmission path between the digital speaker buffer (which stores incoming audio from RTP) and the digital microphone buffer (which stores audio delivered to the application by the audio driver) varies too much (sudden jumps by more than 30ms). By transmission path I mean:

digital spk buffer -> audio driver (spk)  -> acoustic path -> audio driver (mic) -> digital mic buffer

The delay on the acoustic path is naturally constant.
Reply by Mauritz Jameson April 22, 20132013-04-22
I assume you're using the same type of algorithm for the delay estimation?

If so, try this experiment where there's no near-end speech and let me know how that works out:

1) Generate a digital speaker signal which lasts 60 seconds
2) Let the delay toggle between 250ms and 300ms every 7 seconds. So like this:

time = 0s to 7s : Delay = 250ms
time = 7s to 14s : Delay = 300ms
time = 14s to 21s : Delay = 250ms
time = 21s to 28s : Delay = 300ms

..etc..etc

3) Generate a digital microphone signal which meets the requirements in [2]. So like this:

time = 0s to 7s : Time offset between mic and spk signal is 250ms
time = 7s to 14s : Time offset between mic and spk signal is 300ms
time = 14s to 21s : Time offset between mic and spk signal is 250ms
time = 21s to 28s : Time offset between mic and spk signal is 300ms

4) Let your delay estimator process the mic and spk signal. 

Tell me if your delay estimator was able to accurately track the delay. I'm not talking about TDOA. The delay on the acoustic path is constant (why wouldn't it be?) since the mic and spk stay in fixed positions. The delay jumps are happening because the audio subsystem (audio driver etc) is not working optimally. 



Reply by April 21, 20132013-04-21
On Friday, April 19, 2013 11:04:05 PM UTC+12, Mauritz Jameson wrote:
> @HardySpicer > > > > Will that work for situations where the delay "jumps" ?
Well it worked for me. Depends what you mean by jumping but it does track a varying delay no problem ie as I walk about a room it will get the TDOA between two mics. Biggest problem is reverberation.
Reply by Vladimir Vassilevsky April 20, 20132013-04-20
On 4/19/2013 8:21 PM, Mauritz Jameson wrote:
> Vlad, > > Can you elaborate on your comment: > > "Now you can see why it is difficult to do EC at far end. No wonder > all > systems work EC at near end or over synchronous transport" > > Your far-end is my near-end and vice versa. So I'm not sure I > understand > what you mean by "difficult to do EC at far end" ?
Near end is whatever is local to speaker and mike. Far end is on the other side of the communication link (wrt this speaker and mike).
> And what do you mean by "synchronous transport" ?
All parts of the system sitting on the same clock. No cycle slips.
> The AEC processes the microphone signal which during far-end talk is > only composed of background noise and a capture of the acoustic echo > from the loudspeaker. From the near-end speaker's point of view, the > AEC > is done on the near-end. For the far-end speaker's point of view, that > AEC > is done at the far-end.
You can close AEC loop at near end, which is typical. Or you can try to close AEC loop at the far end; that is more difficult.
> Which intelligent algorithm would you suggest for synchronization ? >
Estimate the rate of upcomming/outgoing data. Resample the data so everything would work as if it is on the same sample clock. Vladimir Vassilevsky DSP and Mixed Signal Designs www.abvolt.com
Reply by Les Cargill April 20, 20132013-04-20
Mauritz Jameson wrote:
> Vlad, > > Can you elaborate on your comment: > > "Now you can see why it is difficult to do EC at far end. No wonder > all > systems work EC at near end or over synchronous transport" > > Your far-end is my near-end and vice versa. So I'm not sure I > understand > what you mean by "difficult to do EC at far end" ? >
Far end is Tx; near end is Rx from the frame of reference of the listener.
> And what do you mean by "synchronous transport" ? >
Synchronous means not-asynchronous - any buffering is purely deterministic and ideally constant delay. A T1 line is synchronous - the clock runs from end to end. The McDyson-Spohn book is worth ten times what it costs ( and weighs ) if you have an interest in that sort of thing.
> The AEC processes the microphone signal which during far-end talk is > only composed of background noise and a capture of the acoustic echo > from the loudspeaker. From the near-end speaker's point of view, the > AEC > is done on the near-end. For the far-end speaker's point of view, that > AEC > is done at the far-end. > > Which intelligent algorithm would you suggest for synchronization ?
RTP (and by extension VoIP) uses jitter buffers. They are quite strange. <snip> -- Les Cargill
Reply by robert bristow-johnson April 20, 20132013-04-20
On Apr 19, 6:12&#4294967295;pm, "dszabo" <62466@dsprelated> wrote:
> >On 4/19/2013 3:32 PM, dszabo wrote: > >> I should probably point out that your capacity to calculate a delay is > >> dependent on the presence of transient sound. &#4294967295;For example, is you have > a > >> sine wave going through, the best you can do is measure the phase > >> difference between the input and output, but there would be an ambiguity > in > >> the number of whole cycles that have passed. &#4294967295;This example can be > >> extrapolated to any periodic signal. > > >> Suppose your delay is 200ms, and you have a signal that repeats every > >> 150ms. &#4294967295;You would start the signal every n*150ms, and receive it every > 200 > >> + m*150ms. &#4294967295;At 300 ms, you will have just sent out a signal, and at > 350ms > >> you will receive it, which would imply a 50ms delay. > > >> What all this means, is that trying to calculate a delay of >100ms > during a > >> tonal aspect of a sound is a fools errand because the sound is likely > (for > >> some sounds) to have a period of less than 100 ms. &#4294967295;Your best bet is to > >> wait for a transient that you can look for. > > >Q: Why it is impossible to have sex in Red Square in Moscow ? > >A: Because every bystander idiot would be trying to give his invaluable > >advice. > > >Vladimir Vassilevsky > >DSP and Mixed Signal Designs > >www.abvolt.com > > I love this guy! &#4294967295;Can we hang out some time? &#4294967295;Grab a drink and talk about > the finer points of Kalman filters?
might have to wait until the next comp.dsp conference. i missed the first two, but will endevour to make it to the next one, whenever it is. r b-j