Forums

Echo Cancellation on PC platform

Started by qfu72 August 5, 2005
we want to implement an acoustic echo canceller on a windows PC platform
for chatting tools like MSN/skype. 

A serious problem we found in this kind of platform is the time delay
(sound card play+speaker+air+mic.+ sound card record) between Ref. signal
and Echo signal is not consist in every session  1)how could we esimate
the time dealy? 2) why the AEC is so sensitive to time delay setting?




		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
"qfu72" <qiangfu72@gmail.com> wrote in message 
news:wbydnZ2dnZ1aPCjVnZ2dnXPWbt-dnZ2dRVn-y52dnZ0@giganews.com...
> we want to implement an acoustic echo canceller on a windows PC platform > for chatting tools like MSN/skype. > > A serious problem we found in this kind of platform is the time delay > (sound card play+speaker+air+mic.+ sound card record) between Ref. signal > and Echo signal is not consist in every session 1)how could we esimate > the time dealy? 2) why the AEC is so sensitive to time delay setting?
If you add a loopback connection (cable) between the sound card output and input, does that delay change session-to-session? I would expect that this could vary dramatically depending on audio hardware, drivers, buffering settings, OS versions, etc. but would think it would be relatively constant on the same PC. Of course, the acoustic delay ("air") is going to vary depending on speaker/microphone distance based on the speed of sound. As to your second question, an adaptive FIR filter with a fixed filter length needs all of the "echo" to "fit" within the filter's length. If there is a long constant delay that is not accounted for, the echo tail may not fit within the filter's taps. Perhaps you need to measure/calibrate the delay to get a rough idea at the beginning of a call/session, then make sure the adaptive filter starts working at that point (i.e. delay the reference signal by about the same amount). Then you can let the adaptive algorithm work from there--a good AEC algorithm should be able to easily cope with some variability as long as it is well-centered on the echo to cancel. FYI, another issue you may face is that sample synchronization may not be perfect between ADC and DAC and cheap soundcards. Do your initial development with the best quality sound hardware you can find.
You will have two problems:

1) The delay is not constant and it changes sometime. when you reboot the pc 
there will
    be a different delay.
2) some sound cards can only play the audio at 48khz. when such a soundcard 
has to
    play at 8000hz and record at 8000hz it's dsp uses a 
downsampling/upsampling
    algorithm.
    therefor you will see that you actually play the audio at 8000hz and 
record it at 8005hz or 8010hz or.... who knows...
    so you will have to detect that and convert your near_end or far_end 
samples
    accordingly.

You should try to do all that online using long buffers of your near_end and 
far_end.
It is not simple to make an echo canceller for a pc, but it is possible once 
your figure out all the problems.

Yehuda 


Were you able to do that correctly (or you just know about the problems)?

I heard there are some patents on how to do that correctly.


Yehuda Mittelman wrote:
> You will have two problems: > > 1) The delay is not constant and it changes sometime. when you reboot the pc > there will > be a different delay. > 2) some sound cards can only play the audio at 48khz. when such a soundcard > has to > play at 8000hz and record at 8000hz it's dsp uses a > downsampling/upsampling > algorithm. > therefor you will see that you actually play the audio at 8000hz and > record it at 8005hz or 8010hz or.... who knows... > so you will have to detect that and convert your near_end or far_end > samples > accordingly. > > You should try to do all that online using long buffers of your near_end and > far_end. > It is not simple to make an echo canceller for a pc, but it is possible once > your figure out all the problems. > > Yehuda > >
The problem may be even more complicated when you randomly use different 
audio devices, for example Camcorder or other device connected to PC via 
USB or Firewire etc., and then speaker connected via soundcard...

qfu72 wrote:
> we want to implement an acoustic echo canceller on a windows PC platform > for chatting tools like MSN/skype. > > A serious problem we found in this kind of platform is the time delay > (sound card play+speaker+air+mic.+ sound card record) between Ref. signal > and Echo signal is not consist in every session 1)how could we esimate > the time dealy? 2) why the AEC is so sensitive to time delay setting? > > > > > > This message was sent using the Comp.DSP web interface on > www.DSPRelated.com
Yehuda Mittelman wrote:
> You will have two problems: > > 1) The delay is not constant and it changes sometime. when you reboot the pc > there will > be a different delay.
That's simple one!
> 2) some sound cards can only play the audio at 48khz. when such a soundcard > has to > play at 8000hz and record at 8000hz it's dsp uses a > downsampling/upsampling > algorithm. > therefor you will see that you actually play the audio at 8000hz and > record it at 8005hz or 8010hz or.... who knows... > so you will have to detect that and convert your near_end or far_end > samples > accordingly.
Can't it be 8017.35125874 Hz...? and then your sampling rate conversion is likely to not be 100% accurate, hence there will probably be drift between your signals, and then the AEC filter will drift, and even if you track drift, the performance still degrades as a result of the (perhaps small but still) inaccuracy/mismatch in the sampling rates! Good AEC needs more professional solution to this problem!
> > You should try to do all that online using long buffers of your near_end and > far_end. > It is not simple to make an echo canceller for a pc, but it is possible once > your figure out all the problems.
We already figured out the problems..., it may be possible once you figure out the SOLUTIONs to the problems....
Hello!

> As to your second question, an adaptive FIR filter with a fixed filter length > needs all of the "echo" to "fit" within the filter's length. If there is a
Sorry if it sounds lame, but what are adaptive FIR filters? Where can I learn more about them? thanks and regards --Himanshu
A little more patience and persistence works. I found this:
http://cnx.rice.edu/content/m10481/latest/
atleast to clear the doubt what an adaptive filter is. But any more
references will be more helpful.

thanks and regards
--Himanshu

I used a calibration procedure since the project was not final, but it is 
possible to do it online without calibration.

Here is what I did.
For problem 1:

When the program started I played a recorded beep sound from a wave file and 
recorded the near_end and far_end signals.
then I calculated the long cross correlation between the recordings to 
estimate the delay ( I did that in the frequency domain to be more 
efficient).
If you want to do it online you will have to calculate the long cross 
correlation of the near_end and far_end all the time.
another solutionis to use multiple lms filters, each one with a different 
delay. this will be more efficient if you work with subbands (using dft 
filterbank) since you will be able to run these multiple lms only in one 
subband.
you can also combine these methods to get a fine delay estimation.


For problem 2:
I used a calibration when the program was installed. I played a wav file of 
music and recorded the near_end and far_end. then I started checking for the 
right sample rate conversion by grid searching.
the sample rate conversion worked like that:
suppose your far_end played the sound at 8001hz and your near end recorded 
it at 8000hz. this means that you will drop one sample from your far_end 
signal every 8000 samples. to be more accurate you can drop half a sample 
every 4000 samples or quarter every 2000samples or... you can go on with 
that.

after guessing the sample rate conversion I calculated the coherence (matlab 
function: COHERE) of the near_end and the corrected far_end.
I started by guessing that the far end sample rate is 8010hz, converted the 
sample rate and calculated the coherence. I did the same procedure for 
8020hz etc.

so I got a table:
frequency   | coherence
8010          |   0.5
8020          |   1
8030          |   0.5
...
in this case the right sample rate is 8020hz.

The correct sample rate was the one that gave the maximal coherence. and 
this sample rate does not change in time. it will be the same all the time 
so only have to do it once.

you can search with a resolution of 50hz and when you find the right 
frequency conversion you can switch to a fine resolution of 1hz.
There might still be a drift, but it will be very slow and since you 
calibrate the delay online it will compensate for the drift.

I hope this helps

Yehuda







"NS" <NS@NoSpam.com> wrote in message news:42F3FCEA.5070105@NoSpam.com...
> Yehuda Mittelman wrote: >> You will have two problems: >> >> 1) The delay is not constant and it changes sometime. when you reboot the >> pc there will >> be a different delay. > > That's simple one! > >> 2) some sound cards can only play the audio at 48khz. when such a >> soundcard has to >> play at 8000hz and record at 8000hz it's dsp uses a >> downsampling/upsampling >> algorithm. >> therefor you will see that you actually play the audio at 8000hz and >> record it at 8005hz or 8010hz or.... who knows... >> so you will have to detect that and convert your near_end or far_end >> samples >> accordingly. > > Can't it be 8017.35125874 Hz...? and then your sampling rate conversion is > likely to not be 100% accurate, hence there will probably be drift between > your signals, and then the AEC filter will drift, and even if you track > drift, the performance still degrades as a result of the (perhaps small > but still) inaccuracy/mismatch in the sampling rates! > > Good AEC needs more professional solution to this problem! > >> >> You should try to do all that online using long buffers of your near_end >> and far_end. >> It is not simple to make an echo canceller for a pc, but it is possible >> once your figure out all the problems. > > We already figured out the problems..., it may be possible once you figure > out the SOLUTIONs to the problems....
"Yehuda Mittelman" <liayehud@netvision.net.il> wrote in message 
news:dd05t8$cml$1@news2.netvision.net.il...
> You will have two problems: > > 2) some sound cards can only play the audio at 48khz. when such a soundcard > has to > play at 8000hz and record at 8000hz it's dsp uses a > downsampling/upsampling > algorithm. > therefor you will see that you actually play the audio at 8000hz and record > it at 8005hz or 8010hz or.... who knows... > so you will have to detect that and convert your near_end or far_end > samples > accordingly.
I've heard this before, but never really understood why this is the case. Can you explain? Is the problem that the OS isn't locked to the sample rate on the audio card?