Your best bet for a great Acoustic Echo Canceler is probably at:
http://www.Compandent.com/products_echokiller.htm
http://www.Compandent.com/EchoKillerFactSheet.pdf


johan kleuskens wrote:
> Hi,
> 
> We are currently working on an acoustic echocanceller based on the well know 
> NLMS principle. This echocanceller works fine as long as we feed the 
> echocanceller with a echo signal that is generated by an audio processing 
> program. When using this ideal echo signal, freezing the FIR coefficients 
> works like it should: the echo is still cancelled because the FIR tabs 
> contain a representation of the impulse respons of the (virtual) room.
> 
> Things are different in real life : when working with a real room, the echo 
> is cancelled as long as the tabs are not froozen. Echo Attenuation (ERLE?) 
> is as much as 40dB. However, as soons as the tabs are froozen, the echo 
> attenuation is reduced to 10-15 dB even with no near end speech!
> 
> This raises some questions:
> 
>     - Maybe our code is wrong. We've tested the algorithm in C++ and Matlab, 
> and both behave the same. Below the Matlab code is include as a reference, 
> so         if  anyone  sees a bug,  please let me know. In this peace if 
> code you can see that we stop adapting the weights when half way through the 
> microphone and         speaker file.
> 
>     - If this bad behaviour is due to the non linear impulse respons of the 
> room, and therefore is inherent to AEC, why is everybody talking about 
> freezing the tabs     when double talk is active?
> 
> With kind regards,
> 
> Johan Kleusens

I work indirectly in this area so my understanding is far from
complete, but I'm wondering if what your seeing is in fact 'normal'.

Sure, adaption should be frozen during double talk, that's to prevent
the EC from converging on the wrong signal.  The idea is to stop
adaptation in the double-talk case not because it makes things better,
you do it so it doesn't make things worse.

>From what I understand that's not what you have (based on your 'no near
end speech' comment), so you shouldn't be freezing the taps.

When you do your test in the real room, you assume that the echo paths
will remain constant throughout the test.  That would be the only way
that the EC could maintain it's performance after you freeze the taps.
I doubt that this is the case.  Go back and check the weights on the
tests that you did, are they identical for every run?  Make another
test, if you think your echo path(s) are constant then you EC should
adapt to the exact same thing every time, no matter where in the test
signal you start....

Again, I don't do this stuff for a living, I just know people who do,
so take my ramblings for what they are worth.... ;)

Hi Jerry,

Thank you for your input.

Your correct about  the difference in sample frequency of the microphone and 
speaker when you look at it in an analogous way. But if you put the played 
and the recorded wave files together in one stereo file and compare them on 
a sample by sample basis, you see a slightly drift of the phase diffence 
between those two when the play and frequency sample frequency are 
different. This occurs for example when you play a file on soundcard A and 
record it on soundcard B. If you play and record on the same soundcard, the 
phase difference is constant.

With kind regards,

Johan Kleuskens

"Jerry Avins" <jya@ieee.org> wrote in message 
news:GdadnfDmEY0J2cbfRVn-vw@rcn.net...
> johan kleuskens wrote:
>> Hi Steve,
>>
>> The room is a closed office with no curtains or moving things in it, and 
>> "as soon as" means immediately. For a long time we tought is was caused 
>> by a difference in sample frequency of the speaker and the microphone 
>> part of the PC soundcard we use. However, we did a test and concluded 
>> that microphone and speaker sample frequency are synchronous. We tested 
>> as follows: we played a 1000Hz soundfile through our speakers, and 
>> recorded this at the same time via our microphone. The phase of the sine 
>> on the recorded file was compared with the sine on the 1000Hz speaker 
>> file. The phase should be constant when speaker en microphone are 
>> synchronous and non constant if speaker and microphone are non 
>> synchronous. The phase difference was constant, and therefore the speaker 
>> and microphone are synchronous.
>
> What the microphone picks up will bear a constant phase relationship to 
> what the speaker puts out even if the sample rates are very different, so 
> long as the Nyquist criterion is met for both devices. I hope sample rate 
> isn't the identity you thought to prove.
>
>> Your idea of sample jitter is interesting. I will give that a thought, 
>> but i have no idea how to solve this problem if jiiter is the cause of 
>> all this. The recording device is a ordinary soundcard, and it is not 
>> possible to adjust jitter-behaviour on such a device.
>
> It seems unlikely to me that sample jitter could be the cause of 
> progressive deterioration.
>
> Jerry
> -- 
> Engineering is the art of making what you want from things you can get.
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

On Mon, 11 Apr 2005 18:58:25 +0200, "jkle" <johanenbernie@hotmail.com>
wrote:

> microphone 
> common PC accessories.

Bingo?

Chris Hornbeck
6x9=42

johan kleuskens wrote:
> Hi Steve,
> 
> The room is a closed office with no curtains or moving things in it, and "as 
> soon as" means immediately. For a long time we tought is was caused by a 
> difference in sample frequency of the speaker and the microphone part of the 
> PC soundcard we use. However, we did a test and concluded that microphone 
> and speaker sample frequency are synchronous. We tested as follows: we 
> played a 1000Hz soundfile through our speakers, and recorded this at the 
> same time via our microphone. The phase of the sine on the recorded file was 
> compared with the sine on the 1000Hz speaker file. The phase should be 
> constant when speaker en microphone are synchronous and non constant if 
> speaker and microphone are non synchronous. The phase difference was 
> constant, and therefore the speaker and microphone are synchronous.

What the microphone picks up will bear a constant phase relationship to 
what the speaker puts out even if the sample rates are very different, 
so long as the Nyquist criterion is met for both devices. I hope sample 
rate isn't the identity you thought to prove.

> Your idea of sample jitter is interesting. I will give that a thought, but i 
> have no idea how to solve this problem if jiiter is the cause of all this. 
> The recording device is a ordinary soundcard, and it is not possible to 
> adjust jitter-behaviour on such a device.

It seems unlikely to me that sample jitter could be the cause of 
progressive deterioration.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

You might try switching to a high-end sound-card/recording system to see if that
makes a difference.  Maybe one with digital I/O on the card and external
ADC/DACs.  Also experiment with better speakers/microphones to see if that helps
at all.
-- 
Jon Harris
SPAM blocked e-mail address in use.  Replace the ANIMAL with 7 to reply.

"johan kleuskens" <johanenbernie@hotmail.com> wrote in message
news:425ad215$0$141$e4fe514c@news.xs4all.nl...
> Hi Steve,
>
> The room is a closed office with no curtains or moving things in it, and "as
> soon as" means immediately. For a long time we tought is was caused by a
> difference in sample frequency of the speaker and the microphone part of the
> PC soundcard we use. However, we did a test and concluded that microphone
> and speaker sample frequency are synchronous. We tested as follows: we
> played a 1000Hz soundfile through our speakers, and recorded this at the
> same time via our microphone. The phase of the sine on the recorded file was
> compared with the sine on the 1000Hz speaker file. The phase should be
> constant when speaker en microphone are synchronous and non constant if
> speaker and microphone are non synchronous. The phase difference was
> constant, and therefore the speaker and microphone are synchronous.
>
> Your idea of sample jitter is interesting. I will give that a thought, but i
> have no idea how to solve this problem if jiiter is the cause of all this.
> The recording device is a ordinary soundcard, and it is not possible to
> adjust jitter-behaviour on such a device.
>
> With kind regards,
>
> Johan Kleuskens
> The Netherlands
>
> "Steve Underwood" <steveu@dis.org> schreef in bericht
> news:d3ebdv$r20$1@nnews.pacific.net.hk...
> > Hi Johan,
> >
> > What does "as soon as" really mean? The very second, or just fairly
> > quickly after? It is a step change, or a progressive degradation? When you
> > switch off the adaption, what phyisical action do you perform? Are you
> > just flicking a finger to press a key, or shuffling a bunch of people
> > around the room? Is the window open? Do you have curtains flapping in the
> > breeze?
> >
> > If the scale of the degradation relates in some way to the scale movement
> > within the room, that is expected. If it is a step change, unprovoked by
> > physical movement, you probably have a system problem. Software bugs are,
> > of course, a possbility. So is some kind of sampling jitter. If you have
> > quick adaption it is surprising how well things will work with unstable
> > signal timing. Without the adaption they suddenly fall apart. When the
> > adaption is at work is it settling to a fairly steady state, or in a
> > constant state of flux? If you are using a sufficiently whitened training
> > signal in should settle to a pretty stable state, until someone moves.
> >
> > For reference, with good quality converters, echo cancellation should do
> > better than 40dB. You can get around 30dB even down a phone line with
> > a-law/u-law distortion.
> >
> > Regards,
> > Steve

Hi Steve,

The room is a closed office with no curtains or moving things in it, and "as 
soon as" means immediately. For a long time we tought is was caused by a 
difference in sample frequency of the speaker and the microphone part of the 
PC soundcard we use. However, we did a test and concluded that microphone 
and speaker sample frequency are synchronous. We tested as follows: we 
played a 1000Hz soundfile through our speakers, and recorded this at the 
same time via our microphone. The phase of the sine on the recorded file was 
compared with the sine on the 1000Hz speaker file. The phase should be 
constant when speaker en microphone are synchronous and non constant if 
speaker and microphone are non synchronous. The phase difference was 
constant, and therefore the speaker and microphone are synchronous.

Your idea of sample jitter is interesting. I will give that a thought, but i 
have no idea how to solve this problem if jiiter is the cause of all this. 
The recording device is a ordinary soundcard, and it is not possible to 
adjust jitter-behaviour on such a device.

With kind regards,

Johan Kleuskens
The Netherlands

"Steve Underwood" <steveu@dis.org> schreef in bericht 
news:d3ebdv$r20$1@nnews.pacific.net.hk...
> Hi Johan,
>
> What does "as soon as" really mean? The very second, or just fairly 
> quickly after? It is a step change, or a progressive degradation? When you 
> switch off the adaption, what phyisical action do you perform? Are you 
> just flicking a finger to press a key, or shuffling a bunch of people 
> around the room? Is the window open? Do you have curtains flapping in the 
> breeze?
>
> If the scale of the degradation relates in some way to the scale movement 
> within the room, that is expected. If it is a step change, unprovoked by 
> physical movement, you probably have a system problem. Software bugs are, 
> of course, a possbility. So is some kind of sampling jitter. If you have 
> quick adaption it is surprising how well things will work with unstable 
> signal timing. Without the adaption they suddenly fall apart. When the 
> adaption is at work is it settling to a fairly steady state, or in a 
> constant state of flux? If you are using a sufficiently whitened training 
> signal in should settle to a pretty stable state, until someone moves.
>
> For reference, with good quality converters, echo cancellation should do 
> better than 40dB. You can get around 30dB even down a phone line with 
> a-law/u-law distortion.
>
> Regards,
> Steve
>
>
> johan kleuskens wrote:
>> Hi,
>>
>> We are currently working on an acoustic echocanceller based on the well 
>> know NLMS principle. This echocanceller works fine as long as we feed the 
>> echocanceller with a echo signal that is generated by an audio processing 
>> program. When using this ideal echo signal, freezing the FIR coefficients 
>> works like it should: the echo is still cancelled because the FIR tabs 
>> contain a representation of the impulse respons of the (virtual) room.
>>
>> Things are different in real life : when working with a real room, the 
>> echo is cancelled as long as the tabs are not froozen. Echo Attenuation 
>> (ERLE?) is as much as 40dB. However, as soons as the tabs are froozen, 
>> the echo attenuation is reduced to 10-15 dB even with no near end speech!
>>
>> This raises some questions:
>>
>>     - Maybe our code is wrong. We've tested the algorithm in C++ and 
>> Matlab, and both behave the same. Below the Matlab code is include as a 
>> reference, so         if  anyone  sees a bug,  please let me know. In 
>> this peace if code you can see that we stop adapting the weights when 
>> half way through the microphone and         speaker file.
>>
>>     - If this bad behaviour is due to the non linear impulse respons of 
>> the room, and therefore is inherent to AEC, why is everybody talking 
>> about freezing the tabs     when double talk is active?
>>
>> With kind regards,
>>
>> Johan Kleusens
>>
>>
>>
>> % read the speaker file
>> [x,fs] = wavread('c:\testspeaker.wav');             % read speaker file
>>
>> % read the microphone file
>> d=wavread('c:\testmic.wav');                        % read microphone 
>> file
>>
>> L  = 1500;                                          % Define nr of tabs
>>
>> wn = zeros(L,1);                                    % Array of weight 
>> values
>> xn = zeros(L,1);                                    % Storage for input 
>> data
>> n = length(x);                                      % nr of samples on 
>> wave file
>> wavout = zeros(n,2);                                % Storage for wave 
>> output
>> %read sound data on a one sample basis, and proces each sample
>> for i=1:n
>>     xn(2:L) = xn(1:L-1);                            % shift data
>>     xn(1) = x(i);                                   % get one new sample
>>     yn=wn' * xn;                                    % calculate estimated 
>> echo signal
>>     en=d(i)-yn;                                     % calculate error 
>> signal
>>     wavout(i,1)=en;                                 % store error in 
>> output array
>>     p = xn' * xn;                                   % Calculate power of 
>> input
>>     if (i/n) < 0.5
>>         wn = wn +  0.5/(p+0.001) * xn .* en;            % update weights
>>         wavout(i,2)=1;
>>     else
>>         wavout(i,2)=0;
>>     end
>> end
>> wavwrite(wavout,fs,'c:\fdaf.wav');               % Write result to output 
>> file
>>

Hi Johan,

What does "as soon as" really mean? The very second, or just fairly 
quickly after? It is a step change, or a progressive degradation? When 
you switch off the adaption, what phyisical action do you perform? Are 
you just flicking a finger to press a key, or shuffling a bunch of 
people around the room? Is the window open? Do you have curtains 
flapping in the breeze?

If the scale of the degradation relates in some way to the scale 
movement within the room, that is expected. If it is a step change, 
unprovoked by physical movement, you probably have a system problem. 
Software bugs are, of course, a possbility. So is some kind of sampling 
jitter. If you have quick adaption it is surprising how well things will 
work with unstable signal timing. Without the adaption they suddenly 
fall apart. When the adaption is at work is it settling to a fairly 
steady state, or in a constant state of flux? If you are using a 
sufficiently whitened training signal in should settle to a pretty 
stable state, until someone moves.

For reference, with good quality converters, echo cancellation should do 
better than 40dB. You can get around 30dB even down a phone line with 
a-law/u-law distortion.

Regards,
Steve


johan kleuskens wrote:
> Hi,
> 
> We are currently working on an acoustic echocanceller based on the well know 
> NLMS principle. This echocanceller works fine as long as we feed the 
> echocanceller with a echo signal that is generated by an audio processing 
> program. When using this ideal echo signal, freezing the FIR coefficients 
> works like it should: the echo is still cancelled because the FIR tabs 
> contain a representation of the impulse respons of the (virtual) room.
> 
> Things are different in real life : when working with a real room, the echo 
> is cancelled as long as the tabs are not froozen. Echo Attenuation (ERLE?) 
> is as much as 40dB. However, as soons as the tabs are froozen, the echo 
> attenuation is reduced to 10-15 dB even with no near end speech!
> 
> This raises some questions:
> 
>     - Maybe our code is wrong. We've tested the algorithm in C++ and Matlab, 
> and both behave the same. Below the Matlab code is include as a reference, 
> so         if  anyone  sees a bug,  please let me know. In this peace if 
> code you can see that we stop adapting the weights when half way through the 
> microphone and         speaker file.
> 
>     - If this bad behaviour is due to the non linear impulse respons of the 
> room, and therefore is inherent to AEC, why is everybody talking about 
> freezing the tabs     when double talk is active?
> 
> With kind regards,
> 
> Johan Kleusens
> 
> 
> 
> % read the speaker file
> [x,fs] = wavread('c:\testspeaker.wav');             % read speaker file
> 
> % read the microphone file
> d=wavread('c:\testmic.wav');                        % read microphone file
> 
> L  = 1500;                                          % Define nr of tabs
> 
> wn = zeros(L,1);                                    % Array of weight values
> xn = zeros(L,1);                                    % Storage for input data
> n = length(x);                                      % nr of samples on wave 
> file
> wavout = zeros(n,2);                                % Storage for wave 
> output
> %read sound data on a one sample basis, and proces each sample
> for i=1:n
>     xn(2:L) = xn(1:L-1);                            % shift data
>     xn(1) = x(i);                                   % get one new sample
>     yn=wn' * xn;                                    % calculate estimated 
> echo signal
>     en=d(i)-yn;                                     % calculate error signal
>     wavout(i,1)=en;                                 % store error in output 
> array
>     p = xn' * xn;                                   % Calculate power of 
> input
>     if (i/n) < 0.5
>         wn = wn +  0.5/(p+0.001) * xn .* en;            % update weights
>         wavout(i,2)=1;
>     else
>         wavout(i,2)=0;
>     end
> end
> wavwrite(wavout,fs,'c:\fdaf.wav');               % Write result to output 
> file
> 
>

The result changes "immediately" (within a few hundred samples). The room is 
a test-room. Nobody is present in that room. I think there is no change of 
delay in code, as you can see in the Matlab code i sent in the original 
posting. Could the non-linearity of the speaker and/or microphone cause 
this? They are common PC accesoires.

"Tim Wescott" <tim@wescottnospamdesign.com> schreef in bericht 
news:115l3rgsp3lbv49@corp.supernews.com...
> johan kleuskens wrote:
>
>> Hi,
>>
>> We are currently working on an acoustic echocanceller based on the well 
>> know NLMS principle. This echocanceller works fine as long as we feed the 
>> echocanceller with a echo signal that is generated by an audio processing 
>> program. When using this ideal echo signal, freezing the FIR coefficients 
>> works like it should: the echo is still cancelled because the FIR tabs 
>> contain a representation of the impulse respons of the (virtual) room.
>>
>> Things are different in real life : when working with a real room, the 
>> echo is cancelled as long as the tabs are not froozen. Echo Attenuation 
>> (ERLE?) is as much as 40dB. However, as soons as the tabs are froozen, 
>> the echo attenuation is reduced to 10-15 dB even with no near end speech!
>>
>> This raises some questions:
>>
>>     - Maybe our code is wrong. We've tested the algorithm in C++ and 
>> Matlab, and both behave the same. Below the Matlab code is include as a 
>> reference, so         if  anyone  sees a bug,  please let me know. In 
>> this peace if code you can see that we stop adapting the weights when 
>> half way through the microphone and         speaker file.
>>
>>     - If this bad behaviour is due to the non linear impulse respons of 
>> the room, and therefore is inherent to AEC, why is everybody talking 
>> about freezing the tabs     when double talk is active?
>>
>> With kind regards,
>>
>> Johan Kleusens
>>
> -- code snipped --
>
> As Mr. Mughal suggested, your room characteristics may be constantly 
> changing.  You can find this out by monitoring how the echo parameters 
> change with time -- if they don't settle for the real room, then that's 
> probably it.
>
> You should also verify that the delay in your code doesn't change when you 
> freeze the acquisition.  Sudden changes of delay in the code may not be 
> modeled with your "virtual" room, yet would certainly come into play with 
> the real room.
>
> Does the effect actually change _immediately_ on freezing the acquisition, 
> or does it take a little bit of time?  If it's absolutely immediately that 
> would point toward your delay changing (depending on your definition of 
> "immediate", and how quickly you acquire echo information).
>
> -- 
>
> Tim Wescott
> Wescott Design Services
> http://www.wescottdesign.com