Reply by maury October 4, 20112011-10-04
On Oct 3, 10:40&#2013266080;pm, "steveu" <steveu@n_o_s_p_a_m.coppice.org> wrote:
> >On Oct 3, 7:03=A0pm, "steveu" <steveu@n_o_s_p_a_m.coppice.org> wrote: > >> The right way is to record something realistic, since that's what you > nee= > >d > >> to cancel. Most real signals, like voice, are suitably wideband, so > they > >> avoid the narrowband pitfall. > > >Because echo cancellers usually converge fast, voice appears as a non- > >stationary signalto the echo canceller. Voice changes frequency enough > >that the echo canceller converges to the response of the unknown for > >the frequencies present, then must reconverge when those frequencies > >change. If you look at the extimate of the echo (y_hat) with voice, > >you will see continual instances of the canceller reconverging to > >produce the estimate. > > True, but voice is the most benign of the common real world training > signals. With certain types of music it is not uncommon to find echo > cancelers drifting off from a generic solution to one which only cancels > the music. However, if your canceler drifts off from the generic solution > on voice you are in serious trouble. > > Steve- Hide quoted text - > > - Show quoted text -
Well, that may be a different problem. LMS-based adaptive filters estimate the Weiner optimum filter, y = Rxx^-1 * rxd. If the input is a sinusoid, there is NO optimum solution. In fact, there are infinite solutions. Adaptive filters can, and often do, jump from one solution to another. This appears as an inability of the adaptive filter to converge. The design of the adaptive filter must take this into account (look at the tests for low-speed modems in ITU-T G.168, Network Echo Cancellers). Since music can often be sinusoidal, this may be the problem with the drift you see with music inputs. With sinusoids, you must look at the adaptive filter using transfer function analysis. For reference, look at P. M. Clarkson and P. R. White, "Simplified analysis of the LMS adaptive filter using a transfer function approximation", IEEE Trans. Acoust., Speech, Signal Processing, vol ASSP-35, pp. 987 - 993, 1987. One *brute-force* solution that has been successful, if the unknown impulse response remains constant, is to converge using wide-band noise (over the full frequency range of interest), and then freeze futher adaptations. The impulse response is acquired for all frequencies, and the adpative filter could then care less what signal you apply. Maurice Givens
Reply by steveu October 4, 20112011-10-04
>On Oct 3, 7:03=A0pm, "steveu" <steveu@n_o_s_p_a_m.coppice.org> wrote: >> The right way is to record something realistic, since that's what you
nee=
>d >> to cancel. Most real signals, like voice, are suitably wideband, so
they
>> avoid the narrowband pitfall. >> > >Because echo cancellers usually converge fast, voice appears as a non- >stationary signalto the echo canceller. Voice changes frequency enough >that the echo canceller converges to the response of the unknown for >the frequencies present, then must reconverge when those frequencies >change. If you look at the extimate of the echo (y_hat) with voice, >you will see continual instances of the canceller reconverging to >produce the estimate.
True, but voice is the most benign of the common real world training signals. With certain types of music it is not uncommon to find echo cancelers drifting off from a generic solution to one which only cancels the music. However, if your canceler drifts off from the generic solution on voice you are in serious trouble. Steve
Reply by maury October 3, 20112011-10-03
On Oct 3, 7:03&#2013266080;pm, "steveu" <steveu@n_o_s_p_a_m.coppice.org> wrote:
> The right way is to record something realistic, since that's what you need > to cancel. Most real signals, like voice, are suitably wideband, so they > avoid the narrowband pitfall. >
Because echo cancellers usually converge fast, voice appears as a non- stationary signalto the echo canceller. Voice changes frequency enough that the echo canceller converges to the response of the unknown for the frequencies present, then must reconverge when those frequencies change. If you look at the extimate of the echo (y_hat) with voice, you will see continual instances of the canceller reconverging to produce the estimate. Maurice Givens
Reply by steveu October 3, 20112011-10-03
> >> Define loud. If loud means into clipping, or loud enough the mic or
speaker
>> distort, you are into non-linearities which will mess up the
cancellation
>> quite badly. Stay linear and things should be fine during double talk.
An
> >x[k] : far-end speech (speaker signal) >e[k] : echo signal from loudspeaker >z[k] : near-end speech >y[k] : microphone signal > >y[k] = e[k] + z[k] > > >By loud I mean that the variance of e[k] is much larger than the >variance of z[k]. > >Which properties must x[k] and y[k] possess? You already mentioned >clipping and >distortion...What else? > >If you're going to record some test signals to test an AEC with, what >is the right way >to do it; and what is the wrong way?
The wrong way is to record something narrow band. The right way is to record something realistic, since that's what you need to cancel. Most real signals, like voice, are suitably wideband, so they avoid the narrowband pitfall.
>> EC would be rather a waste of time if it could handle double talk well. > >Are you saying that an AEC can't handle double-talk well? I have seen >examples >of an AEC completely cancelling out the echo during double-talk.
Sorry, that was a typo. I missed the "n't" from couldn't. Steve
Reply by maury October 3, 20112011-10-03
On Oct 3, 4:34=A0pm, John McDermick <johnthedsp...@gmail.com> wrote:
> > Which properties must x[k] and y[k] possess? You already mentioned > > clipping and > > distortion...What else? > > Correction: > Which properties must x[k] and y[k] possess? You already mentioned > that signals will clipping/distortion don't work well with an AEC.
The first question to answer, John, is what algorithm are you using, LMS, NLMS, RLS, a variation thereof? If the LMS varity (of which there are many varieties), then look at the assumptions of the algorithm, e.g., stationary, non-correlated, etc. Those are the properties you must consider. Another very important property is the model. If you are using the y =3D Ax model, then the unknown must be linear. How about the length of your adaptive filter? Is it long enough to encompass the length of the unknowm impulse response? Are you keeping the impulse response constant (stationarity)? If you are using a *live* room, then every time someone moves, the impulse response changes. Is the convergence speed of your algorithm fast enough to keep up with the changes? For a basic look at echo cancellers (network type) and things to consider, see if you can get a copy of a technical report on echo cancellers produced by the T1 committee around 1993 - 1997 time frame. Can't remember the exact date. Maurice Givens
Reply by John McDermick October 3, 20112011-10-03
> Which properties must x[k] and y[k] possess? You already mentioned > clipping and > distortion...What else? >
Correction: Which properties must x[k] and y[k] possess? You already mentioned that signals will clipping/distortion don't work well with an AEC.
Reply by John McDermick October 3, 20112011-10-03
> Define loud. If loud means into clipping, or loud enough the mic or speaker > distort, you are into non-linearities which will mess up the cancellation > quite badly. Stay linear and things should be fine during double talk. An
x[k] : far-end speech (speaker signal) e[k] : echo signal from loudspeaker z[k] : near-end speech y[k] : microphone signal y[k] = e[k] + z[k] By loud I mean that the variance of e[k] is much larger than the variance of z[k]. Which properties must x[k] and y[k] possess? You already mentioned clipping and distortion...What else? If you're going to record some test signals to test an AEC with, what is the right way to do it; and what is the wrong way?
> EC would be rather a waste of time if it could handle double talk well.
Are you saying that an AEC can't handle double-talk well? I have seen examples of an AEC completely cancelling out the echo during double-talk.
Reply by steveu October 3, 20112011-10-03
>Hello, > >I have a general question about acoustic echo cancellers. > >If you look at the power of the echo component in a microphone signal, >is there a point (common for most AECs) where the AEC starts degrading >the near-end speech component instead? > >From my limited experience with AECs it seems that they work great >within a narrow range. If the speaker signal becomes too loud compared >to the loudness of the near-end speech, then the AEC output will sound >horrible during double-talk.
Define loud. If loud means into clipping, or loud enough the mic or speaker distort, you are into non-linearities which will mess up the cancellation quite badly. Stay linear and things should be fine during double talk. An EC would be rather a waste of time if it could handle double talk well. Steve
Reply by John McDermick October 3, 20112011-10-03
Hello,

I have a general question about acoustic echo cancellers.

If you look at the power of the echo component in a microphone signal,
is there a point (common for most AECs) where the AEC starts degrading
the near-end speech component instead?

From my limited experience with AECs it seems that they work great
within a narrow range. If the speaker signal becomes too loud compared
to the loudness of the near-end speech, then the AEC output will sound
horrible during double-talk.