DSPRelated.com
Forums

signal enhancement by adaptive filtering

Started by Arran September 5, 2012
Hi

I'm developing an application to reduce alarm noise from speech tranmission=
s. Many posts have come up on the forum before for topics like this, so apo=
logies if some of these questions have been covered before but I haven't ma=
naged to find discussions about all these issues.

I'm considering using adaptive filtering to do this, in particular a soluti=
on based on the standard 'signal enhancement' approach:

                     +
s+n1 -----------------( + )---------- e=20
                        =A6-    =A6
              /         =A6     =A6
  n2 ----- Filter ------+     =A6
            /                 =A6
            =A6                 =A6
            +-----------------+
               =20
Where s is speech, n1 the noise added to the speech, n2 a reference measure=
ment of the noise and e the 'error' signal fed back to the adaptive filter.=
 A voice microphone picks up the speech s plus a version of the noise n1 wh=
ich has travelled through an 'acoustic system', which is time variant depen=
ding on the position of the user and other factors. The 'initial' delay thr=
ough this system (time from alarm to user) could be in the order of 10ms to=
 500ms.

In the application, it's possible to get the noise signal n2 from the alarm=
 signal generator output directly. Alternatively, a mic positioned near the=
 alarm transducer could be used. The alarm noise is sweeping tones of fixed=
 patterns, but arranged in random order, which may have many harmonics and =
can occur at any time. I don't have a measurement yet for noise levels in d=
B at the voice mic, or a measurement relative to the speech, however it's e=
xpected to be loud enough for the user to have to raise their voice to talk=
. Eventually some real measurements of the input signals will be taken.

I'm trying to understand what the 'real world' difficulties are in implemen=
ting this adaptive filter approach. I'm wondering if anyone can share some =
thoughts or experience on how successful it is likely to be?

From simple experiments in simulation, it seems like LMS based algorithms c=
an converge to give a good mean square estimate of a really simple system r=
egardless of the non-stationary statistics of the noise n2. I'm wondering w=
hether there is a strong theorectical view of this, but in simple terms it =
seems that the changing statistics of the noise cannot effect the error ass=
uming n1 and n2 are correlated and their changing statistics are 'cancelled=
 out' when calculating the error e =3D s+n1 - n2? I'm wondering whether thi=
s will translate to a real system. The speech is also of course non-station=
ary, and this seems to have more effect on the estimate as it's statistics =
change and get fed back in the error signal. However the filter converges w=
ell when the user is not speaking.

Will the form of the noise signal effect the ability of the filter to 'mode=
l' the acoustic system? It seems possible to analyse how well a given algor=
ithm will respond in terms of the convergence rate to the minimum error fro=
m the input signal correlation matrix. I have to admit my knowledge here is=
 limited on this, I'm wondering before reading more on the theory if this i=
s likely to be feasible or useful in particular for the sweeping tone noise=
?

Will the ability of the filter to estimate n1 in a real system depend on fa=
ctors such as non-linearities in the electro-acoustic system the noise trav=
els through? If the alarm transducer is non-linear, will the filter respons=
e likely be better with n2 measuring the output of the transducer rather th=
an direct from the signal generator?

The acoustic system will have walls that act as barriers to the noise, and =
provide reflections, reverberations etc. I'm wondering whether this will im=
practical in term of CPU load to use an FIR filter to estimate the noise as=
 the range of delays in the impulse response could be large (up to 500ms). =
Or, whether the difficulty will be more due to the system being non-station=
ary, and whether the adaptive filter algorithms will be able to track the s=
ystem given the input signals. In this application could the sweeping noise=
 input or the time-variant electro-acoustic system, or both, determine what=
 algorithm will work best (LMS, RLS etc.)?

I'm also wondering if there are alternative approaches to adaptive time-dom=
ain filtering that I should be considering as they could produce better noi=
se reduction for this type of application?

Many thanks
Arran
>> 'real world' difficulties
a very basic issue: I'd keep an eye on nonlinear distortion of microphone, amplifier, DAC etc.
What you're showing is a standard noise cancellation approach .. as 
rather distinct from a "signal enhancement" approach (like a line 
enhancer which doesn't use a noise reference as such).  But, OK.

Consider this:

In order for a filtered version of n2 to cancel n1, the noise has to be 
a periodic waveform so that what components are being cancelled are 
sinusoids.  I have seen a *great* demonstration of this where the 
"noise" was the output of a periodically and rapidly swept sinusoid. 
The improvement was dramatic.  (well, it *was* a demo!)

The principle is that the adapted filter becomes a "comb" that provides 
amplitude and phase controlled versions of the sinusoids in n2 that 
match those in n1 while shutting off any random noise frequencies in n2.

I'm assuming this is your case.

But, just to be a bit more complete:

If the noise is broadband random noise then, in the general case, the 
adaptive filter has to turn off completely.  That's because a random 
noise added to an uncorrrelated random noise yields more noise and never 
less noise.

If n1 and n2 are random but identical, less a time shift and a scale 
factor, then all the adaptive filter would have to do is pick an 
appropriate delay and weight .. in principle.  Probably noise cancelling 
headphones would be an example of this.

In what appears to be your case:
The filter can do a good job of cancelling the "noise" as long as the 
dynamics aren't too great.  Consider this:
- if the time delay between n1 and n2 changes then the filter has to 
adapt to the change.  In general these filters don't adapt all that fast 
at least in my experience.  The faster the better in this regard.
- if the filter changes because of changes in the desired signal then I 
would say "there's something wrong" but I'm not sure in any particular 
case just what.  My best guess would be that it's adapting too fast 
rather than tracking the average noise output.

As in many cases of this sort there is a likely trade between the speed 
of adaptation, convergence and sensitivity to the "signal".
- you want it to converge and, I believe, convergence and speed of 
convergence run counter to one another.
- you want it to converge rapidly enough to track changes in the 
relationship between n1 and n2.
- you don't want it to change so rapidly that changes in the signal 
affect the adaptation all that much.

Fred
On Wednesday, September 5, 2012 11:35:14 AM UTC-5, Fred Marshall wrote:
> What you're showing is a standard noise cancellation approach .. as rathe=
r distinct from a "signal enhancement" approach (like a line enhancer which= doesn't use a noise reference as such). But, OK. Consider this: In order f= or a filtered version of n2 to cancel n1, the noise has to be a periodic wa= veform so that what components are being cancelled are sinusoids. I have se= en a *great* demonstration of this where the "noise" was the output of a pe= riodically and rapidly swept sinusoid. The improvement was dramatic. (well,= it *was* a demo!) The principle is that the adapted filter becomes a "comb= " that provides amplitude and phase controlled versions of the sinusoids in= n2 that match those in n1 while shutting off any random noise frequencies = in n2. I'm assuming this is your case. But, just to be a bit more complete:= If the noise is broadband random noise then, in the general case, the adap= tive filter has to turn off completely. That's because a random noise added= to an uncorrrelated random noise yields more noise and never less noise. I= f n1 and n2 are random but identical, less a time shift and a scale factor,= then all the adaptive filter would have to do is pick an appropriate delay= and weight .. in principle. Probably noise cancelling headphones would be = an example of this. In what appears to be your case: The filter can do a go= od job of cancelling the "noise" as long as the dynamics aren't too great. = Consider this: - if the time delay between n1 and n2 changes then the filte= r has to adapt to the change. In general these filters don't adapt all that= fast at least in my experience. The faster the better in this regard. - if= the filter changes because of changes in the desired signal then I would s= ay "there's something wrong" but I'm not sure in any particular case just w= hat. My best guess would be that it's adapting too fast rather than trackin= g the average noise output. As in many cases of this sort there is a likely= trade between the speed of adaptation, convergence and sensitivity to the = "signal". - you want it to converge and, I believe, convergence and speed o= f convergence run counter to one another. - you want it to converge rapidly= enough to track changes in the relationship between n1 and n2. - you don't= want it to change so rapidly that changes in the signal affect the adaptat= ion all that much. Fred Fred, The noise does not need to be periodic. It just needs to be correlated with= the noise in n1 (and independent of the desired signal). n1 can be a filte= red version of n2, and the noise will be reduced as long as the adaptive fi= lter is long enough to represent the impusle response of the noise filter. maurice
On Wednesday, September 5, 2012 11:17:44 PM UTC+12, Arran wrote:
> Hi >=20 >=20 >=20 > I'm developing an application to reduce alarm noise from speech tranmissi=
ons. Many posts have come up on the forum before for topics like this, so a= pologies if some of these questions have been covered before but I haven't = managed to find discussions about all these issues.
>=20 >=20 >=20 > I'm considering using adaptive filtering to do this, in particular a solu=
tion based on the standard 'signal enhancement' approach:
>=20 >=20 >=20 > + >=20 > s+n1 -----------------( + )---------- e=20 >=20 > =A6- =A6 >=20 > / =A6 =A6 >=20 > n2 ----- Filter ------+ =A6 >=20 > / =A6 >=20 > =A6 =A6 >=20 > +-----------------+ >=20 > =20 >=20 > Where s is speech, n1 the noise added to the speech, n2 a reference measu=
rement of the noise and e the 'error' signal fed back to the adaptive filte= r. A voice microphone picks up the speech s plus a version of the noise n1 = which has travelled through an 'acoustic system', which is time variant dep= ending on the position of the user and other factors. The 'initial' delay t= hrough this system (time from alarm to user) could be in the order of 10ms = to 500ms.
>=20 >=20 >=20 > In the application, it's possible to get the noise signal n2 from the ala=
rm signal generator output directly. Alternatively, a mic positioned near t= he alarm transducer could be used. The alarm noise is sweeping tones of fix= ed patterns, but arranged in random order, which may have many harmonics an= d can occur at any time. I don't have a measurement yet for noise levels in= dB at the voice mic, or a measurement relative to the speech, however it's= expected to be loud enough for the user to have to raise their voice to ta= lk. Eventually some real measurements of the input signals will be taken.
>=20 >=20 >=20 > I'm trying to understand what the 'real world' difficulties are in implem=
enting this adaptive filter approach. I'm wondering if anyone can share som= e thoughts or experience on how successful it is likely to be?
>=20 >=20 >=20 > From simple experiments in simulation, it seems like LMS based algorithms=
can converge to give a good mean square estimate of a really simple system= regardless of the non-stationary statistics of the noise n2. I'm wondering= whether there is a strong theorectical view of this, but in simple terms i= t seems that the changing statistics of the noise cannot effect the error a= ssuming n1 and n2 are correlated and their changing statistics are 'cancell= ed out' when calculating the error e =3D s+n1 - n2? I'm wondering whether t= his will translate to a real system. The speech is also of course non-stati= onary, and this seems to have more effect on the estimate as it's statistic= s change and get fed back in the error signal. However the filter converges= well when the user is not speaking.
>=20 >=20 >=20 > Will the form of the noise signal effect the ability of the filter to 'mo=
del' the acoustic system? It seems possible to analyse how well a given alg= orithm will respond in terms of the convergence rate to the minimum error f= rom the input signal correlation matrix. I have to admit my knowledge here = is limited on this, I'm wondering before reading more on the theory if this= is likely to be feasible or useful in particular for the sweeping tone noi= se?
>=20 >=20 >=20 > Will the ability of the filter to estimate n1 in a real system depend on =
factors such as non-linearities in the electro-acoustic system the noise tr= avels through? If the alarm transducer is non-linear, will the filter respo= nse likely be better with n2 measuring the output of the transducer rather = than direct from the signal generator?
>=20 >=20 >=20 > The acoustic system will have walls that act as barriers to the noise, an=
d provide reflections, reverberations etc. I'm wondering whether this will = impractical in term of CPU load to use an FIR filter to estimate the noise = as the range of delays in the impulse response could be large (up to 500ms)= . Or, whether the difficulty will be more due to the system being non-stati= onary, and whether the adaptive filter algorithms will be able to track the= system given the input signals. In this application could the sweeping noi= se input or the time-variant electro-acoustic system, or both, determine wh= at algorithm will work best (LMS, RLS etc.)?
>=20 >=20 >=20 > I'm also wondering if there are alternative approaches to adaptive time-d=
omain filtering that I should be considering as they could produce better n= oise reduction for this type of application?
>=20 >=20 >=20 > Many thanks >=20 > Arran
So many I cannot list them all. Spectral subtraction maybe, blind source se= paration, de-correlation etc
On Wednesday, September 5, 2012 2:57:46 PM UTC+1, mnentwig wrote:
> >> 'real world' difficulties=20 >=20 >=20 >=20 > a very basic issue: I'd keep an eye on nonlinear distortion of microphone=
,
>=20 > amplifier, DAC etc.
Right, this is what I wondered. It seems like it will be worth spending som= e time comparing the direct signal generator output with the output of a re= ference microphone near the horn in case the horn and amp are particularly = non-linear (assuming the reference microphone is reasonably linear). These = would be the two choices for n2, so I'd want to pick the one which represen= ts n1 best. Thanks Arran
> The principle is that the adapted filter becomes a "comb" that provides > amplitude and phase controlled versions of the sinusoids in n2 that > match those in n1 while shutting off any random noise frequencies in n2. > > I'm assuming this is your case. > > But, just to be a bit more complete: > > If the noise is broadband random noise then, in the general case, the > adaptive filter has to turn off completely. That's because a random > noise added to an uncorrrelated random noise yields more noise and never > less noise.
The noise is simple swept tones, with a number of harmonics. The tones can occur in a random sequence, but the noise is not random as in the sense of the output of a stochastic process, e.g. white noise.
> In what appears to be your case: > > The filter can do a good job of cancelling the "noise" as long as the > dynamics aren't too great. Consider this: > - if the time delay between n1 and n2 changes then the filter has to > adapt to the change. In general these filters don't adapt all that fast > at least in my experience. The faster the better in this regard. > - if the filter changes because of changes in the desired signal then I > would say "there's something wrong" but I'm not sure in any particular > case just what. My best guess would be that it's adapting too fast > rather than tracking the average noise output.
I'm expecting the time delay and acoustic system to vary fairly quickly and often so getting the adaption rate and choice of algorithm right sounds like it could be a challenge (although available MIPs might help decide the algorithm).
> As in many cases of this sort there is a likely trade between the speed > of adaptation, convergence and sensitivity to the "signal". > - you want it to converge and, I believe, convergence and speed of > convergence run counter to one another. > - you want it to converge rapidly enough to track changes in the > relationship between n1 and n2. > - you don't want it to change so rapidly that changes in the signal > affect the adaptation all that much.
Right, if I increase the step update factor for the LMS case, I see the speech signal starting to affect the adaption, which I think agrees with what you are saying. Thanks, Arran
> The noise does not need to be periodic. It just needs to be correlated wi=
th the noise in n1 (and independent of the desired signal). n1 can be a fil= tered version of n2, and the noise will be reduced as long as the adaptive = filter is long enough to represent the impusle response of the noise filter= . I have a budget of about 20MIPs. So even using LMS, as a rough estimate I c= ould probably have a filter with an impulse response of around 40ms maximum= . I expect the real acoustic systems could easily have longer impulse respo= nses. Is this likely to be one of the main limiting factors with this approach? I= 'm hoping that using an initial fixed delay to match roughly the length of = the delay through the acoustic system will help (to 'locate' the filter nea= r the start of the acoustic system impulse response). Perhaps in practice t= his might be a bit simplistic? I don't know how much such a delay might var= y or even if it is present until I have some real measurements. Thanks, Arran
On 9/5/2012 2:14 PM, maury wrote:
> Fred, > The noise does not need to be periodic. It just needs to be correlated with the noise in n1 (and independent of the desired signal). n1 can be a filtered version of n2, and the noise will be reduced as long as the adaptive filter is long enough to represent the impusle response of the noise filter. > > maurice
Maurice, Ah. I guess I can see how that would work. So, I guess what I said: "If n1 and n2 are random but identical, less a time shift and a scale factor, then all the adaptive filter would have to do is pick an appropriate delay and weight .. in principle." is a simpler example of the same thing and maybe better said as: "If n1 and n2 are random but correlated, less amplitude and phase (or delay?) details, then the adaptive filter would have to adjust to an appropriate amplitude and phase response for cancellation." I'm not sure what "correlated" means when there are phase differences but surely simple delay, linear phase makes sense here to me. Thanks for commenting. Fred
On 9/6/2012 6:12 AM, Arran wrote:
>> The principle is that the adapted filter becomes a "comb" that provides >> amplitude and phase controlled versions of the sinusoids in n2 that >> match those in n1 while shutting off any random noise frequencies in n2. >> >> I'm assuming this is your case. >> >> But, just to be a bit more complete: >> >> If the noise is broadband random noise then, in the general case, the >> adaptive filter has to turn off completely. That's because a random >> noise added to an uncorrrelated random noise yields more noise and never >> less noise. > > The noise is simple swept tones, with a number of harmonics. The tones can occur in a random sequence, but the noise is not random as in the sense of the output of a stochastic process, e.g. white noise. > >> In what appears to be your case: >> >> The filter can do a good job of cancelling the "noise" as long as the >> dynamics aren't too great. Consider this: >> - if the time delay between n1 and n2 changes then the filter has to >> adapt to the change. In general these filters don't adapt all that fast >> at least in my experience. The faster the better in this regard. >> - if the filter changes because of changes in the desired signal then I >> would say "there's something wrong" but I'm not sure in any particular >> case just what. My best guess would be that it's adapting too fast >> rather than tracking the average noise output. > > I'm expecting the time delay and acoustic system to vary fairly quickly and often so getting the adaption rate and choice of algorithm right sounds like it could be a challenge (although available MIPs might help decide the algorithm). > >> As in many cases of this sort there is a likely trade between the speed >> of adaptation, convergence and sensitivity to the "signal". >> - you want it to converge and, I believe, convergence and speed of >> convergence run counter to one another. >> - you want it to converge rapidly enough to track changes in the >> relationship between n1 and n2. >> - you don't want it to change so rapidly that changes in the signal >> affect the adaptation all that much. > > Right, if I increase the step update factor for the LMS case, I see the speech signal starting to affect the adaption, which I think agrees with what you are saying. > > Thanks, > Arran >
Well maybe MIPs will help in some system context here but that's beyond my comprehension of your situation. The tradeoff that I mentioned is about *physics* and not MIPs. It doesn't matter how fast you can compute things if they aren't "ready" to be computed. That is, the system time constants need to be what they need to be. Example: You have all the MIPs you need. You set the adaptation too fast. The adaptation becomes sensitive to the signal components. Compared to: You set the adaptation so that the noise is minimized independent of the signal. After all, there is no signal in the reference in principle. So all the adjustment has to do or has to work with is the reference. In that sense the filter becomes a model of the path between n2 and n1. Now, if the path is changing rapidly in the sense of the filter adaptation then that's where the trade comes in. If the path varies so fast that you make the adaptation fast and then the signal gets into the adaptation then you're on the hairy edge. I should stop now as these dynamics are a bit out of my depth. Fred Fred