DSPRelated.com
Forums

Did you know - LMS

Started by HardySpicer July 4, 2012
That the LMS derivation and equation is only optimal for Guassian
driving signals. If the driving noise through an unknown system has
some other form of distribution (say Laplace as is the case with
speech), then the best estimator of that FIR system is the simpler
sign() LMS. I found this quite a pleasant surprise.

Hardy
On 7/4/12 2:15 AM, HardySpicer wrote:
> That the LMS derivation and equation is only optimal for Guassian > driving signals. If the driving noise through an unknown system has > some other form of distribution (say Laplace as is the case with > speech), then the best estimator of that FIR system is the simpler > sign() LMS. I found this quite a pleasant surprise. >
what's the "sign() LMS"? i never really had the adaptive LMS in class, but the derivation, as i have seen it, doesn't make an assumption regarding the p.d.f. of the driving signals. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
On Jul 5, 2:04&#4294967295;am, robert bristow-johnson <r...@audioimagination.com>
wrote:
> On 7/4/12 2:15 AM, HardySpicer wrote: > > > That the LMS derivation and equation is only optimal for Guassian > > driving signals. If the driving noise through an unknown system has > > some other form of distribution (say Laplace as is the case with > > speech), then the best estimator of that FIR system is the simpler > > sign() LMS. I found this quite a pleasant surprise. > > what's the "sign() LMS"? > > i never really had the adaptive LMS in class, but the derivation, as i > have seen it, doesn't make an assumption regarding the p.d.f. of the > driving signals. > > -- > > r b-j &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295;r...@audioimagination.com > > "Imagination is more important than knowledge."
Same as ordinary LMS except you use sign(e(t)) instead of e(t) (essentially). Well, maximum likelihood does use the PDF of the noise and for Guassian stats you get the same as ordinary least-squares. However, with a different PDF the formula changes of course. Hardy
On 7/4/12 2:49 PM, HardySpicer wrote:
> On Jul 5, 2:04 am, robert bristow-johnson<r...@audioimagination.com> > wrote: >> On 7/4/12 2:15 AM, HardySpicer wrote: >> >>> That the LMS derivation and equation is only optimal for Guassian >>> driving signals. If the driving noise through an unknown system has >>> some other form of distribution (say Laplace as is the case with >>> speech), then the best estimator of that FIR system is the simpler >>> sign() LMS. I found this quite a pleasant surprise. >> >> what's the "sign() LMS"? >> >> i never really had the adaptive LMS in class, but the derivation, as i >> have seen it, doesn't make an assumption regarding the p.d.f. of the >> driving signals. >> > > Same as ordinary LMS except you use sign(e(t)) instead of e(t) > (essentially).
so how does that help?
> Well, maximum likelihood does use the PDF of the noise and for > Guassian stats you get the same > as ordinary least-squares. However, with a different PDF the formula > changes of course.
i don't get it. you can derive the basic LMS filter by minimizing an error metric (involving the mean-square) with no assumption at all about p.d.f. there *are* assumptions about correlation and whiteness,i think. but none, as far as i understand the simple derivation, for p.d.f. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
On Wednesday, July 4, 2012 1:15:04 AM UTC-5, HardySpicer wrote:
> That the LMS derivation and equation is only optimal for Guassian > driving signals. If the driving noise through an unknown system has > some other form of distribution (say Laplace as is the case with > speech), then the best estimator of that FIR system is the simpler > sign() LMS. I found this quite a pleasant surprise. > > Hardy
On what do you base this? It is my understanding that the only constraint on the LMS algorithm is that the input vector be independent of the weight vector (this in itself has hugh implications). As for as optimality is concerned, the LMS only provides optimality in the sense that the filter will estimate the Wiener optimal solution if the input is stationary (in addition to the independence requirement). LMS requires neither gaussian nor white inputs.
>LMS requires neither gaussian nor white inputs. >
Of course you can apply LMS to non-gaussian inputs. I believe this is done in many applications. At least I can mention a few where I know for sure that LMS is used for inputs that are non-gaussian. However, maybe one can take the viewpoint that implicitly the noise is assumed gaussian because only second order information is used. If you know that the noise is not gaussian then there should be a possibility of deriving a better performing 'LMS' filter by exploiting the information that is conveyed in the higher order moments. It will likely introduce some non-linear operation into the update. You might want to do a search on alpha-stable distributions combined with LMS filter or adaptive filtering. just my &euro;0.02 Cheers
I almost might ought to label this "OT" only because it's so far removed 
from earlier comments.  But here goes:

My simple-minded understanding of an LMS filter, just picking one 
variation, goes like this:

Let's assume an automatic line enhancer ALE.
The "model", if you will is that there are some sinusoids "S" mixed with 
noise "N".  I don't know if the type of noise matters at all.  I suspect 
not because:

Here is the block diagram in sort of "analog" terms:

Input
goes into
Long delay line which goes to the Summing Junction
and goes into
Adaptive Filter which also goes to the Summing Junction

The adaptation objective is to minimize the output of the summing junction.
And, the implementation uses the output of the Adaptive Filter as the 
system output (not the Summing Junction).
The job of the delay line is to decorrelate the noises arriving at the 
summing junction.
So, for that to happen I think all one need to say about the noise is 
that it's random.

Take this case:
Apply *only* noise to the input.
The Adaptive filter "turns off" at all frequencies because random noise1 
plus uncorrelated random noise2 is always larger than noise1.
How could the nature of the noise matter then?

OK.  Now put the sinusoids back in.
The Adaptive filter still "turns of" at all frequencies *except* where 
the sinusoids lie (where their amplitude and phase are adjusted) - in 
order to cancel them out at the summing junction.  Anything else would 
not minimize the summing junction output.
That particular part of the adapted filter should not be much affected 
by the nature of the noise - the pdf or whiteness or whatever.

If the noise is narrowband noise centered where the sinusoids lie then 
the "noise" starts looking sinusoidal and we have a good chore for a 
thesis topic.  Otherwise I don't see how it will matter much ... if it does.

Fred


On Jul 6, 2:44&#4294967295;am, maury <maury...@core.com> wrote:
> On Wednesday, July 4, 2012 1:15:04 AM UTC-5, HardySpicer wrote: > > That the LMS derivation and equation is only optimal for Guassian > > driving signals. If the driving noise through an unknown system has > > some other form of distribution (say Laplace as is the case with > > speech), then the best estimator of that FIR system is the simpler > > sign() LMS. I found this quite a pleasant surprise. > > > Hardy > > On what do you base this? It is my understanding that the only constraint on the LMS algorithm is that the input vector be independent of the weight vector (this in itself has hugh implications). As for as optimality is concerned, the LMS only provides optimality in the sense that the filter will estimate the Wiener optimal solution if the input is stationary (in addition to the independence requirement). > > LMS requires neither gaussian nor white inputs.
It's not a major difference that I can see, but it is an important observation. LMS can be derived from maximum likelihood assuming a Guassian distribution. However, if you assume a different distribution you get a different algorithm tailored to that distribution. It should give better results. From my simulations it's not that striking though. See this paper OPTIMUM ERROR NONLINEARITIES FOR LMS ADAPTATION S. C. Douglas and T. H.-Y. Meng Hardy
On Jul 6, 11:53&#4294967295;am, HardySpicer <gyansor...@gmail.com> wrote:
> On Jul 6, 2:44&#4294967295;am, maury <maury...@core.com> wrote: > > > On Wednesday, July 4, 2012 1:15:04 AM UTC-5, HardySpicer wrote: > > > That the LMS derivation and equation is only optimal for Guassian > > > driving signals. If the driving noise through an unknown system has > > > some other form of distribution (say Laplace as is the case with > > > speech), then the best estimator of that FIR system is the simpler > > > sign() LMS. I found this quite a pleasant surprise. > > > > Hardy > > > On what do you base this? It is my understanding that the only constraint on the LMS algorithm is that the input vector be independent of the weight vector (this in itself has hugh implications). As for as optimality is concerned, the LMS only provides optimality in the sense that the filter will estimate the Wiener optimal solution if the input is stationary (in addition to the independence requirement). > > > LMS requires neither gaussian nor white inputs. > > It's not a major difference that I can see, but it is an important > observation. > LMS can be derived from maximum likelihood assuming a Guassian > distribution. However, if you assume a different distribution > you get a different algorithm tailored to that distribution. It should > give better results. From my simulations it's not that striking > though. > See this paper > > OPTIMUM ERROR NONLINEARITIES FOR LMS ADAPTATION > S. C. Douglas and T. H.-Y. Meng > > Hardy
and from that paper it says for Laplacian plant noise, we may achieve a 3dB reduction in misadjustment using a sign error nonlinearity for a given convergence rate as compared to standard LMS adaptation
On Thursday, July 5, 2012 8:03:04 PM UTC-5, HardySpicer wrote:
> On Jul 6, 11:53&#4294967295;am, HardySpicer <gyansor...@gmail.com> wrote: > > On Jul 6, 2:44&#4294967295;am, maury <maury...@core.com> wrote: > > > > > On Wednesday, July 4, 2012 1:15:04 AM UTC-5, HardySpicer wrote: > > > > That the LMS derivation and equation is only optimal for Guassian > > > > driving signals. If the driving noise through an unknown system has > > > > some other form of distribution (say Laplace as is the case with > > > > speech), then the best estimator of that FIR system is the simpler > > > > sign() LMS. I found this quite a pleasant surprise. > > > > > > Hardy > > > > > On what do you base this? It is my understanding that the only constraint on the LMS algorithm is that the input vector be independent of the weight vector (this in itself has hugh implications). As for as optimality is concerned, the LMS only provides optimality in the sense that the filter will estimate the Wiener optimal solution if the input is stationary (in addition to the independence requirement). > > > > > LMS requires neither gaussian nor white inputs. > > > > It's not a major difference that I can see, but it is an important > > observation. > > LMS can be derived from maximum likelihood assuming a Guassian > > distribution. However, if you assume a different distribution > > you get a different algorithm tailored to that distribution. It should > > give better results. From my simulations it's not that striking > > though. > > See this paper > > > > OPTIMUM ERROR NONLINEARITIES FOR LMS ADAPTATION > > S. C. Douglas and T. H.-Y. Meng > > > > Hardy > > and from that paper it says > > for Laplacian plant noise, we may achieve a > 3dB reduction in misadjustment using a sign error nonlinearity for a > given convergence rate as compared to standard LMS adaptation
I always have a problem when people use coefficient misadjustment as a measure of the performance of the LMS algorithm. The LMS algorithm is NOT derived by determining the minimum misadjustment, nor does it try to minimize coefficient misadjustment. The ONLY parameter the derivation minimizes is the mean-square-error, nothing else.