DSPRelated.com
Forums

Did you know - LMS

Started by HardySpicer July 4, 2012
On 7/14/12 3:52 PM, niarn wrote:
>>> Using your notation can you then accept this formulation >>> as a formulation of adaptive FIR filtering: >>> "Given d[n] and x[n] and given some assumptions about e[n], then in > some >>> sense estimate/compute h[n,m]." > >> apart from the doubletalk issue, yes, the problem statement is to do >> something about h[n,i] so that e[n] is reduced to a minimum in a >> mean-square sense. that is what the "LMS" is about. > > I assume then that you accept the formulation of the adaptive FIR > filtering.
i think so. i just presented a formulation, so i might ask you the same, do you understand and accept it as how i presented? you see the difference between e[n] and v[n]?. i am still a little more careful to present d[n] as a real signal that has it's own independent origin and that origin has *no* e[n]. *you* (or your LMS filter) in practice define e[n] in terms of y[n] and d[n]. d[n] is supplied as an input.
> > The next step is then to write the expression for the PDF of d[n], written > as p(d[n];h[n,m]). Note that this is not a conditional PDF, the semi colon > is there to note/express that the PDF of d[n] is parameterized by h[n,m]. > In our model > > N-1 > d[n] = SUM{ h[n,i] * x[n-i] } + e[n] = y[n] + e[n] > i=0 >
> we now take e[n] to be normally distributed with zero mean and variance > sigma.
in general, least-square fitting *does* make an assumption that the error is zero mean and finite variance, but there is *no* assumption of normal or gaussian p.d.f.
> Typically, if I remember correct this is sometimes written in the > following way, e[n] ~ N(0,sigma) > > Then we have: d[n] ~ N(y[n],sigma)
that does not directly follow. what if all of your h[n,i] are zero except one (say h[n,0]) and x[n] is not normal?
> or if we want to write the detailed expression > > p(d[n];h[n,m]) = sqrt(2 pi sigma)^{-1} exp(-0.5 (d[n] - y[n])^2 / sigma ) > > > Does this make any sense?
sorta, it's stated in a manner that i think i know what you're saying mathematically, but the assumption of normal p.d.f. is not necessary, is it? -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
>i think so. i just presented a formulation, so i might ask you the >same, do you understand and accept it as how i presented?
Yes, I do. I first learned about LMS in a book by S. Haykin where LMS is coached, if not exactly, then in a way very similar to the way to coach it. The motivation for this other formulation is that it paves the way for another level of statistical treatment.
>> Typically, if I remember correct this is sometimes written in the >> following way, e[n] ~ N(0,sigma) >> >> Then we have: d[n] ~ N(y[n],sigma) > >that does not directly follow. what if all of your h[n,i] are zero >except one (say h[n,0]) and x[n] is not normal?
x[n] is given to us, it is not considered a random variable in this model. The coefficients h[n,m] are unknown but deterministic. They CAN be considered as random variables. In fact this is what they do in the code I pasted previously. There the coefficients were assumed to evolve according to first order markov processes but that takes us into a Bayesian setup. It is important to note that x[n] are given to us and that we regard h[n,m] as being deterministic.
>> >> Does this make any sense? > >sorta, it's stated in a manner that i think i know what you're saying >mathematically, but the assumption of normal p.d.f. is not necessary, is
it? e[n] is a random variable and we are free to choose any distribution for it. If we have some prior knowledge about it such as for instance that it is heavy tailed we might choose a different (heavy tailed) distribution for e[n]. But we take it to be normal for know. Unfortunately, I don't have internet access for a week (such places do exist). Maybe Hardy can take over otherwise I hope we can take the next 2 or so steps in a week from now.
Hi rbj,

Sorry about the delay. Don't know if you're still interested in this. I
have found a reference (a book) that describes exactly the procedure I had
planned to write down. The reference is very easy to follow and I have used
the book myself back when I was a student. The book is "Neural Networks for
Pattern Recognition" by Christopher M. Bishop. Chapter 6 is called "Error
functions" and it starts out by giving an easy to follow outline of the
principle of Maximum Likelihood (ML). 4 pages into chapter 6 on page 194,
second paragraph, I quote 

"We have derived the sum-of-squares error function from the principle of
maximum likelihood on the assumption of Gaussian distributed target data.
Of course the use of a sum-of-squares error does not require the target
data to have a Gaussian distribution. Later in this chapter we shall
consider the least-squares solution for an example problem with a strongly
non-Gaussian distribution. However, as we shall see, if we use a
sum-of-squares error, then the results we obtain cannot distinguish between
the true distribution and any other distribution having the same mean and
variance." 


Maybe that chapter is of interest to you.
On Sunday, July 22, 2012 8:52:20 AM UTC+12, niarn wrote:
> Hi rbj, > > Sorry about the delay. Don't know if you're still interested in this. I > have found a reference (a book) that describes exactly the procedure I had > planned to write down. The reference is very easy to follow and I have used > the book myself back when I was a student. The book is "Neural Networks for > Pattern Recognition" by Christopher M. Bishop. Chapter 6 is called "Error > functions" and it starts out by giving an easy to follow outline of the > principle of Maximum Likelihood (ML). 4 pages into chapter 6 on page 194, > second paragraph, I quote > > "We have derived the sum-of-squares error function from the principle of > maximum likelihood on the assumption of Gaussian distributed target data. > Of course the use of a sum-of-squares error does not require the target > data to have a Gaussian distribution. Later in this chapter we shall > consider the least-squares solution for an example problem with a strongly > non-Gaussian distribution. However, as we shall see, if we use a > sum-of-squares error, then the results we obtain cannot distinguish between > the true distribution and any other distribution having the same mean and > variance." > > > Maybe that chapter is of interest to you.
If you start from max likelihood and use a Laplace distribution you won't get ordinary LMS, you'll get sign() LMS. Of course you could argue that this derivation isn't LMS in the first place.