comp.dsp | Basic problem about cost function of LMS noise whitening filter

X(z)---> H(z)----->Y(z)

I want to implement an adaptive noise whitening filter h(n).
The desired property of y(n) is that E{|y(n)|^6} = SigmaD (constant).
H(z) is a freq response of an all pole iir filter of length N, and H(z) =
1/(1+sum(h_i*z^-i)).

h(n)'s adaptation rule is
h_i(n+1) = h_i(n) - mu* d(J)/dh, where J is the cost function.

On papers I have, J = E{|y(n)|^2}. 

The goal of h(n) is to whiten the noise so that its output has constant
variance = SigmaD.

Why is that same as minimizing the variance of its output, i.e. why J =
E{|y(n)|^2}?
Shouldn't J be CMA, J = E{|y(n)|^2 - SigmaD}?


Regards,
Sam

Reply by John ●December 10, 20072007-12-10

On Dec 10, 4:41 pm, "samwo123" <sup...@gmail.com> wrote:
> X(z)---> H(z)----->Y(z)
>
> I want to implement an adaptive noise whitening filter h(n).
> The desired property of y(n) is that E{|y(n)|^6} = SigmaD (constant).
> H(z) is a freq response of an all pole iir filter of length N, and H(z) =
> 1/(1+sum(h_i*z^-i)).
>
> h(n)'s adaptation rule is
> h_i(n+1) = h_i(n) - mu* d(J)/dh, where J is the cost function.
>
> On papers I have, J = E{|y(n)|^2}.
>
> The goal of h(n) is to whiten the noise so that its output has constant
> variance = SigmaD.
>
> Why is that same as minimizing the variance of its output, i.e. why J =
> E{|y(n)|^2}?
> Shouldn't J be CMA, J = E{|y(n)|^2 - SigmaD}?
>
> Regards,
> Sam

This isn't a direct answer to your question, but have you thought
about implementing the whitener in the frequency domain using an
overlap-add or overlap-save algorithm with FFT bin modifications in
the frequency domain?

John

Reply by ●December 11, 20072007-12-11

> Why is that same as minimizing the variance of its output, i.e. why J =
> E{|y(n)|^2}?
> Shouldn't J be CMA, J = E{|y(n)|^2 - SigmaD}?
>
> Regards,
> Sam

Note that the two approaches that you suggested are equivalent. If
SigmaD is a positive constant, then minimizing E(|y[n]|^2) is
equivalent to minimizing E(|y[n]|^2 - SigmaD).

However, whitening noise isn't equivalent to forcing its variance to a
constant value. A noise process could have a known constant variance,
but that says nothing of its color. Colors of noise are usually
classified by the shape of the process's PSD, which in turn points to
the level of correlation between different samples from the process.
White noise has a flat power spectrum, meaning that any two samples
taken from the process are uncorrelated. So really, your adaptive
whitener is really trying to generate an output sequence where
successive samples are not correlated with each other.

I'm not a guru on this topic at all, but the one filter structure that
I know of that has this property is the forward-linear-predictor-error
filter. The basic idea behind it is that you're trying to predict
future values of a process based upon previous observations; the
output of the filter is the error between your prediction and the
actual new value. If the filter is long enough, you can assume that
your vector of observations (the tap-input vector) is "large" enough
to span the entire space of values that the process will take on (and
therefore give you a perfect prediction if your taps are right). Any
residual error, which is the output of the filter, is uncorrelated
with the tap-input vector, and therefore is uncorrelated with all past
inputs of the filter. Thus, the sequence observed at the filter output
is a stream of samples that are (almost) uncorrelated with each other,
which looks like white noise.

Or something like that. I probably got some of the terms wrong, but I
hope you get the idea.

Jason

Reply by samwo123 ●December 12, 20072007-12-12

>On Dec 10, 4:41 pm, "samwo123" <sup...@gmail.com> wrote:
>> X(z)---> H(z)----->Y(z)
>>
>> I want to implement an adaptive noise whitening filter h(n).
>> The desired property of y(n) is that E{|y(n)|^6} = SigmaD (constant).
>> H(z) is a freq response of an all pole iir filter of length N, and H(z)
=
>> 1/(1+sum(h_i*z^-i)).
>>
>> h(n)'s adaptation rule is
>> h_i(n+1) = h_i(n) - mu* d(J)/dh, where J is the cost function.
>>
>> On papers I have, J = E{|y(n)|^2}.
>>
>> The goal of h(n) is to whiten the noise so that its output has
constant
>> variance = SigmaD.
>>
>> Why is that same as minimizing the variance of its output, i.e. why J
=
>> E{|y(n)|^2}?
>> Shouldn't J be CMA, J = E{|y(n)|^2 - SigmaD}?
>>
>> Regards,
>> Sam
>
>This isn't a direct answer to your question, but have you thought
>about implementing the whitener in the frequency domain using an
>overlap-add or overlap-save algorithm with FFT bin modifications in
>the frequency domain?
>
>John
>
I havent thought about that. 
I want to implement it as a adaptive component, and I'm not very familiar
with adaptive filter in frequency domain.

Regards,
Sam

Reply by samwo123 ●December 12, 20072007-12-12

>> Why is that same as minimizing the variance of its output, i.e. why J =
>> E{|y(n)|^2}?
>> Shouldn't J be CMA, J = E{|y(n)|^2 - SigmaD}?
>>
>> Regards,
>> Sam
>
>Note that the two approaches that you suggested are equivalent. If
>SigmaD is a positive constant, then minimizing E(|y[n]|^2) is
>equivalent to minimizing E(|y[n]|^2 - SigmaD).
>
>However, whitening noise isn't equivalent to forcing its variance to a
>constant value. A noise process could have a known constant variance,
>but that says nothing of its color. Colors of noise are usually
>classified by the shape of the process's PSD, which in turn points to
>the level of correlation between different samples from the process.
>White noise has a flat power spectrum, meaning that any two samples
>taken from the process are uncorrelated. So really, your adaptive
>whitener is really trying to generate an output sequence where
>successive samples are not correlated with each other.
>
>I'm not a guru on this topic at all, but the one filter structure that
>I know of that has this property is the forward-linear-predictor-error
>filter. The basic idea behind it is that you're trying to predict
>future values of a process based upon previous observations; the
>output of the filter is the error between your prediction and the
>actual new value. If the filter is long enough, you can assume that
>your vector of observations (the tap-input vector) is "large" enough
>to span the entire space of values that the process will take on (and
>therefore give you a perfect prediction if your taps are right). Any
>residual error, which is the output of the filter, is uncorrelated
>with the tap-input vector, and therefore is uncorrelated with all past
>inputs of the filter. Thus, the sequence observed at the filter output
>is a stream of samples that are (almost) uncorrelated with each other,
>which looks like white noise.
>
>Or something like that. I probably got some of the terms wrong, but I
>hope you get the idea.
>
>Jason
>
Thank you for your explanation. I now have a better understanding about
the problem. I had a small typos in my original post though. I should have
used SigmaD^2 instead of SigmaD. 

>the
>output of the filter is the error between your prediction and the
>actual new value. If the filter is long enough, 
So, the cost function in that case will be J = E{|y-d|^2}, where y is
output of linear prediction and d is the known desired value.
In my constraint I have no knowledge of d. I guess the best I can do is to
do blind adaptation using E(||y[n]|^2 - SigmaD^2|) or E(|y[n]|^2) as my
cost function.

>However, whitening noise isn't equivalent to forcing its variance to a
>constant value. A noise process could have a known constant variance,
>but that says nothing of its color.
I'm still a little confused here. The white noise , say w(k), has constant
PSD across its spectrum. That translates to E{w(k)*conj(w(k-d))} =
Sigmad^2, where d = 0, and = 0 else where. Whitening noise, and forcing
the process to have constant variance are rather equivalent?

Regards,
Sam

Reply by ●December 12, 20072007-12-12

On Dec 12, 11:10 am, "samwo123" <sup...@gmail.com> wrote:
> So, the cost function in that case will be J = E{|y-d|^2}, where y is
> output of linear prediction and d is the known desired value.
> In my constraint I have no knowledge of d. I guess the best I can do is to
> do blind adaptation using E(||y[n]|^2 - SigmaD^2|) or E(|y[n]|^2) as my
> cost function.

For an Nth-order forward linear prediction error filter, you use a
vector containing samples x[n-1], x[n-2], ... x[n-N-1] to predict the
value of x[n]. So, in effect, you do have a desired signal to use for
adaptation: the current value of the input signal x[n]. So, in the
common LMS nomenclature:

input signal: x[n]
tap-input vector: u = [ x[n-1] x[n-2] ... x[n-N-1] ]
desired signal: x[n]

I hope this makes it clearer that this is not a blind adaptation
algorithm. I would recommend Haykin's Adaptive Filter Theory text if
you have access to it; there is a good chapter on linear prediction.

> >However, whitening noise isn't equivalent to forcing its variance to a
> >constant value. A noise process could have a known constant variance,
> >but that says nothing of its color.
>
> I'm still a little confused here. The white noise , say w(k), has constant
> PSD across its spectrum. That translates to E{w(k)*conj(w(k-d))} =
> Sigmad^2, where d = 0, and = 0 else where. Whitening noise, and forcing
> the process to have constant variance are rather equivalent?
>
> Regards,
> Sam

They are not equivalent. Forcing a process to have constant variance
would imply that you are forcing it to be wide-sense stationary. Its
stationarity has nothing to do with the correlation between samples of
the process, which is what makes it "white." You can force the
process's variance to be a constant, but that makes no guarantee that
its output will be white. In the most ridiculous example, make a
filter with all zero taps. The output variance is a constant (zero),
but it sure isn't white.

Jason

Basic problem about cost function of LMS noise whitening filter

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group