"Jack L." <jack_nospam@nospam.dk> wrote in message news:<bw9zb.50644$jf4.2789643@news000.worldonline.dk>...

> Why is it that we use the minimized mean-square value of e(n) that gives the
> optimum filter? Or more precisely, why use the quantity "mean-square value"
> of a value - why not some, eg. mean-squareroot or whatever one's creativity
> can come up with. What is the consequence of using the mean-square value?

One good reason for using the MSE as a cost function is that it is
usually a  quadratic function of the filter coefficients. Thus the
optimization problem reduces to finding the eigenvector corresponding
to the minimum eigenvalue of a correlation matrix, which is closed
form. If you were to use a metric such as achievable bit-rate which is
a nonlinear function of the filter coefficients then you must use
nonlinear programming techniques to solve for the optimum coefficients
and you may not even achieve the global optimum.

Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FD0D68E.4D874717@arcanemethods.com>...
> Wouldn't it be
> correct to believe that if the result of the calculation is
> a least mean square approximation to the component's actual
> impulse response having a particular length and delay that
> the phase information would be optimally preserved as a side
> effect? 

Actually, I don't know. The problem is that it's very difficult to 
estimate the system impulse response h(t) from non-impulse input data
x(t) in time domain. 

      y(t) = x(t) (*) h(t)       (*) is convolution             [1]

The system output y(t) is just so "messed up" that isolating 
particular elements in the sequence h(t) in time domain becomes 
very difficult. Remember, every element in (the discrete-time sequence) 
y(t) depends on every element in the sequence h(t) as well as all 
past values of x(t). It's just too much of a mess to untangle. 
Transforming the whole mess into frequency domain straightens things 
out a bit, but it's still not easy. 

Most analyses (though not necessarily computations) are done in 
frequency domain to ensure stable inverse filters that also yeld 
results that make some sort of sense.

The basic idea is that if you take the signal y(t) and apply it to 
a new adaptive system G and adjust the impulse response g(t) until 
you reproduce the original input signal x(t), 

      v(t)= y(t) (*) g(t) = [x(t) (*) h(t)] (*) g(t) = x(t)     [2]

it follows that g(t) in some sense "undoes" the effect of h(t), and 
thus is an inverse filter to h(t). In time domain this becomes

    g(t) (*) h(t) = delta(t)                                   [3.a]
    g(t) = h^(-1)(t).                                          [3.b]

Now, if you take this analysis to frequency domain you will find 
that h(t) in most practical cases is described in terms of a rational 
transfer function H(f) and thus is completely characterized by its
poles and zeros. It follows that the inverse filter g(t) in frequency 
domain becomes 

     G(f) = H^(-1)(f) = 1/H(f)                                  [4]

such that the zeros of H(f) becomes the poles of G(f) and vice versa.

Enter the z transform and its requirements on filter stability. 
We know that for a filter to be stable (and we know that h(t) is stable),
all the poles must be inside the unit circle. Thus, all the *zeros* of
G(f) does, by definition, fall strictly inside the unit circle. Next, 
recall that zeros of a stable filter can occur everywhere in z domain. 
Hence, the *poles* of G(f) can potentially pop up anywhere in z domain. 
If a pole occurs on or outside the unit circle, the filter g(t) becomes 
unstable and is useless to us. 

However, further analysis shows that if a pole of G(f) outside the unit 
circle is "wrapped" inside the unit circle, the filter spectrum magnitude 
is preserved in the sense 

     |G(f)| = |1/H(f)|                                           [5]

(note that the absolute values in [5] not present in [4] essentially 
removes phase information in the frequency response) although the 
inverse deconvolution property [3.a] is lost. Some times we decide we 
are content with that. 

So in conclusion, what we want ideally, is an inverse filter g(t) 
that completely "undoes" the effects of h(t) in the sense [3.a] above. 
In the practical world, we have to contend ourselves with a stable filter 
that "merely" re-scales spectrum magnitudes in the sense [5], where the 
phase response (and thus the time-domain impulse response) is left 
unspecified.

> That's the problem I have, understanding the
> frequency domain implications of time domain calculation. 

In my experience, that's *the* problem of DSP.

Rune


Rune Allnor wrote:
> 
> Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCFBDC9.4E24F00@arcanemethods.com>...
> > Rune Allnor wrote:
> > >
> > >
> > > I would stay away from spectrum divisions. If I understand you correctly,
> > > you have a system like
> > >
> > >                                      Y(f)
> > >    Y(f) = H(f)X(f)     =>    H(f) = ------
> > >                                      X(f)
> >
> > I have the equivalent h(t) = y(t) / x(t) where y(t) is a
> > broadband stimulus "impulse" and x(t) the response of a
> > component of the system to that stimulus.  I'm seeking an
> > FIR approximate inverse of that component of the system
> > under test.
> 
> Hum... does that "/" really mean "sample-wise division", or could it
> be that you look for some sort of deconvolution?

Yes, it represents deconvolution via the LMS approach of
Levinson-Durban as implemented with a Toeplitz solver.

> 
> I'm using the term "wavelet" the way common in seismics, i.e. as the
> pulse waveform emitted by the source. It has little (if anything) to do
> with that signal analysis thing that has become so popular.

Got it.

> 
> > but I don't
> > understand how phase information would be lost or how the
> > phase characteristic would affect the outcome.
> 
> The term "phase" only makes sense in frequency domain, even if
> computations are done on time-domain data. Since you deal
> with autocorrelation functions, phase information is lost because
> the Fourier transform of an autocorrelation function (of a real
> signal) is real. In the case of cross-correlation functions you
> find the principal value of the phase difference in the cross-spectrum.
> The English translation is that you find some phi restricted to
> 
>      0 <= phi < 2pi
> 
> where the "true" phase lag Phi is
> 
>     Phi = phi + 2pi*k
> 
> for some integer k. Again, phase information is lost.
> 
> Phase information is cruical for reconstructing the correct time-domain
> impulse response.
> 

Hmmm.  I've got to chew on that a bit.  Wouldn't it be
correct to believe that if the result of the calculation is
a least mean square approximation to the component's actual
impulse response having a particular length and delay that
the phase information would be optimally preserved as a side
effect?  That's the problem I have, understanding the
frequency domain implications of time domain calculation. 


Bob
-- 

"Things should be described as simply as possible, but no
simpler."

                                             A. Einstein

Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCFBDC9.4E24F00@arcanemethods.com>...
> Rune Allnor wrote:
> > 
> > 
> > I would stay away from spectrum divisions. If I understand you correctly,
> > you have a system like
> > 
> >                                      Y(f)
> >    Y(f) = H(f)X(f)     =>    H(f) = ------
> >                                      X(f)
> 
> I have the equivalent h(t) = y(t) / x(t) where y(t) is a
> broadband stimulus "impulse" and x(t) the response of a
> component of the system to that stimulus.  I'm seeking an
> FIR approximate inverse of that component of the system
> under test.

Hum... does that "/" really mean "sample-wise division", or could it 
be that you look for some sort of deconvolution?

> > If you do the spectral division you risk problems if X(f) has zeros
> > close to the unit circle (i.e. vanishing spectrum coefficients for
> > some f). The system response just blows up and makes little sense.
> > So while the frequency division trick is "obvious" and easy to implement,
> > I wouldn't expect it to give useful results.
> 
> The usual trick is to add some kind of small
> "regularization" value to X(F) but I have doubts as to the
> efficacy of that as well as the time domain aliasing that
> can result.

I have seen that sort of trick used in other contexts. It solves 
the immediate numerical concern, but, as you say, it may introduce
artifacts in the signal. 

> > The pros of Levinson-based LPC methods is that the inverse filter is
> > stable, and is easy to implement/has fast run-time once the filter
> > coefficients have been estimated.. The cons is that it may be difficult
> > to estimate those coefficients, as the estimate is based on a signal
> > autocovariance function that may be difficult to estimate, and that
> > phase information is lost. If your (pulse) input signal is something
> > other than a minimum-phase wavelet, the LPC inverse filter will not
> > represent the correct source wavelet.
> 
> I'm not working with wavelets here and am not sure as to how
> your comments about them relate. 

I'm using the term "wavelet" the way common in seismics, i.e. as the 
pulse waveform emitted by the source. It has little (if anything) to do 
with that signal analysis thing that has become so popular.

> I just have the time
> domain responses of two points on either side of a component
> of the system to which an impulse is applied.  The
> Levinson-Durban approach involves taking the autocorrelation
> of x(t) and the cross correlation of y(t) and x(t) before
> handing the results to a Toeplitz solver 

I think I have seen something similar in a discussion of Wiener 
filters...

> but I don't
> understand how phase information would be lost or how the
> phase characteristic would affect the outcome.

The term "phase" only makes sense in frequency domain, even if 
computations are done on time-domain data. Since you deal 
with autocorrelation functions, phase information is lost because 
the Fourier transform of an autocorrelation function (of a real 
signal) is real. In the case of cross-correlation functions you 
find the principal value of the phase difference in the cross-spectrum.
The English translation is that you find some phi restricted to 

     0 <= phi < 2pi

where the "true" phase lag Phi is

    Phi = phi + 2pi*k

for some integer k. Again, phase information is lost.

Phase information is cruical for reconstructing the correct time-domain 
impulse response. 

Rune

Rune Allnor wrote:
> 
> 
> I would stay away from spectrum divisions. If I understand you correctly,
> you have a system like
> 
>                                      Y(f)
>    Y(f) = H(f)X(f)     =>    H(f) = ------
>                                      X(f)

I have the equivalent h(t) = y(t) / x(t) where y(t) is a
broadband stimulus "impulse" and x(t) the response of a
component of the system to that stimulus.  I'm seeking an
FIR approximate inverse of that component of the system
under test.

> 
> If you do the spectral division you risk problems if X(f) has zeros
> close to the unit circle (i.e. vanishing spectrum coefficients for
> some f). The system response just blows up and makes little sense.
> So while the frequency division trick is "obvious" and easy to implement,
> I wouldn't expect it to give useful results.

The usual trick is to add some kind of small
"regularization" value to X(F) but I have doubts as to the
efficacy of that as well as the time domain aliasing that
can result.

> 
> The pros of Levinson-based LPC methods is that the inverse filter is
> stable, and is easy to implement/has fast run-time once the filter
> coefficients have been estimated.. The cons is that it may be difficult
> to estimate those coefficients, as the estimate is based on a signal
> autocovariance function that may be difficult to estimate, and that
> phase information is lost. If your (pulse) input signal is something
> other than a minimum-phase wavelet, the LPC inverse filter will not
> represent the correct source wavelet.

I'm not working with wavelets here and am not sure as to how
your comments about them relate.  I just have the time
domain responses of two points on either side of a component
of the system to which an impulse is applied.  The
Levinson-Durban approach involves taking the autocorrelation
of x(t) and the cross correlation of y(t) and x(t) before
handing the results to a Toeplitz solver but I don't
understand how phase information would be lost or how the
phase characteristic would affect the outcome.

Thanks,

Bob
-- 

"Things should be described as simply as possible, but no
simpler."

                                             A. Einstein


Matt Timmermans wrote:
> 
> Hi Bob,
> 
> I have never actually tried to use your first method, but it has two
> problems that I can see:
> 
> 1) You're asking for the actual inverse of the denominator, and I don't see
> any guarantees that the actual inverse will be causal.  And you have to
> decide what to do when the denominator response is close to zero.
> 
> 2) The impulse response you're appoximating is infinite, so there will be
> some aliasing as it wraps around in the convolution buffer.  I don't know
> much about the practical nature of this error, which would make me avoid it.
> Presumably it's moot if you use a long enough buffer, but we don't know how
> long that really is.  And if the imulse response is acausal, this doesn't
> help.
> 
> The Toeplitz solver will be more predictable.

Thanks for the sanity check, Matt.  Those are the same
conclusions I had come to.


Bob
-- 

"Things should be described as simply as possible, but no
simpler."

                                             A. Einstein

Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCE3B14.86BC05B@arcanemethods.com>...
> Matt Timmermans wrote:
> > 
> > - Orthogonal transforms that preserve energy, like the Fourier transform,
> > also preserve squared error.  So if your goal is to minimize squared error,
> > you can do it in the time or frequency domain and it will mean the same
> > thing.  You can, in fact, combine time and frequency-based goals  and
> > minimize the error w.r.t. both simultaneously without difficulty.
> > 
> 
> Matt, thanks for the great list of attributes.  I wonder if
> you could expand on this one a bit.  I'm trying to determine
> whether it is better to form a transfer function impulse
> response by either the complex division of the transforms of
> the numerator and denominator impulse responses followed by
> the inverse transform or by doing the division in the time
> domain with a Toeplitz solver (which minimizes the squared
> error in the time domain at the expense of an O(N^2)
> calculation.)  If you have insight on the pros and cons of
> the two approaches or can say more about how to combine
> methods for better or faster results I'd greatly appreciate
> it.

I would stay away from spectrum divisions. If I understand you correctly, 
you have a system like

                                     Y(f)
   Y(f) = H(f)X(f)     =>    H(f) = ------
                                     X(f)

in frequency domain, where the output Y(f) and the input X(f) are known 
and the system function H(f) is to be estimated?

If you do the spectral division you risk problems if X(f) has zeros
close to the unit circle (i.e. vanishing spectrum coefficients for 
some f). The system response just blows up and makes little sense. 
So while the frequency division trick is "obvious" and easy to implement, 
I wouldn't expect it to give useful results.

The pros of Levinson-based LPC methods is that the inverse filter is 
stable, and is easy to implement/has fast run-time once the filter 
coefficients have been estimated.. The cons is that it may be difficult 
to estimate those coefficients, as the estimate is based on a signal 
autocovariance function that may be difficult to estimate, and that 
phase information is lost. If your (pulse) input signal is something 
other than a minimum-phase wavelet, the LPC inverse filter will not 
represent the correct source wavelet.

There have been some efforts to estimate the mixed-phase source wavelet 
in seismic problems. The approaches I've seen involves measuring the 
estimation error using "arcane" norms (I think L_5 was popular) and also 
involved a search using genethic algorithms. Hardly a real-time method, 
in other words. 

If you want to know more, check out Porsani and Ursin's papers in 
Geophysics from 1998 and onwards.

Rune

"Bob Cain" <arcane@arcanemethods.com> wrote in message
news:3FCE3B14.86BC05B@arcanemethods.com...
> Matt, thanks for the great list of attributes.  I wonder if
> you could expand on this one a bit.  I'm trying to determine
> whether it is better to form a transfer function impulse
> response by either the complex division of the transforms of
> the numerator and denominator impulse responses followed by
> the inverse transform or by doing the division in the time
> domain with a Toeplitz solver (which minimizes the squared
> error in the time domain at the expense of an O(N^2)
> calculation.)  If you have insight on the pros and cons of
> the two approaches or can say more about how to combine
> methods for better or faster results I'd greatly appreciate
> it.

Hi Bob,

I have never actually tried to use your first method, but it has two
problems that I can see:

1) You're asking for the actual inverse of the denominator, and I don't see
any guarantees that the actual inverse will be causal.  And you have to
decide what to do when the denominator response is close to zero.

2) The impulse response you're appoximating is infinite, so there will be
some aliasing as it wraps around in the convolution buffer.  I don't know
much about the practical nature of this error, which would make me avoid it.
Presumably it's moot if you use a long enough buffer, but we don't know how
long that really is.  And if the imulse response is acausal, this doesn't
help.

The Toeplitz solver will be more predictable.


"Jack L." wrote:

> Hello group.
>
> Quote from Simon Haykin's book "Adaptive Filter Theory", 4th edition, on the
> Wiener filter (chap. 2):
>
> "We now summarize the essence of the filtering problem with the following
> statement:
>
> Design a linear discrete-time filter whose output y(n) provides an estimate
> of a desired response d(n), given a set of input samples u(0), u(1),
> u(2),..., such that the MEAN-SQUARE VALUE of the estimation error e(n),
> defined as the difference between the desired response d(n) and the actual
> response y(n), is minimized."
>
> Why is it that we use the minimized mean-square value of e(n) that gives the
> optimum filter? Or more precisely, why use the quantity "mean-square value"
> of a value - why not some, eg. mean-squareroot or whatever one's creativity
> can come up with. What is the consequence of using the mean-square value?
>
> --
> Mvh / Best regards,
> Jack, Copenhagen
>
> The email address is for real. :)

I have seen many other criteria -for instance E[e^4(t)], modulus of  error and
of course H infinity optimisation.
For a dc free error the mean squared value is average power of the error so it
makes some engineering sense.
(its also easy to get an answer!)
Tom

Matt Timmermans wrote:
> 
> - Orthogonal transforms that preserve energy, like the Fourier transform,
> also preserve squared error.  So if your goal is to minimize squared error,
> you can do it in the time or frequency domain and it will mean the same
> thing.  You can, in fact, combine time and frequency-based goals  and
> minimize the error w.r.t. both simultaneously without difficulty.
> 

Matt, thanks for the great list of attributes.  I wonder if
you could expand on this one a bit.  I'm trying to determine
whether it is better to form a transfer function impulse
response by either the complex division of the transforms of
the numerator and denominator impulse responses followed by
the inverse transform or by doing the division in the time
domain with a Toeplitz solver (which minimizes the squared
error in the time domain at the expense of an O(N^2)
calculation.)  If you have insight on the pros and cons of
the two approaches or can say more about how to combine
methods for better or faster results I'd greatly appreciate
it.

Bob
-- 

"Things should be described as simply as possible, but no
simpler."

                                             A. Einstein