Forums

Why minimising in the mean-error sense.

Started by Jack L. December 2, 2003
"Bob Cain" <arcane@arcanemethods.com> wrote in message
news:3FCE3B14.86BC05B@arcanemethods.com...
> Matt, thanks for the great list of attributes. I wonder if > you could expand on this one a bit. I'm trying to determine > whether it is better to form a transfer function impulse > response by either the complex division of the transforms of > the numerator and denominator impulse responses followed by > the inverse transform or by doing the division in the time > domain with a Toeplitz solver (which minimizes the squared > error in the time domain at the expense of an O(N^2) > calculation.) If you have insight on the pros and cons of > the two approaches or can say more about how to combine > methods for better or faster results I'd greatly appreciate > it.
Hi Bob, I have never actually tried to use your first method, but it has two problems that I can see: 1) You're asking for the actual inverse of the denominator, and I don't see any guarantees that the actual inverse will be causal. And you have to decide what to do when the denominator response is close to zero. 2) The impulse response you're appoximating is infinite, so there will be some aliasing as it wraps around in the convolution buffer. I don't know much about the practical nature of this error, which would make me avoid it. Presumably it's moot if you use a long enough buffer, but we don't know how long that really is. And if the imulse response is acausal, this doesn't help. The Toeplitz solver will be more predictable.
Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCE3B14.86BC05B@arcanemethods.com>...
> Matt Timmermans wrote: > > > > - Orthogonal transforms that preserve energy, like the Fourier transform, > > also preserve squared error. So if your goal is to minimize squared error, > > you can do it in the time or frequency domain and it will mean the same > > thing. You can, in fact, combine time and frequency-based goals and > > minimize the error w.r.t. both simultaneously without difficulty. > > > > Matt, thanks for the great list of attributes. I wonder if > you could expand on this one a bit. I'm trying to determine > whether it is better to form a transfer function impulse > response by either the complex division of the transforms of > the numerator and denominator impulse responses followed by > the inverse transform or by doing the division in the time > domain with a Toeplitz solver (which minimizes the squared > error in the time domain at the expense of an O(N^2) > calculation.) If you have insight on the pros and cons of > the two approaches or can say more about how to combine > methods for better or faster results I'd greatly appreciate > it.
I would stay away from spectrum divisions. If I understand you correctly, you have a system like Y(f) Y(f) = H(f)X(f) => H(f) = ------ X(f) in frequency domain, where the output Y(f) and the input X(f) are known and the system function H(f) is to be estimated? If you do the spectral division you risk problems if X(f) has zeros close to the unit circle (i.e. vanishing spectrum coefficients for some f). The system response just blows up and makes little sense. So while the frequency division trick is "obvious" and easy to implement, I wouldn't expect it to give useful results. The pros of Levinson-based LPC methods is that the inverse filter is stable, and is easy to implement/has fast run-time once the filter coefficients have been estimated.. The cons is that it may be difficult to estimate those coefficients, as the estimate is based on a signal autocovariance function that may be difficult to estimate, and that phase information is lost. If your (pulse) input signal is something other than a minimum-phase wavelet, the LPC inverse filter will not represent the correct source wavelet. There have been some efforts to estimate the mixed-phase source wavelet in seismic problems. The approaches I've seen involves measuring the estimation error using "arcane" norms (I think L_5 was popular) and also involved a search using genethic algorithms. Hardly a real-time method, in other words. If you want to know more, check out Porsani and Ursin's papers in Geophysics from 1998 and onwards. Rune

Matt Timmermans wrote:
> > Hi Bob, > > I have never actually tried to use your first method, but it has two > problems that I can see: > > 1) You're asking for the actual inverse of the denominator, and I don't see > any guarantees that the actual inverse will be causal. And you have to > decide what to do when the denominator response is close to zero. > > 2) The impulse response you're appoximating is infinite, so there will be > some aliasing as it wraps around in the convolution buffer. I don't know > much about the practical nature of this error, which would make me avoid it. > Presumably it's moot if you use a long enough buffer, but we don't know how > long that really is. And if the imulse response is acausal, this doesn't > help. > > The Toeplitz solver will be more predictable.
Thanks for the sanity check, Matt. Those are the same conclusions I had come to. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein

Rune Allnor wrote:
> > > I would stay away from spectrum divisions. If I understand you correctly, > you have a system like > > Y(f) > Y(f) = H(f)X(f) => H(f) = ------ > X(f)
I have the equivalent h(t) = y(t) / x(t) where y(t) is a broadband stimulus "impulse" and x(t) the response of a component of the system to that stimulus. I'm seeking an FIR approximate inverse of that component of the system under test.
> > If you do the spectral division you risk problems if X(f) has zeros > close to the unit circle (i.e. vanishing spectrum coefficients for > some f). The system response just blows up and makes little sense. > So while the frequency division trick is "obvious" and easy to implement, > I wouldn't expect it to give useful results.
The usual trick is to add some kind of small "regularization" value to X(F) but I have doubts as to the efficacy of that as well as the time domain aliasing that can result.
> > The pros of Levinson-based LPC methods is that the inverse filter is > stable, and is easy to implement/has fast run-time once the filter > coefficients have been estimated.. The cons is that it may be difficult > to estimate those coefficients, as the estimate is based on a signal > autocovariance function that may be difficult to estimate, and that > phase information is lost. If your (pulse) input signal is something > other than a minimum-phase wavelet, the LPC inverse filter will not > represent the correct source wavelet.
I'm not working with wavelets here and am not sure as to how your comments about them relate. I just have the time domain responses of two points on either side of a component of the system to which an impulse is applied. The Levinson-Durban approach involves taking the autocorrelation of x(t) and the cross correlation of y(t) and x(t) before handing the results to a Toeplitz solver but I don't understand how phase information would be lost or how the phase characteristic would affect the outcome. Thanks, Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCFBDC9.4E24F00@arcanemethods.com>...
> Rune Allnor wrote: > > > > > > I would stay away from spectrum divisions. If I understand you correctly, > > you have a system like > > > > Y(f) > > Y(f) = H(f)X(f) => H(f) = ------ > > X(f) > > I have the equivalent h(t) = y(t) / x(t) where y(t) is a > broadband stimulus "impulse" and x(t) the response of a > component of the system to that stimulus. I'm seeking an > FIR approximate inverse of that component of the system > under test.
Hum... does that "/" really mean "sample-wise division", or could it be that you look for some sort of deconvolution?
> > If you do the spectral division you risk problems if X(f) has zeros > > close to the unit circle (i.e. vanishing spectrum coefficients for > > some f). The system response just blows up and makes little sense. > > So while the frequency division trick is "obvious" and easy to implement, > > I wouldn't expect it to give useful results. > > The usual trick is to add some kind of small > "regularization" value to X(F) but I have doubts as to the > efficacy of that as well as the time domain aliasing that > can result.
I have seen that sort of trick used in other contexts. It solves the immediate numerical concern, but, as you say, it may introduce artifacts in the signal.
> > The pros of Levinson-based LPC methods is that the inverse filter is > > stable, and is easy to implement/has fast run-time once the filter > > coefficients have been estimated.. The cons is that it may be difficult > > to estimate those coefficients, as the estimate is based on a signal > > autocovariance function that may be difficult to estimate, and that > > phase information is lost. If your (pulse) input signal is something > > other than a minimum-phase wavelet, the LPC inverse filter will not > > represent the correct source wavelet. > > I'm not working with wavelets here and am not sure as to how > your comments about them relate.
I'm using the term "wavelet" the way common in seismics, i.e. as the pulse waveform emitted by the source. It has little (if anything) to do with that signal analysis thing that has become so popular.
> I just have the time > domain responses of two points on either side of a component > of the system to which an impulse is applied. The > Levinson-Durban approach involves taking the autocorrelation > of x(t) and the cross correlation of y(t) and x(t) before > handing the results to a Toeplitz solver
I think I have seen something similar in a discussion of Wiener filters...
> but I don't > understand how phase information would be lost or how the > phase characteristic would affect the outcome.
The term "phase" only makes sense in frequency domain, even if computations are done on time-domain data. Since you deal with autocorrelation functions, phase information is lost because the Fourier transform of an autocorrelation function (of a real signal) is real. In the case of cross-correlation functions you find the principal value of the phase difference in the cross-spectrum. The English translation is that you find some phi restricted to 0 <= phi < 2pi where the "true" phase lag Phi is Phi = phi + 2pi*k for some integer k. Again, phase information is lost. Phase information is cruical for reconstructing the correct time-domain impulse response. Rune

Rune Allnor wrote:
> > Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FCFBDC9.4E24F00@arcanemethods.com>... > > Rune Allnor wrote: > > > > > > > > > I would stay away from spectrum divisions. If I understand you correctly, > > > you have a system like > > > > > > Y(f) > > > Y(f) = H(f)X(f) => H(f) = ------ > > > X(f) > > > > I have the equivalent h(t) = y(t) / x(t) where y(t) is a > > broadband stimulus "impulse" and x(t) the response of a > > component of the system to that stimulus. I'm seeking an > > FIR approximate inverse of that component of the system > > under test. > > Hum... does that "/" really mean "sample-wise division", or could it > be that you look for some sort of deconvolution?
Yes, it represents deconvolution via the LMS approach of Levinson-Durban as implemented with a Toeplitz solver.
> > I'm using the term "wavelet" the way common in seismics, i.e. as the > pulse waveform emitted by the source. It has little (if anything) to do > with that signal analysis thing that has become so popular.
Got it.
> > > but I don't > > understand how phase information would be lost or how the > > phase characteristic would affect the outcome. > > The term "phase" only makes sense in frequency domain, even if > computations are done on time-domain data. Since you deal > with autocorrelation functions, phase information is lost because > the Fourier transform of an autocorrelation function (of a real > signal) is real. In the case of cross-correlation functions you > find the principal value of the phase difference in the cross-spectrum. > The English translation is that you find some phi restricted to > > 0 <= phi < 2pi > > where the "true" phase lag Phi is > > Phi = phi + 2pi*k > > for some integer k. Again, phase information is lost. > > Phase information is cruical for reconstructing the correct time-domain > impulse response. >
Hmmm. I've got to chew on that a bit. Wouldn't it be correct to believe that if the result of the calculation is a least mean square approximation to the component's actual impulse response having a particular length and delay that the phase information would be optimally preserved as a side effect? That's the problem I have, understanding the frequency domain implications of time domain calculation. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Bob Cain <arcane@arcanemethods.com> wrote in message news:<3FD0D68E.4D874717@arcanemethods.com>...
> Wouldn't it be > correct to believe that if the result of the calculation is > a least mean square approximation to the component's actual > impulse response having a particular length and delay that > the phase information would be optimally preserved as a side > effect?
Actually, I don't know. The problem is that it's very difficult to estimate the system impulse response h(t) from non-impulse input data x(t) in time domain. y(t) = x(t) (*) h(t) (*) is convolution [1] The system output y(t) is just so "messed up" that isolating particular elements in the sequence h(t) in time domain becomes very difficult. Remember, every element in (the discrete-time sequence) y(t) depends on every element in the sequence h(t) as well as all past values of x(t). It's just too much of a mess to untangle. Transforming the whole mess into frequency domain straightens things out a bit, but it's still not easy. Most analyses (though not necessarily computations) are done in frequency domain to ensure stable inverse filters that also yeld results that make some sort of sense. The basic idea is that if you take the signal y(t) and apply it to a new adaptive system G and adjust the impulse response g(t) until you reproduce the original input signal x(t), v(t)= y(t) (*) g(t) = [x(t) (*) h(t)] (*) g(t) = x(t) [2] it follows that g(t) in some sense "undoes" the effect of h(t), and thus is an inverse filter to h(t). In time domain this becomes g(t) (*) h(t) = delta(t) [3.a] g(t) = h^(-1)(t). [3.b] Now, if you take this analysis to frequency domain you will find that h(t) in most practical cases is described in terms of a rational transfer function H(f) and thus is completely characterized by its poles and zeros. It follows that the inverse filter g(t) in frequency domain becomes G(f) = H^(-1)(f) = 1/H(f) [4] such that the zeros of H(f) becomes the poles of G(f) and vice versa. Enter the z transform and its requirements on filter stability. We know that for a filter to be stable (and we know that h(t) is stable), all the poles must be inside the unit circle. Thus, all the *zeros* of G(f) does, by definition, fall strictly inside the unit circle. Next, recall that zeros of a stable filter can occur everywhere in z domain. Hence, the *poles* of G(f) can potentially pop up anywhere in z domain. If a pole occurs on or outside the unit circle, the filter g(t) becomes unstable and is useless to us. However, further analysis shows that if a pole of G(f) outside the unit circle is "wrapped" inside the unit circle, the filter spectrum magnitude is preserved in the sense |G(f)| = |1/H(f)| [5] (note that the absolute values in [5] not present in [4] essentially removes phase information in the frequency response) although the inverse deconvolution property [3.a] is lost. Some times we decide we are content with that. So in conclusion, what we want ideally, is an inverse filter g(t) that completely "undoes" the effects of h(t) in the sense [3.a] above. In the practical world, we have to contend ourselves with a stable filter that "merely" re-scales spectrum magnitudes in the sense [5], where the phase response (and thus the time-domain impulse response) is left unspecified.
> That's the problem I have, understanding the > frequency domain implications of time domain calculation.
In my experience, that's *the* problem of DSP. Rune
"Jack L." <jack_nospam@nospam.dk> wrote in message news:<bw9zb.50644$jf4.2789643@news000.worldonline.dk>...

> Why is it that we use the minimized mean-square value of e(n) that gives the > optimum filter? Or more precisely, why use the quantity "mean-square value" > of a value - why not some, eg. mean-squareroot or whatever one's creativity > can come up with. What is the consequence of using the mean-square value?
One good reason for using the MSE as a cost function is that it is usually a quadratic function of the filter coefficients. Thus the optimization problem reduces to finding the eigenvector corresponding to the minimum eigenvalue of a correlation matrix, which is closed form. If you were to use a metric such as achievable bit-rate which is a nonlinear function of the filter coefficients then you must use nonlinear programming techniques to solve for the optimum coefficients and you may not even achieve the global optimum.