DSPRelated.com
Forums

LMS Pseudo Inverses of Complex Matrices

Started by Bob Cain January 28, 2005
I have an overconstrained system of equations (m equations 
in n variables with m>n) where the variables are complex. 
Using the Moore-Penrose pseudoinverse method of solution is 
giving me results that are terribly counterintuitive.  Does 
anyone know what exactly is minimized in the least squares 
sense when the variables and solution coeficients are complex?


Thanks,

Bob
-- 

"Things should be described as simply as possible, but no 
simpler."

                                              A. Einstein
Bob Cain wrote:

> > I have an overconstrained system of equations (m equations in n > variables with m>n) where the variables are complex. Using the > Moore-Penrose pseudoinverse method of solution is giving me results that > are terribly counterintuitive. Does anyone know what exactly is > minimized in the least squares sense when the variables and solution > coeficients are complex? > > > Thanks, > > Bob
Bob, Please dissipate a bit of my ignorance. I had thought that with an overconstrained system, either there's a solution or not. How does matrix inversion bear on that? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Bob Cain wrote:

> > I have an overconstrained system of equations (m equations in n > variables with m>n) where the variables are complex. Using the > Moore-Penrose pseudoinverse method of solution is giving me results that > are terribly counterintuitive. Does anyone know what exactly is > minimized in the least squares sense when the variables and solution > coeficients are complex? > > > Thanks, > > Bob
Bob, For the case of real data, what is minimized is the sum of the squares of the difference between your measurements and the measurements that would result from the LMS solution. For example, if A is your matrix, b is your vector of measurements, and x the LMS solution, then the quantity being minimized is the sum of the squares of the difference vector (b - Ax). When (b-Ax) is complex-valued, you would replace the square of the real values with squares of the absolute value of the complex elements. As to why your results are counter-intuitive, let us assume a correct implementation of your pseudoinverse software. I would ask: 1. What is the condition number of your matrix? 2. How noisy are the measurements you're trying to fit? Your LMS solution will be affected by any noise in your measurements, and that noise will be scaled by the condition number of your matrix. In general, the larger the condition number, the "noisier" your solution becomes. The condition number is the ratio of the largest singular value in the matrix "A" to the smallest (these numbers should be provided by your pseudoinverse software). Thus, an identity matrix has a condition number of one and is obviously numerically well-conditioned, i.e., easily inverted. As another example, consider the 2x2 system | m1 | | 1.0e+05 0 | | x1 | | | = | | | | | m2 | | 0 1 | | x2 | whose condition number is 1.0e+05 and the solution is given by | x1 | | 1.0e-05 0 | | m1 | | m1 | | | = | | | | = 1.0e-05 | | | x2 | | 0 1 | | m2 | | 1.0e+5 m2 | Assuming the errors in m1 and m2 are comparable, note how the errors in m2 are significantly amplified relative to those in m1. You can easily see how this increases the variability in the estimate of x2. Even though this is not an example of an overdetermined system, the same ideas apply. I hope this has been more helpful than confusing! Mike
On Fri, 28 Jan 2005 14:24:16 -0800, Bob Cain
<arcane@arcanemethods.com> wrote:

>>I have an overconstrained system of equations (m equations >>in n variables with m>n) where the variables are complex. >>Using the Moore-Penrose pseudoinverse method of solution is >>giving me results that are terribly counterintuitive. Does >>anyone know what exactly is minimized in the least squares >>sense when the variables and solution coeficients are complex?
If Q = [XZ-I], where X is the original matrix, Z is its pseudoinverse (Moore-Penrose pseudoinverse), and I is the identity matrix, then . SUM[SUM(q )&#4294967295;] . m n m,n is minimized, where (q ) is the element of Q in . m,n row "m", column "n". If X is complex, as it is in your case, then I believe that . * . SUM[SUM(q )(q ) ] . m n m,n m,n is minimized, where "*" denotes "complex conjugate". Note that the computation for the pseudoinverse; . T -1 T . Z = [X X] X must use conjugate (Hermitian) transposes if X is complex: . H -1 H . Z = [X X] X Is it possible that this is your problem? -- Greg
Michael Soyka wrote:

> Bob Cain wrote: > >> >> I have an overconstrained system of equations (m equations in n >> variables with m>n) where the variables are complex. Using the >> Moore-Penrose pseudoinverse method of solution is giving me results >> that are terribly counterintuitive. Does anyone know what exactly is >> minimized in the least squares sense when the variables and solution >> coeficients are complex? >> >> >> Thanks, >> >> Bob > > > Bob, > > For the case of real data, what is minimized is the sum of the squares > of the difference between your measurements and the measurements that > would result from the LMS solution. For example, if A is your matrix, b > is your vector of measurements, and x the LMS solution, then the > quantity being minimized is the sum of the squares of the difference > vector (b - Ax). > > When (b-Ax) is complex-valued, you would replace the square of the real > values with squares of the absolute value of the complex elements. > > As to why your results are counter-intuitive, let us assume a correct > implementation of your pseudoinverse software. I would ask: > > 1. What is the condition number of your matrix? > 2. How noisy are the measurements you're trying to fit? > > Your LMS solution will be affected by any noise in your measurements, > and that noise will be scaled by the condition number of your matrix. > In general, the larger the condition number, the "noisier" your solution > becomes. > > The condition number is the ratio of the largest singular value in the > matrix "A" to the smallest (these numbers should be provided by your > pseudoinverse software). Thus, an identity matrix has a condition > number of one and is obviously numerically well-conditioned, i.e., > easily inverted. As another example, consider the 2x2 system > > | m1 | | 1.0e+05 0 | | x1 | > | | = | | | | > | m2 | | 0 1 | | x2 | > > whose condition number is 1.0e+05 and the solution is given by > > | x1 | | 1.0e-05 0 | | m1 | | m1 | > | | = | | | | = 1.0e-05 | | > | x2 | | 0 1 | | m2 | | 1.0e+5 m2 | > > Assuming the errors in m1 and m2 are comparable, note how the errors in > m2 are significantly amplified relative to those in m1. You can easily > see how this increases the variability in the estimate of x2. > > Even though this is not an example of an overdetermined system, the same > ideas apply. I hope this has been more helpful than confusing! > > Mike
I'm not familiar with the term "Moore-Penrose pseudoinverse", but when you solve that problem using the singular value decomposition you can choose a cutoff value for the singular values (the diagonal in the simple example above) -- if you have some that are too small you can easily jigger the solution to ignore them. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Bob Cain wrote:
> I have an overconstrained system of equations (m equations > in n variables with m>n) where the variables are complex. > Using the Moore-Penrose pseudoinverse method of solution is > giving me results that are terribly counterintuitive. Does > anyone know what exactly is minimized in the least squares > sense when the variables and solution coeficients are complex?
In a complex-valued world, the error norm is N E = sum e(n)*conj(e(n)). [1] n=1 Now, you don't say how you have implemented the pseudo inverse. The usual way is to use the singular value decomposition, SVD: Problem: Solve Ax = b + e [2] for the coefficents x that minimize the norm of the error e. Solution: Use the SVD to decompose A as A = USV [3] and compute the Moore-Penrose pseudo-inverse A# of A as A# = V'S^{-1}U' [4] where U' is the transposed of U, and find the solution x as x = A#b = V'S^{-1}U'b [5] So far so good, this is basic linear algebra theory that can be found in any decent textbook. Now, Tim mentioned the singular value problem. S is diagonal with the singular values sigma_i on the diagonal, so S^{-1} is also diagonal and has the numbers 1/sigma_{i} on the diagonal. This causes problems. If there is a very small sigma in S, the corresponding term in S^{-1} will be very large, and the solution x is dominated by numerical noise. The condition number c for A'A is defined as c = max{sigma_i}/min{sigma_i} [6] (or was it min{sigma_i}/max{sigma_i}? I can't remember...) and this number gives you an impression of how bad this problem is. There are several ways of handling the problem of numerical noise. The naive way is to drop the "small" singular values (with corresponding left and right singular vectors) from the solution. This is some times known as "the method of principal components". The problem here is to decide which singular values are "small" ansd should be kicked out, and which are "large" and should be retained. An alternative way is to use some sort of regularization scheme where you impose constraints on your solution (e.g. the solution should be "smooth"). When I have used such schemes (I have never implemented them), the user were required to supply a weighting parameter, which in some cases can be difficult to choose. Another, more mundane issue is what programming language you used for your test. If you used matlab and implemented eq. [4] above yourself (as opposed to use the PINV function), be aware that the matlab SVD function returns the U, S and V a bit differently than you might think:
>> help SVD
SVD Singular value decomposition. [U,S,V] = SVD(X) produces a diagonal matrix S, of the same dimension as X and with nonnegative diagonal elements in decreasing order, and unitary matrices U and V so that X = U*S*V'. Note that the V that is returned here is the conjugated transposed of the V that occurs in the formulas, so you need to shuffle the numbers a bit if you implement such stuff yourself. I think I might have seen a similar structure in some FORTRAN numerical package somewhere, so make sure you know the storage format of your compudted results before you use them. Rune

Jerry Avins wrote:

> Please dissipate a bit of my ignorance. I had thought that with an > overconstrained system, either there's a solution or not. How does > matrix inversion bear on that?
In an overconstrained system Ax=B with A having more rows than columns there is an inbetween, an x which gives the least mean square error between B and Ax. The Moore-Penrose pseudoinverse gives you that. It is calculated as: inv(A'A)*A' Then x=inv(A'A)*A'*B is the x such that sqrt(sum((B-aX)^2)) is minimized for real matrices. What is unclear to me is what is minimized when all the matrices are complex rather than real. What's the metric? Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein

Bob Cain wrote:
> > > Jerry Avins wrote: > >> Please dissipate a bit of my ignorance. I had thought that with an >> overconstrained system, either there's a solution or not. How does >> matrix inversion bear on that? > > > In an overconstrained system Ax=B with A having more rows than columns > there is an inbetween, an x which gives the least mean square error > between B and Ax. The Moore-Penrose pseudoinverse gives you that. It > is calculated as: > > inv(A'A)*A' > > Then > > x=inv(A'A)*A'*B > > is the x such that sqrt(sum((B-aX)^2)) is minimized for real matrices. > What is unclear to me is what is minimized when all the matrices are > complex rather than real. What's the metric? > > > Bob
If you look closely you will find that the various transposes of real matrices become Hermitian transposes (i.e. transposed complex conjugate) and the effect on the residual vector is that you are minimizing the sum of squared magnitudes. All of this maps the quantity being minimized back to the reals when the notion of minimum is in fact defined.