Forums

The importance of the Ubiquitous Covariance Matrix

Started by bmh161 August 28, 2006
Hello all.  This is my first post.  However, I have read many of the
discussions here in the past and found them to be very interesting.  I
especially appreciate the patience that the wiser DSP practitioners show
to enthusiastic newcomers to the field, like myself.

At any rate I feel that my question must be basic because I cannot find an
answer, as hard as I look.  On the other hand, I might be forcing the
answer I�m looking for into a box and ignoring the possibility that I am
reading and re-reading what I am looking for in the first place.

What I am trying to understand is the importance of the covariance matrix
in estimation theory.  It has shown up in both areas of statistical signal
processing that I have attempted to study (Kalman Filtering and MUSIC so
far).  Therefore, I feel like it is well worth my effort to get a deep
understanding for what makes the properties of this matrix so popular to
estimation theory.  

I have tried pretty hard to find the answer on my own so please don�t
think I�m looking for someone to do my work for me.  But I would like to
get some insight and perhaps a common sense explanation of why this is
such an important topic in statistical DSP.



Thank you all very much for your time in considering my question.



bmh161 said the following on 28/08/2006 11:25:
> What I am trying to understand is the importance of the covariance matrix > in estimation theory. It has shown up in both areas of statistical signal > processing that I have attempted to study (Kalman Filtering and MUSIC so > far). Therefore, I feel like it is well worth my effort to get a deep > understanding for what makes the properties of this matrix so popular to > estimation theory. > > I have tried pretty hard to find the answer on my own so please don�t > think I�m looking for someone to do my work for me. But I would like to > get some insight and perhaps a common sense explanation of why this is > such an important topic in statistical DSP.
It's important because most optimal estimators (optimal in the sense of best ensemble average performance) require it. Computing the ensemble average requires expectations (see http://en.wikipedia.org/wiki/Expected_value); any time you require the expectation of x*y (or something similar), you get E{x*y}, which is the definition of the covariance matrix. Kalman and MUSIC are quite advanced estimators for a beginner to start with. Try reading up on simpler estimators, such as LLMSE (linear least mean-square error) estimators - the derivations are simple, and will demonstrate how covariance matrices are required. I would love to recommend a good book to start with, but in reality I've only ever used two books on the topic, so I don't know how good they are compared to others in the field. [1] is a classic, but very dense mathematically. [2] is much more mordern, and far easier to follow, but perhaps lacking in some details. [1] L.L. Scharf, "Statistical Signal Processing" [2] A.H. Sayed, "Fundamentals of Adaptive Filtering" -- Oli
Oli Filth <catch@olifilth.co.uk> writes:

> bmh161 said the following on 28/08/2006 11:25: >> What I am trying to understand is the importance of the covariance matrix >> in estimation theory. It has shown up in both areas of statistical signal >> processing that I have attempted to study (Kalman Filtering and MUSIC so >> far). Therefore, I feel like it is well worth my effort to get a deep >> understanding for what makes the properties of this matrix so popular to >> estimation theory. I have tried pretty hard to find the answer on >> my own so please dont >> think Im looking for someone to do my work for me. But I would like to >> get some insight and perhaps a common sense explanation of why this is >> such an important topic in statistical DSP. > > It's important because most optimal estimators (optimal in the sense > of best ensemble average performance) require it. Computing the > ensemble average requires expectations (see > http://en.wikipedia.org/wiki/Expected_value); any time you require the > expectation of x*y (or something similar), you get E{x*y}, which is > the definition of the covariance matrix. > > Kalman and MUSIC are quite advanced estimators for a beginner to start > with. Try reading up on simpler estimators, such as LLMSE (linear > least mean-square error) estimators - the derivations are simple, and > will demonstrate how covariance matrices are required. > > I would love to recommend a good book to start with, but in reality > I've only ever used two books on the topic, so I don't know how good > they are compared to others in the field. [1] is a classic, but very > dense mathematically. [2] is much more mordern, and far easier to > follow, but perhaps lacking in some details. > > [1] L.L. Scharf, "Statistical Signal Processing" > [2] A.H. Sayed, "Fundamentals of Adaptive Filtering"
I've always loved the down-to-earth style of Brown in @book{brown, title = "Introduction to Random Signal Analysis and Kalman Filtering", author = "{Robert~Grover~Brown}", publisher = "John Wiley and Sons", year = "1983"} --Randy <loving his new xemacs macro to extract BiBTeX references from his database!> -- % Randy Yates % "Maybe one day I'll feel her cold embrace, %% Fuquay-Varina, NC % and kiss her interface, %%% 919-577-9882 % til then, I'll leave her alone." %%%% <yates@ieee.org> % 'Yours Truly, 2095', *Time*, ELO http://home.earthlink.net/~yatescr
one more thing i understand is.
                                            if you have a signal
corrupted with noise. It can be assumed to be a set of random variables
and if you compute covariance matrix on it, the diagonal elements will
give you Signal To Noise ratio. It was quite longtime back, i studied
may be wrong..will post..again if what i said above is wrong..

regards
indraneel.

> >It's important because most optimal estimators (optimal in the sense of >best ensemble average performance) require it. Computing the ensemble >average requires expectations (see >http://en.wikipedia.org/wiki/Expected_value); any time you require the >expectation of x*y (or something similar), you get E{x*y}, which is the >definition of the covariance matrix. >
Oli, My follow-up question is this: Just so I am clear on your notation I will make some assumptions. Please correct my wording and/or usage as necessary. Let's assume that x and y are seqeunces of sampled data. From a random processes perspective, assuming such implies that each sample in each sequence is a particular realization associated with the data points respective random variable, which is a member of a particular ensemble realization. Also, I'm assuming x* denotes the conjugate transpose of x. With that out of the way, what is the significance of multiplying x* and y. Playing around with some made up sequences I see that the result of such a multiplication gives a matrix whose entries correspond to all of the possible products between the elements of x and y. What I don't understand is what statistical information is gained by comupting the expected value of each of these matrix entries. My intuition leads me to thinking that not much would be gained in the way of an understanding of the relationship between x and y, which is what I get the impression E{x*y} is trying to produce. Please set my intution straight :-)
bmh161 said the following on 28/08/2006 23:00:
>> It's important because most optimal estimators (optimal in the sense of >> best ensemble average performance) require it. Computing the ensemble >> average requires expectations (see >> http://en.wikipedia.org/wiki/Expected_value); any time you require the >> expectation of x*y (or something similar), you get E{x*y}, which is the >> definition of the covariance matrix. > > My follow-up question is this: > Just so I am clear on your notation I will make some assumptions. Please > correct my wording and/or usage as necessary. > > Let's assume that x and y are seqeunces of sampled data.
Kind of. More generally, x and y are just sets of data; they're not necessarily sequences in time.
> Also, I'm assuming x* denotes the conjugate transpose of x.
Yes.
> With that out of the way, what is the significance of multiplying x* and > y. Playing around with some made up sequences I see that the result of > such a multiplication gives a matrix whose entries correspond to all of > the possible products between the elements of x and y. What I don't > understand is what statistical information is gained by comupting the > expected value of each of these matrix entries. > > My intuition leads me to thinking that not much would be gained in the way > of an understanding of the relationship between x and y, which is what I > get the impression E{x*y} is trying to produce.
Covariance matrices provide information on the way that one set of variables affect another set of variables. This information is useful, for instance, if we want to estimate one set from observations of the other. I'll give an example, which demonstrates how crucial covariance matrices are to even the simplest of estimators. Let's say you have a process, x, that you want to estimate, but the only observations you have are y. +--------+ x ---->| System |----> y +--------+ Let's also say that you intend to estimate x by applying an appropriate transform to y: x_est = K y Each value of K that you choose will lead to an estimation error. A common way of quantifying this is the mean-square-error criterion, which leads to a "cost function" J(K): J(K) = E|x_est - x|^2 Expanding, we have: J(K) = E|K y - x|^2 If we expand the quadratic expression, we get terms in: E|y|^2 = R_y E|x*y| = R_xy E|y*x| = R_yx E|x|^2 = R_x where the various R are the covariance matrices. -- Oli
hi oli,
       very good reply in indeed, i would like you to draw ur attention
to what indraneel said, i is it right that can you estiamate SNR of a
signal corrupted by noise just by covariance matrix. kinldy reply to
this..

regards
particle

PARTICLEREDDY said the following on 29/08/2006 05:48:
> i would like you to draw ur attention > to what indraneel said, i is it right that can you estiamate SNR of a > signal corrupted by noise just by covariance matrix.
If we have a system: y = x + v where v is the noise (independent from x), then we can compute the covariance matrix of y as: R_yy = E{yy*} = E[x+v][x+v]* = R_xx + R_vv If know R_xx, then we can obtain the noise covariance matrix R_vv. Assuming that each noise value is independent, then R_vv is diagonal, with each diagonal entry equal to the variance of the corresponding noise value. -- Oli