# Comparing several Correlation Functions

Started by September 8, 2014
```Hi Folks,

I have one dependent variable y that I think depends upon several
independent variables x_i each with an unknown timelag. Now I can use the
correlation function between y and each of my x_i to determine the timelag.
My question is this:

How can I compare the correlation function between y and x_1 with the
correlation function between y and x_2 and so on? Or, asked differently,

How can I interpret the numerical value of the correlation function?

Some correlations must be more significant than others. I suspect that
several of the x's can be ignored and y could still be adequately modelled.
Can I somehow normalize the correlation functions and come to the
conclusion, say, that x_1 and x_3 correlate so significantly and x_2 so
insignificantly that I can ignore x_2?

Thank you!! Best, Patrick

_____________________________
Posted through www.DSPRelated.com
```
```On Mon, 08 Sep 2014 10:51:27 -0500, "patbangert" <101588@dsprelated>
wrote:

>Hi Folks,
>
>I have one dependent variable y that I think depends upon several
>independent variables x_i each with an unknown timelag. Now I can use the
>correlation function between y and each of my x_i to determine the timelag.
>My question is this:
>
>How can I compare the correlation function between y and x_1 with the
>correlation function between y and x_2 and so on? Or, asked differently,
>
>How can I interpret the numerical value of the correlation function?
>
>Some correlations must be more significant than others. I suspect that
>several of the x's can be ignored and y could still be adequately modelled.
>Can I somehow normalize the correlation functions and come to the
>conclusion, say, that x_1 and x_3 correlate so significantly and x_2 so
>insignificantly that I can ignore x_2?
>
>Thank you!! Best, Patrick

I'm going to make some assumptions, and I'll try state them all so if
any don't apply, you can adjust as necessary.

I'm assuming you have a reference function for what each of your x_i
look like, and that your cross correlation process is a sliding dot
product of a sampled vector of y with each of the x_i reference
functions.

If that's the case, you can normalize the power in each of the x_i
reference functions to some value (e.g., normalize them all to unity
power).   If you do that, then the relative outputs of the various
cross-correlation functions will be proportional to the power of each
of the x_i independent variables present in y, with a peak at the time
delay for each.

Eric Jacobsen
Anchor Hill Communications
http://www.anchorhill.com
```
```>I'm assuming you have a reference function for what each of your x_i
>look like, and that your cross correlation process is a sliding dot
>product of a sampled vector of y with each of the x_i reference
>functions.

My y and the x_i are known by empirical data only as a long time-series
each.

>If that's the case, you can normalize the power in each of the x_i
>reference functions to some value (e.g., normalize them all to unity
>power).   If you do that, then the relative outputs of the various
>cross-correlation functions will be proportional to the power of each
>of the x_i independent variables present in y, with a peak at the time
>delay for each.

So that means, I would divide the values in any variable by the integral
over all its values, right? Since I'm dealing with empirical data, that is
essentially a sum over all values.

Best, Patrick

_____________________________
Posted through www.DSPRelated.com
```
```Hi,

you may need to orthogonalize your x_i vectors (Gram-Schmidt procedure) or
use some equivalent procedure (i.e. least-squares projection via
Moore-Penrose Pseudoinverse).
If the x_i are correlated among themselves, the correlation coefficients
with the signal are not easily comparable as there is no concept of "energy
conservation", as would be with an orthogonal (orthonormal) basis ("energy"
if the signal is a voltage, current, or the like).

A possible approach is to put all x_i as column vectors into matrix M and
calculate c=pinv(M)*y, where y is the signal as column vector. c gives the
least-squares optimal fit that minimizes |M*c-y|^2 (which gives the
"energy" that is not accounted for by the projection).

_____________________________
Posted through www.DSPRelated.com
```