Hi Folks, I have one dependent variable y that I think depends upon several independent variables x_i each with an unknown timelag. Now I can use the correlation function between y and each of my x_i to determine the timelag. My question is this: How can I compare the correlation function between y and x_1 with the correlation function between y and x_2 and so on? Or, asked differently, How can I interpret the numerical value of the correlation function? Some correlations must be more significant than others. I suspect that several of the x's can be ignored and y could still be adequately modelled. Can I somehow normalize the correlation functions and come to the conclusion, say, that x_1 and x_3 correlate so significantly and x_2 so insignificantly that I can ignore x_2? Thank you!! Best, Patrick _____________________________ Posted through www.DSPRelated.com
Comparing several Correlation Functions
Started by ●September 8, 2014
Reply by ●September 8, 20142014-09-08
On Mon, 08 Sep 2014 10:51:27 -0500, "patbangert" <101588@dsprelated> wrote:>Hi Folks, > >I have one dependent variable y that I think depends upon several >independent variables x_i each with an unknown timelag. Now I can use the >correlation function between y and each of my x_i to determine the timelag. >My question is this: > >How can I compare the correlation function between y and x_1 with the >correlation function between y and x_2 and so on? Or, asked differently, > >How can I interpret the numerical value of the correlation function? > >Some correlations must be more significant than others. I suspect that >several of the x's can be ignored and y could still be adequately modelled. >Can I somehow normalize the correlation functions and come to the >conclusion, say, that x_1 and x_3 correlate so significantly and x_2 so >insignificantly that I can ignore x_2? > >Thank you!! Best, PatrickI'm going to make some assumptions, and I'll try state them all so if any don't apply, you can adjust as necessary. I'm assuming you have a reference function for what each of your x_i look like, and that your cross correlation process is a sliding dot product of a sampled vector of y with each of the x_i reference functions. If that's the case, you can normalize the power in each of the x_i reference functions to some value (e.g., normalize them all to unity power). If you do that, then the relative outputs of the various cross-correlation functions will be proportional to the power of each of the x_i independent variables present in y, with a peak at the time delay for each. Is that answering your question? Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com
Reply by ●September 9, 20142014-09-09
>I'm assuming you have a reference function for what each of your x_i >look like, and that your cross correlation process is a sliding dot >product of a sampled vector of y with each of the x_i reference >functions.My y and the x_i are known by empirical data only as a long time-series each.>If that's the case, you can normalize the power in each of the x_i >reference functions to some value (e.g., normalize them all to unity >power). If you do that, then the relative outputs of the various >cross-correlation functions will be proportional to the power of each >of the x_i independent variables present in y, with a peak at the time >delay for each.So that means, I would divide the values in any variable by the integral over all its values, right? Since I'm dealing with empirical data, that is essentially a sum over all values. Thank you for your help! Best, Patrick _____________________________ Posted through www.DSPRelated.com
Reply by ●September 9, 20142014-09-09
Hi, you may need to orthogonalize your x_i vectors (Gram-Schmidt procedure) or use some equivalent procedure (i.e. least-squares projection via Moore-Penrose Pseudoinverse). If the x_i are correlated among themselves, the correlation coefficients with the signal are not easily comparable as there is no concept of "energy conservation", as would be with an orthogonal (orthonormal) basis ("energy" if the signal is a voltage, current, or the like). A possible approach is to put all x_i as column vectors into matrix M and calculate c=pinv(M)*y, where y is the signal as column vector. c gives the least-squares optimal fit that minimizes |M*c-y|^2 (which gives the "energy" that is not accounted for by the projection). _____________________________ Posted through www.DSPRelated.com