Reply by Rune Allnor April 26, 20062006-04-26
martini skrev:
> I have written a cross correlation algorithm which is giving me quite a few > problems. From what is looks like, and from the descriptions of cross > correlation on this site are telling me, i think i found the > problem...however I cant think of how to fix it. Hopefully you can help. > Here is the situation which has been described in more details in other > threads. > > I have a template of data, of length N. > I have a sample of data of length M, with N>>M > > I need to find where the sample data best fits on the template data - so I > am using the FFT method of cross correlation to find the timeshift which > best matches the data. Here is the problem:
Be very careful when using the DFT to estimate correlation. There have been a couple of threads here the last week or so where the properties of time-domain -> spectrum domain transforms of correlation functions were discussed. It is not obvious that the FFT will give correct results for such computations.
> It seems that by this description of the cross correlation: > > "You slide one function over the other, one sample at a time, and at > each slid point you: > > - multiply corresponding samples of each function > - sum all those products > > Summing the prducts is like taking the area under the resulting curve > of the products. > > Now, if the signals are similar, and unslid, then their products will > be like squaring one signal - which will all be positive going, so the > area under this curve wil all add (constructive interefernce - > remember physics of waves?) and so will be big. If the signals are not > similar, or are slid, then some positive signal samples will multiply > negative signal samples, so some of the resulting product samples will > be negative, and these negative going parts of the curve will subtract > from the positive parts (destructive interference) so the area will be > less."
This recipe has nothing to do with the DFT. This one works in time domain.
> that if the signals are of different magnitude, they can still have a very > good correlation. Take for example the plot shown here: > > http://www.personal.psu.edu/rdm186/random_images/phase_mag.jpg
I commented on that in your other thread.
> Red section is the template data, and the green/blue sections are the > sample data timeshifted. The green sample, is the timeshift corresponding > to the highest correlation via the FFT method, whereas the blue section is > where it actually fits the best.. you can see that the phases are VERY > similar in these two sections of the template, however the section where > the FFT returned the highest correlation has considerably different > magnitudes.
As expected. Read my post in your other thread.
> However, from the above description, it makes sense that the > FFT would return this as the highest correlation, because the > multiplication of the two data sets at this section of the curve would > yield a higher product than at the 'correct' section of the curve - since > the template data has a higher magnitude at this point.
The DFT has nothing to do with what you posted above. That was valid for time domain.
> Is there a way to consider magnitudes with xcorrelation?
Yes. I did that in the MEX file I posted a week ago. Read it. If you have questions, just ask. Everything is taken care of in that file. Rune
Reply by Ikaro April 26, 20062006-04-26
The coherence might useful here:

http://www.mathworks.com/access/helpdesk/help/toolbox/signal/mscohere.html

Reply by Ikaro April 26, 20062006-04-26
Have you tried dividing the cross-correlation of each section by the
autocorrelation of the template at the section (or the power)?

Reply by martini April 26, 20062006-04-26
I have written a cross correlation algorithm which is giving me quite a few
problems.  From what is looks like, and from the descriptions of cross
correlation on this site are telling me, i think i found the
problem...however I cant think of how to fix it.  Hopefully you can help. 
Here is the situation which has been described in more details in other
threads.

I have a template of data, of length N.
I have a sample of data of length M, with N>>M

I need to find where the sample data best fits on the template data - so I
am using the FFT method of cross correlation to find the timeshift which
best matches the data.  Here is the problem:

It seems that by this description of the cross correlation:

"You slide one function over the other, one sample at a time, and at
each slid point you:

- multiply corresponding samples of each function
- sum all those products

Summing the prducts is like taking the area under the resulting curve
of the products.

Now, if the signals are similar, and unslid, then their products will
be like squaring one signal - which will all be positive going, so the
area under this curve wil all add (constructive interefernce -
remember physics of waves?) and so will be big. If the signals are not
similar, or are slid, then some positive signal samples will multiply
negative signal samples, so some of the resulting product samples will
be negative, and these negative going parts of the curve will subtract
from the positive parts (destructive interference) so the area will be
less."


that if the signals are of different magnitude, they can still have a very
good correlation.  Take for example the plot shown here:

http://www.personal.psu.edu/rdm186/random_images/phase_mag.jpg

Red section is the template data, and the green/blue sections are the
sample data timeshifted.  The green sample, is the timeshift corresponding
to the highest correlation via the FFT method, whereas the blue section is
where it actually fits the best..  you can see that the phases are VERY
similar in these two sections of the template, however the section where
the FFT returned the highest correlation has considerably different
magnitudes.  However, from the above description, it makes sense that the
FFT would return this as the highest correlation, because the
multiplication of the two data sets at this section of the curve would
yield a higher product than at the 'correct' section of the curve - since
the template data has a higher magnitude at this point.


Is there a way to consider magnitudes with xcorrelation?