Forums

autocorrelogram have bias?

Started by rune2earth March 24, 2008
Hello 

I'm working on analyzing neuroscience data using the autocorrelation
function. I discovered that it has a negative bias (for all lags except
zero lag), that exactly matches up so that the sum of all is zero. This is
in disagreement with what most textbooks write, for example this taken from
wikipedia (autocorrelation, under properties):

"The autocorrelation of a continuous-time white noise signal will have a
strong peak (represented by a Dirac delta function) at τ = 0 and will be
absolutely 0 for all other τ."

Well, I work in discrete time, so maybe the above is still true, but in
discrete time it would be false. I worked out simple proof of this. 
Correlation for any sequence y_t defined as

g_k=1/n sum_from{t=k+1}to{n}  y_t*y_{t-k}

Furthermore,   the mean value of the sequence is zero:
Sum_{t=1}{N} y_t   = 0

Then we square this equation

1/n [ Sum_{t=1}{N} y_t ]^2  = g_0  + 2*sum_{k=1}{n} sum{t=k+1}{n}
y_t*y_{t-k} =0

the last terms we identify as correlation coefficients for k>0:

g_0 + 2*sum{k=1}{n} g_k  =0

since g_0=1  (definition of autocorrelation) we have the sum of remaining
coefficients must be

sum{k=1}{n} g_k  =  - 1/2 .


Now, if we include both negative and positive timelags we see that the
total sum of the autocorrelation is zero, regardless of the type of data,
random points, pink noise, cosine, whichever. So for white noise, the
autocorrelation is NOT zero for lags k>0.

Am i right about this, or did i make a mistake?

thanks, 

Rune




>Hello > >I'm working on analyzing neuroscience data using the autocorrelation >function. I discovered that it has a negative bias (for all lags except >zero lag), that exactly matches up so that the sum of all is zero. This
is
>in disagreement with what most textbooks write, for example this taken
from
>wikipedia (autocorrelation, under properties): > >"The autocorrelation of a continuous-time white noise signal will have a >strong peak (represented by a Dirac delta function) at τ = 0 and will
be
>absolutely 0 for all other τ." > >Well, I work in discrete time, so maybe the above is still true, but in >discrete time it would be false. I worked out simple proof of this. >Correlation for any sequence y_t defined as > >g_k=1/n sum_from{t=k+1}to{n} y_t*y_{t-k} > >Furthermore, the mean value of the sequence is zero: >Sum_{t=1}{N} y_t = 0 > >Then we square this equation > >1/n [ Sum_{t=1}{N} y_t ]^2 = g_0 + 2*sum_{k=1}{n} sum{t=k+1}{n} >y_t*y_{t-k} =0 > >the last terms we identify as correlation coefficients for k>0: > >g_0 + 2*sum{k=1}{n} g_k =0 > >since g_0=1 (definition of autocorrelation) we have the sum of
remaining
>coefficients must be > >sum{k=1}{n} g_k = - 1/2 . > > >Now, if we include both negative and positive timelags we see that the >total sum of the autocorrelation is zero, regardless of the type of
data,
>random points, pink noise, cosine, whichever. So for white noise, the >autocorrelation is NOT zero for lags k>0. > >Am i right about this, or did i make a mistake? > >thanks, > >Rune >
*********************************************************************** Hello Rune, i worked on your proof, whose implications are indeed was very contradicting, and i think i found where the mistake is. First of all the formal definition of the autocorrelation sequence is r[k] = E{y[n]y[n-k], that is the expectation operator is used. When instead we use samples of data, just as you did, we can only approximate the autocorrelation sequence, from the definition formula that you give. Now, let y zero-mean, unit-variance white noise and let {y[0], y[1], ..., y[N-1]} be N samples of this process. What is wrong with your proof is the following statement of yours: "Furthermore, the mean value of the sequence is zero:
>Sum_{t=1}{N} y_t = 0"
Don't forget that you use samples and not random variables. The mean value of the samples is the above expression devided by N! Furthermore, this mean-value is not zero, but just a small number close to zero, and definitely it's numerator is by no means zero. (you can try that in MATLAB). So, if i square the expression 1/N*Sum_{t=1}{N} y_t and require it to be a small number close to zero, and follow your step i will end up with (1/N)*g_0 + 2*sum{k=1}{n} g_k = very small (1) , instead of g_0 + 2*sum{k=1}{n} g_k =0 (2) So g_0 has a factor 1/N in the equation and hence is very small and consequently (1) is something like an identity. In summary, while doing your computations you ignored the division by N and assumed that the sum of the white noise samples is zero, which is not. Do you see it? Manolis
a small correction, (1) is

>(1/N)*{g_0 + 2*sum{k=1}{n} g_k} = very small (1) , instead of
that is the 1/N factor multiplies also the other autocorrelation terms. Manolis
rune2earth wrote:
> Hello > > I'm working on analyzing neuroscience data using the autocorrelation > function. I discovered that it has a negative bias (for all lags except > zero lag), that exactly matches up so that the sum of all is zero. This is > in disagreement with what most textbooks write, for example this taken from > wikipedia (autocorrelation, under properties): > > "The autocorrelation of a continuous-time white noise signal will have a > strong peak (represented by a Dirac delta function) at τ = 0 and will be > absolutely 0 for all other τ." >
-- snip --
> > Am i right about this, or did i make a mistake? > > thanks, > > Rune
If I read your post correctly, you are mistaken in the meaning of "white noise". The definition of sampled-time white noise is that every sample is a zero-mean random variable that is independent of all other samples. This being the case, the expected value of the product of any two samples is identically zero, while the expected value of the square of a sample is some positive value. Your data is not white, therefore it doesn't have the autocorrelation of white noise. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" gives you just what it says. See details at http://www.wescottdesign.com/actfes/actfes.html


>If I read your post correctly, you are mistaken in the meaning of "white
>noise". The definition of sampled-time white noise is that every sample >is a zero-mean random variable that is independent of all other samples.
>This being the case, the expected value of the product of any two >samples is identically zero, while the expected value of the square of a
>sample is some positive value. > >Your data is not white, therefore it doesn't have the autocorrelation of
>white noise.
Hi Tim, the data doesnt have to be white. The result applies for all types of noise including white noise. The only condition is that the mean is zero. If you subtract the sample average from each data point and perform the ACF estimation written as a finite sum, you come to the conclusion thet the lags are never all clustered around zero, because the sum off all ACF points have to balance the 1 at zero lag. It comes out of the equations very simply. Nowhere in the derivation is there a requirement that the data has to be white or anything else. I spoke to a mathmatician about this and we came to two realizations, 1) People usually derive results about autocorrelation based on INIFINITE data series, and falsely apply them to FINITE data series. 2) When estimating the finite ACF 90% of the function is unuseful because of the increasing bias for higher lag times. My point is that too few textbooks are aware of this. And even less so, they point out how to circumvent this increasing bias. I'm surprised about this, because autocorrelation function is a very common function. regards, Rune
Hi Manolis

Yes i see your point. But you still get that bias when you use the most
common estimators of autocorrelation functions. If i generate white noise
data points in matlab and calculate the ACF and sum over all points they
always come out almost equal to zero (from -N to N). Keep in mind that ACF
is 1 at zero lag, thus the rest of the points have to have a slight
negative bias, all adding up to -1.  So the more data points you include
the more this -1 is "distributed" over many points and thus the bias per
datapoint is smaller.  Thats part of the reason it is better to have larger
data sets, than smaller.   

You see the point?  I encourage you to try it on matlab data.

 best regards,

Rune



>>Hello >> >>I'm working on analyzing neuroscience data using the autocorrelation >>function. I discovered that it has a negative bias (for all lags except >>zero lag), that exactly matches up so that the sum of all is zero. This >is >>in disagreement with what most textbooks write, for example this taken >from >>wikipedia (autocorrelation, under properties): >> >>"The autocorrelation of a continuous-time white noise signal will have
a
>>strong peak (represented by a Dirac delta function) at τ = 0 and will >be >>absolutely 0 for all other τ." >> >>Well, I work in discrete time, so maybe the above is still true, but in >>discrete time it would be false. I worked out simple proof of this. >>Correlation for any sequence y_t defined as >> >>g_k=1/n sum_from{t=k+1}to{n} y_t*y_{t-k} >> >>Furthermore, the mean value of the sequence is zero: >>Sum_{t=1}{N} y_t = 0 >> >>Then we square this equation >> >>1/n [ Sum_{t=1}{N} y_t ]^2 = g_0 + 2*sum_{k=1}{n} sum{t=k+1}{n} >>y_t*y_{t-k} =0 >> >>the last terms we identify as correlation coefficients for k>0: >> >>g_0 + 2*sum{k=1}{n} g_k =0 >> >>since g_0=1 (definition of autocorrelation) we have the sum of >remaining >>coefficients must be >> >>sum{k=1}{n} g_k = - 1/2 . >> >> >>Now, if we include both negative and positive timelags we see that the >>total sum of the autocorrelation is zero, regardless of the type of >data, >>random points, pink noise, cosine, whichever. So for white noise, the >>autocorrelation is NOT zero for lags k>0. >> >>Am i right about this, or did i make a mistake? >> >>thanks, >> >>Rune >> > >*********************************************************************** >Hello Rune, > >i worked on your proof, whose implications are indeed was very >contradicting, and i think i found where the mistake is. > >First of all the formal definition of the autocorrelation sequence is >r[k] = E{y[n]y[n-k], that is the expectation operator is used. When >instead we use samples of data, just as you did, we can only approximate >the autocorrelation sequence, from the definition formula that you give. > >Now, let y zero-mean, unit-variance white noise and let >{y[0], y[1], ..., y[N-1]} be N samples of this process. > >What is wrong with your proof is the following statement of yours: > >"Furthermore, the mean value of the sequence is zero: >>Sum_{t=1}{N} y_t = 0" > >Don't forget that you use samples and not random variables. >The mean value of the samples is the above expression devided by N! >Furthermore, this mean-value is not zero, but just a small number close
to
>zero, and definitely it's numerator is by no means zero. (you can try
that
>in MATLAB). > >So, if i square the expression 1/N*Sum_{t=1}{N} y_t and require it to be
a
>small number close to zero, and follow your step i will end up with > >(1/N)*g_0 + 2*sum{k=1}{n} g_k = very small (1) , instead of > >g_0 + 2*sum{k=1}{n} g_k =0 (2) > >So g_0 has a factor 1/N in the equation and hence is very small and >consequently (1) is something like an identity. > >In summary, while doing your computations you ignored the division by N >and assumed that the sum of the white noise samples is zero, which is >not. > >Do you see it? > >Manolis > >
On Apr 28, 1:29 am, "rune2earth" <r...@mfi.ku.dk> wrote:

> ... > > I spoke to a mathmatician about this and we came to two realizations, > > 1) People usually derive results about autocorrelation based on > INIFINITE data series, and falsely apply them to FINITE data series.
And continuous to discrete. And Fourier transforms as well as autocorrelation. These are things that keep comp.dsp going.
> > 2) When estimating the finite ACF 90% of the function is unuseful because > of the increasing bias for higher lag times. > > My point is that too few textbooks are aware of this. And even less so, > they point out how to circumvent this increasing bias. I'm surprised about > this, because autocorrelation function is a very common function. > > regards, > > Rune
The ACF and power spectral density are only a transform apart. So they are equally common in the world. Before the rise in computational power and familiarization with the FFT, the ACF was commonly taught and used. The power spectrum was too much bother unless it couldn't be avoided. Now the teaching runs the other way and the ACF is only generated from the power spectrum if it can't be avoided. Textbooks don't explain the biases in the discrete ACF because they don't even include the ACF as an important function. Dale B. Dalrymple