DSPRelated.com
Forums

Efficient Moving Average and Moving Variance Calculations

Started by John E. Hadstate May 18, 2008
Steven Smith in "Digital Signal Processing" describes an
efficient algorithm for computing a moving average.  This
algorithm is also mentioned in the Wikipedia article describing
Moving Average:

http://en.wikipedia.org/wiki/Moving_average

Rick Lyons once asked in this newsgroup about an efficient
algorithm for computing "moving variance":

http://groups.google.com/group/comp.dsp/browse_frm/thread/330ac90a92f8dfaf/02a3b89dcf21fdcc?hl=en&lnk=st&q=variance+group%3Acomp.dsp+author%3AHadstate#02a3b89dcf21fdcc

With minimal effort, one can modify the "Moving Average"
algorithm to efficiently compute a "Moving Variance" and a
"Moving Average" simultaneously at each time step.

Notice that one of the equations for computing a sample
variance over a window with N samples can be written as:


    N * Sum(X**2) - (Sum(X)**2)
V = ---------------------------
            N * (N - 1)


where X is the input.

To implement this efficiently, allocate two history buffers,
one for values of X and one for values of X**2, each containing
room for N points.  These buffers need to be initialized,
perhaps to the first sample of X and X**2 or perhaps to zero,
your choice.  Then initialize two variables, SX1 to be the
Sum(elements in X history buffer) and SX2 to be Sum(elements in
X**2 history buffer).

Then, at each time-step k, compute:

X1 = (new sample value)
X2 = X1 * X1

Y1 = (oldest X1 value from X1 history buffer)
Y2 = (oldest X2 value from X2 history buffer)

Overwrite oldest value in X1 history buffer with X1.
Overwrite oldest value in X2 history buffer with X2.

SX1(k) = SX1(k-1) + X1 - Y1
SX2(k) = SX2(k-1) + X2 - Y2

Calculate Moving Average:
M = SX1 / N

Calculate Moving Variance:
V = (N * SX2 - (SX1 * SX1)) / (N * (N - 1))



Cool!
On Sun, 18 May 2008 08:28:51 -0400, "John E. Hadstate"
<jh113355@hotmail.com> wrote:

>Steven Smith in "Digital Signal Processing" describes an >efficient algorithm for computing a moving average. This >algorithm is also mentioned in the Wikipedia article describing >Moving Average: > >http://en.wikipedia.org/wiki/Moving_average > >Rick Lyons once asked in this newsgroup about an efficient >algorithm for computing "moving variance": > >http://groups.google.com/group/comp.dsp/browse_frm/thread/330ac90a92f8dfaf/02a3b89dcf21fdcc?hl=en&lnk=st&q=variance+group%3Acomp.dsp+author%3AHadstate#02a3b89dcf21fdcc > >With minimal effort, one can modify the "Moving Average" >algorithm to efficiently compute a "Moving Variance" and a >"Moving Average" simultaneously at each time step. > >Notice that one of the equations for computing a sample >variance over a window with N samples can be written as: > > > N * Sum(X**2) - (Sum(X)**2) >V = --------------------------- > N * (N - 1) > > >where X is the input.
(snipped by Lyons) Hi John, Your above equation for "V" looks correct to me. (For whatever that's worth.) Regards, [-Rick-]
On May 19, 12:28 am, "John E. Hadstate" <jh113...@hotmail.com> wrote:
> Steven Smith in "Digital Signal Processing" describes an > efficient algorithm for computing a moving average. This > algorithm is also mentioned in the Wikipedia article describing > Moving Average: > > http://en.wikipedia.org/wiki/Moving_average > > Rick Lyons once asked in this newsgroup about an efficient > algorithm for computing "moving variance": > > http://groups.google.com/group/comp.dsp/browse_frm/thread/330ac90a92f... > > With minimal effort, one can modify the "Moving Average" > algorithm to efficiently compute a "Moving Variance" and a > "Moving Average" simultaneously at each time step. > > Notice that one of the equations for computing a sample > variance over a window with N samples can be written as: > > N * Sum(X**2) - (Sum(X)**2) > V = --------------------------- > N * (N - 1) > > where X is the input. > > To implement this efficiently, allocate two history buffers, > one for values of X and one for values of X**2, each containing > room for N points. These buffers need to be initialized, > perhaps to the first sample of X and X**2 or perhaps to zero, > your choice. Then initialize two variables, SX1 to be the > Sum(elements in X history buffer) and SX2 to be Sum(elements in > X**2 history buffer). > > Then, at each time-step k, compute: > > X1 = (new sample value) > X2 = X1 * X1 > > Y1 = (oldest X1 value from X1 history buffer) > Y2 = (oldest X2 value from X2 history buffer) > > Overwrite oldest value in X1 history buffer with X1. > Overwrite oldest value in X2 history buffer with X2. > > SX1(k) = SX1(k-1) + X1 - Y1 > SX2(k) = SX2(k-1) + X2 - Y2 > > Calculate Moving Average: > M = SX1 / N > > Calculate Moving Variance: > V = (N * SX2 - (SX1 * SX1)) / (N * (N - 1))
A simpler way is to do this.... s(k+1)=beta*s(k)+(1-beta)*u^2(k) where beta is a forgetting factor less than unity. u is the input and s is the variance. The correct equation for recursive variance can be found by addoing an extra sample to batch variance s(k+1)=s(k)+(1/(k+1)[s(k)-u^2(k)] if I rememebr right which converges for stationary u when k-> infinity. No use for tracking though - you need the forgetting factor version or a moving window as has been explained. There is one more method which estimated the inverse of variance should you need that one! K.
<kronecker@yahoo.co.uk> wrote in message 
news:3e0aadd8-6179-444b-a81d-ec7ab37f9986@d19g2000prm.googlegroups.com...
> On May 19, 12:28 am, "John E. Hadstate" > <jh113...@hotmail.com> wrote: >> Then, at each time-step k, compute: >> >> X1 = (new sample value) >> X2 = X1 * X1 >> >> Y1 = (oldest X1 value from X1 history buffer) >> Y2 = (oldest X2 value from X2 history buffer) >> >> Overwrite oldest value in X1 history buffer with X1. >> Overwrite oldest value in X2 history buffer with X2. >> >> SX1(k) = SX1(k-1) + X1 - Y1 >> SX2(k) = SX2(k-1) + X2 - Y2 >> >> Calculate Moving Average: >> M = SX1 / N >> >> Calculate Moving Variance: >> V = (N * SX2 - (SX1 * SX1)) / (N * (N - 1)) > > A simpler way is to do this.... > > s(k+1)=beta*s(k)+(1-beta)*u^2(k) > > > where beta is a forgetting factor less than unity. > > u is the input and s is the variance. >
Caveat Emptor!! This is not "a simpler way to do this". The method described by Hadstate computes what everyone in the world agrees is the sample variance for a population size of N and correctly evaluates the sample variance in that population as new measurements replace old ones. The method of "kronecker@yahoo.co.uk" will only accidentally produce the same results. The method described by Hadstate can be improved at the cost of one additional squaring per sample but at the benefit of eliminating the need for the X**2 history buffer. If implemented in scaled fixed point arithmetic, each word of the X**2 buffer should be twice as wide as the words of the X history buffer, so this could represent a significant savings in memory.
On May 19, 3:10 am, "John E. Hadstate" <jh113...@hotmail.com> wrote:
> <kronec...@yahoo.co.uk> wrote in message > > news:3e0aadd8-6179-444b-a81d-ec7ab37f9986@d19g2000prm.googlegroups.com...
> > Caveat Emptor!! This is not "a simpler way to do this". The > method described by Hadstate computes what everyone in the > world agrees is the sample variance for a population size of N > and correctly evaluates the sample variance in that population > as new measurements replace old ones. The method of > "kronec...@yahoo.co.uk" will only accidentally produce the same > results.
What people usually want is the variance of the population, not the sample variance for a population of size N. The sample variance for a population of size N "will only accidentally produce the same results", since the sample variance is only an estimator of the variance. That is often a useful approximation. So is the exponentially averaged version.
> > The method described by Hadstate can be improved at the cost of > one additional squaring per sample but at the benefit of > eliminating the need for the X**2 history buffer. If > implemented in scaled fixed point arithmetic, each word of the > X**2 buffer should be twice as wide as the words of the X > history buffer, so this could represent a significant savings > in memory.
Even this improvement still requires N memory elements instead of the 2 required for the exponentially averaged version, so exponential averaging could represent a significant savings in memory. In systems where hundreds of channels are decomposed into thousands of bins and each bin has this calculation performed, the savings is significant. The use of exponential averaging to calculate signal and background parameters was common in signal analyzers in the last half of the last century when memory prices were considerably greater than they are today. Dale B. Dalrymple http://dbdimages.com
On May 19, 10:10 pm, "John E. Hadstate" <jh113...@hotmail.com> wrote:
> <kronec...@yahoo.co.uk> wrote in message > > news:3e0aadd8-6179-444b-a81d-ec7ab37f9986@d19g2000prm.googlegroups.com... > > > > > On May 19, 12:28 am, "John E. Hadstate" > > <jh113...@hotmail.com> wrote: > >> Then, at each time-step k, compute: > > >> X1 = (new sample value) > >> X2 = X1 * X1 > > >> Y1 = (oldest X1 value from X1 history buffer) > >> Y2 = (oldest X2 value from X2 history buffer) > > >> Overwrite oldest value in X1 history buffer with X1. > >> Overwrite oldest value in X2 history buffer with X2. > > >> SX1(k) = SX1(k-1) + X1 - Y1 > >> SX2(k) = SX2(k-1) + X2 - Y2 > > >> Calculate Moving Average: > >> M = SX1 / N > > >> Calculate Moving Variance: > >> V = (N * SX2 - (SX1 * SX1)) / (N * (N - 1)) > > > A simpler way is to do this.... > > > s(k+1)=beta*s(k)+(1-beta)*u^2(k) > > > where beta is a forgetting factor less than unity. > > > u is the input and s is the variance. > > Caveat Emptor!! This is not "a simpler way to do this". The > method described by Hadstate computes what everyone in the > world agrees is the sample variance for a population size of N > and correctly evaluates the sample variance in that population > as new measurements replace old ones. The method of > "kronec...@yahoo.co.uk" will only accidentally produce the same > results. > >
Which method? The second method (not exponential smoothing) IS exact! It is called recursive variance in the literature just as we can get recursive mean in a similar way. I assume here that the signal is stationary. If it isn't then essentially it is no longer variance but time-varying variance. Take your pick of the many methods. Then I agree the moving window approaches come into their own. K.
On May 19, 7:22 pm, kronec...@yahoo.co.uk wrote:
> On May 19, 10:10 pm, "John E. Hadstate" <jh113...@hotmail.com> wrote: > > > <kronec...@yahoo.co.uk> wrote in message > > >news:3e0aadd8-6179-444b-a81d-ec7ab37f9986@d19g2000prm.googlegroups.com... > > > > On May 19, 12:28 am, "John E. Hadstate" > > > <jh113...@hotmail.com> wrote: > > >> Then, at each time-step k, compute: > > > >> X1 = (new sample value) > > >> X2 = X1 * X1 > > > >> Y1 = (oldest X1 value from X1 history buffer) > > >> Y2 = (oldest X2 value from X2 history buffer) > > > >> Overwrite oldest value in X1 history buffer with X1. > > >> Overwrite oldest value in X2 history buffer with X2. > > > >> SX1(k) = SX1(k-1) + X1 - Y1 > > >> SX2(k) = SX2(k-1) + X2 - Y2 > > > >> Calculate Moving Average: > > >> M = SX1 / N > > > >> Calculate Moving Variance: > > >> V = (N * SX2 - (SX1 * SX1)) / (N * (N - 1)) > > > > A simpler way is to do this.... > > > > s(k+1)=beta*s(k)+(1-beta)*u^2(k) > > > > where beta is a forgetting factor less than unity. > > > > u is the input and s is the variance. > > > Caveat Emptor!! This is not "a simpler way to do this". The > > method described by Hadstate computes what everyone in the > > world agrees is the sample variance for a population size of N > > and correctly evaluates the sample variance in that population > > as new measurements replace old ones. The method of > > "kronec...@yahoo.co.uk" will only accidentally produce the same > > results.
it won't produce the same results. it's a different formula. it is computing a *different* weighted mean and mean-square.
> Which method? The second method (not exponential smoothing) IS exact! > It is called recursive variance in the literature just as we can get > recursive mean in a similar way. I assume here that the signal is > stationary. If it isn't then essentially it is no longer variance but > time-varying variance. Take your pick of the many methods. Then I > agree the moving window approaches come into their own.
i don't see how your "second method" (if i understand what you mean by the term) is "not exponential smoothing". i *do* see how y[n] = beta*y[n-1] + (1-beta)*(x[n])^2 gives you a form of a moving mean square (with exponential weighting). the averaging is done with a IIR filter with DC gain = 1 and what John H says is what "everyone in the world agrees is the sample variance for a population size of N" is n FIR filter with DC gain of *almost* 1 (but is a little more to compensate for the bias in the estimator of variance using that weighting method). both are weighted averages but apply different weightings to the averaging function. your "beta" has for a counterpart "N" in John H's thing. the impulse response to both averagers are: yours: h[n] = (1-beta) * beta^n * u[n] John's: h[n] = 1/(N-1) * (u[n] - u[n-(N-1)) where u[n] is the unit step function, 1 for non-negative n and 0 for negative n. now both of your impulse responses have a mean length. for your's it is 1/(1-beta) and for John's it is (N-1)/2 . if you were to equate those, you would have a rough comparison if you would define beta = 1 - 2/(N-1) or N = 2/(1-beta) + 1 now, to unbias the estimator, John has to take the naive mean of the variances and multiply it by N/(N-1) to get the correct estimate of variance. so my question to you, K, is do you have to also unbias your estimate of variance by multiplying by an equivalent factor, perhaps by 1 + (1-beta)/2 ? that's something to think about. not to denigrate John or anything, but i had in grad school (3 decades ago) come up with the same basic method for moving estimates of variance and mean of real-time data and i implemented it on a 6800 based system (lotsa "add-with-carry" instructions for quintuple precision arithmetic). i also considered a first-order IIR averager like you did, K, and i would have to dig out some very old school papers, but i think i came up with the need to scale the estimator of variance by a factor similar to the above. have you considered this issue? like the FIR moving average, it doesn't make much difference if N is very large or if beta is very close to 1. r b-j
"robert bristow-johnson" <rbj@audioimagination.com> wrote in 
message 
news:e666e54b-6a71-49d0-9938-5fe5ed895dab@m44g2000hsc.googlegroups.com...

> > not to denigrate John or anything, but i had in grad school > (3 decades > ago) come up with the same basic method for moving estimates > of > variance and mean of real-time data and i implemented it on a > 6800 > based system (lotsa "add-with-carry" instructions for > quintuple > precision arithmetic).
I just re-invented this method for the umpteenth time. I don't claim original sin; somebody else showed me the trick back when I was an undergrad. My main purpose in posting it here, aside from making it accessible to those who haven't seen it, is to put it someplace where I can find it the next time I need it. Maybe Rick will put it into the next edition of his book. By the time it comes out, I might be wealthy enough to buy my own copy ;-)
kronecker@yahoo.co.uk writes:

> Which method? The second method (not exponential smoothing) IS exact! > It is called recursive variance in the literature just as we can get > recursive mean in a similar way. I assume here that the signal is > stationary. If it isn't then essentially it is no longer variance but > time-varying variance. Take your pick of the many methods. Then I > agree the moving window approaches come into their own.
You are correct, the second method is exact. The problem is that it effectively "turns off" updates as time proceeds (1/(k+1) -> 0). That can be bad if the system has some sort of start-up transient (or, as you say it, the system is not stationary). Ciao, Peter K. -- "And he sees the vision splendid of the sunlit plains extended And at night the wondrous glory of the everlasting stars."