DSPRelated.com
Forums

Question about computing a "moving variance"

Started by Rick Lyons February 27, 2005
"Rick Lyons" <r.lyons@_BOGUS_ieee.org> wrote in message 
news:4221b997.481221828@news.sf.sbcglobal.net...
> > .................... What occurred > to me was that it might be useful to figure out how > to build a network that computes an > "N-point moving variance". That is, a network where > each new y(n) output is the variance of the > current x(n) input sample and the N-1 previous input > samples. > >
Rick, I don't know either but it brought to mind an old "analogue" implementation: The objective was to measure the signal arrival angle from an array. The array was split into two halves. Sum and difference signals were computed. The difference signal is proportional to the angle of arrival. The question was: "is the difference signal currently indicating an angle of arrival from a consistent source or is it just noise?" The solution was to look at the absence of variance over a short time - which is similar to your question. So, the data was high-pass filtered and detected / rectified which would be similar to taking the sum of squares. The absence of energy in this component indicated low variance and, thus, likely valid angle of arrival. The presence of energy in this component would be an indicator of the variance. Does that suggest anything? High pass filter followed by sum of squares? Take a look at the companion page: http://www.mnstate.edu/wasson/ed602calcvardevs.htm Here the variance is calculated by taking the sume of the squared differences between the samples and their mean (properly weighted). Subtracting the mean of the samples (the output of a lowpass filter) from the samples is like highpass filtering the samples with whatever time constant / bandwidth you care to choose. So, the variance is a sum of squares of the output of a highpass filter and thereafter averaged / lowpassed I don't know if this is any different than what anyone else said (in terms of math and resulting filter structure) but I like this way of looking at it for insight. I guess they look like this .. and you can change the configuration around as you will: /---\ x[k]>---+------>-----| + |--->--------+ | (N-1)/N \---/ | | | | +--+--+ | V | | ^ +--+--+ |z^-1 | | | | +--+--+ | ||.|^2| | /---\ +--+--+ +------>-----| + | | | -1/N \---/ | +--+--+ | | | | | | |z^-1 | ^ | +--+--+ | | | /---\ | +------>-----| + | | | -1/N \---/ | +--+--+ | | | | | | |z^-1 | ^ | +--+--+ | | | /---\ | +------>-----| + | | | -1/N \---/ | +--+--+ | | | | | | |z^-1 | ^ | +--+--+ | | | | | | | | +------>-------+ | -1/N | | | +---------------------------------+ | | | /---\ +->-+------>-----| + |--> y[k] | 1/N \---/ | | +--+--+ | | | ^ |z^-1 | | +--+--+ | | /---\ +------>-----| + | | 1/N \---/ +--+--+ | | | | |z^-1 | ^ +--+--+ | | /---\ +------>-----| + | | 1/N \---/ +--+--+ | | | | |z^-1 | ^ +--+--+ | | /---\ +------>-----| + | | 1/N \---/ +--+--+ | | | | |z^-1 | ^ +--+--+ | | | | | +------>-------+ 1/N So, with a modification to your IIR trick, it might look like this if I didn't make a mistake: The first section is highpass followed by squaring. Pretty similar to the averager. The unit sample response of the highpass is: (N-1)/N,-1/N,-1/N,-1/N, ........ with a zero at f=0 + x(n)---------->(+)-->(*)-->(+)----+--> ----+ | -^ ^ ^ | | v | | | v | [z^-N] | (N-1)/N | [z^-1] V | | | | +--+--+ +------>(+) +-(*)-+ | | ^ ||.|^2| | +-----+ -1/N | | | +----------------------------------------+ | | | + +->----------->(+)-->(*)-->(+)----+--> y(n) | -^ ^ ^ | v | | | v [z^-N] | 1/N | [z^-1] | | | | +------>(+) +-----+ The last stage is the averager. Don't ask *me* if it's a good numerical approach...... Fred
robert bristow-johnson <rbj@audioimagination.com> writes:
> [...] > but > > E{ S^2 } = (N-1)/N * E{ (x[n]-m)^2 } = (N-1)/N * sigma^2
How is this derived? -- % Randy Yates % "Bird, on the wing, %% Fuquay-Varina, NC % goes floating by %%% 919-577-9882 % but there's a teardrop in his eye..." %%%% <yates@ieee.org> % 'One Summer Dream', *Face The Music*, ELO http://home.earthlink.net/~yatescr
in article sm3h9yth.fsf@ieee.org, Randy Yates at yates@ieee.org wrote on
02/27/2005 21:56:

> robert bristow-johnson <rbj@audioimagination.com> writes: >> [...] >> but >> >> E{ S^2 } = (N-1)/N * E{ (x[n]-m)^2 } = (N-1)/N * sigma^2 > > How is this derived?
oh, it's kinda a bitch. i think it's only true under the assumption that the samples (as we DSPers define "samples") are statistically independent of each other. that is if n <> i, then E{ x[n]*x[i] } = E{ x[n] } * E{ x[i] } = m^2 if x[n] and x[i] are controlled by the same statistics. if n = i, then E{ x[n]*x[i] } = E{ x[n]^2 } = sigma^2 + m^2 i think with these definitions: _ N-1 est. mean: x = (1/N) * SUM{ x[n] } n=0 N-1 _ est. variance: S^2 = (1/N) * SUM{ (x[n] - x)^2 } n=0 you can fanangle the summations in S^2 to be N-1 _ S^2 = (1/N) * SUM{ (x[n])^2 } - ( x )^2 n=0 N-1 N-1 N-1 = (1/N) * SUM{ x[n]^2 } - (1/N)*SUM{x[n]} * (1/N)*SUM{x[i]} n=0 n=0 i=0 N-1 N-1 N-1 = (1/N) * SUM{ x[n]^2 } - (1/N^2) * SUM{ SUM{ x[n]*x[i] } } n=0 n=0 i=0 there is no expectations, yet. now when you apply the expectation N-1 N-1 N-1 E{ S^2 } = (1/N) * SUM{ E{x[n]^2} } - (1/N^2) * SUM{ SUM{ E{x[n]*x[i]} } } n=0 n=0 i=0 you get E{S^2} = (1/N)*(N*E{x[n]^2}) - (1/N^2)*((N^2-N)*E{x[n]*x[i]} + N*E{x[n]^2}) (for n <> i) in that double summation, there are N^2 terms of which for N terms, i = n. so there are N^2 - N terms where i <> n. finally E{S^2} = (1/N)*N*(sigma^2 + m^2) - (1/N^2)*((N^2-N)*m^2 + N*(sigma^2 + m^2)) = (sigma^2 + m^2) - (1/N^2)*( N^2*m^2 + N*sigma^2 ) = sigma^2 - (1/N^2)*( N*sigma^2 ) = (N-1)/N * sigma^2 you can see that if you grab only one sample of a random variable, the apparent estimate of variance (from that singular sample) is 0, no matter what the probabilistic variance really is. so a single sample might give you an idea of the mean, if you're lucky, but you can't be lucky enough for that single sample to give you any idea of the variance. so getting more samples can only increase your expectation of getting the variance. this makes sense to me, but what doesn't (to me) is what happens if you take a group of samples that is one less than the entire population, you still use that biased formula, but if it's the entire population, you don't. i guess it still makes sense if the population is infinite, so any finite subset is still a sample and the estimated variance needs to be bumped up a little to correct for the bias. but it still bothers me when you flip the switch from using 1/(N-1) to using 1/N. it seems unnatural. anyway, getting back to Rick's implementation, i see no reason why you can run a CIC filter on x[n] to get an estimate of N*E{x[n]} (the moving sum) and another CIC filter (of the same length) to get an estimate of N*E{x[n]^2} (the moving sum of squares) and from that you can get the biased estimate of variance: S^2 = 1/N * (N*E{x[n]^2}) - ( 1/N * (N*E{x[n]}) )^2 and then you gotta multiply by N/(N-1) to unbias the estimator. N/(N-1) * S^2 = 1/(N-1) * ( N*E{x[n]^2} - 1/N * ( N*E{x[n]} )^2 ) Rick, wouldn't that be the final expression of a "moving variance" algorithm? -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
robert bristow-johnson wrote:


Hello Robert,
my comments interspersed

> > i've done some moving "power" implementations before which was about
the
> same as this. (i *did* remove DC with a DC block and kept a running
sum of
> the last N values of (x[n])^2.
I've done the same in a digital radio. Once the data is "debiased", the variance and power are simply related.
> > i think what Clay said is the pretty straight-forward way to do what
the
> statistics texts say (keep a running sum of x[n] and x[n]^2 using a
CIC and
> compute mean and variance from that) and his reference to > > http://www.mnstate.edu/wasson/ed602calcvarraws.htm
I didn't specifically mention the CIC, but that it what I implied.
> > reminds me of a question i intended to ask the group a few times from
long
> ago. > > using some of the language of the webpage with link above, there is
the
> "population variance" and the "sample variance". and there is a
slightly
> different formula for the variance of the population vs. the sample. > > i understand what the mathematics is about regarding this difference
(to
> *some* point, but there is still a philosophical problem i have).
Basically it is rooted in using an estimate to calculate another estimate. So for the discrete case, the number of degrees of freedom is reduced by one. For example, if I have a sample consisting of 16 numbers taken from a large population of numbers, and I calculate the mean of the 16 numbers, that value is of course an esitimate of the mean of the population from which the 16 numbers came. So now when I use the 16 numbers and the estimate of the population mean, then I have the case of using an estimate to find another estimate. Another way to think about it is if I have 16 numbers and the sample mean, then ?I have seventeen numbers. However, given any 15 of the original numbers and the sample mean, I can easily find the 16th. So we really only have 15 degrees of freedom in this example. So the N-1 case is the unbiased estimator case for the population variance.
> > let x[n] be some random variable with a known mean and known
variance:
> > mean: E{ x[n] } = integral{ x * p(x) dx} = m > > variance: E{ (x[n]-m)^2 } = integral{ (x-m)^2 * p(x) dx} =
sigma^2
> > where the integral is over all x and p(x) is the p.d.f. of x[n]. > > now we take a "sample" of these x[n] (that's that statistician's use
of the
> term, we DSPers would say "N samples of these x[n]") and, from that
sample,
> compute a measure or estimate of the mean and variance in the
straight way: I know the terminology seems a little weird at first, but think in terms of human statistics, where the "population" refers to all humans, and the "sample" is a subset of all humans.
> > (you need a mono-spaced font to view this right.) > > > _ N-1 > est. mean: x = (1/N) * SUM{ x[n] } > n=0 > > > N-1 _ > est. variance: S^2 = (1/N) * SUM{ (x[n] - x)^2 } > n=0 > > N-1 _ > = (1/N) * SUM{ (x[n])^2 } - ( x )^2 > n=0 > > > this computation is valid for the so-called "population mean" and > "population variance" if N represents the number of *all* of the
samples of
> x[n]. in fact, it is not an estimate, but is actually computing the
actual
> mean and variance of the entire set of x[n]. but if N is smaller
than the
> entire population of x[n], then this is a "sample mean" and "sample > variance" and there is some risk in getting the value wrong because
of the
> choice of which x[n] you use and which x[n] are left out of the
estimation.
> > _ > in that case, x and S^2 are both random variables and it turns out
that the
> _ > mean of x is the same as the mean of x[n] which is m (cool!). but it
turns
> out that the mean of S^2 (the estimated variance) is a little smaller
than
> sigma^2.
In fact the viewing of the estimators as random variables is the correct approach. The variance of x bar will equal the population variance divided by the sample size.
> > _ > E{ x } = E{ x[n] } = m > > but > > E{ S^2 } = (N-1)/N * E{ (x[n]-m)^2 } = (N-1)/N * sigma^2 > > > so, it's a "biased estimator" and we fix that bias by multiplying E{
S^2 }
> by N/(N-1). > > now here's what bothers me: it's the discontinuity of the
application of
> the bias correction. first of all the "biased" estimator works, even
for
> N=1 (the mean is just the single sample x[0] and the variance is
zero), but
> the "unbiased" estimator does not (division by 0). i guess that
doesn't
> bother me too much since, if you had a random variable of non-zero
sigma,
> you have to turn that estimated variance of zero to the non-zero
sigma^2 and
> dividing by zero seems like a plausible way to do it. :-) > > but what really bothers me is the that if N is even only 1 less than
the
> entire population, then you unbias the variance estimate by
multiplying by
> N/(N-1), but if N is the entire population you don't multiply by
N/(N-1).
> > let "P" be the size of the entire population, if N = P-1 you multiply
by
> (P-1)/(P-2) but when N = P, you don't multiply by (P-0)/(P-1). in
general
> you always multiply S^2 by > > (P - (P-N))/(P - (P-N) - 1) > > except the case when P-N = 0. then you multiply by 1. > > can any of you statistics gurus explain why this discontinuity
exists? why
> doesn't the unbiased estimate of variance work for the entire
population? Part of the answer lies in what the comments in "Numerical Recipes in C" say about this. "They say if the difference matters, you are probably up to no good anyway." But realisticly, I think you are changing gears between having a sample that is the population of data and a sample that is a subset of the population. For a fun experiment pick 100 random numbers and call that our population. Now just simply calculate the population mean and variance using 100 degrees of freedom. Next pick random subsets of this data and calculate the sample means and variances of each of your samples taken from the population. If you use biased estimaters, the expectation of your estimators will be off. Compare with using unbiased estimators. I hope this helps some. Clay
> > -- > > r b-j rbj@audioimagination.com > > "Imagination is more important than knowledge."
robert bristow-johnson wrote:

   ...

> can any of you statistics gurus explain why this discontinuity exists? why > doesn't the unbiased estimate of variance work for the entire population?
I don't qualify as a statistical guru, but my son does, so if you give me a hard time, I'll sic him on you. The discontinuity is an artifact being too literal. The population variance is defined as the average of the squares of the deviation from the mean. 1 N For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where xbar is N n=1 the population average and N is the number in the population. For samples of the population, s, rather than &sigma; is used. The calculation of s carries two uncertainties: the population average will probably differ from the sample population, and samples chosen won't accurately represent the population as a whole. It turns out that a calculation usually gives s is usually a slightly smaller value than the true value. Certainly, it is less accurate, and it is better not to make it look too good. Reasonable practice augments s by multiplying the formula number by N/(N-1), effectively dividing by (N-1) in the formula above. This the effects small sample sizes more than large ones, reflecting the circumstance that large samples probably need less correction. When the sample size is large enough, the correction is insignificant. When estimating the standard deviation of the heights of twenty-year-old men in New York City by sampling 500 of them, it doesn't matter whether the division is by 500 or by 501. If enough places were given to show the difference, an intelligent reader would cry foul. When the statistics of a population of 30 is estimated from a sample of 29, you might have a problem. I just wouldn't do that. Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
On Sun, 27 Feb 2005 16:11:12 -0800, "Fred Marshall"
<fmarshallx@remove_the_x.acm.org> wrote:

> >"Rick Lyons" <r.lyons@_BOGUS_ieee.org> wrote in message >news:4221b997.481221828@news.sf.sbcglobal.net... >> >> .................... What occurred >> to me was that it might be useful to figure out how >> to build a network that computes an >> "N-point moving variance". That is, a network where >> each new y(n) output is the variance of the >> current x(n) input sample and the N-1 previous input >> samples. >> >> > >Rick, > >I don't know either but it brought to mind an old "analogue" implementation: > >The objective was to measure the signal arrival angle from an array. >The array was split into two halves. Sum and difference signals were >computed. >
(all Fred's good stuff snipped)
> >The last stage is the averager. > >Don't ask *me* if it's a good numerical approach...... > > >Fred >
Hi Fred, thanks for the "different" viewpoint. You're appraoch seems, to me, to have a strong "signal processing" flavor, where previous posts dealt with numerical computations only. I'll print your post and study it later today. Thanks Fred, [-Rick-]
Clay wrote:
> Part of the answer lies in what the comments in "Numerical Recipes in > C" say about this. "They say if the difference matters, you are > probably up to no good anyway."
I like that! -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (603) 226-0404 x536 Quidquid latine dictum sit, altum sonatur. Whatever is said in Latin sounds profound.
Jerry Avins wrote:

> 1 N > For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where > N n=1
I don't know if this works for others, but it's probably folly to expect an actual sigma (if that's what it was) to survive usenet transfer. -- Quidquid latine dictum sit, altum viditur.
Martin Eisenberg wrote:
> Jerry Avins wrote: > > >> 1 N >> For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where >> N n=1 > > > I don't know if this works for others, but it's probably folly to > expect an actual sigma (if that's what it was) to survive usenet > transfer.
It was. Did exponent 2 come through? Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
"Martin Eisenberg" <martin.eisenberg@udo.edu> wrote in message 
news:1109691941.58213@ostenberg.wh.uni-dortmund.de...
> Jerry Avins wrote: > >> 1 N >> For those who think in math, &#4294967295;f&sup2; = -*SUM(xbar - x[n]), where >> N n=1 > > I don't know if this works for others, but it's probably folly to > expect an actual sigma (if that's what it was) to survive usenet > transfer. > > -- > Quidquid latine dictum sit, altum viditur.
Actually, the sigma came through fine for me. I guess it will work on anything that supports UTF-8 and has a font with a signma in it. What didn't work for me (until I changed fonts) is the previous and next lines, with the "1 N" and "N n=1", which don't line up properly unless you use a fixed-width font. -- Eric Backus R&D Design Engineer Agilent Technologies, Inc.