comp.dsp | Question about computing a "moving variance"| page 2

Reply by Fred Marshall ●February 27, 20052005-02-27

"Rick Lyons" <r.lyons@_BOGUS_ieee.org> wrote in message 
news:4221b997.481221828@news.sf.sbcglobal.net...
>
> ....................  What occurred
> to me was that it might be useful to figure out how
> to build a network that computes an
> "N-point moving variance".  That is, a network where
> each new y(n) output is the variance of the
> current x(n) input sample and the N-1 previous input
> samples.
>
>

Rick,

I don't know either but it brought to mind an old "analogue" implementation:

The objective was to measure the signal arrival angle from an array.
The array was split into two halves.  Sum and difference signals were 
computed.

The difference signal is proportional to the angle of arrival.
The question was: "is the difference signal currently indicating an angle of 
arrival from a consistent source or is it just noise?"

The solution was to look at the absence of variance over a short time - 
which is similar to your question.
So, the data was high-pass filtered and detected / rectified which would be 
similar to taking the sum of squares.  The absence of energy in this 
component indicated low variance and, thus, likely valid angle of arrival. 
The presence of energy in this component would be an indicator of the 
variance.

Does that suggest anything?  High pass filter followed by sum of squares?
Take a look at the companion page:
http://www.mnstate.edu/wasson/ed602calcvardevs.htm

Here the variance is calculated by taking the sume of the squared 
differences between the samples and their mean (properly weighted).
Subtracting the mean of the samples (the output of a lowpass filter) from 
the samples is like highpass filtering the samples with whatever time 
constant / bandwidth you care to choose.
So, the variance is a sum of squares of the output of a highpass filter and 
thereafter averaged / lowpassed

I don't know if this is any different than what anyone else said (in terms 
of math and resulting filter structure) but I like this way of looking at it 
for insight.

I guess they look like this .. and you can change the configuration around 
as you will:

                             /---\
        x[k]>---+------>-----| + |--->--------+
                |    (N-1)/N \---/            |
                |              |              |
             +--+--+           |              V
             |     |           ^           +--+--+
             |z^-1 |           |           |     |
             +--+--+           |           ||.|^2|
                |            /---\         +--+--+
                +------>-----| + |            |
                |     -1/N   \---/            |
             +--+--+           |              |
             |     |           |              |
             |z^-1 |           ^              |
             +--+--+           |              |
                |            /---\            |
                +------>-----| + |            |
                |     -1/N   \---/            |
             +--+--+           |              |
             |     |           |              |
             |z^-1 |           ^              |
             +--+--+           |              |
                |            /---\            |
                +------>-----| + |            |
                |     -1/N   \---/            |
             +--+--+           |              |
             |     |           |              |
             |z^-1 |           ^              |
             +--+--+           |              |
                |              |              |
                |              |              |
                +------>-------+              |
                      -1/N                    |
                                              |
                                              |
            +---------------------------------+
            |
            |
            |                /---\
            +->-+------>-----| + |--> y[k]
                |     1/N    \---/
                |              |
             +--+--+           |
             |     |           ^
             |z^-1 |           |
             +--+--+           |
                |            /---\
                +------>-----| + |
                |     1/N    \---/
             +--+--+           |
             |     |           |
             |z^-1 |           ^
             +--+--+           |
                |            /---\
                +------>-----| + |
                |      1/N   \---/
             +--+--+           |
             |     |           |
             |z^-1 |           ^
             +--+--+           |
                |            /---\
                +------>-----| + |
                |      1/N   \---/
             +--+--+           |
             |     |           |
             |z^-1 |           ^
             +--+--+           |
                |              |
                |              |
                +------>-------+
                       1/N

So, with a modification to your IIR trick, it might look like this
 if I didn't make a mistake:
The first section is highpass followed by squaring.
Pretty similar to the averager.
The unit sample response of the highpass is:
(N-1)/N,-1/N,-1/N,-1/N, ........
with a zero at f=0

              +
x(n)---------->(+)-->(*)-->(+)----+--> ----+
       |       -^     ^     ^     |        |
       v        |     |     |     v        |
     [z^-N]     | (N-1)/N   |   [z^-1]     V
       |        |           |     |     +--+--+
       +------>(+)          +-(*)-+     |     |
                               ^        ||.|^2|
                               |        +-----+
                             -1/N          |
                                           |
                                           |
  +----------------------------------------+
  |
  |
  |             +
  +->----------->(+)-->(*)-->(+)----+--> y(n)
         |       -^     ^     ^     |
         v        |     |     |     v
       [z^-N]     |    1/N    |   [z^-1]
         |        |           |     |
         +------>(+)          +-----+

The last stage is the averager.

Don't ask *me* if it's a good numerical approach......


Fred

Reply by Randy Yates ●February 27, 20052005-02-27

robert bristow-johnson <rbj@audioimagination.com> writes:
> [...]
> but
>
>     E{ S^2 }  =  (N-1)/N * E{ (x[n]-m)^2 }  =  (N-1)/N * sigma^2

How is this derived?
-- 
%  Randy Yates                  % "Bird, on the wing,
%% Fuquay-Varina, NC            %   goes floating by
%%% 919-577-9882                %   but there's a teardrop in his eye..."
%%%% <yates@ieee.org>           % 'One Summer Dream', *Face The Music*, ELO
http://home.earthlink.net/~yatescr

Reply by robert bristow-johnson ●February 28, 20052005-02-28

in article sm3h9yth.fsf@ieee.org, Randy Yates at yates@ieee.org wrote on
02/27/2005 21:56:

> robert bristow-johnson <rbj@audioimagination.com> writes:
>> [...]
>>  but
>> 
>>  E{ S^2 }  =  (N-1)/N * E{ (x[n]-m)^2 }  =  (N-1)/N * sigma^2
> 
> How is this derived?

oh, it's kinda a bitch.

i think it's only true under the assumption that the samples (as we DSPers
define "samples") are statistically independent of each other.  that is if
n <> i,  then

    E{ x[n]*x[i] } = E{ x[n] } * E{ x[i] }

                   = m^2

                     if x[n] and x[i] are controlled by the same statistics.

if n = i, then

    E{ x[n]*x[i] } = E{ x[n]^2 }

                   = sigma^2 + m^2

i think with these definitions:

                       _            N-1
    est. mean:         x =  (1/N) * SUM{ x[n] }
                                    n=0

                                    N-1          _
    est. variance:   S^2 =  (1/N) * SUM{ (x[n] - x)^2 }
                                    n=0

you can fanangle the summations in S^2 to be

                         N-1                 _
          S^2 =  (1/N) * SUM{ (x[n])^2 } - ( x )^2
                         n=0

                         N-1                   N-1               N-1
              =  (1/N) * SUM{ x[n]^2 } - (1/N)*SUM{x[n]} * (1/N)*SUM{x[i]}
                         n=0                   n=0               i=0

                         N-1                       N-1  N-1
              =  (1/N) * SUM{ x[n]^2 } - (1/N^2) * SUM{ SUM{ x[n]*x[i] } }
                         n=0                       n=0  i=0

there is no expectations, yet.

now when you apply the expectation

                     N-1                          N-1  N-1
 E{ S^2 } =  (1/N) * SUM{ E{x[n]^2} } - (1/N^2) * SUM{ SUM{ E{x[n]*x[i]} } }
                     n=0                          n=0  i=0

you get

E{S^2} = (1/N)*(N*E{x[n]^2}) - (1/N^2)*((N^2-N)*E{x[n]*x[i]} + N*E{x[n]^2})
                                                (for n <> i)

in that double summation, there are N^2 terms of which for N terms, i = n.
so there are N^2 - N terms where i <> n.

finally

E{S^2} = (1/N)*N*(sigma^2 + m^2) - (1/N^2)*((N^2-N)*m^2 + N*(sigma^2 + m^2))

       =         (sigma^2 + m^2) - (1/N^2)*( N^2*m^2 + N*sigma^2 )

       =          sigma^2        - (1/N^2)*( N*sigma^2 )

       = (N-1)/N * sigma^2

you can see that if you grab only one sample of a random variable, the
apparent estimate of variance (from that singular sample) is 0, no matter
what the probabilistic variance really is.  so a single sample might give
you an idea of the mean, if you're lucky, but you can't be lucky enough for
that single sample to give you any idea of the variance.  so getting more
samples can only increase your expectation of getting the variance.  this
makes sense to me, but what doesn't (to me) is what happens if you take a
group of samples that is one less than the entire population, you still use
that biased formula, but if it's the entire population, you don't.

i guess it still makes sense if the population is infinite, so any finite
subset is still a sample and the estimated variance needs to be bumped up a
little to correct for the bias.  but it still bothers me when you flip the
switch from using 1/(N-1) to using 1/N.  it seems unnatural.

anyway, getting back to Rick's implementation, i see no reason why you can
run a CIC filter on x[n] to get an estimate of N*E{x[n]} (the moving sum)
and another CIC filter (of the same length) to get an estimate of
N*E{x[n]^2} (the moving sum of squares) and from that you can get the biased
estimate of variance:

    S^2 = 1/N * (N*E{x[n]^2}) - ( 1/N * (N*E{x[n]}) )^2

and then you gotta multiply by N/(N-1) to unbias the estimator.

    N/(N-1) * S^2 = 1/(N-1) * ( N*E{x[n]^2}  -  1/N * ( N*E{x[n]} )^2 )

Rick, wouldn't that be the final expression of a "moving variance"
algorithm?

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Reply by Clay ●February 28, 20052005-02-28

robert bristow-johnson wrote:

Hello Robert,
my comments interspersed

>
> i've done some moving "power" implementations before which was about
the
> same as this.  (i *did* remove DC with a DC block and kept a running
sum of
> the last N values of (x[n])^2.

I've done the same in a digital radio. Once the data is "debiased", the
variance and power are simply related.

>
> i think what Clay said is the pretty straight-forward way to do what
the
> statistics texts say (keep a running sum of x[n] and x[n]^2 using a
CIC and
> compute mean and variance from that) and his reference to
>
> http://www.mnstate.edu/wasson/ed602calcvarraws.htm

I didn't specifically mention the CIC, but that it what I implied.

>
> reminds me of a question i intended to ask the group a few times from
long
> ago.
>
> using some of the language of the webpage with link above, there is
the
> "population variance" and the "sample variance".  and there is a
slightly
> different formula for the variance of the population vs. the sample.
>
> i understand what the mathematics is about regarding this difference
(to
> *some* point, but there is still a philosophical problem i have).

Basically it is rooted in using an estimate to calculate another
estimate. So for the discrete case, the number of degrees of freedom is
reduced by one. For example, if I have a sample consisting of 16
numbers taken from a large population of numbers, and I calculate the
mean of the 16 numbers, that value is of course an esitimate of the
mean of the population from which the 16 numbers came. So now when I
use the 16 numbers and the estimate of the population mean, then I have
the case of using an estimate to find another estimate.  Another way to
think about it is if I have 16 numbers and the sample mean, then ?I
have seventeen numbers. However, given any 15 of the original numbers
and the sample mean, I can easily find the 16th. So we really only have
15 degrees of freedom in this example. So the N-1 case is the unbiased
estimator case for the population variance.

>
> let x[n] be some random variable with a known mean and known
variance:
>
>     mean:       E{ x[n] }       = integral{ x * p(x) dx} = m
>
>     variance:   E{ (x[n]-m)^2 } = integral{ (x-m)^2 * p(x) dx} =
sigma^2
>
> where the integral is over all x and p(x) is the p.d.f. of x[n].
>
> now we take a "sample" of these x[n] (that's that statistician's use
of the
> term, we DSPers would say "N samples of these x[n]") and, from that
sample,
> compute a measure or estimate of the mean and variance in the
straight way:

I know the terminology seems a little weird at first, but think in
terms of human statistics, where the "population" refers to all humans,
and the "sample" is a subset of all humans.

>
> (you need a mono-spaced font to view this right.)
>
>
>                        _            N-1
>     est. mean:         x =  (1/N) * SUM{ x[n] }
>                                     n=0
>
>
>                                     N-1          _
>     est. variance:   S^2 =  (1/N) * SUM{ (x[n] - x)^2 }
>                                     n=0
>
>                                     N-1                 _
>                          =  (1/N) * SUM{ (x[n])^2 } - ( x )^2
>                                     n=0
>
>
> this computation is valid for the so-called "population mean" and
> "population variance" if N represents the number of *all* of the
samples of
> x[n].  in fact, it is not an estimate, but is actually computing the
actual
> mean and variance of the entire set of x[n].  but if N is smaller
than the
> entire population of x[n], then this is a "sample mean" and "sample
> variance" and there is some risk in getting the value wrong because
of the
> choice of which x[n] you use and which x[n] are left out of the
estimation.
>
>               _
> in that case, x and S^2 are both random variables and it turns out
that the
>         _
> mean of x is the same as the mean of x[n] which is m (cool!).  but it
turns
> out that the mean of S^2 (the estimated variance) is a little smaller
than
> sigma^2.

In fact the viewing of the estimators as random variables is the
correct approach. The variance of x bar will equal the population
variance divided by the sample size.

>
>        _
>     E{ x } = E{ x[n] } = m
>
> but
>
>     E{ S^2 }  =  (N-1)/N * E{ (x[n]-m)^2 }  =  (N-1)/N * sigma^2
>
>
> so, it's a "biased estimator" and we fix that bias by multiplying E{
S^2 }
> by N/(N-1).
>
> now here's what bothers me:  it's the discontinuity of the
application of
> the bias correction.  first of all the "biased" estimator works, even
for
> N=1 (the mean is just the single sample x[0] and the variance is
zero), but
> the "unbiased" estimator does not (division by 0).  i guess that
doesn't
> bother me too much since, if you had a random variable of non-zero
sigma,
> you have to turn that estimated variance of zero to the non-zero
sigma^2 and
> dividing by zero seems like a plausible way to do it. :-)
>
> but what really bothers me is the that if N is even only 1 less than
the
> entire population, then you unbias the variance estimate by
multiplying by
> N/(N-1), but if N is the entire population you don't multiply by
N/(N-1).
>
> let "P" be the size of the entire population, if N = P-1 you multiply
by
> (P-1)/(P-2) but when N = P, you don't multiply by (P-0)/(P-1).  in
general
> you always multiply S^2 by
>
>     (P - (P-N))/(P - (P-N) - 1)
>
> except the case when P-N = 0.  then you multiply by 1.
>
> can any of you statistics gurus explain why this discontinuity
exists?  why
> doesn't the unbiased estimate of variance work for the entire
population?

Part of the answer lies in what the comments in "Numerical Recipes in
C" say about this. "They say if the difference matters, you are
probably up to no good anyway."

But realisticly, I think you are changing gears between having a sample
that is the population of data and a sample that is a subset of the
population. For a fun experiment pick 100 random numbers and call that
our population. Now just simply calculate the population mean and
variance using 100 degrees of freedom. Next pick random subsets of this
data and calculate the sample means and variances of each of your
samples taken from the population. If you use biased estimaters, the
expectation of your estimators will be off. Compare with using unbiased
estimators.

I hope this helps some.

Clay

>
> --
>
> r b-j                  rbj@audioimagination.com
> 
> "Imagination is more important than knowledge."

Reply by Jerry Avins ●February 28, 20052005-02-28

robert bristow-johnson wrote:

   ...

> can any of you statistics gurus explain why this discontinuity exists?  why
> doesn't the unbiased estimate of variance work for the entire population?

I don't qualify as a statistical guru, but my son does, so if you give 
me a hard time, I'll sic him on you.

The discontinuity is an artifact being too literal. The population 
variance is defined as the average of the squares of the deviation from 
the mean.
                                    1  N
  For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where xbar is
                                    N n=1

the population average and N is the number in the population. For 
samples of the population, s, rather than &sigma; is used. The calculation of 
s carries two uncertainties: the population average will probably differ 
from the sample population, and samples chosen won't accurately 
represent the population as a whole. It turns out that a calculation 
usually gives s is usually a slightly smaller value than the true value. 
Certainly, it is less accurate, and it is better not to make it look too 
good. Reasonable practice augments s by multiplying the formula number 
by N/(N-1), effectively dividing by (N-1) in the formula above. This the 
effects small sample sizes more than large ones, reflecting the 
circumstance that large samples probably need less correction. When the 
sample size is large enough, the correction is insignificant. When 
estimating the standard deviation of the heights of twenty-year-old men 
in  New York City by sampling 500 of them, it doesn't matter whether the 
division is by 500 or by 501. If enough places were given to show the 
difference, an intelligent reader would cry foul.

When the statistics of a population of 30 is estimated from a sample of 
29, you might have a problem. I just wouldn't do that.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Reply by Rick Lyons ●March 1, 20052005-03-01

On Sun, 27 Feb 2005 16:11:12 -0800, "Fred Marshall"
<fmarshallx@remove_the_x.acm.org> wrote:

>
>"Rick Lyons" <r.lyons@_BOGUS_ieee.org> wrote in message 
>news:4221b997.481221828@news.sf.sbcglobal.net...
>>
>> ....................  What occurred
>> to me was that it might be useful to figure out how
>> to build a network that computes an
>> "N-point moving variance".  That is, a network where
>> each new y(n) output is the variance of the
>> current x(n) input sample and the N-1 previous input
>> samples.
>>
>>
>
>Rick,
>
>I don't know either but it brought to mind an old "analogue" implementation:
>
>The objective was to measure the signal arrival angle from an array.
>The array was split into two halves.  Sum and difference signals were 
>computed.
>
  (all Fred's good stuff snipped)
>
>The last stage is the averager.
>
>Don't ask *me* if it's a good numerical approach......
>
>
>Fred
>

Hi Fred,

  thanks for the "different" viewpoint.

You're appraoch seems, to me, to have a strong 
"signal processing" flavor, where previous posts 
dealt with numerical computations only.
I'll print your post and study it later today.

Thanks Fred,
[-Rick-]

Reply by Jim Thomas ●March 1, 20052005-03-01

Clay wrote:
> Part of the answer lies in what the comments in "Numerical Recipes in
> C" say about this. "They say if the difference matters, you are
> probably up to no good anyway."

I like that!

-- 
Jim Thomas            Principal Applications Engineer  Bittware, Inc
jthomas@bittware.com  http://www.bittware.com    (603) 226-0404 x536
Quidquid latine dictum sit, altum sonatur.
Whatever is said in Latin sounds profound.

Reply by Martin Eisenberg ●March 1, 20052005-03-01

Jerry Avins wrote:

>                                     1  N
>   For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where
>                                     N n=1

I don't know if this works for others, but it's probably folly to 
expect an actual sigma (if that's what it was) to survive usenet 
transfer.

-- 
Quidquid latine dictum sit, altum viditur.

Reply by Jerry Avins ●March 1, 20052005-03-01

Martin Eisenberg wrote:
> Jerry Avins wrote:
> 
> 
>>                                    1  N
>>  For those who think in math, &sigma;&sup2; = -*SUM(xbar - x[n]), where
>>                                    N n=1
> 
> 
> I don't know if this works for others, but it's probably folly to 
> expect an actual sigma (if that's what it was) to survive usenet 
> transfer.

It was. Did exponent 2 come through?

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Reply by Eric Backus ●March 1, 20052005-03-01

"Martin Eisenberg" <martin.eisenberg@udo.edu> wrote in message 
news:1109691941.58213@ostenberg.wh.uni-dortmund.de...
> Jerry Avins wrote:
>
>>                                     1  N
>>   For those who think in math, &#4294967295;f&sup2; = -*SUM(xbar - x[n]), where
>>                                     N n=1
>
> I don't know if this works for others, but it's probably folly to
> expect an actual sigma (if that's what it was) to survive usenet
> transfer.
>
> -- 
> Quidquid latine dictum sit, altum viditur.

Actually, the sigma came through fine for me.  I guess it will work on 
anything that supports UTF-8 and has a font with a signma in it.  What 
didn't work for me (until I changed fonts) is the previous and next lines, 
with the "1 N" and "N n=1", which don't line up properly unless you use a 
fixed-width font.

-- 
Eric Backus
R&D Design Engineer
Agilent Technologies, Inc.

Previous 123 4 5 6 Next

Question about computing a "moving variance"

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group