DSPRelated.com
Forums

simple perspective in using floats for recursive moving-average

Started by Randy Yates December 26, 2016
On Tue, 27 Dec 2016 07:22:53 -0500, Randy Yates wrote:

> Tim Wescott <seemywebsite@myfooter.really> writes: > >> On Mon, 26 Dec 2016 17:03:55 -0500, Randy Yates wrote: >> >>> Yup. And admittedly I don't understand floating-point truncation. >> >> I just tried for a nice analogy, and failed. But imagine that you want >> to add 1/3 to 4/3, in a floating point notation that only allows an >> 8-bit mantissa. 1/3 = 1r0101010x2^-2. 4/3 = 1r0101010x2^0. Note that >> truncation has already happened, because 1/3 is a repeating "decimal" >> in binary -- ignore that, and just go with the above values. To add >> them, you first line them up, then add: >> >> r010101010 >> + 1r0101010 >> ----------- >> 1r101010010 >> >> This answer is exact in the stored numbers, but it is 10 bits. So, you >> whack off the two trailing bits: >> >> answer = 1r1010100x2^0 >> >> This answer is off by (if I'm counting digits correctly) 2^-8. Every >> time you do an operation, the answer is off. >> >> It gets more complicated in the real world, but this is a summation of >> the effect. > > Right. Pretty straightforward. Although I wonder what complicates this?
There's more than one rounding method, and on some floating point hardware it's selectable. Also, on some floating point hardware (notably just about anything from Intel), the FPU registers are longer than the external word -- IIRC it's 80 bits for double; I don't know if they support longer words for single-precision. So not only is the truncation sometimes not so bad, but it's sensitive to how often the numbers are fetched from the FPU and put back in. For at least one of my customers I'm doing some serious number crunching on code written in C++. I use Linux, my customer uses Windows. I get oh- so-slightly different answers running the exact same data through executables compiled with the exact same code, only with different tool chains. Personally, when it gets down to that level of detail I figure that the most sensible thing to do is to run away screaming -- but sometimes you have to just tough it out. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!
Tim Wescott <seemywebsite@myfooter.really> writes:

> On Tue, 27 Dec 2016 07:22:53 -0500, Randy Yates wrote: > >> Tim Wescott <seemywebsite@myfooter.really> writes: >> >>> On Mon, 26 Dec 2016 17:03:55 -0500, Randy Yates wrote: >>> >>>> Yup. And admittedly I don't understand floating-point truncation. >>> >>> I just tried for a nice analogy, and failed. But imagine that you want >>> to add 1/3 to 4/3, in a floating point notation that only allows an >>> 8-bit mantissa. 1/3 = 1r0101010x2^-2. 4/3 = 1r0101010x2^0. Note that >>> truncation has already happened, because 1/3 is a repeating "decimal" >>> in binary -- ignore that, and just go with the above values. To add >>> them, you first line them up, then add: >>> >>> r010101010 >>> + 1r0101010 >>> ----------- >>> 1r101010010 >>> >>> This answer is exact in the stored numbers, but it is 10 bits. So, you >>> whack off the two trailing bits: >>> >>> answer = 1r1010100x2^0 >>> >>> This answer is off by (if I'm counting digits correctly) 2^-8. Every >>> time you do an operation, the answer is off. >>> >>> It gets more complicated in the real world, but this is a summation of >>> the effect. >> >> Right. Pretty straightforward. Although I wonder what complicates this? > > There's more than one rounding method, and on some floating point > hardware it's selectable. Also, on some floating point hardware (notably > just about anything from Intel), the FPU registers are longer than the > external word -- IIRC it's 80 bits for double; I don't know if they > support longer words for single-precision. So not only is the truncation > sometimes not so bad, but it's sensitive to how often the numbers are > fetched from the FPU and put back in.
It was 80 bits back to the 80s when the 387 was a separate floating-point processor chip, I believe. I was thinking complications are also associated with NANs, INFs, denormalization, etc.
> For at least one of my customers I'm doing some serious number crunching > on code written in C++. I use Linux, my customer uses Windows. I get oh- > so-slightly different answers running the exact same data through > executables compiled with the exact same code, only with different tool > chains.
Are you dealing with data which have magnitudes very near to zero (i.e., minimum exponent? If so it may have something to do with the denormal modes used: https://en.wikipedia.org/wiki/Denormal_number but this is just a guess.
> Personally, when it gets down to that level of detail I figure that the > most sensible thing to do is to run away screaming -- but sometimes you > have to just tough it out.
Makes me want to run screaming to fixed-point. -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
I use these filters all the time, but there's this nagging worry that a single-cycle math error (caused by a glitch or an alpha particle or whatever ) will produce a permanent DC offset in the output.  If you want to get fancy you can detect this and reset your filter. 

Bob
On Fri, 30 Dec 2016 05:56:16 -0800, radams2000 wrote:

> I use these filters all the time, but there's this nagging worry that a > single-cycle math error (caused by a glitch or an alpha particle or > whatever ) will produce a permanent DC offset in the output. If you > want to get fancy you can detect this and reset your filter. > > Bob
The technique used in CIC filters makes that a non-problem -- instead of saving the vector of past data and subtracting out the n-th oldest one from the sum (and thus preserving any errors to the sum forever), save a vector of past sums and subtract out the n-th oldest one. If you use wrap-on-overflow arithmetic, then your alpha-particle data-smash will only affect the filter output for at most n samples -- the same as if it happened to the input data. In general it means you're saving data with a wider word length, but you're picking up some dependability. -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
Tim

That's true of CIC decimation filters, however Randy's post made no mention of rate changes so he was referring to a filter where you take the difference between the input and output of a delay line and integrate the result, in which case an error will cause a permanent DC offset. 
When using CIC filters for changing the rate, decimation and interpolation behave differently. In decimation mode, the integrators come first (followed by differentiation after the decimation) and wrap around as you described. With interpolation, the differentiators come first, followed by integrators at the high rate, and those are the structures that will turn into white noise generators with a single math error. 

Bob
On Sun, 01 Jan 2017 07:11:42 -0800, radams2000 wrote:

> Tim > > That's true of CIC decimation filters, however Randy's post made no > mention of rate changes so he was referring to a filter where you take > the difference between the input and output of a delay line and > integrate the result, in which case an error will cause a permanent DC > offset. > When using CIC filters for changing the rate, decimation and > interpolation behave differently. In decimation mode, the integrators > come first (followed by differentiation after the decimation) and wrap > around as you described. With interpolation, the differentiators come > first, followed by integrators at the high rate, and those are the > structures that will turn into white noise generators with a single math > error.
Doesna matter -- do the math. Instead of saving past values of the input, save past values of the sum. Yes, you have to save ALL past values of the sum, and not just the ones on the decimation boundary, but then, you don't have that niggling worry about persistent errors. Because -- (B - A) mod C is still (B - A) mod C, even if you change the details of how you got there. -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> writes:

> On Fri, 30 Dec 2016 05:56:16 -0800, radams2000 wrote: > >> I use these filters all the time, but there's this nagging worry that a >> single-cycle math error (caused by a glitch or an alpha particle or >> whatever ) will produce a permanent DC offset in the output. If you >> want to get fancy you can detect this and reset your filter. >> >> Bob > > The technique used in CIC filters makes that a non-problem -- instead of > saving the vector of past data and subtracting out the n-th oldest one > from the sum (and thus preserving any errors to the sum forever), save a > vector of past sums and subtract out the n-th oldest one. If you use > wrap-on-overflow arithmetic, then your alpha-particle data-smash will > only affect the filter output for at most n samples -- the same as if it > happened to the input data. > > In general it means you're saving data with a wider word length, but > you're picking up some dependability.
Tim, I don't understand. Are you saying form a filter with this difference equation: y[n] = y[n - 1] - y[n - N] + x[n] ? If my math is correct, this works out to a z-transform of Y(z)/X(z) = 1 - z^(-1) + z^(-N), and this is not the same as a boxcar sum, is it? -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
On Sun, 01 Jan 2017 22:09:42 -0500, Randy Yates wrote:

> Tim Wescott <tim@seemywebsite.com> writes: > >> On Fri, 30 Dec 2016 05:56:16 -0800, radams2000 wrote: >> >>> I use these filters all the time, but there's this nagging worry that >>> a single-cycle math error (caused by a glitch or an alpha particle or >>> whatever ) will produce a permanent DC offset in the output. If you >>> want to get fancy you can detect this and reset your filter. >>> >>> Bob >> >> The technique used in CIC filters makes that a non-problem -- instead >> of saving the vector of past data and subtracting out the n-th oldest >> one from the sum (and thus preserving any errors to the sum forever), >> save a vector of past sums and subtract out the n-th oldest one. If >> you use wrap-on-overflow arithmetic, then your alpha-particle >> data-smash will only affect the filter output for at most n samples -- >> the same as if it happened to the input data. >> >> In general it means you're saving data with a wider word length, but >> you're picking up some dependability. > > Tim, > > I don't understand. Are you saying form a filter with this difference > equation: > > y[n] = y[n - 1] - y[n - N] + x[n] ? > > If my math is correct, this works out to a z-transform of > > Y(z)/X(z) = 1 - z^(-1) + z^(-N), > > and this is not the same as a boxcar sum, is it?
Given u[n] in and y[n] out: y[n] = y[n - 1] + u[n] - u[n - N] vs. y[n] = sum_k={n-N+1}^n u[n] vs. x[n] = x[n - 1] + u[n] y[n] = x[n] - x[n - N] Unless I've done something tremendously stupid with my math, they all have the transfer function H(z) = (1 - z^-N)/(1 - z^-1) = 1 + z^-1 + ... + z^-(N-1). The first is "traditional", the second is the cannonical FIR way of doing it. The third looks like it'd have all sorts of overflow problems, and would if you were using floats, but if you're using fixed-point overflow arithmetic ("typical C implementation" fixed point), then it'll be just fine. Looked at another way, the first and third cascade a difference that spans N taps and an integrator -- it's just that the third method switches the order of operations. -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
On 12/25/2016 11:27 PM, Randy Yates wrote:
> A colleague was explaining that when he tried using floats in a > recursive moving-average filter, there was some residual error which > wasn't present in the standard (non-recursive) moving-average filter. > > Would it be correct and reasonable to frame the problem like this: > > In the recursive moving-average implementation, the output y[n - 1] > needs to store perfect knowledge of the past N states, so that when you > subtract the input N - 1 samples back from y[n - 1], you perfectly > maintain the sum of inputs x[n - N + 2] to x[n - 1]. However, it does > not. > > In general, every addition/subtraction to y[n - m], m < N, produces a > quantization error; this quantization error is propagated into the new > output y[n]. > > So it is the fact that the output "state" of the filter (i.e., the > output y[n - 1]) is not perfect, i.e., the state of the filter is > tainted, that creates this residual error. > > This also illustrates why a fixed-point implementation (using the > appropriate intermediate accumulator width) of the same filter > works: the output state is known perfectly. > > Correct? Good way to view it? (Trivial???)
Am I correct in thinking the recursive version of the filter is one where the current value of the output is calculated from the previous value of the output, the current value of the input and an older value of the input? The idea being that the recursive calculation is fewer steps than the "standard" approach where each output is calculated from a series of values of the input. -- Rick C
rickman <gnuarm@gmail.com> writes:

> On 12/25/2016 11:27 PM, Randy Yates wrote: >> A colleague was explaining that when he tried using floats in a >> recursive moving-average filter, there was some residual error which >> wasn't present in the standard (non-recursive) moving-average filter. >> >> Would it be correct and reasonable to frame the problem like this: >> >> In the recursive moving-average implementation, the output y[n - 1] >> needs to store perfect knowledge of the past N states, so that when you >> subtract the input N - 1 samples back from y[n - 1], you perfectly >> maintain the sum of inputs x[n - N + 2] to x[n - 1]. However, it does >> not. >> >> In general, every addition/subtraction to y[n - m], m < N, produces a >> quantization error; this quantization error is propagated into the new >> output y[n]. >> >> So it is the fact that the output "state" of the filter (i.e., the >> output y[n - 1]) is not perfect, i.e., the state of the filter is >> tainted, that creates this residual error. >> >> This also illustrates why a fixed-point implementation (using the >> appropriate intermediate accumulator width) of the same filter >> works: the output state is known perfectly. >> >> Correct? Good way to view it? (Trivial???) > > Am I correct in thinking the recursive version of the filter is one > where the current value of the output is calculated from the previous > value of the output, the current value of the input and an older value > of the input? The idea being that the recursive calculation is fewer > steps than the "standard" approach where each output is calculated > from a series of values of the input.
Hi rickman, Yes. y[n] = y[n - 1] - x[n - N] + x[n] -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com