DSPRelated.com
Forums

How to simplify a rolling average in an FPGA

Started by garengllc September 30, 2015
I am trying to replicate some C code in an FPGA and am getting wrapped
around the axle of the rolling average that is being used.  I am sure
there is a simple way to do what is being done in C (or at least close),
but I seem to keep going in mental circles trying to converge onto it.  

The C code is pretty simple:
rolling_average =
rolling_average*((rolling_average_num-1)/rolling_average_num) +
newValue*(1.0/rolling_average_num)

*rolling_average - the updated rolling average
*rolling_average_num - this is basically the number of samples we have in
the average (with an upperbound of rolling_average_num_max [call it 8000
for this example])
*newValue - The new value to get put into the average.

The way I see it, once the rolling average has been running for a while,
((rolling_average_num-1)/rolling_average_num) will be a number slightly
smaller than 1 (call it 7999/8000 for simplicity sake), and
(1.0/rolling_average_num) will be a very small number (call it 1/8000 in
this case), so I ought to be able to play some powers-of-two games for
doing the divisions, but I can't quite seem to come up with it.

I know I may not be able to simplify things to give me a bit perfect match
to the C code, but is there a way to compute this rolling average in an
FPGA that is lightweight and a close approximation?


---------------------------------------
Posted through http://www.DSPRelated.com
>I know I may not be able to simplify things to give me a bit perfect
match
>to the C code, but is there a way to compute this rolling average in an >FPGA that is lightweight and a close approximation? > > >--------------------------------------- >Posted through http://www.DSPRelated.com
I may have worked it out, but would still like to hear other's opinions. The code I ended up with is: rolling_average = rolling_average - (rolling_average-newValue)/rolling_average_num_max I am looking into converting rolling_average_num_max to the closest power of two, and then using that to shift the (rolling_average-newValue) numerator. --------------------------------------- Posted through http://www.DSPRelated.com
you can do block average by just an accumulator of 2^n samples to be
cleared at end and restart. Then all you need is discard n LSBs.

For running average you can apply delay line as many as your taps and
subtract a new sample from last then accumulate continuously(recursive
implementation).

A good reference is a book titled "Digital Filters design for FPGAs"

Kaz
---------------------------------------
Posted through http://www.DSPRelated.com
garengllc <62192@dsprelated> wrote:
> I am trying to replicate some C code in an FPGA and am getting wrapped > around the axle of the rolling average that is being used.
(snip)
> The C code is pretty simple: > rolling_average = > rolling_average*((rolling_average_num-1)/rolling_average_num) + > newValue*(1.0/rolling_average_num)
This looks more like decaying average, not rolling average. Well, rolling average sounds more like moving average than decaying average to me, but maybe not to everyone. Moving average is the uniform weight for the last N samples. Decaying average is a negative exponential weight that decays enough that even with an infinite number of points it gives a nice value. -- glen
On 9/30/2015 1:56 PM, garengllc wrote:
>> I know I may not be able to simplify things to give me a bit perfect > match >> to the C code, but is there a way to compute this rolling average in an >> FPGA that is lightweight and a close approximation? >> >> >> --------------------------------------- >> Posted through http://www.DSPRelated.com > > I may have worked it out, but would still like to hear other's opinions. > The code I ended up with is: > rolling_average = rolling_average - > (rolling_average-newValue)/rolling_average_num_max > > I am looking into converting rolling_average_num_max to the closest power > of two, and then using that to shift the (rolling_average-newValue) > numerator.
I think you are making this too complicated in all cases. Your original formula uses 2 divides, 2 multiplies and an add. This second attempt at the formula is close, but still more work than you need and is wrong. It is attempting to do one divide instead of two and the multiplies. It won't give accurate results because you are dividing by the same constant at every step. To have a correct value at every step you need to use the ratios you calculated in your first attempt if done that way. The basic definition of a rolling average only requires one add and one multiply for each result. If you don't need the result with every input you can just do the add to calculate the accumulated sum and do the divide when you need it, not that it saves you much other than power perhaps. A divide is awkward in hardware to say the least. sum <= sum + newValue ; rolling_average_num <= rolling_average_num + 1 ; -- The rolling average divide not required until result is read rolling_average <= sum / rolling_average_num ; Do you need intermediate results before the full 8000 samples are input? If so you need to do at least one divide for each result. The sum and count need to be updated with each sample. -- Rick
On 9/30/2015 2:45 PM, glen herrmannsfeldt wrote:
> garengllc <62192@dsprelated> wrote: >> I am trying to replicate some C code in an FPGA and am getting wrapped >> around the axle of the rolling average that is being used. > > (snip) > >> The C code is pretty simple: >> rolling_average = >> rolling_average*((rolling_average_num-1)/rolling_average_num) + >> newValue*(1.0/rolling_average_num) > > This looks more like decaying average, not rolling average. > > Well, rolling average sounds more like moving average than decaying > average to me, but maybe not to everyone. > > Moving average is the uniform weight for the last N samples. > > Decaying average is a negative exponential weight that decays > enough that even with an infinite number of points it gives a > nice value.
Are you thinking of an average that weights the most recent more heavily? That is not what the OP is calculating. I think he is asking about what Wikipedia calls the cumulative moving average. It is just an average of all the previous samples updated as each new one comes in. All the sample weights are equal. https://en.wikipedia.org/wiki/Moving_average#Cumulative_moving_average -- Rick
On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:

>>I know I may not be able to simplify things to give me a bit perfect > match >>to the C code, but is there a way to compute this rolling average in an >>FPGA that is lightweight and a close approximation? >> >> >>--------------------------------------- >>Posted through http://www.DSPRelated.com > > I may have worked it out, but would still like to hear other's opinions. > The code I ended up with is: > rolling_average = rolling_average - > (rolling_average-newValue)/rolling_average_num_max
Please don't delete context! That's not what you started with. What you started with isn't dividing by rolling_average_num_max (whatever that is) but rolling_average_num, which is, presumably, the number of points that have been averaged.
> I am looking into converting rolling_average_num_max to the closest > power of two, and then using that to shift the > (rolling_average-newValue) numerator.
The advantage to the cumulative moving average (thank you Rickman for digging that up) is that it is the optimal filter to estimate an unknown but unchanging value that is corrupted by white Gaussian noise. Better, it's not optimal after some time -- it's always optimal. When you think of it in the way that I worded it, it's actually a really simple Kalman filter. The downside to the filter is that you have that nasty divide in there. So when you change it to rolling_average = rolling_average - (rolling_average-newValue) * gain where "gain" is constrained to be an inverse power of 2, you lose some of the filter's optimality. I'm not sure what the best way to evolve your gain -- if you drop it too soon (i.e., if you let it go as 1, 1/2, 1/4, 1/4, 1/8, etc.) then you'll tend to "lock in" early errors. On the other hand, if you drop it too late (i.e., if you let it go as 1, 1/2, 1/2, 1/4, 1/4, etc.) then it'll take longer for your filter to settle out to a "good" value. I couldn't tell you the details of what you gain and lose unless I did the math. Another point to ponder, unless you're never going to use this filter in the long term, is that it will not be the optimal filter if the quantity that you're estimating changes, and it will inevitably stop working with finite word-length arithmetic at some point as gain goes to zero. I suspect that for a real-world problem, there's some minimum gain that you want to get down to and stick at. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 9/30/2015 6:15 PM, Tim Wescott wrote:
> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: > >>> I know I may not be able to simplify things to give me a bit perfect >> match >>> to the C code, but is there a way to compute this rolling average in an >>> FPGA that is lightweight and a close approximation? >>> >>> >>> --------------------------------------- >>> Posted through http://www.DSPRelated.com >> >> I may have worked it out, but would still like to hear other's opinions. >> The code I ended up with is: >> rolling_average = rolling_average - >> (rolling_average-newValue)/rolling_average_num_max > > Please don't delete context! > > That's not what you started with. What you started with isn't dividing > by rolling_average_num_max (whatever that is) but rolling_average_num, > which is, presumably, the number of points that have been averaged. > >> I am looking into converting rolling_average_num_max to the closest >> power of two, and then using that to shift the >> (rolling_average-newValue) numerator. > > The advantage to the cumulative moving average (thank you Rickman for > digging that up) is that it is the optimal filter to estimate an unknown > but unchanging value that is corrupted by white Gaussian noise. Better, > it's not optimal after some time -- it's always optimal. > > When you think of it in the way that I worded it, it's actually a really > simple Kalman filter. > > The downside to the filter is that you have that nasty divide in there. > So when you change it to > > rolling_average = rolling_average - > (rolling_average-newValue) * gain > > where "gain" is constrained to be an inverse power of 2, you lose some of > the filter's optimality. > > I'm not sure what the best way to evolve your gain -- if you drop it too > soon (i.e., if you let it go as 1, 1/2, 1/4, 1/4, 1/8, etc.) then you'll > tend to "lock in" early errors. On the other hand, if you drop it too > late (i.e., if you let it go as 1, 1/2, 1/2, 1/4, 1/4, etc.) then it'll > take longer for your filter to settle out to a "good" value. I couldn't > tell you the details of what you gain and lose unless I did the math. > > Another point to ponder, unless you're never going to use this filter in > the long term, is that it will not be the optimal filter if the quantity > that you're estimating changes, and it will inevitably stop working with > finite word-length arithmetic at some point as gain goes to zero. > > I suspect that for a real-world problem, there's some minimum gain that > you want to get down to and stick at.
I can't understand why the above formula is chosen to implement this average. It is exactly the same calculation as the one I provided. When implemented it has the limitation of accumulating errors from round off in the division. The actual calculation is N ___ 1 \ AVG = - > X(i) N / --- i=1 This is calculated for each value of N as samples enter the system and N grows. The sum can be maintained separate from the AVG and updated with a single addition. The AVG result is calculated by 1 division. There are actually 1 fewer additions this way and the same 1 division. But the round off errors of the divisions only impact the current output and are not accumulated. The problem of the divisor becoming too small to calculate with is replaced with the issue of the sum becoming too large for the word size, easily mitigated with double or multiple precision arithmetic. -- Rick
On Wed, 30 Sep 2015 18:59:10 -0400, rickman wrote:

> On 9/30/2015 6:15 PM, Tim Wescott wrote: >> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >> >>>> I know I may not be able to simplify things to give me a bit perfect >>> match >>>> to the C code, but is there a way to compute this rolling average in >>>> an FPGA that is lightweight and a close approximation? >>>> >>>> >>>> --------------------------------------- >>>> Posted through http://www.DSPRelated.com >>> >>> I may have worked it out, but would still like to hear other's >>> opinions. >>> The code I ended up with is: >>> rolling_average = rolling_average - >>> (rolling_average-newValue)/rolling_average_num_max >> >> Please don't delete context! >> >> That's not what you started with. What you started with isn't dividing >> by rolling_average_num_max (whatever that is) but rolling_average_num, >> which is, presumably, the number of points that have been averaged. >> >>> I am looking into converting rolling_average_num_max to the closest >>> power of two, and then using that to shift the >>> (rolling_average-newValue) numerator. >> >> The advantage to the cumulative moving average (thank you Rickman for >> digging that up) is that it is the optimal filter to estimate an >> unknown but unchanging value that is corrupted by white Gaussian noise. >> Better, it's not optimal after some time -- it's always optimal. >> >> When you think of it in the way that I worded it, it's actually a >> really simple Kalman filter. >> >> The downside to the filter is that you have that nasty divide in there. >> So when you change it to >> >> rolling_average = rolling_average - >> (rolling_average-newValue) * gain >> >> where "gain" is constrained to be an inverse power of 2, you lose some >> of the filter's optimality. >> >> I'm not sure what the best way to evolve your gain -- if you drop it >> too soon (i.e., if you let it go as 1, 1/2, 1/4, 1/4, 1/8, etc.) then >> you'll tend to "lock in" early errors. On the other hand, if you drop >> it too late (i.e., if you let it go as 1, 1/2, 1/2, 1/4, 1/4, etc.) >> then it'll take longer for your filter to settle out to a "good" value. >> I couldn't tell you the details of what you gain and lose unless I did >> the math. >> >> Another point to ponder, unless you're never going to use this filter >> in the long term, is that it will not be the optimal filter if the >> quantity that you're estimating changes, and it will inevitably stop >> working with finite word-length arithmetic at some point as gain goes >> to zero. >> >> I suspect that for a real-world problem, there's some minimum gain that >> you want to get down to and stick at. > > I can't understand why the above formula is chosen to implement this > average. It is exactly the same calculation as the one I provided. When > implemented it has the limitation of accumulating errors from round off > in the division. The actual calculation is > N > ___ > 1 \ > AVG = - > X(i) > N / > --- > i=1 > > This is calculated for each value of N as samples enter the system and N > grows. The sum can be maintained separate from the AVG and updated with > a single addition. The AVG result is calculated by 1 division. There > are actually 1 fewer additions this way and the same 1 division. But > the round off errors of the divisions only impact the current output and > are not accumulated. The problem of the divisor becoming too small to > calculate with is replaced with the issue of the sum becoming too large > for the word size, easily mitigated with double or multiple precision > arithmetic.
Hmm. Yes, that's right. You still have the divide, which I assume is the most expensive part, but you would turn some obscure problems into nice straightforward ones. I don't see how this method could be translated into the "only divide by powers of two" method -- but I'm not putting much effort into the thoughts! -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 9/30/2015 7:30 PM, Tim Wescott wrote:
> On Wed, 30 Sep 2015 18:59:10 -0400, rickman wrote: > >> On 9/30/2015 6:15 PM, Tim Wescott wrote: >>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >>> >>>>> I know I may not be able to simplify things to give me a bit perfect >>>> match >>>>> to the C code, but is there a way to compute this rolling average in >>>>> an FPGA that is lightweight and a close approximation? >>>>> >>>>> >>>>> --------------------------------------- >>>>> Posted through http://www.DSPRelated.com >>>> >>>> I may have worked it out, but would still like to hear other's >>>> opinions. >>>> The code I ended up with is: >>>> rolling_average = rolling_average - >>>> (rolling_average-newValue)/rolling_average_num_max >>> >>> Please don't delete context! >>> >>> That's not what you started with. What you started with isn't dividing >>> by rolling_average_num_max (whatever that is) but rolling_average_num, >>> which is, presumably, the number of points that have been averaged. >>> >>>> I am looking into converting rolling_average_num_max to the closest >>>> power of two, and then using that to shift the >>>> (rolling_average-newValue) numerator. >>> >>> The advantage to the cumulative moving average (thank you Rickman for >>> digging that up) is that it is the optimal filter to estimate an >>> unknown but unchanging value that is corrupted by white Gaussian noise. >>> Better, it's not optimal after some time -- it's always optimal. >>> >>> When you think of it in the way that I worded it, it's actually a >>> really simple Kalman filter. >>> >>> The downside to the filter is that you have that nasty divide in there. >>> So when you change it to >>> >>> rolling_average = rolling_average - >>> (rolling_average-newValue) * gain >>> >>> where "gain" is constrained to be an inverse power of 2, you lose some >>> of the filter's optimality. >>> >>> I'm not sure what the best way to evolve your gain -- if you drop it >>> too soon (i.e., if you let it go as 1, 1/2, 1/4, 1/4, 1/8, etc.) then >>> you'll tend to "lock in" early errors. On the other hand, if you drop >>> it too late (i.e., if you let it go as 1, 1/2, 1/2, 1/4, 1/4, etc.) >>> then it'll take longer for your filter to settle out to a "good" value. >>> I couldn't tell you the details of what you gain and lose unless I did >>> the math. >>> >>> Another point to ponder, unless you're never going to use this filter >>> in the long term, is that it will not be the optimal filter if the >>> quantity that you're estimating changes, and it will inevitably stop >>> working with finite word-length arithmetic at some point as gain goes >>> to zero. >>> >>> I suspect that for a real-world problem, there's some minimum gain that >>> you want to get down to and stick at. >> >> I can't understand why the above formula is chosen to implement this >> average. It is exactly the same calculation as the one I provided. When >> implemented it has the limitation of accumulating errors from round off >> in the division. The actual calculation is >> N >> ___ >> 1 \ >> AVG = - > X(i) >> N / >> --- >> i=1 >> >> This is calculated for each value of N as samples enter the system and N >> grows. The sum can be maintained separate from the AVG and updated with >> a single addition. The AVG result is calculated by 1 division. There >> are actually 1 fewer additions this way and the same 1 division. But >> the round off errors of the divisions only impact the current output and >> are not accumulated. The problem of the divisor becoming too small to >> calculate with is replaced with the issue of the sum becoming too large >> for the word size, easily mitigated with double or multiple precision >> arithmetic. > > Hmm. Yes, that's right. You still have the divide, which I assume is > the most expensive part, but you would turn some obscure problems into > nice straightforward ones. > > I don't see how this method could be translated into the "only divide by > powers of two" method -- but I'm not putting much effort into the > thoughts!
The OP hasn't replied if he really needs a result on every new input sample. I'm thinking he really just needs a result on the final sample of the block, but who knows. -- Rick