DSPRelated.com
Forums

How to simplify a rolling average in an FPGA

Started by garengllc September 30, 2015
On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:

>>I know I may not be able to simplify things to give me a bit perfect > match >>to the C code, but is there a way to compute this rolling average in an >>FPGA that is lightweight and a close approximation? >> >> >>--------------------------------------- >>Posted through http://www.DSPRelated.com > > I may have worked it out, but would still like to hear other's opinions. > The code I ended up with is: > rolling_average = rolling_average - > (rolling_average-newValue)/rolling_average_num_max > > I am looking into converting rolling_average_num_max to the closest > power of two, and then using that to shift the > (rolling_average-newValue) numerator.
You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. Converting the above C into VHDL gives something like this: library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity averager is generic ( g_width : positive; g_pole_shift : positive ); port ( clk : in std_logic; newValue : in signed(g_width-1 downto 0); rolling_average : out signed(g_width-1 downto 0) ); end entity averager; architecture comp_dsp of averager is begin -- architecture comp_dsp of entity averager naive_attempt : process (clk) constant width : positive := g_width + g_pole_shift; variable sum : signed(width-1 downto 0) := (others => '0'); begin if rising_edge(clk) then sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift; rolling_average <= sum(width-1 downto width-g_width); end if; end process naive_attempt; end architecture comp_dsp; -- of entity averager The arithmetic will synthesise to two adders, despite the use of * and / in the source. I called this one "naive" because it's simple and obvious but really bad - the intermediate result has full precision but the output is just the most significant bits of the internal value. BTW, I haven't tested this; don't use it for anything that matters. Regards, Allan
On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:

> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: > >>>I know I may not be able to simplify things to give me a bit perfect >> match >>>to the C code, but is there a way to compute this rolling average in an >>>FPGA that is lightweight and a close approximation? >>> >>> >>>--------------------------------------- >>>Posted through http://www.DSPRelated.com >> >> I may have worked it out, but would still like to hear other's opinions. >> The code I ended up with is: >> rolling_average = rolling_average - >> (rolling_average-newValue)/rolling_average_num_max >> >> I am looking into converting rolling_average_num_max to the closest >> power of two, and then using that to shift the >> (rolling_average-newValue) numerator. > > > You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. > > Converting the above C into VHDL gives something like this: > > > library ieee; > use ieee.std_logic_1164.all; > use ieee.numeric_std.all; > > entity averager is > generic ( > g_width : positive; > g_pole_shift : positive > ); > port ( > clk : in std_logic; > newValue : in signed(g_width-1 downto 0); > rolling_average : out signed(g_width-1 downto 0) > ); > end entity averager; > > architecture comp_dsp of averager is > > begin -- architecture comp_dsp of entity averager > > naive_attempt : process (clk) > constant width : positive := g_width + g_pole_shift; > variable sum : signed(width-1 downto 0) := (others => '0'); > begin > if rising_edge(clk) then > sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift; > rolling_average <= sum(width-1 downto width-g_width); > end if; > end process naive_attempt; > > end architecture comp_dsp; -- of entity averager > > > The arithmetic will synthesise to two adders, despite > the use of * and / in the source. > > I called this one "naive" because it's simple and obvious but > really bad - the intermediate result has full precision but > the output is just the most significant bits of the internal > value.
And here's one that's slightly less suckful: (Same entity declaration as before) -- This one uses the fraction saving method -- described in the Randy Yates / Richard Lyons -- DC Blocker paper. -- Variable names used here mostly match the -- names used in Figure 2 of that paper. fraction_saving : process (clk) variable s : signed(g_width-1 downto 0); variable u : signed(g_width+g_pole_shift-1 downto 0); variable f_d : unsigned(g_pole_shift-1 downto 0) := (others => '0'); variable y : signed(g_width-1 downto 0) := (others => '0'); variable w : signed(g_width-1 downto 0); constant tail : signed(g_pole_shift-1 downto 0) := (others => '0'); begin if rising_edge(clk) then w := newValue; u := signed('0' & f_d) + (y & tail) - y; s := u(g_width+g_pole_shift-1 downto g_pole_shift); f_d := unsigned(u(g_pole_shift-1 downto 0)); y := w + s; rolling_average <= y; end if; end process fraction_saving; As before, I haven't tested this; don't use it for anything that matters. Regards, Allan
All, thank you so much for all the feedback and explanations, I appreciate
it!

I've read through everything once, but I think I need to go through it a
couple more times to make sure I am following everyone's
points/counterpoints.

Some quick answers (so that you know that I didn't disappear into the
ether) is that I do need the output of rolling_average on every clock
(probably why the original C coder called it a rolling average even though
it isn't exactly acting like it).  

The rolling_average_num_max is an odd value to me.  It is a constant
variable that will not change once the FPGA is built. What it looks like
to me is that rolling_average_num keeps track of the depth of window
average from [0,rolling_average_num_max].  So even in the beginning when
the average only has a few values in it, it will compute the average based
on that.  Once the system has been running for a while,
rolling_average_num will always be equal to rolling_average_num_max, and
that variable in the averaging equation will be a constant.  That make
sense?
---------------------------------------
Posted through http://www.DSPRelated.com
>On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote: > >> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >> >>>>I know I may not be able to simplify things to give me a bit perfect >>> match >>>>to the C code, but is there a way to compute this rolling average in
an
>>>>FPGA that is lightweight and a close approximation? >>>> >>>> >>>>--------------------------------------- >>>>Posted through http://www.DSPRelated.com >>> >>> I may have worked it out, but would still like to hear other's
opinions.
>>> The code I ended up with is: >>> rolling_average = rolling_average - >>> (rolling_average-newValue)/rolling_average_num_max >>> >>> I am looking into converting rolling_average_num_max to the closest >>> power of two, and then using that to shift the >>> (rolling_average-newValue) numerator. >> >> >> You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. >> >> Converting the above C into VHDL gives something like this: >> >> >> library ieee; >> use ieee.std_logic_1164.all; >> use ieee.numeric_std.all; >> >> entity averager is >> generic ( >> g_width : positive; >> g_pole_shift : positive >> ); >> port ( >> clk : in std_logic; >> newValue : in signed(g_width-1 downto 0); >> rolling_average : out signed(g_width-1 downto 0) >> ); >> end entity averager; >> >> architecture comp_dsp of averager is >> >> begin -- architecture comp_dsp of entity averager >> >> naive_attempt : process (clk) >> constant width : positive := g_width + g_pole_shift; >> variable sum : signed(width-1 downto 0) := (others => '0'); >> begin >> if rising_edge(clk) then >> sum := sum - (sum - newValue * 2**g_pole_shift) / >2**g_pole_shift; >> rolling_average <= sum(width-1 downto width-g_width); >> end if; >> end process naive_attempt; >> >> end architecture comp_dsp; -- of entity averager >> >> >> The arithmetic will synthesise to two adders, despite >> the use of * and / in the source. >> >> I called this one "naive" because it's simple and obvious but >> really bad - the intermediate result has full precision but >> the output is just the most significant bits of the internal >> value. > > >And here's one that's slightly less suckful: > >(Same entity declaration as before) > > -- This one uses the fraction saving method > -- described in the Randy Yates / Richard Lyons > -- DC Blocker paper. > -- Variable names used here mostly match the > -- names used in Figure 2 of that paper. > fraction_saving : process (clk) > variable s : signed(g_width-1 downto 0); > variable u : signed(g_width+g_pole_shift-1 downto 0); > variable f_d : unsigned(g_pole_shift-1 downto 0) := (others
=>
>'0'); > variable y : signed(g_width-1 downto 0) := (others => '0'); > variable w : signed(g_width-1 downto 0); > constant tail : signed(g_pole_shift-1 downto 0) := (others => >'0'); > begin > if rising_edge(clk) then > w := newValue; > u := signed('0' & f_d) + (y & tail) - y; > s := u(g_width+g_pole_shift-1 downto g_pole_shift); > f_d := unsigned(u(g_pole_shift-1 downto 0)); > y := w + s; > rolling_average <= y; > end if; > end process fraction_saving; > > >As before, I haven't tested this; don't use it for anything >that matters. > >Regards, >Allan
Just saw this post (I didn't notice I had two pages of replies, you guys rock!), sorry. Of course I am using Verilog, but that is OK, this is pretty straightforward. I am going to look over the paper and this code today as well! --------------------------------------- Posted through http://www.DSPRelated.com
On 10/1/2015 9:11 AM, garengllc wrote:
> All, thank you so much for all the feedback and explanations, I appreciate > it! > > I've read through everything once, but I think I need to go through it a > couple more times to make sure I am following everyone's > points/counterpoints. > > Some quick answers (so that you know that I didn't disappear into the > ether) is that I do need the output of rolling_average on every clock > (probably why the original C coder called it a rolling average even though > it isn't exactly acting like it). > > The rolling_average_num_max is an odd value to me. It is a constant > variable that will not change once the FPGA is built. What it looks like > to me is that rolling_average_num keeps track of the depth of window > average from [0,rolling_average_num_max]. So even in the beginning when > the average only has a few values in it, it will compute the average based > on that. Once the system has been running for a while, > rolling_average_num will always be equal to rolling_average_num_max, and > that variable in the averaging equation will be a constant. That make > sense?
No, not really. I don't see any use in the calculations for rolling_average_num_max other than to set a limit of the computation period if desired. Why is there a max at all? The only limits are in the resolution of the calculations. If you cap rolling_average_num to rolling_average_num_max and continue to calculate the average without removing the oldest data (like in a boxcar average) you will be distorting your results. -- Rick
On 10/1/2015 8:41 AM, Allan Herriman wrote:
> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote: > >> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >> >>>> I know I may not be able to simplify things to give me a bit perfect >>> match >>>> to the C code, but is there a way to compute this rolling average in an >>>> FPGA that is lightweight and a close approximation? >>>> >>>> >>>> --------------------------------------- >>>> Posted through http://www.DSPRelated.com >>> >>> I may have worked it out, but would still like to hear other's opinions. >>> The code I ended up with is: >>> rolling_average = rolling_average - >>> (rolling_average-newValue)/rolling_average_num_max >>> >>> I am looking into converting rolling_average_num_max to the closest >>> power of two, and then using that to shift the >>> (rolling_average-newValue) numerator. >> >> >> You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. >> >> Converting the above C into VHDL gives something like this: >> >> >> library ieee; >> use ieee.std_logic_1164.all; >> use ieee.numeric_std.all; >> >> entity averager is >> generic ( >> g_width : positive; >> g_pole_shift : positive >> ); >> port ( >> clk : in std_logic; >> newValue : in signed(g_width-1 downto 0); >> rolling_average : out signed(g_width-1 downto 0) >> ); >> end entity averager; >> >> architecture comp_dsp of averager is >> >> begin -- architecture comp_dsp of entity averager >> >> naive_attempt : process (clk) >> constant width : positive := g_width + g_pole_shift; >> variable sum : signed(width-1 downto 0) := (others => '0'); >> begin >> if rising_edge(clk) then >> sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift; >> rolling_average <= sum(width-1 downto width-g_width); >> end if; >> end process naive_attempt; >> >> end architecture comp_dsp; -- of entity averager >> >> >> The arithmetic will synthesise to two adders, despite >> the use of * and / in the source. >> >> I called this one "naive" because it's simple and obvious but >> really bad - the intermediate result has full precision but >> the output is just the most significant bits of the internal >> value. > > > And here's one that's slightly less suckful: > > (Same entity declaration as before) > > -- This one uses the fraction saving method > -- described in the Randy Yates / Richard Lyons > -- DC Blocker paper. > -- Variable names used here mostly match the > -- names used in Figure 2 of that paper. > fraction_saving : process (clk) > variable s : signed(g_width-1 downto 0); > variable u : signed(g_width+g_pole_shift-1 downto 0); > variable f_d : unsigned(g_pole_shift-1 downto 0) := (others => '0'); > variable y : signed(g_width-1 downto 0) := (others => '0'); > variable w : signed(g_width-1 downto 0); > constant tail : signed(g_pole_shift-1 downto 0) := (others => '0'); > begin > if rising_edge(clk) then > w := newValue; > u := signed('0' & f_d) + (y & tail) - y; > s := u(g_width+g_pole_shift-1 downto g_pole_shift); > f_d := unsigned(u(g_pole_shift-1 downto 0)); > y := w + s; > rolling_average <= y; > end if; > end process fraction_saving; > > > As before, I haven't tested this; don't use it for anything > that matters.
I don't know the "fraction saving method", so I would have to work pretty hard to understand this code. If it implements the same algorithm as your first example it would not be an average. Your initial did not divide by the number of samples and in fact didn't keep track of the number of samples. I assume this one also doesn't divide by the number of samples because I don't see where you keep track of the number of samples. That would explain why there is no division or multiplication required to implement them. -- Rick
On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote:

> On 10/1/2015 8:41 AM, Allan Herriman wrote: >> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote: >> >>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >>> >>>>> I know I may not be able to simplify things to give me a bit perfect >>>> match >>>>> to the C code, but is there a way to compute this rolling average
in an
>>>>> FPGA that is lightweight and a close approximation? >>>>> >>>>> >>>>> --------------------------------------- >>>>> Posted through http://www.DSPRelated.com >>>> >>>> I may have worked it out, but would still like to hear other's
opinions.
>>>> The code I ended up with is: >>>> rolling_average = rolling_average - >>>> (rolling_average-newValue)/rolling_average_num_max >>>> >>>> I am looking into converting rolling_average_num_max to the closest >>>> power of two, and then using that to shift the >>>> (rolling_average-newValue) numerator. >>> >>> >>> You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. >>> >>> Converting the above C into VHDL gives something like this: >>> >>> >>> library ieee; >>> use ieee.std_logic_1164.all; >>> use ieee.numeric_std.all; >>> >>> entity averager is >>> generic ( >>> g_width : positive; >>> g_pole_shift : positive >>> ); >>> port ( >>> clk : in std_logic; >>> newValue : in signed(g_width-1 downto 0); >>> rolling_average : out signed(g_width-1 downto 0) >>> ); >>> end entity averager; >>> >>> architecture comp_dsp of averager is >>> >>> begin -- architecture comp_dsp of entity averager >>> >>> naive_attempt : process (clk) >>> constant width : positive := g_width + g_pole_shift; >>> variable sum : signed(width-1 downto 0) := (others =>
'0');
>>> begin >>> if rising_edge(clk) then >>> sum := sum - (sum - newValue * 2**g_pole_shift) /
2**g_pole_shift;
>>> rolling_average <= sum(width-1 downto width-g_width); >>> end if; >>> end process naive_attempt; >>> >>> end architecture comp_dsp; -- of entity averager >>> >>> >>> The arithmetic will synthesise to two adders, despite >>> the use of * and / in the source. >>> >>> I called this one "naive" because it's simple and obvious but >>> really bad - the intermediate result has full precision but >>> the output is just the most significant bits of the internal >>> value. >> >> >> And here's one that's slightly less suckful: >> >> (Same entity declaration as before) >> >> -- This one uses the fraction saving method >> -- described in the Randy Yates / Richard Lyons >> -- DC Blocker paper. >> -- Variable names used here mostly match the >> -- names used in Figure 2 of that paper. >> fraction_saving : process (clk) >> variable s : signed(g_width-1 downto 0); >> variable u : signed(g_width+g_pole_shift-1 downto 0); >> variable f_d : unsigned(g_pole_shift-1 downto 0) :=
(others => '0');
>> variable y : signed(g_width-1 downto 0) := (others =>
'0');
>> variable w : signed(g_width-1 downto 0); >> constant tail : signed(g_pole_shift-1 downto 0) := (others
=> '0');
>> begin >> if rising_edge(clk) then >> w := newValue; >> u := signed('0' & f_d) + (y & tail) - y; >> s := u(g_width+g_pole_shift-1 downto g_pole_shift); >> f_d := unsigned(u(g_pole_shift-1 downto 0)); >> y := w + s; >> rolling_average <= y; >> end if; >> end process fraction_saving; >> >> >> As before, I haven't tested this; don't use it for anything >> that matters. > > I don't know the "fraction saving method", so I would have to work > pretty hard to understand this code. If it implements the same > algorithm as your first example it would not be an average. Your > initial did not divide by the number of samples and in fact didn't keep > track of the number of samples. I assume this one also doesn't divide > by the number of samples because I don't see where you keep track of
the
> number of samples. That would explain why there is no division or > multiplication required to implement them.
You might know fraction saving as error feedback, or perhaps delta sigma modulation. I recommend reading the Yates / Lyons paper. Based on the both the C code and the description, the OP seems to be asking for a filter with a u(t).exp(-t) shaped impulse response. One does not need to waste logic keeping track of the number of samples to do that. Of course, I may have interpreted the requirements wrongly. Allan
On 10/1/2015 10:12 AM, Allan Herriman wrote:
> On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote: > >> On 10/1/2015 8:41 AM, Allan Herriman wrote: >>> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote: >>> >>>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >>>> >>>>>> I know I may not be able to simplify things to give me a bit perfect >>>>> match >>>>>> to the C code, but is there a way to compute this rolling average > in an >>>>>> FPGA that is lightweight and a close approximation? >>>>>> >>>>>> >>>>>> --------------------------------------- >>>>>> Posted through http://www.DSPRelated.com >>>>> >>>>> I may have worked it out, but would still like to hear other's > opinions. >>>>> The code I ended up with is: >>>>> rolling_average = rolling_average - >>>>> (rolling_average-newValue)/rolling_average_num_max >>>>> >>>>> I am looking into converting rolling_average_num_max to the closest >>>>> power of two, and then using that to shift the >>>>> (rolling_average-newValue) numerator. >>>> >>>> >>>> You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. >>>> >>>> Converting the above C into VHDL gives something like this: >>>> >>>> >>>> library ieee; >>>> use ieee.std_logic_1164.all; >>>> use ieee.numeric_std.all; >>>> >>>> entity averager is >>>> generic ( >>>> g_width : positive; >>>> g_pole_shift : positive >>>> ); >>>> port ( >>>> clk : in std_logic; >>>> newValue : in signed(g_width-1 downto 0); >>>> rolling_average : out signed(g_width-1 downto 0) >>>> ); >>>> end entity averager; >>>> >>>> architecture comp_dsp of averager is >>>> >>>> begin -- architecture comp_dsp of entity averager >>>> >>>> naive_attempt : process (clk) >>>> constant width : positive := g_width + g_pole_shift; >>>> variable sum : signed(width-1 downto 0) := (others => > '0'); >>>> begin >>>> if rising_edge(clk) then >>>> sum := sum - (sum - newValue * 2**g_pole_shift) / > 2**g_pole_shift; >>>> rolling_average <= sum(width-1 downto width-g_width); >>>> end if; >>>> end process naive_attempt; >>>> >>>> end architecture comp_dsp; -- of entity averager >>>> >>>> >>>> The arithmetic will synthesise to two adders, despite >>>> the use of * and / in the source. >>>> >>>> I called this one "naive" because it's simple and obvious but >>>> really bad - the intermediate result has full precision but >>>> the output is just the most significant bits of the internal >>>> value. >>> >>> >>> And here's one that's slightly less suckful: >>> >>> (Same entity declaration as before) >>> >>> -- This one uses the fraction saving method >>> -- described in the Randy Yates / Richard Lyons >>> -- DC Blocker paper. >>> -- Variable names used here mostly match the >>> -- names used in Figure 2 of that paper. >>> fraction_saving : process (clk) >>> variable s : signed(g_width-1 downto 0); >>> variable u : signed(g_width+g_pole_shift-1 downto 0); >>> variable f_d : unsigned(g_pole_shift-1 downto 0) := > (others => '0'); >>> variable y : signed(g_width-1 downto 0) := (others => > '0'); >>> variable w : signed(g_width-1 downto 0); >>> constant tail : signed(g_pole_shift-1 downto 0) := (others > => '0'); >>> begin >>> if rising_edge(clk) then >>> w := newValue; >>> u := signed('0' & f_d) + (y & tail) - y; >>> s := u(g_width+g_pole_shift-1 downto g_pole_shift); >>> f_d := unsigned(u(g_pole_shift-1 downto 0)); >>> y := w + s; >>> rolling_average <= y; >>> end if; >>> end process fraction_saving; >>> >>> >>> As before, I haven't tested this; don't use it for anything >>> that matters. >> >> I don't know the "fraction saving method", so I would have to work >> pretty hard to understand this code. If it implements the same >> algorithm as your first example it would not be an average. Your >> initial did not divide by the number of samples and in fact didn't keep >> track of the number of samples. I assume this one also doesn't divide >> by the number of samples because I don't see where you keep track of > the >> number of samples. That would explain why there is no division or >> multiplication required to implement them. > > > You might know fraction saving as error feedback, or perhaps delta sigma > modulation. > I recommend reading the Yates / Lyons paper. > > Based on the both the C code and the description, the OP seems to be > asking for a filter with a > u(t).exp(-t) > shaped impulse response. > One does not need to waste logic keeping track of the number of samples > to do that. > > Of course, I may have interpreted the requirements wrongly.
I believe you need to reexamine the original post and possibly others as well. The OP is not designing a filter. He is implementing an average that updates the result on every new sample arriving. Wikipedia calls this the cumulative moving average. It looks a bit like a filter because of the way he wrote the equations, but is actually is just a simple average. Of course, in the strict sense that is also a filter, but is simple enough that there is no need to treat it as a complex filter. The code for this is just avg_num = avg_num + 1 ; sum = sum + newValue ; avg = sum / avg_num ; Throw this in a clocked process (or an always @ (posedge clk) block) and you have your average. The only sticking point is a simple way to calculate the divide. Of course there needs to be initialization code and there may need to be code to restart the average if the OP needs that. The division can be done fairly easily for small values of avg_num using a look up table based on a block RAM. The reciprocal of avg_num would be obtained by the lookup table and then multiplied by the sum to get the average. I believe the OP indicated the max value of the sample count is 8000 which should not consume too many resources in most FPGAs for the reciprocal calculation. This avoids the accumulated errors from the form that looks like a filter. -- Rick

>The division can be done fairly easily for small values of avg_num using
>a look up table based on a block RAM. The reciprocal of avg_num would >be obtained by the lookup table and then multiplied by the sum to get >the average. I believe the OP indicated the max value of the sample >count is 8000 which should not consume too many resources in most FPGAs >for the reciprocal calculation. This avoids the accumulated errors from
>the form that looks like a filter. > >-- > >Rick
for running averager: have n delay stages, subtract last sample from new sample, accumulate continuously. For (scaling back) division: just use 2^n taps (say 64) then discard 6 LSBs from result. I explained that already in my first response but got drowned up. Kaz --------------------------------------- Posted through http://www.DSPRelated.com
On Thu, 01 Oct 2015 11:22:07 -0400, rickman wrote:

> On 10/1/2015 10:12 AM, Allan Herriman wrote: >> On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote: >> >>> On 10/1/2015 8:41 AM, Allan Herriman wrote: >>>> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote: >>>> >>>>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote: >>>>> >>>>>>> I know I may not be able to simplify things to give me a bit >>>>>>> perfect >>>>>> match >>>>>>> to the C code, but is there a way to compute this rolling average >> in an >>>>>>> FPGA that is lightweight and a close approximation? >>>>>>> >>>>>>> >>>>>>> --------------------------------------- >>>>>>> Posted through http://www.DSPRelated.com >>>>>> >>>>>> I may have worked it out, but would still like to hear other's >> opinions. >>>>>> The code I ended up with is: >>>>>> rolling_average = rolling_average - >>>>>> (rolling_average-newValue)/rolling_average_num_max >>>>>> >>>>>> I am looking into converting rolling_average_num_max to the closest >>>>>> power of two, and then using that to shift the >>>>>> (rolling_average-newValue) numerator. >>>>> >>>>> >>>>> You didn't specify VHDL or Verilog. Ok, toss a coin, VHDL it is. >>>>> >>>>> Converting the above C into VHDL gives something like this: >>>>> >>>>> >>>>> library ieee; >>>>> use ieee.std_logic_1164.all; >>>>> use ieee.numeric_std.all; >>>>> >>>>> entity averager is >>>>> generic ( >>>>> g_width : positive; >>>>> g_pole_shift : positive >>>>> ); >>>>> port ( >>>>> clk : in std_logic; >>>>> newValue : in signed(g_width-1 downto 0); >>>>> rolling_average : out signed(g_width-1 downto 0) >>>>> ); >>>>> end entity averager; >>>>> >>>>> architecture comp_dsp of averager is >>>>> >>>>> begin -- architecture comp_dsp of entity averager >>>>> >>>>> naive_attempt : process (clk) >>>>> constant width : positive := g_width + g_pole_shift; >>>>> variable sum : signed(width-1 downto 0) := (others => >> '0'); >>>>> begin >>>>> if rising_edge(clk) then >>>>> sum := sum - (sum - newValue * 2**g_pole_shift) / >> 2**g_pole_shift; >>>>> rolling_average <= sum(width-1 downto width-g_width); >>>>> end if; >>>>> end process naive_attempt; >>>>> >>>>> end architecture comp_dsp; -- of entity averager >>>>> >>>>> >>>>> The arithmetic will synthesise to two adders, despite the use of * >>>>> and / in the source. >>>>> >>>>> I called this one "naive" because it's simple and obvious but really >>>>> bad - the intermediate result has full precision but the output is >>>>> just the most significant bits of the internal value. >>>> >>>> >>>> And here's one that's slightly less suckful: >>>> >>>> (Same entity declaration as before) >>>> >>>> -- This one uses the fraction saving method -- described in the >>>> Randy Yates / Richard Lyons -- DC Blocker paper. >>>> -- Variable names used here mostly match the -- names used in >>>> Figure 2 of that paper. fraction_saving : process (clk) >>>> variable s : signed(g_width-1 downto 0); >>>> variable u : signed(g_width+g_pole_shift-1 downto 0); >>>> variable f_d : unsigned(g_pole_shift-1 downto 0) := >> (others => '0'); >>>> variable y : signed(g_width-1 downto 0) := (others => >> '0'); >>>> variable w : signed(g_width-1 downto 0); >>>> constant tail : signed(g_pole_shift-1 downto 0) := >>>> (others >> => '0'); >>>> begin >>>> if rising_edge(clk) then >>>> w := newValue; >>>> u := signed('0' & f_d) + (y & tail) - y; >>>> s := u(g_width+g_pole_shift-1 downto g_pole_shift); >>>> f_d := unsigned(u(g_pole_shift-1 downto 0)); >>>> y := w + s; >>>> rolling_average <= y; >>>> end if; >>>> end process fraction_saving; >>>> >>>> >>>> As before, I haven't tested this; don't use it for anything that >>>> matters. >>> >>> I don't know the "fraction saving method", so I would have to work >>> pretty hard to understand this code. If it implements the same >>> algorithm as your first example it would not be an average. Your >>> initial did not divide by the number of samples and in fact didn't >>> keep track of the number of samples. I assume this one also doesn't >>> divide by the number of samples because I don't see where you keep >>> track of >> the >>> number of samples. That would explain why there is no division or >>> multiplication required to implement them. >> >> >> You might know fraction saving as error feedback, or perhaps delta >> sigma modulation. >> I recommend reading the Yates / Lyons paper. >> >> Based on the both the C code and the description, the OP seems to be >> asking for a filter with a u(t).exp(-t) >> shaped impulse response. >> One does not need to waste logic keeping track of the number of samples >> to do that. >> >> Of course, I may have interpreted the requirements wrongly. > > I believe you need to reexamine the original post and possibly others as > well. The OP is not designing a filter. He is implementing an average > that updates the result on every new sample arriving. Wikipedia calls > this the cumulative moving average. It looks a bit like a filter > because of the way he wrote the equations, but is actually is just a > simple average. Of course, in the strict sense that is also a filter, > but is simple enough that there is no need to treat it as a complex > filter. The code for this is just > > avg_num = avg_num + 1 ; > sum = sum + newValue ; > avg = sum / avg_num ; > > Throw this in a clocked process (or an always @ (posedge clk) block) and > you have your average. The only sticking point is a simple way to > calculate the divide. Of course there needs to be initialization code > and there may need to be code to restart the average if the OP needs > that. > > The division can be done fairly easily for small values of avg_num using > a look up table based on a block RAM. The reciprocal of avg_num would > be obtained by the lookup table and then multiplied by the sum to get > the average. I believe the OP indicated the max value of the sample > count is 8000 which should not consume too many resources in most FPGAs > for the reciprocal calculation. This avoids the accumulated errors from > the form that looks like a filter.
Dang. That form looked so much like a single pole IIR filter I thought it was one. For the division, I was thinking that perhaps Newton-Raphson to find the reciprocal might work well, as the reciprocal at any input sample will be a good approximation for the next reciprocal, meaning that only a single N-R iteration per input sample should give an accurate result. Each N-R iteration would require two multiplications. Allan