comp.dsp | How to simplify a rolling average in an FPGA| page 2

Reply by Allan Herriman ●October 1, 20152015-10-01

On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:

>>I know I may not be able to simplify things to give me a bit perfect
> match
>>to the C code, but is there a way to compute this rolling average in an
>>FPGA that is lightweight and a close approximation?
>>
>>
>>---------------------------------------
>>Posted through http://www.DSPRelated.com
> 
> I may have worked it out, but would still like to hear other's opinions.
> The code I ended up with is:
> rolling_average = rolling_average -
> (rolling_average-newValue)/rolling_average_num_max
> 
> I am looking into converting rolling_average_num_max to the closest
> power of two, and then using that to shift the
> (rolling_average-newValue) numerator.


You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.

Converting the above C into VHDL gives something like this:


library ieee;
use     ieee.std_logic_1164.all;
use     ieee.numeric_std.all;

entity averager is
    generic (
        g_width         : positive;
        g_pole_shift    : positive
    );
    port (
        clk             : in    std_logic;
        newValue        : in    signed(g_width-1 downto 0);
        rolling_average : out   signed(g_width-1 downto 0)
    );
end entity averager;

architecture comp_dsp of averager is

begin -- architecture comp_dsp of entity averager

    naive_attempt : process (clk)
        constant width  : positive := g_width + g_pole_shift;
        variable sum    : signed(width-1 downto 0) := (others => '0');
    begin
        if rising_edge(clk) then
            sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift;
            rolling_average <= sum(width-1 downto width-g_width);
        end if;
    end process naive_attempt;

end architecture comp_dsp; -- of entity averager


The arithmetic will synthesise to two adders, despite
the use of * and / in the source.

I called this one "naive" because it's simple and obvious but
really bad - the intermediate result has full precision but
the output is just the most significant bits of the internal
value.


BTW, I haven't tested this; don't use it for anything that
matters.

Regards,
Allan

Reply by Allan Herriman ●October 1, 20152015-10-01

On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:

> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
> 
>>>I know I may not be able to simplify things to give me a bit perfect
>> match
>>>to the C code, but is there a way to compute this rolling average in an
>>>FPGA that is lightweight and a close approximation?
>>>
>>>
>>>---------------------------------------
>>>Posted through http://www.DSPRelated.com
>> 
>> I may have worked it out, but would still like to hear other's opinions.
>> The code I ended up with is:
>> rolling_average = rolling_average -
>> (rolling_average-newValue)/rolling_average_num_max
>> 
>> I am looking into converting rolling_average_num_max to the closest
>> power of two, and then using that to shift the
>> (rolling_average-newValue) numerator.
> 
> 
> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
> 
> Converting the above C into VHDL gives something like this:
> 
> 
> library ieee;
> use     ieee.std_logic_1164.all;
> use     ieee.numeric_std.all;
> 
> entity averager is
>     generic (
>         g_width         : positive;
>         g_pole_shift    : positive
>     );
>     port (
>         clk             : in    std_logic;
>         newValue        : in    signed(g_width-1 downto 0);
>         rolling_average : out   signed(g_width-1 downto 0)
>     );
> end entity averager;
> 
> architecture comp_dsp of averager is
> 
> begin -- architecture comp_dsp of entity averager
> 
>     naive_attempt : process (clk)
>         constant width  : positive := g_width + g_pole_shift;
>         variable sum    : signed(width-1 downto 0) := (others => '0');
>     begin
>         if rising_edge(clk) then
>             sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift;
>             rolling_average <= sum(width-1 downto width-g_width);
>         end if;
>     end process naive_attempt;
> 
> end architecture comp_dsp; -- of entity averager
> 
> 
> The arithmetic will synthesise to two adders, despite
> the use of * and / in the source.
> 
> I called this one "naive" because it's simple and obvious but
> really bad - the intermediate result has full precision but
> the output is just the most significant bits of the internal
> value.


And here's one that's slightly less suckful:

(Same entity declaration as before)

    -- This one uses the fraction saving method
    -- described in the Randy Yates / Richard Lyons
    -- DC Blocker paper.
    -- Variable names used here mostly match the
    -- names used in Figure 2 of that paper.
    fraction_saving : process (clk)
        variable s      : signed(g_width-1 downto 0);
        variable u      : signed(g_width+g_pole_shift-1 downto 0);
        variable f_d    : unsigned(g_pole_shift-1 downto 0) := (others => '0');
        variable y      : signed(g_width-1 downto 0) := (others => '0');
        variable w      : signed(g_width-1 downto 0);
        constant tail   : signed(g_pole_shift-1 downto 0) := (others => '0');
    begin
        if rising_edge(clk) then
            w := newValue;
            u := signed('0' & f_d) + (y & tail) - y;
            s := u(g_width+g_pole_shift-1 downto g_pole_shift);
            f_d := unsigned(u(g_pole_shift-1 downto 0));
            y := w + s;
            rolling_average <= y;
        end if;
    end process fraction_saving;


As before, I haven't tested this; don't use it for anything
that matters.

Regards,
Allan

Reply by garengllc ●October 1, 20152015-10-01

All, thank you so much for all the feedback and explanations, I appreciate
it!

I've read through everything once, but I think I need to go through it a
couple more times to make sure I am following everyone's
points/counterpoints.

Some quick answers (so that you know that I didn't disappear into the
ether) is that I do need the output of rolling_average on every clock
(probably why the original C coder called it a rolling average even though
it isn't exactly acting like it).  

The rolling_average_num_max is an odd value to me.  It is a constant
variable that will not change once the FPGA is built. What it looks like
to me is that rolling_average_num keeps track of the depth of window
average from [0,rolling_average_num_max].  So even in the beginning when
the average only has a few values in it, it will compute the average based
on that.  Once the system has been running for a while,
rolling_average_num will always be equal to rolling_average_num_max, and
that variable in the averaging equation will be a constant.  That make
sense?
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by garengllc ●October 1, 20152015-10-01

>On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:
>
>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
>> 
>>>>I know I may not be able to simplify things to give me a bit perfect
>>> match
>>>>to the C code, but is there a way to compute this rolling average in
an
>>>>FPGA that is lightweight and a close approximation?
>>>>
>>>>
>>>>---------------------------------------
>>>>Posted through http://www.DSPRelated.com
>>> 
>>> I may have worked it out, but would still like to hear other's
opinions.
>>> The code I ended up with is:
>>> rolling_average = rolling_average -
>>> (rolling_average-newValue)/rolling_average_num_max
>>> 
>>> I am looking into converting rolling_average_num_max to the closest
>>> power of two, and then using that to shift the
>>> (rolling_average-newValue) numerator.
>> 
>> 
>> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
>> 
>> Converting the above C into VHDL gives something like this:
>> 
>> 
>> library ieee;
>> use     ieee.std_logic_1164.all;
>> use     ieee.numeric_std.all;
>> 
>> entity averager is
>>     generic (
>>         g_width         : positive;
>>         g_pole_shift    : positive
>>     );
>>     port (
>>         clk             : in    std_logic;
>>         newValue        : in    signed(g_width-1 downto 0);
>>         rolling_average : out   signed(g_width-1 downto 0)
>>     );
>> end entity averager;
>> 
>> architecture comp_dsp of averager is
>> 
>> begin -- architecture comp_dsp of entity averager
>> 
>>     naive_attempt : process (clk)
>>         constant width  : positive := g_width + g_pole_shift;
>>         variable sum    : signed(width-1 downto 0) := (others => '0');
>>     begin
>>         if rising_edge(clk) then
>>             sum := sum - (sum - newValue * 2**g_pole_shift) /
>2**g_pole_shift;
>>             rolling_average <= sum(width-1 downto width-g_width);
>>         end if;
>>     end process naive_attempt;
>> 
>> end architecture comp_dsp; -- of entity averager
>> 
>> 
>> The arithmetic will synthesise to two adders, despite
>> the use of * and / in the source.
>> 
>> I called this one "naive" because it's simple and obvious but
>> really bad - the intermediate result has full precision but
>> the output is just the most significant bits of the internal
>> value.
>
>
>And here's one that's slightly less suckful:
>
>(Same entity declaration as before)
>
>    -- This one uses the fraction saving method
>    -- described in the Randy Yates / Richard Lyons
>    -- DC Blocker paper.
>    -- Variable names used here mostly match the
>    -- names used in Figure 2 of that paper.
>    fraction_saving : process (clk)
>        variable s      : signed(g_width-1 downto 0);
>        variable u      : signed(g_width+g_pole_shift-1 downto 0);
>        variable f_d    : unsigned(g_pole_shift-1 downto 0) := (others
=>
>'0');
>        variable y      : signed(g_width-1 downto 0) := (others => '0');
>        variable w      : signed(g_width-1 downto 0);
>        constant tail   : signed(g_pole_shift-1 downto 0) := (others =>
>'0');
>    begin
>        if rising_edge(clk) then
>            w := newValue;
>            u := signed('0' & f_d) + (y & tail) - y;
>            s := u(g_width+g_pole_shift-1 downto g_pole_shift);
>            f_d := unsigned(u(g_pole_shift-1 downto 0));
>            y := w + s;
>            rolling_average <= y;
>        end if;
>    end process fraction_saving;
>
>
>As before, I haven't tested this; don't use it for anything
>that matters.
>
>Regards,
>Allan

Just saw this post (I didn't notice I had two pages of replies, you guys
rock!), sorry.  Of course I am using Verilog, but that is OK, this is
pretty straightforward.  I am going to look over the paper and this code
today as well!
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by rickman ●October 1, 20152015-10-01

On 10/1/2015 9:11 AM, garengllc wrote:
> All, thank you so much for all the feedback and explanations, I appreciate
> it!
>
> I've read through everything once, but I think I need to go through it a
> couple more times to make sure I am following everyone's
> points/counterpoints.
>
> Some quick answers (so that you know that I didn't disappear into the
> ether) is that I do need the output of rolling_average on every clock
> (probably why the original C coder called it a rolling average even though
> it isn't exactly acting like it).
>
> The rolling_average_num_max is an odd value to me.  It is a constant
> variable that will not change once the FPGA is built. What it looks like
> to me is that rolling_average_num keeps track of the depth of window
> average from [0,rolling_average_num_max].  So even in the beginning when
> the average only has a few values in it, it will compute the average based
> on that.  Once the system has been running for a while,
> rolling_average_num will always be equal to rolling_average_num_max, and
> that variable in the averaging equation will be a constant.  That make
> sense?

No, not really.  I don't see any use in the calculations for 
rolling_average_num_max other than to set a limit of the computation 
period if desired.  Why is there a max at all?  The only limits are in 
the resolution of the calculations.  If you cap rolling_average_num to 
rolling_average_num_max and continue to calculate the average without 
removing the oldest data (like in a boxcar average) you will be 
distorting your results.

-- 

Rick

Reply by rickman ●October 1, 20152015-10-01

On 10/1/2015 8:41 AM, Allan Herriman wrote:
> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:
>
>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
>>
>>>> I know I may not be able to simplify things to give me a bit perfect
>>> match
>>>> to the C code, but is there a way to compute this rolling average in an
>>>> FPGA that is lightweight and a close approximation?
>>>>
>>>>
>>>> ---------------------------------------
>>>> Posted through http://www.DSPRelated.com
>>>
>>> I may have worked it out, but would still like to hear other's opinions.
>>> The code I ended up with is:
>>> rolling_average = rolling_average -
>>> (rolling_average-newValue)/rolling_average_num_max
>>>
>>> I am looking into converting rolling_average_num_max to the closest
>>> power of two, and then using that to shift the
>>> (rolling_average-newValue) numerator.
>>
>>
>> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
>>
>> Converting the above C into VHDL gives something like this:
>>
>>
>> library ieee;
>> use     ieee.std_logic_1164.all;
>> use     ieee.numeric_std.all;
>>
>> entity averager is
>>      generic (
>>          g_width         : positive;
>>          g_pole_shift    : positive
>>      );
>>      port (
>>          clk             : in    std_logic;
>>          newValue        : in    signed(g_width-1 downto 0);
>>          rolling_average : out   signed(g_width-1 downto 0)
>>      );
>> end entity averager;
>>
>> architecture comp_dsp of averager is
>>
>> begin -- architecture comp_dsp of entity averager
>>
>>      naive_attempt : process (clk)
>>          constant width  : positive := g_width + g_pole_shift;
>>          variable sum    : signed(width-1 downto 0) := (others => '0');
>>      begin
>>          if rising_edge(clk) then
>>              sum := sum - (sum - newValue * 2**g_pole_shift) / 2**g_pole_shift;
>>              rolling_average <= sum(width-1 downto width-g_width);
>>          end if;
>>      end process naive_attempt;
>>
>> end architecture comp_dsp; -- of entity averager
>>
>>
>> The arithmetic will synthesise to two adders, despite
>> the use of * and / in the source.
>>
>> I called this one "naive" because it's simple and obvious but
>> really bad - the intermediate result has full precision but
>> the output is just the most significant bits of the internal
>> value.
>
>
> And here's one that's slightly less suckful:
>
> (Same entity declaration as before)
>
>      -- This one uses the fraction saving method
>      -- described in the Randy Yates / Richard Lyons
>      -- DC Blocker paper.
>      -- Variable names used here mostly match the
>      -- names used in Figure 2 of that paper.
>      fraction_saving : process (clk)
>          variable s      : signed(g_width-1 downto 0);
>          variable u      : signed(g_width+g_pole_shift-1 downto 0);
>          variable f_d    : unsigned(g_pole_shift-1 downto 0) := (others => '0');
>          variable y      : signed(g_width-1 downto 0) := (others => '0');
>          variable w      : signed(g_width-1 downto 0);
>          constant tail   : signed(g_pole_shift-1 downto 0) := (others => '0');
>      begin
>          if rising_edge(clk) then
>              w := newValue;
>              u := signed('0' & f_d) + (y & tail) - y;
>              s := u(g_width+g_pole_shift-1 downto g_pole_shift);
>              f_d := unsigned(u(g_pole_shift-1 downto 0));
>              y := w + s;
>              rolling_average <= y;
>          end if;
>      end process fraction_saving;
>
>
> As before, I haven't tested this; don't use it for anything
> that matters.

I don't know the "fraction saving method", so I would have to work 
pretty hard to understand this code.  If it implements the same 
algorithm as your first example it would not be an average.  Your 
initial did not divide by the number of samples and in fact didn't keep 
track of the number of samples.  I assume this one also doesn't divide 
by the number of samples because I don't see where you keep track of the 
number of samples.  That would explain why there is no division or 
multiplication required to implement them.

-- 

Rick

Reply by Allan Herriman ●October 1, 20152015-10-01

On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote:

> On 10/1/2015 8:41 AM, Allan Herriman wrote:
>> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:
>>
>>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
>>>
>>>>> I know I may not be able to simplify things to give me a bit perfect
>>>> match
>>>>> to the C code, but is there a way to compute this rolling average 
in an
>>>>> FPGA that is lightweight and a close approximation?
>>>>>
>>>>>
>>>>> ---------------------------------------
>>>>> Posted through http://www.DSPRelated.com
>>>>
>>>> I may have worked it out, but would still like to hear other's 
opinions.
>>>> The code I ended up with is:
>>>> rolling_average = rolling_average -
>>>> (rolling_average-newValue)/rolling_average_num_max
>>>>
>>>> I am looking into converting rolling_average_num_max to the closest
>>>> power of two, and then using that to shift the
>>>> (rolling_average-newValue) numerator.
>>>
>>>
>>> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
>>>
>>> Converting the above C into VHDL gives something like this:
>>>
>>>
>>> library ieee;
>>> use     ieee.std_logic_1164.all;
>>> use     ieee.numeric_std.all;
>>>
>>> entity averager is
>>>      generic (
>>>          g_width         : positive;
>>>          g_pole_shift    : positive
>>>      );
>>>      port (
>>>          clk             : in    std_logic;
>>>          newValue        : in    signed(g_width-1 downto 0);
>>>          rolling_average : out   signed(g_width-1 downto 0)
>>>      );
>>> end entity averager;
>>>
>>> architecture comp_dsp of averager is
>>>
>>> begin -- architecture comp_dsp of entity averager
>>>
>>>      naive_attempt : process (clk)
>>>          constant width  : positive := g_width + g_pole_shift;
>>>          variable sum    : signed(width-1 downto 0) := (others => 
'0');
>>>      begin
>>>          if rising_edge(clk) then
>>>              sum := sum - (sum - newValue * 2**g_pole_shift) / 
2**g_pole_shift;
>>>              rolling_average <= sum(width-1 downto width-g_width);
>>>          end if;
>>>      end process naive_attempt;
>>>
>>> end architecture comp_dsp; -- of entity averager
>>>
>>>
>>> The arithmetic will synthesise to two adders, despite
>>> the use of * and / in the source.
>>>
>>> I called this one "naive" because it's simple and obvious but
>>> really bad - the intermediate result has full precision but
>>> the output is just the most significant bits of the internal
>>> value.
>>
>>
>> And here's one that's slightly less suckful:
>>
>> (Same entity declaration as before)
>>
>>      -- This one uses the fraction saving method
>>      -- described in the Randy Yates / Richard Lyons
>>      -- DC Blocker paper.
>>      -- Variable names used here mostly match the
>>      -- names used in Figure 2 of that paper.
>>      fraction_saving : process (clk)
>>          variable s      : signed(g_width-1 downto 0);
>>          variable u      : signed(g_width+g_pole_shift-1 downto 0);
>>          variable f_d    : unsigned(g_pole_shift-1 downto 0) := 
(others => '0');
>>          variable y      : signed(g_width-1 downto 0) := (others => 
'0');
>>          variable w      : signed(g_width-1 downto 0);
>>          constant tail   : signed(g_pole_shift-1 downto 0) := (others 
=> '0');
>>      begin
>>          if rising_edge(clk) then
>>              w := newValue;
>>              u := signed('0' & f_d) + (y & tail) - y;
>>              s := u(g_width+g_pole_shift-1 downto g_pole_shift);
>>              f_d := unsigned(u(g_pole_shift-1 downto 0));
>>              y := w + s;
>>              rolling_average <= y;
>>          end if;
>>      end process fraction_saving;
>>
>>
>> As before, I haven't tested this; don't use it for anything
>> that matters.
> 
> I don't know the "fraction saving method", so I would have to work 
> pretty hard to understand this code.  If it implements the same 
> algorithm as your first example it would not be an average.  Your 
> initial did not divide by the number of samples and in fact didn't keep 
> track of the number of samples.  I assume this one also doesn't divide 
> by the number of samples because I don't see where you keep track of 
the 
> number of samples.  That would explain why there is no division or 
> multiplication required to implement them.


You might know fraction saving as error feedback, or perhaps delta sigma 
modulation.
I recommend reading the Yates / Lyons paper.

Based on the both the C code and the description, the OP seems to be 
asking for a filter with a 
u(t).exp(-t)
shaped impulse response.
One does not need to waste logic keeping track of the number of samples 
to do that.

Of course, I may have interpreted the requirements wrongly.

Allan

Reply by rickman ●October 1, 20152015-10-01

On 10/1/2015 10:12 AM, Allan Herriman wrote:
> On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote:
>
>> On 10/1/2015 8:41 AM, Allan Herriman wrote:
>>> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:
>>>
>>>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
>>>>
>>>>>> I know I may not be able to simplify things to give me a bit perfect
>>>>> match
>>>>>> to the C code, but is there a way to compute this rolling average
> in an
>>>>>> FPGA that is lightweight and a close approximation?
>>>>>>
>>>>>>
>>>>>> ---------------------------------------
>>>>>> Posted through http://www.DSPRelated.com
>>>>>
>>>>> I may have worked it out, but would still like to hear other's
> opinions.
>>>>> The code I ended up with is:
>>>>> rolling_average = rolling_average -
>>>>> (rolling_average-newValue)/rolling_average_num_max
>>>>>
>>>>> I am looking into converting rolling_average_num_max to the closest
>>>>> power of two, and then using that to shift the
>>>>> (rolling_average-newValue) numerator.
>>>>
>>>>
>>>> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
>>>>
>>>> Converting the above C into VHDL gives something like this:
>>>>
>>>>
>>>> library ieee;
>>>> use     ieee.std_logic_1164.all;
>>>> use     ieee.numeric_std.all;
>>>>
>>>> entity averager is
>>>>       generic (
>>>>           g_width         : positive;
>>>>           g_pole_shift    : positive
>>>>       );
>>>>       port (
>>>>           clk             : in    std_logic;
>>>>           newValue        : in    signed(g_width-1 downto 0);
>>>>           rolling_average : out   signed(g_width-1 downto 0)
>>>>       );
>>>> end entity averager;
>>>>
>>>> architecture comp_dsp of averager is
>>>>
>>>> begin -- architecture comp_dsp of entity averager
>>>>
>>>>       naive_attempt : process (clk)
>>>>           constant width  : positive := g_width + g_pole_shift;
>>>>           variable sum    : signed(width-1 downto 0) := (others =>
> '0');
>>>>       begin
>>>>           if rising_edge(clk) then
>>>>               sum := sum - (sum - newValue * 2**g_pole_shift) /
> 2**g_pole_shift;
>>>>               rolling_average <= sum(width-1 downto width-g_width);
>>>>           end if;
>>>>       end process naive_attempt;
>>>>
>>>> end architecture comp_dsp; -- of entity averager
>>>>
>>>>
>>>> The arithmetic will synthesise to two adders, despite
>>>> the use of * and / in the source.
>>>>
>>>> I called this one "naive" because it's simple and obvious but
>>>> really bad - the intermediate result has full precision but
>>>> the output is just the most significant bits of the internal
>>>> value.
>>>
>>>
>>> And here's one that's slightly less suckful:
>>>
>>> (Same entity declaration as before)
>>>
>>>       -- This one uses the fraction saving method
>>>       -- described in the Randy Yates / Richard Lyons
>>>       -- DC Blocker paper.
>>>       -- Variable names used here mostly match the
>>>       -- names used in Figure 2 of that paper.
>>>       fraction_saving : process (clk)
>>>           variable s      : signed(g_width-1 downto 0);
>>>           variable u      : signed(g_width+g_pole_shift-1 downto 0);
>>>           variable f_d    : unsigned(g_pole_shift-1 downto 0) :=
> (others => '0');
>>>           variable y      : signed(g_width-1 downto 0) := (others =>
> '0');
>>>           variable w      : signed(g_width-1 downto 0);
>>>           constant tail   : signed(g_pole_shift-1 downto 0) := (others
> => '0');
>>>       begin
>>>           if rising_edge(clk) then
>>>               w := newValue;
>>>               u := signed('0' & f_d) + (y & tail) - y;
>>>               s := u(g_width+g_pole_shift-1 downto g_pole_shift);
>>>               f_d := unsigned(u(g_pole_shift-1 downto 0));
>>>               y := w + s;
>>>               rolling_average <= y;
>>>           end if;
>>>       end process fraction_saving;
>>>
>>>
>>> As before, I haven't tested this; don't use it for anything
>>> that matters.
>>
>> I don't know the "fraction saving method", so I would have to work
>> pretty hard to understand this code.  If it implements the same
>> algorithm as your first example it would not be an average.  Your
>> initial did not divide by the number of samples and in fact didn't keep
>> track of the number of samples.  I assume this one also doesn't divide
>> by the number of samples because I don't see where you keep track of
> the
>> number of samples.  That would explain why there is no division or
>> multiplication required to implement them.
>
>
> You might know fraction saving as error feedback, or perhaps delta sigma
> modulation.
> I recommend reading the Yates / Lyons paper.
>
> Based on the both the C code and the description, the OP seems to be
> asking for a filter with a
> u(t).exp(-t)
> shaped impulse response.
> One does not need to waste logic keeping track of the number of samples
> to do that.
>
> Of course, I may have interpreted the requirements wrongly.

I believe you need to reexamine the original post and possibly others as 
well.  The OP is not designing a filter.  He is implementing an average 
that updates the result on every new sample arriving.  Wikipedia calls 
this the cumulative moving average.  It looks a bit like a filter 
because of the way he wrote the equations, but is actually is just a 
simple average.  Of course, in the strict sense that is also a filter, 
but is simple enough that there is no need to treat it as a complex 
filter.  The code for this is just

avg_num = avg_num + 1 ;
sum = sum + newValue ;
avg = sum / avg_num ;

Throw this in a clocked process (or an always @ (posedge clk) block) and 
you have your average.  The only sticking point is a simple way to 
calculate the divide.  Of course there needs to be initialization code 
and there may need to be code to restart the average if the OP needs that.

The division can be done fairly easily for small values of avg_num using 
a look up table based on a block RAM.  The reciprocal of avg_num would 
be obtained by the lookup table and then multiplied by the sum to get 
the average.  I believe the OP indicated the max value of the sample 
count is 8000 which should not consume too many resources in most FPGAs 
for the reciprocal calculation.  This avoids the accumulated errors from 
the form that looks like a filter.

-- 

Rick

Reply by kaz ●October 1, 20152015-10-01


>The division can be done fairly easily for small values of avg_num using

>a look up table based on a block RAM.  The reciprocal of avg_num would 
>be obtained by the lookup table and then multiplied by the sum to get 
>the average.  I believe the OP indicated the max value of the sample 
>count is 8000 which should not consume too many resources in most FPGAs 
>for the reciprocal calculation.  This avoids the accumulated errors from

>the form that looks like a filter.
>
>-- 
>
>Rick

for running averager:

have n delay stages, subtract last sample from new sample, accumulate
continuously.

For (scaling back) division:
just use 2^n taps (say 64) then discard 6 LSBs from result.

I explained that already in my first response but got drowned up.

Kaz
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Allan Herriman ●October 1, 20152015-10-01

On Thu, 01 Oct 2015 11:22:07 -0400, rickman wrote:

> On 10/1/2015 10:12 AM, Allan Herriman wrote:
>> On Thu, 01 Oct 2015 09:34:25 -0400, rickman wrote:
>>
>>> On 10/1/2015 8:41 AM, Allan Herriman wrote:
>>>> On Thu, 01 Oct 2015 12:13:26 +0000, Allan Herriman wrote:
>>>>
>>>>> On Wed, 30 Sep 2015 12:56:59 -0500, garengllc wrote:
>>>>>
>>>>>>> I know I may not be able to simplify things to give me a bit
>>>>>>> perfect
>>>>>> match
>>>>>>> to the C code, but is there a way to compute this rolling average
>> in an
>>>>>>> FPGA that is lightweight and a close approximation?
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------
>>>>>>> Posted through http://www.DSPRelated.com
>>>>>>
>>>>>> I may have worked it out, but would still like to hear other's
>> opinions.
>>>>>> The code I ended up with is:
>>>>>> rolling_average = rolling_average -
>>>>>> (rolling_average-newValue)/rolling_average_num_max
>>>>>>
>>>>>> I am looking into converting rolling_average_num_max to the closest
>>>>>> power of two, and then using that to shift the
>>>>>> (rolling_average-newValue) numerator.
>>>>>
>>>>>
>>>>> You didn't specify VHDL or Verilog.  Ok, toss a coin, VHDL it is.
>>>>>
>>>>> Converting the above C into VHDL gives something like this:
>>>>>
>>>>>
>>>>> library ieee;
>>>>> use     ieee.std_logic_1164.all;
>>>>> use     ieee.numeric_std.all;
>>>>>
>>>>> entity averager is
>>>>>       generic (
>>>>>           g_width         : positive;
>>>>>           g_pole_shift    : positive
>>>>>       );
>>>>>       port (
>>>>>           clk             : in    std_logic;
>>>>>           newValue        : in    signed(g_width-1 downto 0);
>>>>>           rolling_average : out   signed(g_width-1 downto 0)
>>>>>       );
>>>>> end entity averager;
>>>>>
>>>>> architecture comp_dsp of averager is
>>>>>
>>>>> begin -- architecture comp_dsp of entity averager
>>>>>
>>>>>       naive_attempt : process (clk)
>>>>>           constant width  : positive := g_width + g_pole_shift;
>>>>>           variable sum    : signed(width-1 downto 0) := (others =>
>> '0');
>>>>>       begin
>>>>>           if rising_edge(clk) then
>>>>>               sum := sum - (sum - newValue * 2**g_pole_shift) /
>> 2**g_pole_shift;
>>>>>               rolling_average <= sum(width-1 downto width-g_width);
>>>>>           end if;
>>>>>       end process naive_attempt;
>>>>>
>>>>> end architecture comp_dsp; -- of entity averager
>>>>>
>>>>>
>>>>> The arithmetic will synthesise to two adders, despite the use of *
>>>>> and / in the source.
>>>>>
>>>>> I called this one "naive" because it's simple and obvious but really
>>>>> bad - the intermediate result has full precision but the output is
>>>>> just the most significant bits of the internal value.
>>>>
>>>>
>>>> And here's one that's slightly less suckful:
>>>>
>>>> (Same entity declaration as before)
>>>>
>>>>       -- This one uses the fraction saving method -- described in the
>>>>       Randy Yates / Richard Lyons -- DC Blocker paper.
>>>>       -- Variable names used here mostly match the -- names used in
>>>>       Figure 2 of that paper. fraction_saving : process (clk)
>>>>           variable s      : signed(g_width-1 downto 0);
>>>>           variable u      : signed(g_width+g_pole_shift-1 downto 0);
>>>>           variable f_d    : unsigned(g_pole_shift-1 downto 0) :=
>> (others => '0');
>>>>           variable y      : signed(g_width-1 downto 0) := (others =>
>> '0');
>>>>           variable w      : signed(g_width-1 downto 0);
>>>>           constant tail   : signed(g_pole_shift-1 downto 0) :=
>>>>           (others
>> => '0');
>>>>       begin
>>>>           if rising_edge(clk) then
>>>>               w := newValue;
>>>>               u := signed('0' & f_d) + (y & tail) - y;
>>>>               s := u(g_width+g_pole_shift-1 downto g_pole_shift);
>>>>               f_d := unsigned(u(g_pole_shift-1 downto 0));
>>>>               y := w + s;
>>>>               rolling_average <= y;
>>>>           end if;
>>>>       end process fraction_saving;
>>>>
>>>>
>>>> As before, I haven't tested this; don't use it for anything that
>>>> matters.
>>>
>>> I don't know the "fraction saving method", so I would have to work
>>> pretty hard to understand this code.  If it implements the same
>>> algorithm as your first example it would not be an average.  Your
>>> initial did not divide by the number of samples and in fact didn't
>>> keep track of the number of samples.  I assume this one also doesn't
>>> divide by the number of samples because I don't see where you keep
>>> track of
>> the
>>> number of samples.  That would explain why there is no division or
>>> multiplication required to implement them.
>>
>>
>> You might know fraction saving as error feedback, or perhaps delta
>> sigma modulation.
>> I recommend reading the Yates / Lyons paper.
>>
>> Based on the both the C code and the description, the OP seems to be
>> asking for a filter with a u(t).exp(-t)
>> shaped impulse response.
>> One does not need to waste logic keeping track of the number of samples
>> to do that.
>>
>> Of course, I may have interpreted the requirements wrongly.
> 
> I believe you need to reexamine the original post and possibly others as
> well.  The OP is not designing a filter.  He is implementing an average
> that updates the result on every new sample arriving.  Wikipedia calls
> this the cumulative moving average.  It looks a bit like a filter
> because of the way he wrote the equations, but is actually is just a
> simple average.  Of course, in the strict sense that is also a filter,
> but is simple enough that there is no need to treat it as a complex
> filter.  The code for this is just
> 
> avg_num = avg_num + 1 ;
> sum = sum + newValue ;
> avg = sum / avg_num ;
> 
> Throw this in a clocked process (or an always @ (posedge clk) block) and
> you have your average.  The only sticking point is a simple way to
> calculate the divide.  Of course there needs to be initialization code
> and there may need to be code to restart the average if the OP needs
> that.
> 
> The division can be done fairly easily for small values of avg_num using
> a look up table based on a block RAM.  The reciprocal of avg_num would
> be obtained by the lookup table and then multiplied by the sum to get
> the average.  I believe the OP indicated the max value of the sample
> count is 8000 which should not consume too many resources in most FPGAs
> for the reciprocal calculation.  This avoids the accumulated errors from
> the form that looks like a filter.


Dang.  That form looked so much like a single pole IIR filter I thought 
it was one.


For the division, I was thinking that perhaps Newton-Raphson to find the 
reciprocal might work well, as the reciprocal at any input sample will be 
a good approximation for the next reciprocal, meaning that only a single 
N-R iteration per input sample should give an accurate result.
Each N-R iteration would require two multiplications.

Allan

Previous 123 4 Next

How to simplify a rolling average in an FPGA

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group