>
> > Adaptively scale only if required to prevent overflowing
> > your data type limit.
>
> > Divide by 2 if the intermediate results are large (above
> > some threshold related to the number of bits).
>
> That is, divide the entire output vector after each level
> of butterfly, if any one output is above some threshold.
>
> > Count the
> > number of divides and include that resulting scale factor
> > as part of the result. You end up with a pseudo float
> > result vector (multiple mantissa's sharing one offset
> > exponent).
this is what i think they used to call "block floating-point". i
think the trick is, to keep it "fast" (as one might expect for an FFT)
is to decide at the beginning of the pass whether the pass is a divide-
by-two pass or an unscaled pass. when that is decided, different
optimized code for one of the two cases is performed (one mode that
divides by two right just before the butterfly stores its results and
the other mode that does not).
then the trick is to anticipate, in the worst case, whether overflow
*could* possibly happen if you don't scale. this is done in the
previous pass when each result is stored. assuming to start that the
real and imag components have magnitude no greater than 1 (full scale)
and overflow occurs when either the real or imaginary part of the
result exceeds 1 and, because the butterfly twiddle factor could turn
that square region 45 degrees creating a number sqrt(2) times as big,
then (because of the addition in the butterfly), you *could*
potentially have overflow if any of your values had real or imaginary
parts with magnitude greater than 1/(sqrt(2)+1) = sqrt(2)-1 = 0.4142
full scale. so, if in the previous pass *any* one either the real or
imag parts of the FFT exceed 0.4142 full scale (FS), you *could*
overflow the next pass if you don't divide by 2. so you start with a
cleared sticky bit at the beginning of the pass (either mode), if any
result exceeds 0.4142 in either real or imaginary parts, then set the
sticky bit and do not clear it again until the beginning of the next
pass. then, when the next pass begins, *if* the sticky bit is set,
run the FFT pass in divide-by-two mode (and clear the sticky bit),
otherwise run the FFT pass in normal mode.
i think, for safety, i would assume that the first pass is in the
divide-by-two mode. and still make sure none of your values that you
begin with exceed 0.8284 FS in either the real or imag parts (unless
it's a DIT FFT when you *know* you won't be rotating your data by 45
degrees in the first pass). for some reason (maybe cheaper to do in
hardware), the Motorola DSP56002 and later chips put that threshold at
0.25 FS.
r b-j
Reply by Ron N.●July 6, 20072007-07-06
On Jul 6, 1:50 pm, "Ron N." <rhnlo...@yahoo.com> wrote:
> On Jul 5, 11:28 pm, "malikos" <omar.ma...@nxp.com> wrote:
>
>
>
> > >On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote:
> > >> >Hi,
>
> > >> >This is my first post so please be patient.
> > >> >I am trying to implement an 8pt FFT. However i only have a 16bit
> > adder.
> > >> >When i add 2 large numbers together the result is 17bits. I can
> > remove
> > >> the
> > >> >LSB but what is the best way to make the result 16bits again. Assume
> > the
> > >> >FFT size will be 1024pts eventually.
>
> > >> >Thanks
>
> > >> any replies :-(
>
> > >my, you're patient. :-\
>
> > >i would add (or subtract) and divide by 2 (shift left) for each
> > >butterfly addition (or subtraction). for an 8-point FFT, then your
> > >result would be scaled by 1/8, but you would know it and could deal
> > >with it in the use or interpretation of the FFT results.
>
> > >a fixed 8-point FFT is so specialized and small that it seems
> > >reasonable that your code could be highly optimized for speed/
> > >efficiency by trading generality for tricks. take a look at your
> > >twiddle factors (the FFT coefficients) for the 8-point FFT to see what
> > >i mean by this.
>
> > >r b-j
>
> > Thanks for your reply. The problem is i would also like to build an 1024pt
> > FFT, however if i divide by 2 each time the last butterflies i get 0. Any
> > ideas here ?? I can use floating point unit as it is not build into my
> > DSP.
>
> Adaptively scale only if required to prevent overflowing
> your data type limit.
>
> Divide by 2 if the intermediate results are large (above
> some threshold related to the number of bits).
That is, divide the entire output vector after each level
of butterfly, if any one output is above some threshold.
> Count the
> number of divides and include that resulting scale factor
> as part of the result. You end up with a pseudo float
> result vector (multiple mantissa's sharing one offset
> exponent).
>
> IMHO. YMMV.
> --
> rhn A.T nicholson d.0.t C-o-M
Reply by Ron N.●July 6, 20072007-07-06
On Jul 5, 11:28 pm, "malikos" <omar.ma...@nxp.com> wrote:
> >On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote:
> >> >Hi,
>
> >> >This is my first post so please be patient.
> >> >I am trying to implement an 8pt FFT. However i only have a 16bit
> adder.
> >> >When i add 2 large numbers together the result is 17bits. I can
> remove
> >> the
> >> >LSB but what is the best way to make the result 16bits again. Assume
> the
> >> >FFT size will be 1024pts eventually.
>
> >> >Thanks
>
> >> any replies :-(
>
> >my, you're patient. :-\
>
> >i would add (or subtract) and divide by 2 (shift left) for each
> >butterfly addition (or subtraction). for an 8-point FFT, then your
> >result would be scaled by 1/8, but you would know it and could deal
> >with it in the use or interpretation of the FFT results.
>
> >a fixed 8-point FFT is so specialized and small that it seems
> >reasonable that your code could be highly optimized for speed/
> >efficiency by trading generality for tricks. take a look at your
> >twiddle factors (the FFT coefficients) for the 8-point FFT to see what
> >i mean by this.
>
> >r b-j
>
> Thanks for your reply. The problem is i would also like to build an 1024pt
> FFT, however if i divide by 2 each time the last butterflies i get 0. Any
> ideas here ?? I can use floating point unit as it is not build into my
> DSP.
Adaptively scale only if required to prevent overflowing
your data type limit.
Divide by 2 if the intermediate results are large (above
some threshold related to the number of bits). Count the
number of divides and include that resulting scale factor
as part of the result. You end up with a pseudo float
result vector (multiple mantissa's sharing one offset
exponent).
IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
Reply by robert bristow-johnson●July 6, 20072007-07-06
On Jul 6, 12:39 pm, Jerry Avins <j...@ieee.org> wrote:
> malikos wrote:
> > any ideas ????
>
> Sure. Use double precision.
16 bits just ain't enough for a 1024 pt FFT
> In most cases, dividing by two on every
> second pass is sufficient.
doing this (dividing by 2 every other pass) will preserve energy, or
the mean-square of the signal data. it will neither grow or decline
in r.m.s. amplitude.
> There are many fifed-point FFT
> implementations around; study some.
also good advice.
r b-j
Reply by Jerry Avins●July 6, 20072007-07-06
malikos wrote:
> any ideas ????
Sure. Use double precision. In most cases, dividing by two on every
second pass is sufficient. There are many fifed-point FFT
implementations around; study some.
Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply by malikos●July 6, 20072007-07-06
any ideas ????
Reply by malikos●July 6, 20072007-07-06
>On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote:
>> >Hi,
>>
>> >This is my first post so please be patient.
>> >I am trying to implement an 8pt FFT. However i only have a 16bit
adder.
>> >When i add 2 large numbers together the result is 17bits. I can
remove
>> the
>> >LSB but what is the best way to make the result 16bits again. Assume
the
>> >FFT size will be 1024pts eventually.
>>
>> >Thanks
>>
>> any replies :-(
>
>my, you're patient. :-\
>
>i would add (or subtract) and divide by 2 (shift left) for each
>butterfly addition (or subtraction). for an 8-point FFT, then your
>result would be scaled by 1/8, but you would know it and could deal
>with it in the use or interpretation of the FFT results.
>
>a fixed 8-point FFT is so specialized and small that it seems
>reasonable that your code could be highly optimized for speed/
>efficiency by trading generality for tricks. take a look at your
>twiddle factors (the FFT coefficients) for the 8-point FFT to see what
>i mean by this.
>
>r b-j
>
>
Thanks for your reply. The problem is i would also like to build an 1024pt
FFT, however if i divide by 2 each time the last butterflies i get 0. Any
ideas here ?? I can use floating point unit as it is not build into my
DSP.
Reply by robert bristow-johnson●July 5, 20072007-07-05
On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote:
> >Hi,
>
> >This is my first post so please be patient.
> >I am trying to implement an 8pt FFT. However i only have a 16bit adder.
> >When i add 2 large numbers together the result is 17bits. I can remove
> the
> >LSB but what is the best way to make the result 16bits again. Assume the
> >FFT size will be 1024pts eventually.
>
> >Thanks
>
> any replies :-(
my, you're patient. :-\
i would add (or subtract) and divide by 2 (shift left) for each
butterfly addition (or subtraction). for an 8-point FFT, then your
result would be scaled by 1/8, but you would know it and could deal
with it in the use or interpretation of the FFT results.
a fixed 8-point FFT is so specialized and small that it seems
reasonable that your code could be highly optimized for speed/
efficiency by trading generality for tricks. take a look at your
twiddle factors (the FFT coefficients) for the 8-point FFT to see what
i mean by this.
r b-j
Reply by malikos●July 5, 20072007-07-05
>Hi,
>
>This is my first post so please be patient.
>I am trying to implement an 8pt FFT. However i only have a 16bit adder.
>When i add 2 large numbers together the result is 17bits. I can remove
the
>LSB but what is the best way to make the result 16bits again. Assume the
>FFT size will be 1024pts eventually.
>
>Thanks
>
>
>