Hi, This is my first post so please be patient. I am trying to implement an 8pt FFT. However i only have a 16bit adder. When i add 2 large numbers together the result is 17bits. I can remove the LSB but what is the best way to make the result 16bits again. Assume the FFT size will be 1024pts eventually. Thanks

# FFT overflow

Started by ●July 5, 2007

Reply by ●July 5, 20072007-07-05

>Hi, > >This is my first post so please be patient. >I am trying to implement an 8pt FFT. However i only have a 16bit adder. >When i add 2 large numbers together the result is 17bits. I can removethe>LSB but what is the best way to make the result 16bits again. Assume the >FFT size will be 1024pts eventually. > >Thanks > > >any replies :-(

Reply by ●July 5, 20072007-07-05

On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote:> >Hi, > > >This is my first post so please be patient. > >I am trying to implement an 8pt FFT. However i only have a 16bit adder. > >When i add 2 large numbers together the result is 17bits. I can remove > the > >LSB but what is the best way to make the result 16bits again. Assume the > >FFT size will be 1024pts eventually. > > >Thanks > > any replies :-(my, you're patient. :-\ i would add (or subtract) and divide by 2 (shift left) for each butterfly addition (or subtraction). for an 8-point FFT, then your result would be scaled by 1/8, but you would know it and could deal with it in the use or interpretation of the FFT results. a fixed 8-point FFT is so specialized and small that it seems reasonable that your code could be highly optimized for speed/ efficiency by trading generality for tricks. take a look at your twiddle factors (the FFT coefficients) for the 8-point FFT to see what i mean by this. r b-j

Reply by ●July 6, 20072007-07-06

>On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote: >> >Hi, >> >> >This is my first post so please be patient. >> >I am trying to implement an 8pt FFT. However i only have a 16bitadder.>> >When i add 2 large numbers together the result is 17bits. I canremove>> the >> >LSB but what is the best way to make the result 16bits again. Assumethe>> >FFT size will be 1024pts eventually. >> >> >Thanks >> >> any replies :-( > >my, you're patient. :-\ > >i would add (or subtract) and divide by 2 (shift left) for each >butterfly addition (or subtraction). for an 8-point FFT, then your >result would be scaled by 1/8, but you would know it and could deal >with it in the use or interpretation of the FFT results. > >a fixed 8-point FFT is so specialized and small that it seems >reasonable that your code could be highly optimized for speed/ >efficiency by trading generality for tricks. take a look at your >twiddle factors (the FFT coefficients) for the 8-point FFT to see what >i mean by this. > >r b-j > >Thanks for your reply. The problem is i would also like to build an 1024pt FFT, however if i divide by 2 each time the last butterflies i get 0. Any ideas here ?? I can use floating point unit as it is not build into my DSP.

Reply by ●July 6, 20072007-07-06

Reply by ●July 6, 20072007-07-06

malikos wrote:> any ideas ????Sure. Use double precision. In most cases, dividing by two on every second pass is sufficient. There are many fifed-point FFT implementations around; study some. Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Reply by ●July 6, 20072007-07-06

On Jul 6, 12:39 pm, Jerry Avins <j...@ieee.org> wrote:> malikos wrote: > > any ideas ???? > > Sure. Use double precision.16 bits just ain't enough for a 1024 pt FFT> In most cases, dividing by two on every > second pass is sufficient.doing this (dividing by 2 every other pass) will preserve energy, or the mean-square of the signal data. it will neither grow or decline in r.m.s. amplitude.> There are many fifed-point FFT > implementations around; study some.also good advice. r b-j

Reply by ●July 6, 20072007-07-06

On Jul 5, 11:28 pm, "malikos" <omar.ma...@nxp.com> wrote:> >On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote: > >> >Hi, > > >> >This is my first post so please be patient. > >> >I am trying to implement an 8pt FFT. However i only have a 16bit > adder. > >> >When i add 2 large numbers together the result is 17bits. I can > remove > >> the > >> >LSB but what is the best way to make the result 16bits again. Assume > the > >> >FFT size will be 1024pts eventually. > > >> >Thanks > > >> any replies :-( > > >my, you're patient. :-\ > > >i would add (or subtract) and divide by 2 (shift left) for each > >butterfly addition (or subtraction). for an 8-point FFT, then your > >result would be scaled by 1/8, but you would know it and could deal > >with it in the use or interpretation of the FFT results. > > >a fixed 8-point FFT is so specialized and small that it seems > >reasonable that your code could be highly optimized for speed/ > >efficiency by trading generality for tricks. take a look at your > >twiddle factors (the FFT coefficients) for the 8-point FFT to see what > >i mean by this. > > >r b-j > > Thanks for your reply. The problem is i would also like to build an 1024pt > FFT, however if i divide by 2 each time the last butterflies i get 0. Any > ideas here ?? I can use floating point unit as it is not build into my > DSP.Adaptively scale only if required to prevent overflowing your data type limit. Divide by 2 if the intermediate results are large (above some threshold related to the number of bits). Count the number of divides and include that resulting scale factor as part of the result. You end up with a pseudo float result vector (multiple mantissa's sharing one offset exponent). IMHO. YMMV. -- rhn A.T nicholson d.0.t C-o-M

Reply by ●July 6, 20072007-07-06

On Jul 6, 1:50 pm, "Ron N." <rhnlo...@yahoo.com> wrote:> On Jul 5, 11:28 pm, "malikos" <omar.ma...@nxp.com> wrote: > > > > > >On Jul 5, 11:29 am, "malikos" <omar.ma...@nxp.com> wrote: > > >> >Hi, > > > >> >This is my first post so please be patient. > > >> >I am trying to implement an 8pt FFT. However i only have a 16bit > > adder. > > >> >When i add 2 large numbers together the result is 17bits. I can > > remove > > >> the > > >> >LSB but what is the best way to make the result 16bits again. Assume > > the > > >> >FFT size will be 1024pts eventually. > > > >> >Thanks > > > >> any replies :-( > > > >my, you're patient. :-\ > > > >i would add (or subtract) and divide by 2 (shift left) for each > > >butterfly addition (or subtraction). for an 8-point FFT, then your > > >result would be scaled by 1/8, but you would know it and could deal > > >with it in the use or interpretation of the FFT results. > > > >a fixed 8-point FFT is so specialized and small that it seems > > >reasonable that your code could be highly optimized for speed/ > > >efficiency by trading generality for tricks. take a look at your > > >twiddle factors (the FFT coefficients) for the 8-point FFT to see what > > >i mean by this. > > > >r b-j > > > Thanks for your reply. The problem is i would also like to build an 1024pt > > FFT, however if i divide by 2 each time the last butterflies i get 0. Any > > ideas here ?? I can use floating point unit as it is not build into my > > DSP. > > Adaptively scale only if required to prevent overflowing > your data type limit. > > Divide by 2 if the intermediate results are large (above > some threshold related to the number of bits).That is, divide the entire output vector after each level of butterfly, if any one output is above some threshold.> Count the > number of divides and include that resulting scale factor > as part of the result. You end up with a pseudo float > result vector (multiple mantissa's sharing one offset > exponent). > > IMHO. YMMV. > -- > rhn A.T nicholson d.0.t C-o-M

Reply by ●July 6, 20072007-07-06

On Jul 6, 4:53 pm, "Ron N." <rhnlo...@yahoo.com> wrote:> On Jul 6, 1:50 pm, "Ron N." <rhnlo...@yahoo.com> wrote: >...> > > Adaptively scale only if required to prevent overflowing > > your data type limit. > > > Divide by 2 if the intermediate results are large (above > > some threshold related to the number of bits). > > That is, divide the entire output vector after each level > of butterfly, if any one output is above some threshold. > > > Count the > > number of divides and include that resulting scale factor > > as part of the result. You end up with a pseudo float > > result vector (multiple mantissa's sharing one offset > > exponent).this is what i think they used to call "block floating-point". i think the trick is, to keep it "fast" (as one might expect for an FFT) is to decide at the beginning of the pass whether the pass is a divide- by-two pass or an unscaled pass. when that is decided, different optimized code for one of the two cases is performed (one mode that divides by two right just before the butterfly stores its results and the other mode that does not). then the trick is to anticipate, in the worst case, whether overflow *could* possibly happen if you don't scale. this is done in the previous pass when each result is stored. assuming to start that the real and imag components have magnitude no greater than 1 (full scale) and overflow occurs when either the real or imaginary part of the result exceeds 1 and, because the butterfly twiddle factor could turn that square region 45 degrees creating a number sqrt(2) times as big, then (because of the addition in the butterfly), you *could* potentially have overflow if any of your values had real or imaginary parts with magnitude greater than 1/(sqrt(2)+1) = sqrt(2)-1 = 0.4142 full scale. so, if in the previous pass *any* one either the real or imag parts of the FFT exceed 0.4142 full scale (FS), you *could* overflow the next pass if you don't divide by 2. so you start with a cleared sticky bit at the beginning of the pass (either mode), if any result exceeds 0.4142 in either real or imaginary parts, then set the sticky bit and do not clear it again until the beginning of the next pass. then, when the next pass begins, *if* the sticky bit is set, run the FFT pass in divide-by-two mode (and clear the sticky bit), otherwise run the FFT pass in normal mode. i think, for safety, i would assume that the first pass is in the divide-by-two mode. and still make sure none of your values that you begin with exceed 0.8284 FS in either the real or imag parts (unless it's a DIT FFT when you *know* you won't be rotating your data by 45 degrees in the first pass). for some reason (maybe cheaper to do in hardware), the Motorola DSP56002 and later chips put that threshold at 0.25 FS. r b-j