DSPRelated.com
Forums

Re: Block Floating point FFT or 32 bits precision FFT

Started by Andrew V. Nesterov December 6, 2001
> Date: Wed, 5 Dec 2001 11:47:49 +0100
> From: "Curl" <curl@curl...>
> Subject: Block Floating point FFT or 32 bits precision FFT ?
> 
> Sorry, this is my second question in two days .. 
> I'd like to know if someone tried this two algorithms : Block Floating
> point FFT and 32 bits (double precision) FFT..
> Which gives the better result ?

Speaking about floating point vs fixed point of the same bit width
N it is easy to notice that the former gives larger range while the
latter gives more bits of precision.

Think about it that way: in fixed point format all but one bit are
mantissa bits. In floating point mantissa width is exp bits less,
i.e. N - 1 - exp_bits. Thus maximum floating point is on the order
of 2**(exp), while the maximum fixed point number is 2**(N)-1. If
there are just 8 exp bits 2**(256) is much bigger than 2**(32).
On the other hand, in floating point only 32-1-8# bits left for
mantissa, while fixed point has 31 bit, thus fixed point is more
precise.

Sometimes there is 1 hidden (implicit) bit, the MSB of mantissa in
a floating point format, but it cannot change things too much :)

Now, if you are going to use 16 bit exp 16 bit mantissa format,
the range is going to be enormous, looming. Even 64 bit floating
point IEEE format allocates only 11 bits for the exponent.

You may want to read a paper on floating point arithmetic by Goldberg
on http://docs.sun.com/

Regards,

Andrew