# number of bits to compute DFT

Started by November 10, 2014
```Hello All,

I am reading DFT topic where quantization effect is described.

One of the example in the book calculates the number of bits needed to
compute the DFT of a 1024 point sequence with a SNR of 30 dB.

Here b = 15 bits. My question is whether the size of the sample size does
not matter at all for computation of the DFT?

If input is composed of 24 bits (for example), would number of bits used
for DFT internally would be independent of 24-bit size?

Thanks,

_____________________________
Posted through www.DSPRelated.com
```
```On 10.11.14 14.37, sharanbr wrote:
> I am reading DFT topic where quantization effect is described.
>
> One of the example in the book calculates the number of bits needed to
> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>
> Here b = 15 bits. My question is whether the size of the sample size does
> not matter at all for computation of the DFT?

It depends on your algorithm. The fast FFT algorithm does not produce
optimal results with respect to SNR.
But if you compute the DFT by a matrix multiplication, there is no
significant dependency on the DFT size, since every resulting
coefficient is just a scalar product of the input vector withe the
matching coefficient vector.

> If input is composed of 24 bits (for example), would number of bits used
> for DFT internally would be independent of 24-bit size?

What algorithm do you intend to use?

Marcel
```
```On 2014-11-10 19:24:21 +0000, Marcel Mueller said:

> On 10.11.14 14.37, sharanbr wrote:
>> I am reading DFT topic where quantization effect is described.
>>
>> One of the example in the book calculates the number of bits needed to
>> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>>
>> Here b = 15 bits. My question is whether the size of the sample size does
>> not matter at all for computation of the DFT?
>
> It depends on your algorithm. The fast FFT algorithm does not produce
> optimal results with respect to SNR.

The FFT gets it speed by careful factoring of the Fourier transform matrix
and that results in lower error bounds which are also observed in practice.
Well known in the literature for almost 50 years now.

> But if you compute the DFT by a matrix multiplication, there is no
> significant dependency on the DFT size, since every resulting
> coefficient is just a scalar product of the input vector withe the
> matching coefficient vector.

And each of those inner products has a error coefficient which depends on the
length of the inner product. See any numerical analysis text published
since WWII.

>> If input is composed of 24 bits (for example), would number of bits used
>> for DFT internally would be independent of 24-bit size?
>
> What algorithm do you intend to use?
>
>
> Marcel

```
```On 11/10/14 3:06 PM, Gordon Sande wrote:
> On 2014-11-10 19:24:21 +0000, Marcel Mueller said:
>
>> On 10.11.14 14.37, sharanbr wrote:
>>> I am reading DFT topic where quantization effect is described.
>>>
>>> One of the example in the book calculates the number of bits needed to
>>> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>>>
>>> Here b = 15 bits. My question is whether the size of the sample size
>>> does
>>> not matter at all for computation of the DFT?
>>
>> It depends on your algorithm. The fast FFT algorithm does not produce
>> optimal results with respect to SNR.
>
> The FFT gets it speed by careful factoring of the Fourier transform matrix
> and that results in lower error bounds which are also observed in practice.
> Well known in the literature for almost 50 years now.

i would say that, yes, the FFT does better if a single-precision
accumulator is used.  but if single-precision word width for the
coefficients and data continue to be used just as with the FFT, but a
double-wide accumulator is used, as is a common DSP practice, then i
also think the simple dot-product DFT might have better SNR.  (because
there would be only one rounding operation immediately before writing
X[k], whereas with the vanilla-flavored radix-2 Decimation-in-whatever
FFT alg, there would be many different rounding operations accumulating.
but i don't think they would accumulate very far because it would be
like 1 + 1/4 + 1/16 + ...  i think that would converge pretty fast.  so
i don't see the straight DFT to be useful unless you're doing something
like the sliding DFT.)

--

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

```
```On 10.11.14 14.37, sharanbr wrote:
...  (snip)

From the wikipedia entry on FFT:

"The upper bound on the relative error for the Cooley&ndash;Tukey algorithm is O(&epsilon; log N), compared to O(&epsilon;N3/2) for the na&iuml;ve DFT formula (Gentleman and Sande, 1966),"

There is an extensive literature covering FFT and DFT error analysis, some of it quite recent.  A simple Google search on the subject will probably provide you with far more reading material than you can handle.

And, of course, the issue has been mentioned before on comp.dsp, such as, for example:

http://www.dsprelated.com/showmessage/150283/1.php

Kevin McGee
```
```On 11/10/2014 8:37 AM, sharanbr wrote:
> Hello All,
>
> I am reading DFT topic where quantization effect is described.
>
> One of the example in the book calculates the number of bits needed to
> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>
> Here b = 15 bits. My question is whether the size of the sample size does
> not matter at all for computation of the DFT?
>
> If input is composed of 24 bits (for example), would number of bits used
> for DFT internally would be independent of 24-bit size?

I think you are saying you wish for the output spectrum to have 30 dB of
SNR.  To achieve that you need to consider the size of your FFT.  My
understanding is that you add about 1/2 bit of noise from roundoff error
with each pass of the FFT.  So a 1024 point FFT will have 10 passes and
add 5 bits.  To get 30 dB out (ENOB = 5) you would need to start with
input data with 10 bits of significant data or 60 dB of SNR.  You would
need to do all calculations with at least 10 bits of precision.

--

Rick
```
```On Tue, 11 Nov 2014 02:23:43 -0500, rickman wrote:

> On 11/10/2014 8:37 AM, sharanbr wrote:
>> Hello All,
>>
>> I am reading DFT topic where quantization effect is described.
>>
>> One of the example in the book calculates the number of bits needed to
>> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>>
>> Here b = 15 bits. My question is whether the size of the sample size
>> does not matter at all for computation of the DFT?
>>
>> If input is composed of 24 bits (for example), would number of bits
>> used for DFT internally would be independent of 24-bit size?
>
> I think you are saying you wish for the output spectrum to have 30 dB of
> SNR.  To achieve that you need to consider the size of your FFT.  My
> understanding is that you add about 1/2 bit of noise from roundoff error
> with each pass of the FFT.  So a 1024 point FFT will have 10 passes and
> add 5 bits.  To get 30 dB out (ENOB = 5) you would need to start with
> input data with 10 bits of significant data or 60 dB of SNR.  You would
> need to do all calculations with at least 10 bits of precision.

I think you're doing your computation backwards.

Each bin of an N-point FFT acts like a bandpass filter, N points long.  So
for white noise at the input, the achievable SNR at each bin is (input SNR)
+ 3dB * N.

So for a 30dB _input_ SNR you could expect (at best) a 60dB _output_ SNR.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
```
```On Mon, 10 Nov 2014 07:37:13 -0600, sharanbr wrote:

> Hello All,
>
> I am reading DFT topic where quantization effect is described.
>
> One of the example in the book calculates the number of bits needed to
> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>
> Here b = 15 bits. My question is whether the size of the sample size
> does not matter at all for computation of the DFT?
>
> If input is composed of 24 bits (for example), would number of bits used
> for DFT internally would be independent of 24-bit size?

It's complicated.

If the input is composed of 24 bits, with a signal that uses the full
range of available values, then the bottom 19 bits are, roughly, random
junk.  (It's more like the bottom 17 bits or so, if you add guard bits --
I'm being a bit fast and loose with my math).

When you finish the FFT, then if the signal were a pure tone that fell
exactly on a bin's frequency, the SNR of the signal in that one bin would
be roughly 60dB, meaning that the bottom 14 (OK, 12) bits are junk.

If by "sample size" you mean the number of samples in the FFT (i.e. 1024),
then it matters, because the SNR of that hypothetical pure tone improves
by 3dB for each doubling in the sample size.

If by "sample size" you mean the number of bits in the sample, then (a)
you are not using the correct nomenclature ("precision" or "word width" is
the word you seek), and (b), if the signal is significantly noisier than
the quantization noise imposed by the word width then increasing the word
width will yield no effective increase in SNR.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
```
```On 11/11/14 12:08 PM, Tim Wescott wrote:
> On Mon, 10 Nov 2014 07:37:13 -0600, sharanbr wrote:
>
>> Hello All,
>>
>> I am reading DFT topic where quantization effect is described.
>>
>> One of the example in the book calculates the number of bits needed to
>> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>>
>> Here b = 15 bits. My question is whether the size of the sample size
>> does not matter at all for computation of the DFT?
>>
>> If input is composed of 24 bits (for example), would number of bits used
>> for DFT internally would be independent of 24-bit size?
>
> It's complicated.
>
> If the input is composed of 24 bits, with a signal that uses the full
> range of available values, then the bottom 19 bits are, roughly, random
> junk.  (It's more like the bottom 17 bits or so, if you add guard bits --
> I'm being a bit fast and loose with my math).
>
> When you finish the FFT, then if the signal were a pure tone that fell
> exactly on a bin's frequency, the SNR of the signal in that one bin would
> be roughly 60dB, meaning that the bottom 14 (OK, 12) bits are junk.
>

it appears we're talking fixed-point DFT.  then, to be sufficiently slow
and tight with the math, besides being clear on the 24-bit words of the
signal (and how many guard bits are on the left) and presumably 24-bit
words for the DFT coefficients (or "twiddle factors"), we need to be
clear what is the word width of the intermediate values.

and we need to be clear if we're doing:

N-1
X[k]  =  SUM{ x[n] e^(j 2 pi nk/N) }
n=0

in which the signal is growing in amplitude and we'll need log2(N) guard
bits on the left, or if we're doing

N-1
X[k]  =  (1/N) SUM{ x[n] e^(j 2 pi nk/N) }
n=0

in which the signal is actually decreasing in amplitude and bits are
falling off the edge on the right, or if we're doing

N-1
X[k]  =  (1/sqrt(N)) SUM{ x[n] e^(j 2 pi nk/N) }
n=0

and the signal amplitude is *not* growing or shrinking (in an r.m.s.
sense), but there is still the need for 1/2 log2(N) guard bits on the
left and another 1/2 log2(N) bits are gonna fall of the edge on the right.

if the vanilla-flavored radix-2 FFT is used, the first case there is no
division by 2 in the butterflies.  in the second case, every butterfly
has a division by 2 (a right shift by one bit) so a bit falls offa the
edge for each pass.  in the final case, for every *alternate* pass,
there is a division by 2 (and a right shift by one bit) and for the
other passes, there is no divide nor shift in the butterflies.

this is something i worried about two decades ago when i was coding this
for the good-old DSP56K (before the DSP56002 and later, which had a
special overflow-anticipation bit that was used efficiently implement a
block-floating-point FFT).

--

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

```
```On Tue, 11 Nov 2014 12:41:01 -0500, robert bristow-johnson wrote:

> On 11/11/14 12:08 PM, Tim Wescott wrote:
>> On Mon, 10 Nov 2014 07:37:13 -0600, sharanbr wrote:
>>
>>> Hello All,
>>>
>>> I am reading DFT topic where quantization effect is described.
>>>
>>> One of the example in the book calculates the number of bits needed to
>>> compute the DFT of a 1024 point sequence with a SNR of 30 dB.
>>>
>>> Here b = 15 bits. My question is whether the size of the sample size
>>> does not matter at all for computation of the DFT?
>>>
>>> If input is composed of 24 bits (for example), would number of bits
>>> used for DFT internally would be independent of 24-bit size?
>>
>> It's complicated.
>>
>> If the input is composed of 24 bits, with a signal that uses the full
>> range of available values, then the bottom 19 bits are, roughly, random
>> junk.  (It's more like the bottom 17 bits or so, if you add guard bits
>> --
>> I'm being a bit fast and loose with my math).
>>
>> When you finish the FFT, then if the signal were a pure tone that fell
>> exactly on a bin's frequency, the SNR of the signal in that one bin
>> would be roughly 60dB, meaning that the bottom 14 (OK, 12) bits are
>> junk.
>>
>>
> it appears we're talking fixed-point DFT.  then, to be sufficiently slow
> and tight with the math, besides being clear on the 24-bit words of the
> signal (and how many guard bits are on the left) and presumably 24-bit
> words for the DFT coefficients (or "twiddle factors"), we need to be
> clear what is the word width of the intermediate values.
>
> and we need to be clear if we're doing:
>
>               N-1
>      X[k]  =  SUM{ x[n] e^(j 2 pi nk/N) }
>               n=0
>
> in which the signal is growing in amplitude and we'll need log2(N) guard
> bits on the left, or if we're doing
>
>                     N-1
>      X[k]  =  (1/N) SUM{ x[n] e^(j 2 pi nk/N) }
>                     n=0
>
>
> in which the signal is actually decreasing in amplitude and bits are
> falling off the edge on the right, or if we're doing
>
>                           N-1
>      X[k]  =  (1/sqrt(N)) SUM{ x[n] e^(j 2 pi nk/N) }
>                           n=0
>
> and the signal amplitude is *not* growing or shrinking (in an r.m.s.
> sense), but there is still the need for 1/2 log2(N) guard bits on the
> left and another 1/2 log2(N) bits are gonna fall of the edge on the
> right.
>
> if the vanilla-flavored radix-2 FFT is used, the first case there is no
> division by 2 in the butterflies.  in the second case, every butterfly
> has a division by 2 (a right shift by one bit) so a bit falls offa the
> edge for each pass.  in the final case, for every *alternate* pass,
> there is a division by 2 (and a right shift by one bit) and for the
> other passes, there is no divide nor shift in the butterflies.
>
> this is something i worried about two decades ago when i was coding this
> for the good-old DSP56K (before the DSP56002 and later, which had a
> special overflow-anticipation bit that was used efficiently implement a
> block-floating-point FFT).

You just HAD to go and make it complicateder.

Yes to all that, but within a lot of reasonable assumptions I don't think
it changes the conclusions -- it just affects the details in how you
implement your FFT so that the assumptions you do make remain valid.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
```