# combining small CONSECUTIVE FFTs to make a larger FFT

Started by June 16, 2010
```Hello everybody.

I have got a DSP question for all of you.
I'd like to know if it is possible to obtain a large FFT (for example 1024
samples) combining small SEQUENTIAL FFTs(128 samples).

It is really important for me that the input samples are consecutive (so it
is not possible to apply the usual trick of even and odd samples).
I'm working on a real time system with a RT constraint of 128 samples.
Actually I'm accumulating samples for 8 iterations (128*8=1024) until to
obtain the 1024 samples to perform the FFT of the input signal. However I'd
like to decompose the FFT in small sequential FFTs in order to distribute
the compute for each frame.

If you have any question don't hesitate to contact me.

I'm looking forward to hearing from you, Andrea.

```
```On Wed, 16 Jun 2010 06:44:18 -0500, "Intuitive"
<andrea3891@n_o_s_p_a_m.gmail.com> wrote:

>I have got a DSP question for all of you.
>I'd like to know if it is possible to obtain a large FFT (for example 1024
>samples) combining small SEQUENTIAL FFTs(128 samples).
>
>It is really important for me that the input samples are consecutive (so it
>is not possible to apply the usual trick of even and odd samples).
>I'm working on a real time system with a RT constraint of 128 samples.
>Actually I'm accumulating samples for 8 iterations (128*8=1024) until to
>obtain the 1024 samples to perform the FFT of the input signal. However I'd
>like to decompose the FFT in small sequential FFTs in order to distribute
>the compute for each frame.

It's possible, but you have to modify the FFT algorithm slightly.

my comment at the end:  "Either of these can be reformulated for cases where the
overlap is less than (N-1) points."  In your case the "old" samples are all
zeroes.  You will build your 1024-point FFT in eight "hops", and each hop will
overlap the previous data set by 896 samples, instead of by just one sample.

The modification to the FFT algorithm is in the denominator of the exponential
terms.  A 128-point FFT (DFT) is defined as:

127
X(k) = SUM [x(n)*exp(-j2PIkn/128)]
n=0

However, each of your "small sequential FFTs" needs to be computed as:

127
X(k) = SUM [x(n)*exp(-j2PIkn/1024)] (note "1024" in denominator)
n=0

Greg
```
```On Wed, 16 Jun 2010 09:15:40 -0400, Greg Berchin <gberchin@comicast.net.invalid>
wrote:

>You will build your 1024-point FFT in eight "hops", and each hop will
>overlap the previous data set by 896 samples, instead of by just one sample.

Make that, "instead of by 1023 samples".
```
```On Jun 16, 7:44&#2013266080;am, "Intuitive" <andrea3891@n_o_s_p_a_m.gmail.com>
wrote:
> Hello everybody.
>
> I have got a DSP question for all of you.
> I'd like to know if it is possible to obtain a large FFT (for example 1024
> samples) combining small SEQUENTIAL FFTs(128 samples).
>
> It is really important for me that the input samples are consecutive (so it
> is not possible to apply the usual trick of even and odd samples).
> I'm working on a real time system with a RT constraint of 128 samples.
> Actually I'm accumulating samples for 8 iterations (128*8=1024) until to
> obtain the 1024 samples to perform the FFT of the input signal. However I'd
> like to decompose the FFT in small sequential FFTs in order to distribute
> the compute for each frame.
>
> If you have any question don't hesitate to contact me.
>
> I'm looking forward to hearing from you, Andrea.

Yes, and what you may want to look up is the difference between
decimation in time and decimation in frequency algorithms for the

IHTH,
Clay

```
```
Intuitive wrote:

> Hello everybody.
>
> I have got a DSP question for all of you.
> I'd like to know if it is possible to obtain a large FFT (for example 1024
> samples) combining small SEQUENTIAL FFTs(128 samples).

The best and the simplest is get the data back by inverse FFTs, then
compite the large FFT.  However you can combine two small FFTs into one
large, and so on, so forth. This is not going to be fast or elegant.
The even bins of the large FFT are done by summation of the bins of the
small FFTs. The odd bins done by the frequency shift of the small FFTs
by 1/2 of the bin (that is sinc interpolation in frq domain), and then
summation just like the even bins.

DSP and Mixed Signal Design Consultant
http://www.abvolt.com
```
```On Jun 16, 1:59&#2013266080;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Intuitive wrote:
> > Hello everybody.
>
> > I have got a DSP question for all of you.
> > I'd like to know if it is possible to obtain a large FFT (for example 1024
> > samples) combining small SEQUENTIAL FFTs(128 samples).
>
> The best and the simplest is get the data back by inverse FFTs, then
> compite the large FFT. &#2013266080;However you can combine two small FFTs into one
> large, and so on, so forth. This is not going to be fast or elegant.
> The even bins of the large FFT are done by summation of the bins of the
> small FFTs. The odd bins done by the frequency shift of the small FFTs
> by 1/2 of the bin (that is sinc interpolation in freq domain), and then
> summation just like the even bins.

Vlad, it seems to me that Greg and Clay were on the right track.  The
whole idea behind the FFT is that it builds an N-point DFT out of two
point butterfly or a 4-point butterfly as the building block to the
FFT, think of it as a 128-point butterfly as the building block.

I used to know how to do this by heart.  Even a decade ago when we
were all hanging out here on comp.dsp .  But now I have to think about
it and rederive the wheel.  but i'm pretty sure you can bust up the
1024 term summation into a few 128 term summations of contiguous
samples (i think it would be the Decimation-in-Frequency alg) and make
some adjustments to the terms going in or coming out.  but i really
feel to lazy to attack this.  maybe i can find it in some lit, but
that's work also.

r b-j

```
```
robert bristow-johnson wrote:

> On Jun 16, 1:59 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
>
>>Intuitive wrote:
>>
>>>Hello everybody.
>>
>>>I have got a DSP question for all of you.
>>>I'd like to know if it is possible to obtain a large FFT (for example 1024
>>>samples) combining small SEQUENTIAL FFTs(128 samples).
>>
>>The best and the simplest is get the data back by inverse FFTs, then
>>compite the large FFT.  However you can combine two small FFTs into one
>>large, and so on, so forth. This is not going to be fast or elegant.
>>The even bins of the large FFT are done by summation of the bins of the
>>small FFTs. The odd bins done by the frequency shift of the small FFTs
>>by 1/2 of the bin (that is sinc interpolation in freq domain), and then
>>summation just like the even bins.
>
>
> Vlad, it seems to me that Greg and Clay were on the right track.  The
> whole idea behind the FFT is that it builds an N-point DFT out of two
> N/2-point DFTs (with some pre- or post-adjustment).  Instead of a 2-
> point butterfly or a 4-point butterfly as the building block to the
> FFT, think of it as a 128-point butterfly as the building block.
>
> I used to know how to do this by heart.  Even a decade ago when we
> were all hanging out here on comp.dsp .  But now I have to think about
> it and rederive the wheel.  but i'm pretty sure you can bust up the
> 1024 term summation into a few 128 term summations of contiguous
> samples (i think it would be the Decimation-in-Frequency alg) and make
> some adjustments to the terms going in or coming out.  but i really
> feel to lazy to attack this.  maybe i can find it in some lit, but
> that's work also.

Robert, I agree that you can do a big FFT by changing the constituent
FFTs into something different, like suggested by Greg and Clay. However
could you do that using unmodified small FFTs as building blocks?

VLV

```
```On Wed, 16 Jun 2010 06:44:18 -0500, "Intuitive"

<andrea3891@n_o_s_p_a_m.gmail.com> wrote:
>I have got a DSP question for all of you.
>I'd like to know if it is possible to obtain a large FFT (for example 1024
>samples) combining small SEQUENTIAL FFTs(128 samples).

>It is really important for me that the input samples are consecutive (so it
>is not possible to apply the usual trick of even and odd samples).
>I'm working on a real time system with a RT constraint of 128 samples.
>Actually I'm accumulating samples for 8 iterations (128*8=1024) until to
>obtain the 1024 samples to perform the FFT of the input signal. However I'd
>like to decompose the FFT in small sequential FFTs in order to distribute
>the compute for each frame.

Similar questions have come up before, as in these 2 threads:

2009:

2002:

I think the critical point to make is this.  There are 16 basic types
of FFT (see the Tran-Thong reference in the first thread). There are 8
DIT and 8 DIF algorithms of different input/output orders and geometry
types.  There are also a lot more non-basic types (see Rabiner and
Gold's book and others).  But in every single case (using N = 8 graphs
as an example), the 0 and 4 points are always combined in the same 1st
stage butterfly.  The butterfly may move around or look different
because of the decimation, input/output order or geometry, but the 0
and 4 input points are joined at the hip.  So are the input points 1
and 5, 2 and 6, and 3 and 7.  Similar concepts apply to all larger N

Another important thing to note about all FFT algorithms is that it is
impossible to calculate any one of the N outputs without having all
the N inputs available.  Unless you can guarantee that an input is 0,
each and every input affects each and every output, no matter what FFT
algorithm you're using.

I think I understand what you're trying to do by 'distributing' the
processing after receiving a certain number of sequential points, but
the problem is that all known FFT algorithms don't work that way.

As noted in the above 2 threads, you can in fact combine smaller
groups of sequential samples into a larger transform by zero padding
and using linearity.  For instance, you could obtain the 1st 512
points of a 1024 point sequence, zero pad up to 1024, then use a
pruned 1024 point FFT to compute 1024 frequency outputs.  Then, after
you get the second 512 points, you add 512 zeroes to the front end and
compute a second pruned 1024 FFT to compute another 1024 points.  Then
add the first 1024 to the second 1024:

Pruning the 2 FFT's and making use of real input only processing will
lessen the number of calculations, but I'm sure the overall result
requires more overhead than an N point FFT without the zero padding.

There are other options.  For real input waveforms, you could use the
'2N real with an N point complex' approach (I think this is what you
meant by the 'usual trick of even and odd samples').  You could also
use a sliding DFT.  It's not very efficient when computing all N
outputs (N calculations for each new sample), but you get N updated
outputs after every new sample.

Maybe there are other ways.  What exactly do you need the FFT for?
Are you trying to estimate something?  A frequency or amplitude
perhaps?  Other methods may be more suitable and less computationally
costly.

Kevin McGee
```
```On Jun 16, 7:44&#2013266080;am, "Intuitive" <andrea3891@n_o_s_p_a_m.gmail.com>
wrote:
> Hello everybody.
>
> I have got a DSP question for all of you.
> I'd like to know if it is possible to obtain a large FFT (for example 1024
> samples) combining small SEQUENTIAL FFTs(128 samples).
>
> It is really important for me that the input samples are consecutive (so it
> is not possible to apply the usual trick of even and odd samples).
> I'm working on a real time system with a RT constraint of 128 samples.
> Actually I'm accumulating samples for 8 iterations (128*8=1024) until to
> obtain the 1024 samples to perform the FFT of the input signal. However I'd
> like to decompose the FFT in small sequential FFTs in order to distribute
> the compute for each frame.
>
> If you have any question don't hesitate to contact me.
>
> I'm looking forward to hearing from you, Andrea.

Try googling "Pipelined FFT" you will find a bunch of pertinent
papers.

Clay
```
```Hi guys!!

In this post I'll try to provide a better explanation of my problem:

I'm developing a Real-time Signal Processing Algorithm on an embedded
system.
Design constrain impose me to elaborate the audio stream using frame of 128
samples.
However some signal processing operations (in the frequency domain) have to
be executed at 1024 samples.
So every 8 frame (1024/8) a 1024 samples fft is executed.
In this way the workload required by the algorithm it is not costant. Some
spikes (caused by the fft calculation) are presented every 8 frame.
I'd like to distribuite the calculation of the 1024 samples FFT in all the
eight frames using smaller FFT (I'm using an optimized DSPLib).
However it is really important to consider that I don't know the value of
the whole sequence because I'm working RT.

At this moment I'm following the advice of Vladimir. I tested the second
approach in matlab and It works. However I don't understand the his first
proposed apporach.
What is the meaning of:
"The best and the simplest is get the data back by inverse FFTs, then
compite the large FFT."
Can you give me a more simple explanation? (Possible step by step).

I don't understand the approach proposed by Greg too. Can you give a better
explanation? (Possible step by step).

Thank you everybody, Andrea!
```