DSPRelated.com
Forums

Very large FFT

Started by AdriaanB April 26, 2007
Hi all,

I am not so new to DSP but I would still like to bounce an idea off the
experts just to check that I have all my ducks in a row. 

A very large FFT say 1M points is to be done on a real sequence. Am I
correct in saying that if I treat this 1mill points as a 1024*1024 matrix
and first do 1024 FFTs of length 1024 (1 FFT on each row), then transpose
the matrix, then again 1024 FFTs of size 1024(1 on each row) and then
transpose the matrix again, that I will have done my 1mill point FFT ? The
answer can be read out row by row.

Ignore the following to simplify the discussion for now:
-phase information
-complex values 

Transposing the matrix is only to ease the memory usage so that I do not
have to jump around in the memory to make FFTs down the columns. 

Thanks in advance.

Adriaan



_____________________________________
Do you know a company who employs DSP engineers?  
Is it already listed at http://dsprelated.com/employers.php ?
AdriaanB wrote:

> A very large FFT say 1M points is to be done on a real sequence. Am I > correct in saying that if I treat this 1mill points as a 1024*1024 matrix > and first do 1024 FFTs of length 1024 (1 FFT on each row), then transpose > the matrix, then again 1024 FFTs of size 1024(1 on each row) and then > transpose the matrix again, that I will have done my 1mill point FFT ? The > answer can be read out row by row.
I don't believe so, but it might not be a bad way to do it for the speed up of faster cache access. Well, if you leave out phase then it probably works, but phase is an important part of the FFT, so you can't leave it out. Numerical Recipes has a pretty understandable (by non-experts) explanation of FFT. -- glen
On 26 Apr, 16:01, "AdriaanB" <adriaan.bredek...@vibro-meter.com>
wrote:
> Hi all, > > I am not so new to DSP but I would still like to bounce an idea off the > experts just to check that I have all my ducks in a row. > > A very large FFT say 1M points is to be done on a real sequence. Am I > correct in saying that if I treat this 1mill points as a 1024*1024 matrix > and first do 1024 FFTs of length 1024 (1 FFT on each row), then transpose > the matrix, then again 1024 FFTs of size 1024(1 on each row) and then > transpose the matrix again, that I will have done my 1mill point FFT ? The > answer can be read out row by row.
Nope. A 1M FFT is *formally* equivalent to multiplying a 1M x 1M matrix with a 1M vector. The matrix is known, though, so all the coefficients are computed on the fly, and you don't actually use more than 2M or maybe 3M elements in memory. Rune
> I am not so new to DSP but I would still like to bounce an idea off the > experts just to check that I have all my ducks in a row. > > A very large FFT say 1M points is to be done on a real sequence. Am I > correct in saying that if I treat this 1mill points as a 1024*1024 matrix > and first do 1024 FFTs of length 1024 (1 FFT on each row), then transpose > the matrix, then again 1024 FFTs of size 1024(1 on each row) and then > transpose the matrix again, that I will have done my 1mill point FFT ? The > answer can be read out row by row.
Here is a link of a presentation I did a few years back about this: http://www.dilloneng.com/documents/DE_HPEC2004_Posters.pdf The basic steps are: 1. 1M points stored in row order 2. Column FFTs (1k 1k ffts) 3. Complex multiply each point by complex phase factor 4. Row FFTs (1k 1k ffts) 5. Results are in column order Tom
Rune Allnor wrote:
> A 1M FFT is *formally* equivalent to multiplying a 1M x 1M matrix > with > a 1M vector.
That't mathematically true. But it is not the way how FFT works. FFT uses symmetries in this matrix introduced by the prime factors of it's size. Only if the size is a prime number, there is no redundency.
> The matrix is known, though, so all the coefficients are > computed on the fly, and you don't actually use more than 2M or maybe > 3M elements in memory.
That's true anyway. Marcel
Hi again,

I looked at http://www.jjj.de/fxt/fxtbook.pdf on page 329 where the guy
explains a method for a 1D FFT of arbitary large size and started to
implement it with Matlab around an hour before going home. That is then
work to be continued tomorrow.

He seems to follow the general thing that was also mentioned here of
splitting it in a row * column and then doing FFTs row wise, then
transpose and repeat the rows again (in effect on the columns) and then
transposing it again. New question is: Why is the extra twiddle factor
multiplication needed halfway through the algorithm?

Some quick answers to comments from previous posting authors:
-Yes phase will have to be taken into account but I can handle that
easily.
-I am not worried about speed since my application does a 256 - 8k point
FFT(depending on config for that channel) every 6.25ms as part of a larger
signal chain (integrations, averaging band extraction etc.) and the VERY
large FFT is done when the timeslot has some extra free time available.
Obviously the small FFT channels have more free time available. Since the
very large one is done in the background, it is finished when it is
finished. The background FFT is anyway processed in SDRAM which means it
is slow.

R


 



>Rune Allnor wrote: >> A 1M FFT is *formally* equivalent to multiplying a 1M x 1M matrix >> with >> a 1M vector. > >That't mathematically true. But it is not the way how FFT works. FFT >uses symmetries in this matrix introduced by the prime factors of it's >size. Only if the size is a prime number, there is no redundency. > >> The matrix is known, though, so all the coefficients are >> computed on the fly, and you don't actually use more than 2M or maybe >> 3M elements in memory. > >That's true anyway. > > >Marcel >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
AdriaanB wrote:
(snip)

> He seems to follow the general thing that was also mentioned here of > splitting it in a row * column and then doing FFTs row wise, then > transpose and repeat the rows again (in effect on the columns) and then > transposing it again. New question is: Why is the extra twiddle factor > multiplication needed halfway through the algorithm?
There is no extra twiddle factor, it is the same one that comes on any step of the FFT. It just looks extra because of the way you are separating it. I wondered some time ago about an FFT of a whole CD, so maybe 44100*60*70 = 185220000 samples (per channel). -- glen
HI

Yes it is clear to me that this is the same Twiddle array that I previous
calculated that for increasing the speed of the FFT calc, but why is he
making it then again ? 

It looks like
- FFT with twiddle 
-Twiddle again (???)
- FFT with twiddle
-final answer

I noticed the same in a PhD thesis  on radio astronomy in the appendix C
where the author made an extra Twiddle step. In the referenced
presentation from Dillon Engineering there was also this step. 

What is the purpose of it? Is the 1D FFT calculated as 2D matrix really
that much different to the image processing 2D FFT ?

R






>AdriaanB wrote: >(snip) > >> He seems to follow the general thing that was also mentioned here of >> splitting it in a row * column and then doing FFTs row wise, then >> transpose and repeat the rows again (in effect on the columns) and
then
>> transposing it again. New question is: Why is the extra twiddle factor >> multiplication needed halfway through the algorithm? > >There is no extra twiddle factor, it is the same one that >comes on any step of the FFT. It just looks extra because of >the way you are separating it. > >I wondered some time ago about an FFT of a whole CD, so maybe >44100*60*70 = 185220000 samples (per channel). > >-- glen > >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?
> Yes it is clear to me that this is the same Twiddle array that I previous > calculated that for increasing the speed of the FFT calc, but why is he > making it then again ? > > It looks like > - FFT with twiddle > -Twiddle again (???) > - FFT with twiddle > -final answer > > I noticed the same in a PhD thesis on radio astronomy in the appendix C > where the author made an extra Twiddle step. In the referenced > presentation from Dillon Engineering there was also this step. > > What is the purpose of it? Is the 1D FFT calculated as 2D matrix really > that much different to the image processing 2D FFT ? >
Without it you don't get a 1M FFT when you are done. It would be nice not to have to do it, but if you don't do the extra twiddle multiply you would have a 2D FFT 1k x 1k FFT and not a 1M FFT. Tom
... a more mathematical answer to that please ?

It is clear that when the recipe says you add salt and two eggs it is best
to add exactly that. I would just rather like to know the why without
having to follow it blindly. 

In the jjj.de .pdf file on page 329 and some pages before where J&#4294967295;rg shows
a Image processing 2D FFT, the difference is exactly the extra Twiddle
factor you have to make to get back to a 1D 1M point FFT. 

R




>> Yes it is clear to me that this is the same Twiddle array that I
previous
>> calculated that for increasing the speed of the FFT calc, but why is
he
>> making it then again ? >> >> It looks like >> - FFT with twiddle >> -Twiddle again (???) >> - FFT with twiddle >> -final answer >> >> I noticed the same in a PhD thesis on radio astronomy in the appendix
C
>> where the author made an extra Twiddle step. In the referenced >> presentation from Dillon Engineering there was also this step. >> >> What is the purpose of it? Is the 1D FFT calculated as 2D matrix
really
>> that much different to the image processing 2D FFT ? >> > >Without it you don't get a 1M FFT when you are done. It would be nice >not to have to do it, but if you don't do the extra twiddle multiply >you would have a 2D FFT 1k x 1k FFT and not a 1M FFT. > >Tom > >
_____________________________________ Do you know a company who employs DSP engineers? Is it already listed at http://dsprelated.com/employers.php ?