comp.dsp | use of radix-2 ffts

Hello again,

Can a radix-4 fft be used to construct any type of radix-2 fft. e.g.
can I manipulate a 64 pt radix-4 fft to give me a 128 pt radix-2 fft ?
If so, is the radix-2 fft of any use in dsp as it usually results in
larger and slower ffts than the radix-4 version.


As always thanks for all replies.

Bob Carter
--
My ignorance is shameful, but I would rather be ashamed than ignorant.
Better, I would rather be neither !!!!

Reply by Greg Berchin ●October 8, 20032003-10-08

On 8 Oct 2003 05:20:19 -0700, stenasc@yahoo.com (Bob) wrote:

>>Can a radix-4 fft be used to construct any type of radix-2 fft. e.g.
>>can I manipulate a 64 pt radix-4 fft to give me a 128 pt radix-2 fft ?

If your input data are strictly real, you can.

In the more general case, you can use a mixed-radix FFT with one
stage of radix-2 and the remainder radix-4.

>>If so, is the radix-2 fft of any use in dsp as it usually results in
>>larger and slower ffts than the radix-4 version.

As always, the answer to your question depends upon the specific
circumstances.  If one is only implementing a single FFT in a
unique situation, then it makes sense to use as many radix-4
stages as possible.  However, if one is implementing in
general-purpose hardware, for example, and several different sizes
must be accommodated, then using a radix-2 provides greater
flexibility for just a small increase in resource usage.

Greg Berchin

Reply by Rune Allnor ●October 8, 20032003-10-08

stenasc@yahoo.com (Bob) wrote in message news:<20540d3a.0310080420.6dc88017@posting.google.com>...
> Hello again,
> 
> Can a radix-4 fft be used to construct any type of radix-2 fft. e.g.
> can I manipulate a 64 pt radix-4 fft to give me a 128 pt radix-2 fft ?
> If so, is the radix-2 fft of any use in dsp as it usually results in
> larger and slower ffts than the radix-4 version.

Some quite interesting questions... usually one requires that a "radix
N"
FFT is based on N being a prime number. Could you define the
"primitive" ("butterfly") of the radix 4 FFT? While I can't see how to
construct such
a "primitive", I am quite sure you will find that there are fewer 
computations involved in computing the 4 point DFT by the radix 2 FFT,
than by using a radix 4 "primitive", whatever this may be.

Rune

Reply by robert bristow-johnson ●October 8, 20032003-10-08

In article 20540d3a.0310080420.6dc88017@posting.google.com, Bob at
stenasc@yahoo.com wrote on 10/08/2003 08:20:

> Can a radix-4 fft be used to construct any type of radix-2 fft. e.g.
> can I manipulate a 64 pt radix-4 fft to give me a 128 pt radix-2 fft ?

sure, you can do two 64 pt FFTs (decimation-in-frequency) and then do the
final radix-2 pass to complete 128 pt FFT.  or you can do the initial
radix-2 pass and follow it with two 64 pt FFTs (decimation-in-time).  check
Oppenhiem & Schafer for nice little drawings.

> If so, is the radix-2 fft of any use in dsp as it usually results in
> larger and slower ffts than the radix-4 version.

if the processor doing the FFT has a (severely) limited number of internal
registers (both for real and imaginary butterfly values and for pointers to
the 4 in-place locations in the FFT buffer), then there is likely no
efficient way to implement a true radix-4 FFT (i.e. a radix-2 FFT will be
the best you can hope for).

you can define any radix-2^m FFT from the radix-2 definition by
encapsulating 2^m in-place points within m passes.  e.g. for radix-8, you
would encapsulate the 12 radix-2 butterflies that span 3 passes and make it
into one big mondo radix-8 butterfly that has 8 (complex) inputs and 8
in-place outputs.

one reason that the radix-4 is popular is that it only requires 4 complex
registers (8 real registers) for the FFT data and 1 complex register (2
real) for the twiddle factor.  the twiddle factor for latter 2 FFT points
within the butterfly is the same as the twiddle factor for the first 2
points except for switching the real and imaginary parts and a couple of
sign changes that can be absorbed into the butterfly code without a
computational penalty (i.e. use the "real" twiddle instead of the "imag" and
vise-versa and "add" gets changed to "subtract" and vise-versa).

the way i have written a radix-2 FFT is to do one radix-2 butterfly and then
the other radix-2 butterfly 90 degrees away (the other one in the radix-4
butterfly within the same pass) so that i would not have to reload the
twiddle factor.  i also wrote my code to avoid multiplication by 1 or j
(multiplying by j simple reverses real and imag (and changes sign of the
real) and can be done with no computational cost if the code is structured
to do it.

the radix-8 (or higher radix-2^m) requires more twiddle data (than 2 real
numbers) to get loaded in to a single radix-2^m butterfly.
 
> As always thanks for all replies.

hope you didn't byte off more that you can chew.  (always a risk when i
answer a question on comp.dsp .)

r b-j

Reply by Steven G. Johnson ●October 9, 20032003-10-09

Rune Allnor wrote:
> Some quite interesting questions... usually one requires that a "radix
> N" FFT is based on N being a prime number.

Nope, the radix can be anything you want (as long as it is a factor of 
the size to be transformed, of course), at least in the Cooley-Tukey FFT 
algorithm.

In general, Cooley-Tukey breaks a DFT of size n = n1 * n2 into n1 DFTs 
of size n2, followed by multiplication by roots of unity ("twiddle 
factors") followed by n2 DFTs of size n1.  A "radix-r" FFT is just the 
special case of n1 = r (for decimation in time) or n2 = r (for 
decimation in frequency), where you use the same factor r recursively.

 > Could you define the
> "primitive" ("butterfly") of the radix 4 FFT? While I can't see how to
> construct such
> a "primitive", I am quite sure you will find that there are fewer 
> computations involved in computing the 4 point DFT by the radix 2 FFT,
> than by using a radix 4 "primitive", whatever this may be.

Nope, quite the opposite.  First of all, realize that the radix's 
"primitive" kernel (or "butterfly") is just a DFT of the size of the 
radix.  So, for a radix-4 FFT, you need a (usually hard-coded) size-4 
DFT.  Because this size-4 DFT includes some multiplications by i, -1, 
etcetera that can be optimized out, a radix-4 FFT algorithm generally 
requires fewer operations than a radix-2 algorithm.

Actually, the minimal operation count (for powers of two) is achieved by 
something called "split-radix" that is a mixture of radix 2 and 4.  At 
each step, it breaks your size-n DFT into three pieces, one of size n/2 
and two of size n/4.

However, FFT speed these days is mostly determined by other factors 
besides strict flop counts.  e.g. the FFTW library (www.fftw.org) often 
uses radices as large as 32 or 64.

Steven

Reply by Steven G. Johnson ●October 9, 20032003-10-09

Steven G. Johnson wrote:
> Nope, quite the opposite.  First of all, realize that the radix's 
> "primitive" kernel (or "butterfly") is just a DFT of the size of the 
> radix.  So, for a radix-4 FFT, you need a (usually hard-coded) size-4 
> DFT. 

(In case it wasn't clear, for a radix-r FFT you need a size-r DFT. 
Usually, you compute this size-r DFT in turn via some FFT algorithm 
(e.g. radix-2), but you hard-code it and optimize out things like 
trivial multiplications.)

Reply by Steven G. Johnson ●October 9, 20032003-10-09

Bob wrote:
> Can a radix-4 fft be used to construct any type of radix-2 fft. e.g.
> can I manipulate a 64 pt radix-4 fft to give me a 128 pt radix-2 fft ?
> If so, is the radix-2 fft of any use in dsp as it usually results in
> larger and slower ffts than the radix-4 version.

It's not clear what you mean by "be used to construct", and I suspect 
you're confusing the algorithm ("radix-r") with the sizes that it can 
handle by itself (r^n).

A radix-r FFT algorithm is best understood as a single step, which *can* 
be applied recursively but need not be.  The beauty of this class of FFT 
algorithms (Cooley-Tukey) is that it expresses a DFT of size n as r 
smaller DFTs of size n/r (plus some combining computations involving 
DFTs of size r).  Because these smaller transforms are also DFTs, you 
can apply any FFT algorithm you want in order to compute them, either 
the same radix or different radices.

Steven

Reply by James Van Buskirk ●October 10, 20032003-10-10

"Greg Berchin" <Bank-to-Turn@mchsi.com> wrote in message
news:uqa8ov00mbp3vprq2vcu6ujssisrq41qe6@4ax.com...

> As always, the answer to your question depends upon the specific
> circumstances.  If one is only implementing a single FFT in a
> unique situation, then it makes sense to use as many radix-4
> stages as possible.  However, if one is implementing in
> general-purpose hardware, for example, and several different sizes
> must be accommodated, then using a radix-2 provides greater
> flexibility for just a small increase in resource usage.

The increase in resource usage for radix-2 FFTs is not always
small.  The number of cache or memory reads and writes goes
up significantly compared to, for example, radix-8 FFTs.  On
a processor where these are the limiting resource radix-2 FFTs
are a bad idea, e.g. Alpha 21164.

-- 
write(*,*) transfer((/17.392111325966148d0,3.6351694777236872d228, &
6.0134700169991705d-154/),(/'x'/)); end

Reply by Greg Berchin ●October 10, 20032003-10-10

On Fri, 10 Oct 2003 19:05:42 GMT, "James Van Buskirk"
<not_valid@comcast.net> wrote:

>>The increase in resource usage for radix-2 FFTs is not always
>>small.  The number of cache or memory reads and writes goes
>>up significantly compared to, for example, radix-8 FFTs.  

Well, the original question had to do with radix-4 vs. radix-2.
Nonetheless, though a radix-8 FFT may provide better resource
utilization in a particular implementation, if the user needs a
16384-point FFT then they are out of luck.

>>On a processor where these are the limiting resource radix-2 
>>FFTs are a bad idea, e.g. Alpha 21164.

As I said, "The answer to your question depends upon the 
specific circumstances."

Greg Berchin

Reply by Ray Andraka ●October 10, 20032003-10-10

If for hardware, one possibility is to use a Winograd FFT, which is
basically a different matrix decomposition designed to minimize the
multiplications (winograd FFTs break the DFT into 3 successive matrix
multiplies, the first and last containing only factors of 1 and i so that
no multiplies are needed.  The middle matrix is a diagonal matrix of
constants).  In the case of the winograd FFT, there are decompositions for
several radices up to 16.  The Winograd algorithm is the basis of my FFT
library for FPGAs, which happens to be the fastest possible for single
sample at a time in Xilinx FPGAs (speed is limited by the minimum pulse
width on the SRL16 primitives...over 300 MS/sec in virtexII-P-7 for the 16
point kernel with a 73 clock latency).

"Steven G. Johnson" wrote:

> Steven G. Johnson wrote:
> > Nope, quite the opposite.  First of all, realize that the radix's
> > "primitive" kernel (or "butterfly") is just a DFT of the size of the
> > radix.  So, for a radix-4 FFT, you need a (usually hard-coded) size-4
> > DFT.
>
> (In case it wasn't clear, for a radix-r FFT you need a size-r DFT.
> Usually, you compute this size-r DFT in turn via some FFT algorithm
> (e.g. radix-2), but you hard-code it and optimize out things like
> trivial multiplications.)

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Previous12 Next

use of radix-2 ffts

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group