comp.dsp | FFT algoritms| page 3

Reply by glen herrmannsfeldt ●November 26, 20072007-11-26

Richard Owlett wrote:

(I wrote)
>> Square waves are not orthogonal, which makes the transform
>> harder, but not impossible.

> Why would they be any less orthogonal than sets of sines and cosines?
> They should also span the space as well as sines and cosines.

They span the space fine, but aren't orthogonal.

>> Still, you might look at the ICT.  It is a transform specifically
>> designed to make the forward transform fast on a machine with no
>> multiply instructions.   You might find from there some of the
>> properties of transforms on non-orthogonal basis functions.

>> The ICT assumes that the inverse transform is done on a much
>> faster processor (on earth).  The forward case can be done
>> fast on a CDP1802.

> By ICT do you mean Inverse Cosine Transform? That was the only relevant 
> Google hit I found. My signal is real but not necessarily even (or odd).

> I'd be interested in anything that eliminated multiplications.

Integer Cosine Transform.

http://tda.jpl.nasa.gov/progress_report/42-119/119M.pdf

Consider that you have to do the forward transform on a CDP1802,
maybe similar in power to a 6502.  You have a limited time in which
to do the transform, and limited memory for intermediate data.

The inverse can be done using the fastest hardware you can find,
and you are much less restricted in the time available.  Such 
restrictions led to an interesting new transform.
(Well, not so new now.)

-- glen

Reply by Richard Owlett ●November 26, 20072007-11-26

Ron N. wrote:

> On Nov 25, 4:30 am, Richard Owlett <rowl...@atlascomm.net> wrote:
> 
>>glen herrmannsfeldt wrote:
>>
>>>Richard Owlett wrote:
>>
>>>(snip)
>>
>>>>I'm not interested in the computation time but in whether or not I can
>>>>follow what's going on. Eventually I want to experiment with using
>>>>square waves instead of sinusoids as my basis functions. I suspect it
>>>>will be a better model for my situation.
>>
>>>Square waves are not orthogonal, which makes the transform
>>>harder, but not impossible.
>>
>>Why would they be any less orthogonal than sets of sines and cosines?
>>They should also span the space as well as sines and cosines.
> 
> 
> It depends on which waves.  Even waves are orthogonal to odd
> waves (sinusoidal or square), periodic waves are orthogonal
> to waves at 2^N times the frequency (sinusoidal or square).
> But square waves are not orthogonal to square waves at three
> times the frequency (or other odd multiples), whereas
> sinusoids are.

Oh the trouble gotten into when reasoning by analogy without checking ;)

> 
> There are Walsh functions which do span a space and are
> orthogonal, but they are not purely "square" waves, if square
> means periodic with a constant 50% duty cycle.  Since they
> are not completely constant in periodicity or duty cycle,
> using them to determine "frequency content", for more common
> usages of that term, might be problematic.
> 

I can't see an EXPLICIT requirement for 50% duty cycle. However I'm 
looking for simplicity and a ZERO mean value can easily be shown to be a 
requirement. So 50% duty cycle probably is IMPLICIT.

Thanks.

Reply by Richard Owlett ●November 26, 20072007-11-26

glen herrmannsfeldt wrote:

> Richard Owlett wrote:
> 
> (I wrote)
> 
>>> Square waves are not orthogonal, which makes the transform
>>> harder, but not impossible.
> 
> 
>> Why would they be any less orthogonal than sets of sines and cosines?
>> They should also span the space as well as sines and cosines.
> 
> 
> They span the space fine, but aren't orthogonal.
> 
>>> Still, you might look at the ICT.  It is a transform specifically
>>> designed to make the forward transform fast on a machine with no
>>> multiply instructions.   You might find from there some of the
>>> properties of transforms on non-orthogonal basis functions.
> 
> 
>>> The ICT assumes that the inverse transform is done on a much
>>> faster processor (on earth).  The forward case can be done
>>> fast on a CDP1802.
> 
> 
>> By ICT do you mean Inverse Cosine Transform? That was the only 
>> relevant Google hit I found. My signal is real but not necessarily 
>> even (or odd).
> 
> 
>> I'd be interested in anything that eliminated multiplications.
> 
> 
> Integer Cosine Transform.
> 
> http://tda.jpl.nasa.gov/progress_report/42-119/119M.pdf

Written by and FOR mathematicians ;<

> 
> Consider that you have to do the forward transform on a CDP1802,

When Googling for CDP1802 to see if was what I recalled as an 1802, I found

   "A CDP1802 is much like "486" or Pentium is to Windows PCs today"
               http://www.cdp1802.org/cdp1802.html

OK, so what is a little context between friends ;)

> maybe similar in power to a 6502.  You have a limited time in which
> to do the transform, and limited memory for intermediate data.
> 
> The inverse can be done using the fastest hardware you can find,
> and you are much less restricted in the time available.  Such 
> restrictions led to an interesting new transform.
> (Well, not so new now.)
> 
> -- glen
>

Reply by Mark Borgerding ●November 27, 20072007-11-27

Richard Owlett wrote:
> In "Chapter 12 - The Fast Fourier Transform / FFT Programs" of _The 
> Scientist and Engineer's Guide to Digital Signal Processing_
> Smith says "There are also FFT routines that completely eliminate the 
> bit reversal sorting."
> 
> Could someone point me to a description of one, preferably with BASIC or 
> FORTRAN sample code.
> 
> TIA

KissFFT does not use bit-reversed addressing.
I think the mixed-radix fft is not a good match for it.

If you are interested in seeing the derivation, I put together a pdf 
that contains the derivation used in kissfft.

http://sourceforge.net/projects/kissfft
(go to download, browse all files)

-- Mark

Reply by Richard Owlett ●November 27, 20072007-11-27

Mark Borgerding wrote:
> Richard Owlett wrote:
> 
>> In "Chapter 12 - The Fast Fourier Transform / FFT Programs" of _The 
>> Scientist and Engineer's Guide to Digital Signal Processing_
>> Smith says "There are also FFT routines that completely eliminate the 
>> bit reversal sorting."
>>
>> Could someone point me to a description of one, preferably with BASIC 
>> or FORTRAN sample code.
>>
>> TIA
> 
> 
> 
> KissFFT does not use bit-reversed addressing.
> I think the mixed-radix fft is not a good match for it.
> 
> If you are interested in seeing the derivation, I put together a pdf 
> that contains the derivation used in kissfft.
> 
> http://sourceforge.net/projects/kissfft
> (go to download, browse all files)
> 
> -- Mark
Downloaded.
This group seems to insist I understand the math I took 40 years ago ;)
Thanks

Reply by Steven G. Johnson ●November 27, 20072007-11-27

On Nov 24, 9:45 am, Richard Owlett <rowl...@atlascomm.net> wrote:
> In "Chapter 12 - The Fast Fourier Transform / FFT Programs" of _The
> Scientist and Engineer's Guide to Digital Signal Processing_
> Smith says "There are also FFT routines that completely eliminate the
> bit reversal sorting."
>
> Could someone point me to a description of one, preferably with BASIC or
> FORTRAN sample code.

A number of references are given in the Wikipedia article on the
Cooley-Tukey FFT algorithm.

       http://en.wikipedia.org/wiki/Cooley-Tukey_FFT_algorithm

The Cooley-Tukey FFT (the most common algorithm), along with other FFT
algorithms for that matter, intrinsically requires some sort of re-
ordering, in the sense that its operations map consecutive inputs to
non-consecutive outputs and vice-versa.

The most well-known approach is to do the re-ordering all at once,
with a single bit-reversal pass (or, more generally, a mixed-base
digit reversal).  Another approach that still works in-place is to
arrange the sequence of radices so that one can do a sequence of small
transpositions interleaved with the transform (alternating DIT/DIF
stages work well here).  The simplest approaches work out-of-place.
e.g. the Stockham algorithm goes back and forth between two arrays
with each pass, transposing one digit at a time.

The easiest approach, in my mind, is simply to do the algorithm
recursively out-of-place; if you directly transcribe the Cooley-Tukey
algorithm from the mathematical description, working recursively, all
of the outputs are "automatically" put in the right place.  (This was
the approach we used in the original FFTW; the current version of FFTW
implements a variety of algorithms as described in our Proc. IEEE
paper.)

For example, here is a very simple 15-line radix-2 FFT, working
recursively and out-of-place, that does not require any separate bit-
reversal pass.  (To make it reasonably fast, one would want to use a
larger base case than the size-1 base-case used here.  It also uses a
relatively inaccurate trigonometric recurrence.  But of course, if you
were actually interested in such practical issues you should be using
a pre-existing FFT library.)

#include <complex.h>
#define TWOPI 6.2831853071795864769252867665590057683943388
/* call with rec_fft(+1 or -1, N, input, output, 1, 1), where input
and output are arrays of length N ...
   requires C99 complex-number support  */
void rec_fft(int sign, int N, complex double *x, complex double *y,
int is, int os)
{
     if (N == 1) *y = *x;
     else {
          int N2 = N/2;
          rec_fft(sign, N2, x, y, is*2, os);
          rec_fft(sign, N2, x+is, y+os*N2, is*2, os);
          complex double omega = 1.0, omega1 = cexp(sign * I * TWOPI /
N);
          for (int i = 0; i < N2; ++i) {
               complex double a = y[os*i], b = omega * y[os*(i + N2)];
               y[os*i] = a + b;
               y[os*(i + N2)] = a - b;
               omega *= omega1;
          }
     }
}

Regards,
Steven G. Johnson

PS. Numerical Recipes, which several other posters have suggested, is
not the best reference for information beyond the textbook radix-2
algorithm. Its advice on more sophisticated FFT algorithms is at least
20 years out of date, to the extent that it was ever true.

Reply by Steven G. Johnson ●November 27, 20072007-11-27

On Nov 24, 11:30 pm, robert bristow-johnson
<r...@audioimagination.com> wrote:
> there are drawings of it in Oppenheim and Schafer (any era of O&S) for
> the case of N=8.  i think i can see a pattern in them but, they do not
> appear to me to be as efficient, for most instruction sets, as the
> simple FFT where you have to bit reverse before or after.  you need to
> maintain many more pointers and i'm not completely sure how you
> manipulate them all for the different FFT passes.

To the contrary, a separate bit-reversal pass means that you have an
additional "pass" over your data, which can have a non-negligible
impact on performance (if the rest of your FFT is fast...if you are
using vanilla radix-2 it doesn't make much difference).  For large N,
the problem is the cache impact.  For small N, e.g. N=8, you should
just read everything into registers, compute the transform, and write
it back out in the correct order (or whatever order you want).  Even
for N larger than the cache (or registers, which can be viewed as a
kind of cache), the goal is to do as much as possible with the data
while it is in cache, and avoid going in and out of the cache as much
as possible.

One point of potential confusion: the FFT variants that do not require
bit-reversal are not really different "algorithms".  All of the ones
being referred to here are still variants of the same Cooley-Tukey
factorization of the DFT, with the same number of arithmetic
operations.  The only difference lies in where/when data are stored/
accessed in memory.

Regards,
Steven G. Johnson

Reply by robert bristow-johnson ●November 27, 20072007-11-27

On Nov 27, 12:35 pm, "Steven G. Johnson" <stev...@alum.mit.edu> wrote:
> On Nov 24, 11:30 pm, robert bristow-johnson
>
> <r...@audioimagination.com> wrote:
> > there are drawings of it in Oppenheim and Schafer (any era of O&S) for
> > the case of N=8.  i think i can see a pattern in them but, they do not
> > appear to me to be as efficient, for most instruction sets, as the
> > simple FFT where you have to bit reverse before or after.  you need to
> > maintain many more pointers and i'm not completely sure how you
> > manipulate them all for the different FFT passes.
>
> To the contrary, a separate bit-reversal pass means that you have an
> additional "pass" over your data, which can have a non-negligible
> impact on performance (if the rest of your FFT is fast...if you are
> using vanilla radix-2 it doesn't make much difference).  For large N,
> the problem is the cache impact.  For small N, e.g. N=8, you should
> just read everything into registers, compute the transform, and write
> it back out in the correct order (or whatever order you want).  Even
> for N larger than the cache (or registers, which can be viewed as a
> kind of cache), the goal is to do as much as possible with the data
> while it is in cache, and avoid going in and out of the cache as much
> as possible.
>
> One point of potential confusion: the FFT variants that do not require
> bit-reversal are not really different "algorithms".  All of the ones
> being referred to here are still variants of the same Cooley-Tukey
> factorization of the DFT, with the same number of arithmetic
> operations.  The only difference lies in where/when data are stored/
> accessed in memory.

Steven, what's different (in terms of code) is the pointer
arithmetic.  doing all sorts of goofy pointer arithmetic might be more
costly, in the aggregate throughout all of the FFT passes, than a
single bit-reversing pass at the beginning or end.  the normal in-
place FFT should do better cache-wise than these normal-in, normal-out
routines because the top (or bottom) of the butterfly will share pages
of memory with the top (or bottom) of the next butterfly.  and in-
place should do better cache-wise than an indexing scheme where two
input nodes get read in and crunched and then written to two totally
different locations for the output (and this cannot be in place, you
would have to have two buffers, each of size N, and ping-pong back and
forth between the two for each pass).

in addition, as i tried to point out earlier, if you're doing
something like fast-convolution using the FFT, then if your forward
FFT leaves its results in bit-reversing order and your inverse FFT
takes its input in bit reversed order (and leaves the results in
normal order), then no bit reversing will be necessary.  (the transfer
function, besides being pre-computed, will also be pre bit-reversed.)

r b-j

Reply by SteveSmith ●November 27, 20072007-11-27

Hi Richard,
I haven&rsquo;t seen these alternate FFTs in years, but I can look around if
you have a need.  However, they are just as cryptic and difficult to
understand as the conventional ones.    Also, I doubt that you could
modify one to use square waves.  It sounds like you should be using the
&ldquo;DFT by Correlation&rdquo; method.  See page 157 in Chapter 8 of my on-line
book (www.DSPguide.com). This can directly be modified to use square
waves.  Good luck and let me know if I can help. 
Regards,
Steve

Steve.Smith1@SpectrumSDI.com

Reply by glen herrmannsfeldt ●November 27, 20072007-11-27

Steven G. Johnson wrote:
(snip regarding bit-reversal in FFTs)

> The most well-known approach is to do the re-ordering all at once,
> with a single bit-reversal pass (or, more generally, a mixed-base
> digit reversal).  Another approach that still works in-place is to
> arrange the sequence of radices so that one can do a sequence of small
> transpositions interleaved with the transform (alternating DIT/DIF
> stages work well here).  The simplest approaches work out-of-place.
> e.g. the Stockham algorithm goes back and forth between two arrays
> with each pass, transposing one digit at a time.

I still remember a Fortran FFT subroutine from long ago that did
bit reversal using nested DO loops.  Depending on N (the number
of points) it would (illegally) GOTO the appropriate nested DO and
(maybe) GOTO out at the appropriate point.  Not very portable, but
it must have worked on at least one machine.

-- glen

Previous 1 234 Next

FFT algoritms

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group