DSPRelated.com
Forums

FFT By Decimation In Time paper or succinct explanitory resource

Started by Benjamin W D Jordan May 1, 2005
Hi All,

I have recently completed my DSP course (subject) which is part of my 
engineering degree and I am very interested in DSP applications to Audio in 
particular. I have decided to undertake a project which involves FFT 
(probably I will use a Radix-2 to keep it simple for me) in hardware - I 
will be coding an FFT in VHDL.

The main purpose of this project is to own the core functionality, so I 
intend to write the IP core myself, rather than use one of the miriad 
available for download from sites such as www.opencores.org. (Let alone the 
fact that free IP usually requires just as much effort getting it to work in 
a given application as writing your own from scratch...)

My only problem is that I still have what I would call a mediocre grasp of 
how to do it and what decimation in time actually is. I know what FFT is, 
and why I want to use it - just can't actually do it yet. Can anyone tell me 
where on the web I may find some succinct, clear and 
easy-for-an-undergrad-to-understand reference on Decimation In Time. What I 
mean is, it can have the complex maths, but it must have more plain English 
than academic gushing, and not be just a bunch of symbols with no comments 
either.

Hey, I know it's a tall ask, but I'm sure that some of you guys have been 
where I am now standing and can help point the way forward.

Thanks in advance,

Ben Dennis. 


"Benjamin W D Jordan" <ben.xjordanx@optusnet.com.au> wrote in message 
news:4274c0d4$0$4656$afc38c87@news.optusnet.com.au...
> Hi All, > > I have recently completed my DSP course (subject) which is part of my > engineering degree and I am very interested in DSP applications to Audio > in particular. I have decided to undertake a project which involves FFT > (probably I will use a Radix-2 to keep it simple for me) in hardware - I > will be coding an FFT in VHDL. > > The main purpose of this project is to own the core functionality, so I > intend to write the IP core myself, rather than use one of the miriad > available for download from sites such as www.opencores.org. (Let alone > the fact that free IP usually requires just as much effort getting it to > work in a given application as writing your own from scratch...) > > My only problem is that I still have what I would call a mediocre grasp of > how to do it and what decimation in time actually is. I know what FFT is, > and why I want to use it - just can't actually do it yet. Can anyone tell > me where on the web I may find some succinct, clear and > easy-for-an-undergrad-to-understand reference on Decimation In Time. What > I mean is, it can have the complex maths, but it must have more plain > English than academic gushing, and not be just a bunch of symbols with no > comments either. > > Hey, I know it's a tall ask, but I'm sure that some of you guys have been > where I am now standing and can help point the way forward. > > Thanks in advance, > > Ben Dennis. >
Hi Ben A book I really like is FFT and its Applications by E. Brigham , you may be able to borrow it from your local library before you decide whether it's for you or not. Best of Luck - Mike
in article 4274c0d4$0$4656$afc38c87@news.optusnet.com.au, Benjamin W D
Jordan at ben.xjordanx@optusnet.com.au wrote on 05/01/2005 07:43:

> I have recently completed my DSP course (subject) which is part of my > engineering degree and I am very interested in DSP applications to Audio in > particular. I have decided to undertake a project which involves FFT > (probably I will use a Radix-2 to keep it simple for me) in hardware - I > will be coding an FFT in VHDL.
... i might pee in my pants looking at that cliff to climb.
> Can anyone tell me where on the web I may find some succinct, clear and > easy-for-an-undergrad-to-understand reference on Decimation In Time. What I > mean is, it can have the complex maths, but it must have more plain English > than academic gushing, and not be just a bunch of symbols with no comments > either.
i dunno if Stephen G. Johnson and any other FFTW folks will object to any of this. maybe you should get Oppenheim and Schafer and look at the FFT chapter in that. (for reference) DFT: N-1 X[k] = SUM{ x[n] exp(-j*2*pi*n*k/N) } n=0 iDFT: N-1 x[n] = 1/N SUM{ X[k] exp(+j*2*pi*n*k/N) } k=0 to blast either summation out, in the straight-forward manner needs N-1 complex multiplies (we don't need to multiply by exp(0)) and N-1 complex additions for *each* output value, X[k] or x[n]. so that's a total of N*(N-1) complex multiplies and additions. (about N^2.) the Cooley-Tukey Decimation-in-Time algorithm: suppose N is an even number... we split the sum into two summations, one for even indices, the other odd. N-1 X[k] = SUM{ x[n] exp(-j*2*pi*n*k/N) } n=0 N-2 N-1 = SUM{ x[n] exp(-j*2*pi*n*k/N) } + SUM{ x[n] exp(-j*2*pi*n*k/N) } n=0 (even) n=1 (odd) N/2-1 = SUM{ x[2n] exp(-j*2*pi*(2n)*k/N) } n=0 N/2-1 + SUM{ x[2n+1] exp(-j*2*pi*(2n+1)*k/N) } n=0 N/2-1 X[k] = SUM{ x[2n] exp(-j*2*pi*n*k/(N/2)) } n=0 N/2-1 + exp(-j*2*pi*k/N) SUM{ x[2n+1] exp(-j*2*pi*n*k/(N/2)) } n=0 now what we did, is turn that N-point DFT into two N/2 point DFTs with an additional N/2 (or N/2 - 1 since we don't need to multiply by exp(0)) multiplications. let's see if that saved us anything: single N point DFT: N*(N-1) = N^2 - N multiplies (and additions) splitting into two DFTs: 2*(N/2)*(N/2 - 1) + (N/2 - 1) multiplies = (N/2 - 1)*(N + 1) = (N^2 - N)/2 - 1 comparing: N^2 - N > (N^2 - N)/2 - 1 clearly splitting this into two N/2 point DFTs saved us at least half of the multiplications and saved even more additions. so why stop there? splitting the N/2 point DFTs each into two N/4 DFTs (a total of four N/4 DFTs) will save us even more complex multiplications and additions. but, to do that N/2 needs to be an even number (N is a multiple of 4). you can see that, if N is a power of 2, we can keep splitting these DFTs up until we're left with the 2-point DFTs. however, there is more intermediate data, than just doing it the straightforward way. so try drawing this out and see what it gets you. you can see the beginnings of the "bit-reversed indexing" in the DIT FFT. in the first stage, you picked out all of the x[n] with odd n, that is with the LSB (least significant bit) equal to 1 and you put those in the second half of the buffer (those have indices with the MSB equal to 1). so the evens (with LSB=0) go in the first group (MSB=0), and the odds (LSB=1) go into the second group (MSB=1). so, for all of these x[n], the value in the LSB migrated to the position of the MSB. that is the beginning of a bit-reversing process on the indices of x[n]. now when you do the same for the smaller DFTs as you continue to divide up the input data, the next least significant bit will move to the next most significant bit position. so your input will have to be swapped in a bit reversed fashion. if it were software, my suggestion is a Decimation-in-Frequency algorithm for the forward FFT (normal order in, bit-reversed order out) and the Decimation-in-Time algorithm for the inverse FFT (bit-reversed order in, normal order out). you need a book like O&S (or a website) to look at and see how this shakes out but i hope this gets you started in the concept. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
small screwup (corrected below).  also, here might be a nice online
tutorial:  http://astron.berkeley.edu/~jrg/ngst/fft/fft.html

in article BE9AC1C5.6CD4%rbj@audioimagination.com, robert bristow-johnson at
rbj@audioimagination.com wrote on 05/01/2005 17:26:

> > N/2-1 > X[k] = SUM{ x[2n] exp(-j*2*pi*n*k/(N/2)) } > n=0 > > N/2-1 > + exp(-j*2*pi*k/N) SUM{ x[2n+1] exp(-j*2*pi*n*k/(N/2)) } > n=0 >
now what we did, is turn that N-point DFT into two N/2 point DFTs with an additional N (or N - 1 since we don't need to multiply by exp(0)) multiplications. let's see if that saved us anything:
> > single N point DFT: N*(N-1) = N^2 - N multiplies (and additions) > > splitting into two DFTs: 2*(N/2)*(N/2 - 1) + (N/2 - 1) multiplies
this should be 2*(N/2)*(N/2 - 1) + (N - 1) = (N^2)/2 - 1 comparing: N^2 - N > (N^2)/2 - 1 N^2 - 2*N + 2 > 0 (true for all real and positive N) clearly splitting this into two N/2 point DFTs saved us about half of the multiplications and saved even more additions. ... -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Thanks very much for your replies.

I think your explanation is the most understandable I have yet read. Of 
course, I understand it is wise to go back over my university texts (and I 
do hear all over how Oppenheim's book is the best reference to start with) 
to solidify the concepts.

Thanks Robert, you are very kind to go to the trouble to put such a good 
description here. I look forward to getting the FFT going!

Best Regards,
Ben.

"robert bristow-johnson" <rbj@audioimagination.com> wrote in message 
news:BE9B20E3.6CF4%rbj@audioimagination.com...
> > small screwup (corrected below). also, here might be a nice online > tutorial: http://astron.berkeley.edu/~jrg/ngst/fft/fft.html > > in article BE9AC1C5.6CD4%rbj@audioimagination.com, robert bristow-johnson > at > rbj@audioimagination.com wrote on 05/01/2005 17:26: > >> >> N/2-1 >> X[k] = SUM{ x[2n] exp(-j*2*pi*n*k/(N/2)) } >> n=0 >> >> N/2-1 >> + exp(-j*2*pi*k/N) SUM{ x[2n+1] exp(-j*2*pi*n*k/(N/2)) } >> n=0 >> > > now what we did, is turn that N-point DFT into two N/2 point DFTs with an > additional N (or N - 1 since we don't need to multiply by exp(0)) > multiplications. let's see if that saved us anything: > >> >> single N point DFT: N*(N-1) = N^2 - N multiplies (and additions) >> >> splitting into two DFTs: 2*(N/2)*(N/2 - 1) + (N/2 - 1) multiplies > > this should be 2*(N/2)*(N/2 - 1) + (N - 1) > > = (N^2)/2 - 1 > > comparing: N^2 - N > (N^2)/2 - 1 > > N^2 - 2*N + 2 > 0 (true for all real and positive N) > > clearly splitting this into two N/2 point DFTs saved us about half of the > multiplications and saved even more additions. > > ... > > -- > > r b-j rbj@audioimagination.com > > "Imagination is more important than knowledge." > >
Let me know how this goes, and what platform you plan to implement your
algorithm in after it's synthesized version works in VHDL :).

It will most likely be a Xilinx Spartan 3 or 3E since they are cheap and 
plentiful. I will be using external SRAM and a host uC also, but the grunt 
will be done in the FPGA.

-Ben.

"Benry" <henrybg@gmail.com> wrote in message 
news:1115036241.831259.98250@o13g2000cwo.googlegroups.com...
> Let me know how this goes, and what platform you plan to implement your > algorithm in after it's synthesized version works in VHDL :). >
I'm currently looking at FFT algorithms that take advantage of
symmetries. A real vector x of length N is called symmetric /
antisymmetric (s/a), if

x(n) = + / - x(N-n),  n=1, 2, ..., N/2-1.

With a little post-processing, any real vector of length N can be
transformed by a N/2-point complex FFT (cFFT) routine (see for example
[1] for an overview). I call such an algorithm an rFFT. In [2], an
algorithm is described which transforms a real s/a vector x of length N
into a real vector y of length N/2. X = rFFT[x] can then be
reconstructed with a little post-processing from Y = rFFT[y]. This
procedure reduces the computational and memory cost for s/a of the FFT
by about a factor of 2.

However, it requires the multiplication by additional "twiddle factors"
of the form 1/(2 sin(2 pi/N k), k = 1, 2, ..., N/2 -1. For large N,
this could lead to numerical stability problems.

To remedy this, another algorithm is given in [3] to compute the DFT of
a s/a vector x of length N. Oddly, it transforms x into *two* real s/a
vectors h and g, each of length N/2 (admittedly without the numerically
instable "twiddle factors" used above), and reconstructs X = rFFT[x]
from H = rFFT[h] and G = rFFT[G]. At first glance, this is equivalent
in computational and memory cost as the DIT FFT procedure described
below by rbj, but with additional pre- and post-processing. It works
only for s/a vectors x.

I don't see the point of such an algorithm - it uses a similar
procedure as the DIT to reduce an N-point DFT into two N/2-point DFTs.
While this step is the heart of the FFT, this new algorithm does not
take advantage of the symmetries, is more complicated than DIT and
works only for s/a vectors. It seems a useless contribution to the FFT
literature.

Am I missing somehting?

Regards,
Andor


robert bristow-johnson wrote:
> N-1 > X[k] = SUM{ x[n] exp(-j*2*pi*n*k/N) } > n=0 > > N/2-1 > = SUM{ x[2n] exp(-j*2*pi*n*k/(N/2)) } > n=0 > > N/2-1 > + exp(-j*2*pi*k/N) SUM{ x[2n+1] exp(-j*2*pi*n*k/(N/2))
}
> n=0
[1] H. V. Sorensen et al, "Real-Valued Fast Fourier Transform Algorithms", IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-35, No. 6, June 1987 [2] L. R. Rabiner, "On the Use of Symmetry in FFT Computation", IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-27, No. 3, June 1979 [3] C. Lu, R. Tolimieri, "New Algorithms for the FFT computation of Symmetric and Translational Complex Conjugate Sequences", Proc. ICASSP'92, pp. 13-16, March 1992
>[2] L. R. Rabiner, >"On the Use of Symmetry in FFT Computation", >IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. ASSP-27, >No. 3, >June 1979 > >[3] C. Lu, R. Tolimieri, >"New Algorithms for the FFT computation of Symmetric and Translational >Complex > Conjugate Sequences", >Proc. ICASSP'92, pp. 13-16, March 1992 >
Some more to the Data Base: Notes on the FFT (Burrus) http://www.dsp.rice.edu/research/fft/fftnote.asc http://dir.yahoo.COm/Science/Mathematics/Software/Fast_Fourier_Transform_FFT_/ One-dirnensional real fast Fourier transfonns http://www.hr/josip/DSP/fft.html FXT Package FFT Code (Amdt) http://www.jjj.de/fxt/ FFT (Don Cross) http://www.lntersrv.com/-dcross/fft.html Public Domain FFT Code http://risc1.numis.nwu.edu/ftp/pub/transforms http://risc1.numis.nwu.edu/fft/ DFT (Paul Bourke) httP://www.swin.edu.au/astronomy/pbourke/sigproc/dft/ FFT code for TMS320 processors ftp://ftp.ti.corn/mirrors/tms320bbs/ Fast Fourier Transfoms (Kifowit) http://ourworld.compuserve.com/homepages/steve_klfowit/fft.htm Nielsen's MIXFFT page http://horne.get2net.dk/jjn/fft.htm Parallel FFT homepage http://www.arc.unm.edu/Workshop/FFT/fft/fft.html FFT Public Domain Algorithms http://www.arc.unm.edu/Workshop/FFT/fft/fft.html FFT LInks http://momonga.t.u-tokyo.ac.jp/-ooura/fftlinks.html FFT,performance, accuracy and code (Mayer) http://www.geocitles.com/ResearchTriangle/8869/fft_summary.html Prime-length FFT http://WYIW.dsp.rice.edu/software/RU-FFT/pfft/pfft.html FFT SOFTWARE WEBSITES FFTW http://www.fftw.org/index.htmi http://www.fftw.org/benchfft/doc/ffts.html FFTPACK http://www.netlib.org/fftpack/ FFT for Pentium (Bernstein) ftp://koobera.math.uic.edu/www/djbfft.html Papers: The quick Fourier transform: an FFT based on symmetries Haitao Guo; Sitton, G.A.; Burrus, C.S. IEEE Transactions on Signal Processing, Volume: 462, Feb. 1998, Page(s): 335 -341 Automatic generation of prime length FFT programs Selesnick, I.W.; Burrus, C.S. IEEE Transactions on Signal Processing, Volume: 44 1 ,Jan. 1996, Page(s): 14 -24 FFT algorithms for prime transform sizes and their implementations on VAX. IBM3090VF. and IBM RS/6000 lu. C.; Cooley. J.W.; Tolimieri, R. IEEE Transactions on Signal Processing. Volume: 41 2, Feb. 1993. Page(s): 638 -648 A low-power, high-performance, 1024-point FFT processor Baas, B.M. IEEE Journal of Solid-State Circuits, Volume: 34 3 , March 1999 , Page(s): 380 -387 --Bhooshan This message was sent using the Comp.DSP web interface on www.DSPRelated.com
"Andor" <an2or@mailcircuit.com> wrote in message
news:1115192133.262524.265050@z14g2000cwz.googlegroups.com...

> Am I missing somehting?
! anti_sym.f90 module mykinds implicit none integer, parameter :: dp = selected_real_kind(15,30) end module mykinds module matrices use mykinds implicit none contains function RF(n) integer, intent(in) :: n real(dp) RF(0:n-1,0:n-1) integer i, j real(dp) pi pi = 4*atan(1.0_dp) do i = 0, n/2 do j = 0, n-1 RF(i,j) = cos(2*pi*mod(i*j,n)/n) end do end do do i = 1, (n-1)/2 do j = 0, n-1 RF(i+n/2,j) = sin(2*pi*mod(i*j,n)/n) end do end do end function RF function X(n) integer, intent(in) :: n real(dp) X(0:n-1,0:n-1) integer i X = 0 X(0,0) = 1 do i = 1, n/2 X(i,i) = 1 X(i,n-i) = 1 end do do i = 1, (n-1)/2 X(i+n/2,i) = 1 X(i+n/2,n-i) = -1 end do end function X function X_inverse(n) integer, intent(in) :: n real(dp) X_inverse(0:n-1,0:n-1) integer i X_inverse = 0 X_inverse(0,0) = 1 do i = 1, n/2 X_inverse(i,i) = X_inverse(i,i)+0.5_dp X_inverse(i,n-i) = X_inverse(i,n-i)+0.5_dp end do do i = 1, (n-1)/2 X_inverse(i+n/2,i) = 0.5_dp X_inverse(i+n/2,n-i) = -0.5_dp end do X_inverse = transpose(X_inverse) end function X_inverse subroutine print_matrix(A) real(dp), intent(in) :: A(0:,0:) character(80) fmt write(fmt,'(a,i0,a)') '(',size(A,2),'(f6.3))' write(*,fmt) transpose(A) end subroutine print_matrix end module matrices program anti_sym use mykinds use matrices implicit none integer n write(*,'(a)',advance='no') ' Enter the matrix order:> ' read(*,*) n write(*,'(a,i0,a)') ' RF(',n,') =' call print_matrix(RF(n)) write(*,'(a,i0,a)') ' X(',n,') =' call print_matrix(X(n)) write(*,'(a,i0,a,i0,a)') ' RF(',n,')*x_inverse(',n,') =' call print_matrix(matmul(RF(n),X_inverse(n))) end program anti_sym Output: C:\LF9556\James\fft\anti_sym>anti_sym Enter the matrix order:> 7 RF(7) = 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.623-0.223-0.901-0.901-0.223 0.623 1.000-0.223-0.901 0.623 0.623-0.901-0.223 1.000-0.901 0.623-0.223-0.223 0.623-0.901 0.000 0.782 0.975 0.434-0.434-0.975-0.782 0.000 0.975-0.434-0.782 0.782 0.434-0.975 0.000 0.434-0.782 0.975-0.975 0.782-0.434 X(7) = 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000-1.000 0.000 0.000 1.000 0.000 0.000-1.000 0.000 0.000 0.000 0.000 1.000-1.000 0.000 0.000 RF(7)*x_inverse(7) = 1.000 1.000 1.000 1.000 0.000 0.000 0.000 1.000 0.623-0.223-0.901 0.000 0.000 0.000 1.000-0.223-0.901 0.623 0.000 0.000 0.000 1.000-0.901 0.623-0.223 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.782 0.975 0.434 0.000 0.000 0.000 0.000 0.975-0.434-0.782 0.000 0.000 0.000 0.000 0.434-0.782 0.975 C:\LF9556\James\fft\anti_sym>anti_sym Enter the matrix order:> 8 RF(8) = 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.707 0.000-0.707-1.000-0.707 0.000 0.707 1.000 0.000-1.000 0.000 1.000 0.000-1.000 0.000 1.000-0.707 0.000 0.707-1.000 0.707 0.000-0.707 1.000-1.000 1.000-1.000 1.000-1.000 1.000-1.000 0.000 0.707 1.000 0.707 0.000-0.707-1.000-0.707 0.000 1.000 0.000-1.000 0.000 1.000 0.000-1.000 0.000 0.707-1.000 0.707 0.000-0.707 1.000-0.707 X(8) = 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000-1.000 0.000 0.000 1.000 0.000 0.000 0.000-1.000 0.000 0.000 0.000 0.000 1.000 0.000-1.000 0.000 0.000 RF(8)*x_inverse(8) = 1.000 1.000 1.000 1.000 1.000 0.000 0.000 0.000 1.000 0.707 0.000-0.707-1.000 0.000 0.000 0.000 1.000 0.000-1.000 0.000 1.000 0.000 0.000 0.000 1.000-0.707 0.000 0.707-1.000 0.000 0.000 0.000 1.000-1.000 1.000-1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.707 1.000 0.707 0.000 0.000 0.000 0.000 0.000 1.000 0.000-1.000 0.000 0.000 0.000 0.000 0.000 0.707-1.000 0.707 Discussion: This shows that a real-half complex DFT can be decomposed into a block diagonal matrix times a sparse matrix. This is, in fact, one approach to FFT algorithms for real data. Actually, for symmetric or antisymmetric inputs one would take RF(n)=(RF(n)*X(n))*X_inverse(n) because computing the action of X_inverse(n) is a no-op for such data. It is left as an exercise to the reader to show that RF(n)*X(n) is also block diagonal. For symmetric data, one only has to compute the action of the upper diagonal block, whereas for antisymmetric data, one needs only the lower diagonal block. I like to think about the issue in this way because you can see at a glance that half (in order, actually 3/4 in area) of the original transform matrix gets blasted away by the symmetry right off the bat. -- write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, & 6.0134700243160014d-154/),(/'x'/)); end