Reply by Tel July 5, 20092009-07-05
>Tran-Thong, =93Algebraic Formulation of the Fast Fourier Transform,=94 >IEEE Circuits and Systems magazine, vol. 3, no. 2, June 1981, pp. >9-19. > >It shows all 16 basic types of FFT: in-place, same input, same output, >isogeometric, bit reversed/seq. out, seq. in/bit reversed out, DIT and >DIF. But there are other FFTs that don't fit neatly into the >categories shown in the above, such as Figs. 10-3 to 10-8 in: > >L. R. Rabiner, B. Gold, Theory and Application of Digital Signal >Processing. Englewood Cliffs, NJ, Prentice-Hall, 1975. >
Thanks for the advice Kevin, I was looking for suggestions for good DSP reference books. Terry.
Reply by kevin July 5, 20092009-07-05
On Jul 4, 2:51&#4294967295;pm, "Tel" <terry.bent...@mbda-systems.com>

You can split up a DFT into first and second halves and add the
results.  Using the first half of inputs, you compute N frequency
outputs (k = 0 to N-1).  Then do the same for the second half.  Then,
using the linearity property of the DFT ( DFT(x + y) = DFT(x) + DFT
(y) ), you add the first N results to the second N results.  There are
ways of improving things, but it's still not a very efficient way of
going about it.

For large N, the computational advantage is with the FFT, and, as
Steven Johnson pointed out, FFT algorithms mix upper and lower halves
of the inputs before doing a half size FFT.  And if you ever find one
that doesn't, please let me know, because I've never seen one either.

There are ways of computing FFTs by overlapping data input and
calculations, and they form an area of study known as pipelined
processors.  They can be blazing fast.  But before looking into
something like that, you probably need to become a lot more familiar
with the many kinds of FFT algorithms (most pipelined architectures
depend on the FFT type).  I would suggest:

Tran-Thong, &#4294967295;Algebraic Formulation of the Fast Fourier Transform,&#4294967295;
IEEE Circuits and Systems magazine, vol. 3, no. 2, June 1981, pp.
9-19.

It shows all 16 basic types of FFT: in-place, same input, same output,
isogeometric, bit reversed/seq. out, seq. in/bit reversed out, DIT and
DIF.  But there are other FFTs that don't fit neatly into the
categories shown in the above, such as Figs. 10-3 to 10-8 in:

L. R. Rabiner, B. Gold, Theory and Application of Digital Signal
Processing.  Englewood Cliffs, NJ, Prentice-Hall, 1975.

I would strongly suggest that you read or obtain a copy of that book
(used copy on ebay, or visit university libraries until you find it).
Despite being old and out of print, it still has one of the best
introductions to pipelined FFT processors (Chapter 10 in the book).
Some of the technology related stuff is dated, but the architectures
are still valid.  And the commutation operations seen throughout
almost all FFT pipelines can be interpreted as bit reversal or corner
turn buffers.  And they can just as easily be implemented with RAM,
rather than shift cells, as is usually seen in the older literature.

Then there's a large number of papers and patents about pipelined FFT
processors.  I wouldn't doubt that there's been 100's of
implementations (maybe many more, since proprietary stuff is not
available to just anyone).

But maybe you don't need super fast and efficient architectures.  If a
simple, direct form of the DFT can do the job, then that may be all
you need.

Kevin McGee

P.S. If you draw a line from the lowest left input up to the top right
output of a seq. in, bit reversed out DIF FFT graph, then all the
butterflies intersected by the line and those to its upper left will
not contain any multiplications.  But if you're interested in reducing
the number of multipliers (which is different than reducing the number
of multiplications), then higher radix pipelines (or modified radix 2
pipelines) are the way to go.  For instance, an N = 64 pipeline
requires only 1 full multiplier, and 2 multipliers to do the +/- .707
twiddles.  And the latter can be implemented with a couple of adders.
And you could do a 4096 with just 3 full multipliers.
Reply by Tel July 4, 20092009-07-04
>Unfortunately, I'm not aware of any FFT algorithm that does what you >want: all of the ones I know of require that the consecutive halves of >the data be mixed in some way before performing the two half-size >FFTs. > >Regards, >Steven G. Johhnson >
Thanks for your input Steven. I think you're probably correct, but having done a little more reading, I think the only hope is to use a modified Decimation In Time FFT, with input in normal order, and output in bit reverse order. Oppenheim indicates that all butterfly coefficients on the top half inputs are W0/N, which might mean avoiding any multiplication. I will look into this further. Terry.
Reply by Steven G. Johnson July 3, 20092009-07-03
On Jul 3, 10:39&#4294967295;am, "Tel" <terry.bent...@mbda-systems.com> wrote:
> For this method to work it is necessary to apply the required complex > additions, subtractions and multiplications AFTER the FFTs, rather than > before, which the conventional DIF method uses. > > I had assumed that this was possible because theFFTis linear, and so any > additions/subtraction and multiplications applied to inputs to theFFT > should be allowed to be applied after theFFToutputs.
The steps of FFT algorithms, being linear, can be viewed as matrix- vector multiplications. So, your assumption is equivalent to saying: "matrix-vector multiplication is linear, so ABx should equal BAx for any matrices A and B and any vector x." It's not true because linearity does not imply commutability, and matrices in general do not commute (AB != BA). Unfortunately, I'm not aware of any FFT algorithm that does what you want: all of the ones I know of require that the consecutive halves of the data be mixed in some way before performing the two half-size FFTs. Regards, Steven G. Johhnson
Reply by Tel July 3, 20092009-07-03
I have followed a previous post where a decimation in time (DIT) method had
been used to combine two half-size FFTs into a full size FFT.  I tried to
adapt this method by using a Decimation In Frequency (DIF) approach.  The
reason behind this approach is that it is for an FPGA target, and samples
are received over consecutive acquisition periods.  If a half-size FFT can
be performed when half the samples have arrived, and a second one when the
full set of samples arrrive, it would be possible to halve the total
processing time (compared to waiting for all the samples to arrive and
performing a full size FFT).

For this method to work it is necessary to apply the required complex
additions, subtractions and multiplications AFTER the FFTs, rather than
before, which the conventional DIF method uses.

I had assumed that this was possible because the FFT is linear, and so any
additions/subtraction and multiplications applied to inputs to the FFT
should be allowed to be applied after the FFT outputs.

Unfortunately the method did not appear to work correctly in a Matlab
model.  The even FFT bins come out correctly, but there are discrepancies
with the odd bin values.  I am assuming this is related to the
multiplication by the twiddle factor.

If anyone has any ideas on the feasibility of this approach, and how my
implementation might be corrected, it would be greatly appreciated.

I have attached the Matlab model below.

Thanks,

Terry.

%Reference FFT data, 128 points.
% Note: 'clear all' command must be issued in Matlab on multiple runs of
% this file.
x = randn(128,1);
X = fft(x);

%Decimation In Frequency (DIF) FFT performed using PRE-processed data 
%(before half-length FFT) 
xe = x(1:1:64) + x(65:1:128);
Xe = fft(xe);
xo = exp(-j*2*pi*[0:63].'/128) .*(x(1:1:64) - x(65:1:128));
Xo = fft(xo);
Xdif(1:2:127) = Xe;
Xdif(2:2:128) = Xo;
Xdif = Xdif';
% Compare Magnitude of Pre-Processed result against full-size FFT
max(abs(Xdif)-abs(X))

%Decimation In Frequency (DIF) FFT performed using POST-processed data
%(after half-length FFT) 
xlo = x(1:1:64);
xhi = x(65:1:128);
Xlo = fft(xlo);
Xhi = fft(xhi);
% Add FFT bins to calculate Even bins.
Xev = Xlo + Xhi;
% Multiply difference of FFT bins by Twiddle Factor to calculate Odd
bins.
Xod = exp(-j*2*pi*[0:63].'/128) .* (Xlo - Xhi);
Xdifpop(1:2:127) = Xev;
Xdifpop(2:2:128) = Xod;
Xdifpop=Xdifpop';
% Compare Magnitude of Post-Processed result against full-size FFT
% Note: Comparison of Real + Imaginary data gave non-zero differences
% due to phase reversals.
max(abs(Xdifpop) - abs(X))