On 10 Mar 2006 05:22:41 -0800, "Srini" <RSVasan1007@gmail.com> wrote:

>When I try to apply fft to sequence lengths which are not powers of 2 -
>I decided to extend the data with zeros to the next higher power of 2
>and then apply fft. Problem is the transform has more coefficients than
>the input. We cannot just discard the excess clearly. (When i try to
>apply the inverse transform, I am too far away from the original
>signals)
>
>How do we deal with it in an Image Processing context? I want to be
>able to process odd sized images.
>
>all advice would be appreciated.
>
>regards
>
>srini

Hi,
  maybe the following will help ya'.

Good Luck,
[-Rick-]
-------------------------------------------
Notes  on  the  FFT

                    C. Sidney Burrus

    Department of Electrical and Computer Engineering
Rice University,  Houston, TX  77251-1892, e-mail: csb@rice.edu

	This is a note describing results on efficient algorithms to
calculate the discrete Fourier transform (DFT). The purpose is to
report work done at Rice University, but other contributions used by
the DSP research group at Rice are also cited.  Perhaps the most
interesting is the discovery that the Cooley-Tukey FFT was described
by Gauss in 1805 [1].  That gives some indication of the age of
research on the topic, and the fact that a recently compiled
bibliography [2] on efficient algorithms contains over 3400 entries
indicates its volume.  An expanded version of this bibliography is
published as a book [2] with the references in a data base on a disk.
Four IEEE Press reprint books contain papers on the FFT [3,4,5,6].

	There are several books [7,8,9,10,11,12,13,14,15] that give a
good modern theoretical background for this work, one book [16] that
gives the basic theory plus both FORTRAN and TMS 320 assembly language
programs, and another book [17] that contains a chapter on advanced
FFT topics.  The history of the FFT is outlined in [18,1] and
excellent survey articles can be found in [19,20].  The foundation of
much of the modern work on efficient algorithms was done by S.
Winograd. This can be found in [21,22,23]. An outline and discussion
of his theorems can be found in [17] as well as [7,8,9,10].

	Efficient FFT algorithms for length-2M were described by Gauss
and discovered in modern times by Cooley and Tukey [24].  These have
been highly developed and good examples of FORTRAN programs can be
found in [16].  Several new algorithms have been published that
require the least known amount of total arithmetic [25,26,27,28 ,29].
Of these, the split-radix FFT [26,27,30,31] seems to have the best
structure for programming, and an efficient program has been written
[32] to implement it. A mixture of decimation-in-time and
decimation-in-frequency with very good efficiency is given in [33 ].
Theoretical bounds on the number of multiplications required for the
FFT based on Winograd's theories are given in [10,34]. Schemes for
calculating an in-place, in-order radix-2 FFT are given in [35,
36,37].  Discussion of various forms of unscramblers is given in
[38,39,40,41,42,43,44,45,46].  A discussion of the relation of the
computer architecture, algorithm and compiler can be found in [47,48].

	The "other" FFT is the prime factor algorithm (PFA) which uses
an index map originally developed by Thomas and by Good.  The theory
of the PFA was derived in [49] and further developed and an efficient
in-order and in-place program given in [50,16]. More results on the
PFA are given in [51,52,37,53,54]. A method has been developed to use
dynamic programming to design optimal FFT programs that minimize the
number of additions and data transfers as well as multiplications
[55].  This new approach designs custom algorithms for a particular
computer architecture.  An efficient and practical development of
Winograd's ideas has given a design method that does not require the
rather difficult Chinese remainder theorem [17,56] for short prime
length FFT's. These ideas have been used to design modules of length
11,13, 17, 19, and 25 [57]. Other methods for designing short DFT's
can be found in [58,59].  A use of these ideas with distributed
arithmetic and table look-up rather than multiplication is given in
[60].  A program that implements the nested Winograd Fourier transform
algorithm (WFTA) is given in [7] but it has not proven as fast or as
versatile as the PFA [50].  An interesting use of the PFA was
announced [61] in searching for large prime numbers.

	These efficient algorithms can not only be used on DFT's but
on other transforms with a similar structure.  They have been applied
to the discrete Hartley transform [62,63] and the discrete cosine
transform [29,64,65].

	The fast Hartley transform has been proposed as a superior
method for real data analysis but that has been shown not to be the
case.  A well-designed real-data FFT [66] is always as good as or
better than a well-designed Hartley transform [62,67,68,69].  The
Bruun algorithm [70,71] also looks promising for real data
applications as does the Rader-Brenner algorithm [72,73,69].  A novel
approach to calculating the inverse DFT is given in [74].

	General length algorithms include [75,76,77].  For lengths
that are not highly composite or prime, the chirp z-transform is a
good candidate [16,78] for longer lengths and an efficient order-N^2
algorithm called the QFT [79,80,81] for shorter lengths. A method
which automatically generates near-optimal prime length Winograd based
programs has been given in [56,82,83,84,85]. Special methods are
available for very long lengths [86,87].

	The use of the FFT to calculate discrete convolution was one
of its earliest uses. Although the more direct rectangular transform
[88] would seem to be more efficient, use of the FFT or PFA is still
probably the fastest method on a general purpose computer or DSP chip
[89,66,67, 90] although the use of distributed arithmetic [60] or
number theoretic transforms [91] with special hardware may be even
faster. Special algorithms for use with the short-time Fourier
transform [92] and for the calculation of a few DFT values [93,94,95]
have also been developed.  An excellent analysis of efficient
programming the FFT on DSP microprocessors is given in [96,48].
Formulations of the DFT in terms of tensor or Kronecker products look
promising for developing algorithms for parallel and vector computer
architectures [97,11,98,99,100, 101,102].

	The study of efficient algorithms not only has a long history
and large bibliography, it is still an exciting research field where
new results are used in practical applications.

References

[1]  M.  T.  Heideman,  D.  H.  Johnson,  and  C.  S.  Burrus, "Gauss
and  the  history  of  the  FFT,"  IEEE Acoustics,  Speech, and
Signal  Processing  Magazine,  vol. 1,  pp. 14-21,  October 1984. also
in IEEE Press FFT Reprints, by P. Duhamel, 1995.

[2]  H. V. Sorensen, C. S. Burrus, and M. T. Heideman, Fast Fourier
Transform Database. Boston:  PWS Publishing, 1995.  Update of
Technical Report TR-8402.

[3]  L. R. Rabiner and C. M. Rader, eds., Digital Signal Processing,
selected reprints.  New York:  IEEE Press, 1972.

[4]  DSP Committee, ed., Digital Signal Processing II, selected
reprints. New York:  IEEE Press, 1979.

[5]  DSP Committee, ed., Programs for Digital Signal Processing.  New
York:  IEEE Press, 1979.

[6]  P. Duhamel, ed., Papers on the Fast Fourier Transform.  New York:
IEEE Press, to appear in 1995.

[7]  J. H. McClellan and C. M. Rader, Number Theory in Digital Signal
Processing. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1979.

[8]  H.  J.  Nussbaumer,  Fast  Fourier  Transform  and  Convolution
Algorithms.   Heidelberg,  Germany: Springer-Verlag, second ed., 1981,
1982.

[9]  R. E. Blahut, Fast Algorithms for Digital Signal Processing.
Reading, MA: Addison-Wesley, Inc., 1984.

[10]  M. T. Heideman, Multiplicative Complexity, Convolution, and the
DFT. Springer-Verlag, 1988.

[11]  R. Tolimieri, M. An, and C. Lu, Algorithms for Discrete Fourier
Transform and Convolution.  New York:  Springer-Verlag, 1989.

[12]  D.  G.  Myers,  Digital  Signal  Processing,  Efficient
Convolution and  Fourier  Transform  Techniques.  Sydney, Australia:
Prentice-Hall, 1990.

[13]  R.  E.  Blahut,  Algebraic  Methods  for  Signal  Processing
and Communications  Coding.   New  York: Springer-Verlag, 1992.

[14]  W. L. Briggs and V. E. Henson, The DFT: An Owner's Manual for
the Discrete Fourier Transform.  Philadelphia:  SIAM, 1995.

[15]  W. W. Smith and J. M. Smith, Handbook of Real-Time Fast Fourier
Transforms.  New York:  IEEE Press, 1995.

[16]  C. S. Burrus and T. W. Parks, DFT/FFT and Convolution
Algorithms. New York:  John Wiley & Sons, 1985.

[17]  J. S. Lim and A. V. Oppenheim, Advanced Topics in Signal
Processing, ch. 4.  Prentice-Hall, 1988.

[18]  J. W. Cooley, "How the FFT gained acceptance," IEEE Signal
Processing Magazine, vol. 9, pp. 10-13, January  1992.  Also
presented at the  ACM  Conference  on  the  History  of  Scientific
and  Numerical Computation, Princeton, NJ, May 1987 and published in:
A History of Scientific Computing, edited by S. G. Nash,
Addison-Wesley, 1990, pp. 133-140.

[19]  P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial
review and a state of the art," Signal Processing, vol. 19, pp.
259-299, April 1990.

[20]  J. W. Cooley, "The structure of FFT algorithms," April 1990.
Notes for a Tutorial at IEEE ICASSP-90.

[21]  S. Winograd, "On computing the discrete Fourier transform,"
Mathematics of Computation, vol. 32, pp. 175-199, January 1978.

[22]  S.  Winograd,  "On  the  multiplicative  complexity  of  the
discrete  Fourier  transform,"  Advances  in Mathematics, vol. 32, pp.
83-117, May 1979.  also in[7].

[23]  S. Winograd, Arithmetic Complexity of Computation. SIAM CBMS-NSF
Series, No. 33, Philadelphia: SIAM, 1980.

[24]  J. W. Cooley and J. W. Tukey, "An algorithm for the machine
calculation of complex Fourier series," Math. Computat., vol. 19, pp.
297-301, 1965.

[25]  R. Yavne, "An economical method for calculating the discrete
Fourier transform," in Proceedings of the Fall Joint Computer
Conference, pp. 115-125, 1968.

[26]  P. Duhamel and H. Hollmann, "Split radix FFT algorithm,"
Electronic Letters, vol. 20, pp. 14-16, January 5 1984.

[27]  P. Duhamel, "Implementation of `split-radix' FFT algorithms for
complex, real, and real-symmetric data," IEEE Trans. on ASSP, vol. 34,
pp. 285-295, April 1986.  A shorter version appeared in the ICASSP-85
Proceedings, p. 20.6, March 1985.

[28]  J.  B.  Martens,  "Recursive  cyclotomic  factorization  -  a
new algorithm  for  calculating  the  discrete Fourier transform,"
IEEE Trans. on ASSP, vol. 32, pp. 750-762, August 1984.

[29]  M.  Vetterli  and  H.  J.  Nussbaumer,  "Simple  FFT  and  DCT
algorithms  with  reduced  number  of operations," Signal Processing,
vol. 6, pp. 267-278, August 1984.

[30]  M. Vetterli and P. Duhamel, "Split-radix algorithms for length -
pm DFT's," IEEE Trans. on ASSP, vol. 37, pp. 57-64, January 1989.
Also, ICASSP-88 Proceedings, pp. 1415-1418, April 1988.

[31]  R. Stasinski, "The techniques of the generalized fast Fourier
transform algorithm," IEEE Transactions on Signal Processing, vol. 39,
pp. 1058-1069, May 1991.

[32]  H.  V.  Sorensen,  M.  T.  Heideman,  and  C.  S.  Burrus,  "On
calculating  the  split-radix  FFT,"  IEEE Transactions on Acoustics,
Speech, and Signal Processing, vol. 34, pp. 152-156, February 1986.

[33]  A. Saidi,  "Decimation-in-time-frequency FFT algorithm,"  in
Proceedings of the IEEE International Conference  on  Acoustics,
Speech, and  Signal  Processing,  (IEEE  ICASSP-94,  Adelaide,
Australia), pp. III:453-456, April 19-22 1994.

[34]  M. T. Heideman and C. S. Burrus, "On the number of
multiplications necessary to compute a length-2n DFT," IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. 34, pp.
91-95, February 1986.

[35]  J. K. Beard, "An in-place, self-reordering FFT," in Proceedings
of the ICASSP-78, (Tulsa), pp. 632- 633, April 1978.

[36]  H. W. Johnson and C. S. Burrus, "An in-place, in-order radix-2
FFT," in ICASSP-84 Proceedings, p. 28A.2, March 1984.

[37]  C. Temperton, "Self-sorting in-place fast Fourier transforms,"
SIAM Journal of Sci. Statist. Comput., vol. 12, no. 4, pp. 808-823,
1991.

[38]  C. S. Burrus, "Unscrambling for fast DFT algorithms," IEEE
Transactions on Acoustics, Speech, and Signal Processing, vol. 36, pp.
1086-1087, July 1988.

[39]  P. R"osel, "Timing of some bit reversal algorithms," Signal
Processing, vol. 18, pp. 425-433, December 1989.

[40]  J. Jeong and W. J. Williams, "A fast recursive bit-reversal
algorithm," in Proceedings of the ICASSP- 90, (Albuquerque, NM), pp.
1511-1514, April 1990.

[41]  D. M. W. Evans, "A second improved digit-reversal permutation
algorithm for fast transforms," IEEE Transactions on Acoustics,
Speech, and Signal Processing, vol. 37, pp. 1288-1291, August 1989.

[42]  J.  J.  Rodriguez,  "An  improved  FFT  digit-reversal
algorithm," IEEE  Transactions  on  Acoustics, Speech, and Signal
Processing, vol. 37, pp. 1298-1300, August 1989.

[43]  J. S. Walker, "A new bit reversal algorithm," IEEE Transactions
on Acoustics,  Speech,  and Signal Processing, vol. 38, pp. 1472-1473,
August 1990.

[44]  A.  A.  Yong,  "A  better  FFT  bit-reversal  algorithm  without
tables,"  IEEE  Transactions  on  Signal Processing, vol. 39, pp.
2365-2367, October 1991.

[45]  D. Sundararajan, M. O. Ahamad, and M. N. S. Swamy, "A fast FFT
bit-reversal algorithm," IEEE Transactions on Circuits and Systems,
II, vol. 41, pp. 701-703, October 1994.

[46]  J. M. Rius and R. De Porrata-Doria, "New FFT bit-reversal
algorithm," IEEE Transactions on Signal Processing, vol. 43, pp.
991-994, April 1995.

[47]  L. R. Morris, Digital Signal Processing Software.  Toronto,
Canada: DSPSW, Inc., 1982, 1983.

[48]  R. Meyer and K. Schwarz, "FFT implementation on DSP-chips,"
Sept. 18 1990.  preprint.

[49]  D. P. Kolba and T. W. Parks, "A prime factor FFT algorithm using
high speed convolution," IEEE Trans. on ASSP, vol. 25, pp. 281-294,
August 1977.  also in[7].

[50]  C. S. Burrus and P. W. Eschenbacher, "An in-place, in-order
prime factor FFT algorithm," IEEE Transactions on Acoustics, Speech,
and Signal Processing, vol. 29, pp. 806-817, August 1981. Reprinted in
it DSP Software, by L.R. Morris, 1983 and IEEE Press FFT Reprints, by
P. Duhamel, 1995.

[51]  C.  Temperton,  "A  self-sorting  in-place  prime  factor
real/half-complex  FFT  algorithm,"  Journal  of Computational
Physics, vol. 75, pp. 199-216, 1988.

[52]  C.  Temperton,  "Nesting  strategies  for  prime  factor  FFT
algorithms,"  Journal  of  Computational Physics, vol. 82, pp.
247-268, June 1989.

[53]  C. Temperton, "A generalized prime factor FFT algorithm for any
n = 2p 3q5r," SIAM Journal of Sci.  Stat. Comp., 1992.  to appear.

[54]  R. Stasinski, "Prime factor DFT algorithms for new small-N DFT
modules," IEE Proceedings, Part G, vol. 134, no. 3, pp. 117-126, 1987.

[55]  H.  W.  Johnson  and  C.  S.  Burrus,  "The  design  of  optimal
DFT  algorithms  using  dynamic programming," IEEE Transactions on
Acoustics, Speech, and Signal Processing, vol. 31, pp. 378-387, April
1983.

[56]  H. W. Johnson and C. S. Burrus, "On the structure of efficient
DFT algorithms," IEEE Transactions on Acoustics, Speech, and Signal
Processing, vol. 33, pp. 248-254, February 1985.

[57]  H. W. Johnson and C. S. Burrus, "Large DFT modules: N = 11, 13,
17, 19, and 25," Tech. Rep. 8105, Department of Electrical
Engineering, Rice University, Houston, TX 77251-1892, 1981.

[58]  C. Temperton, "A new set of minimum-add small-n rotated DFT
modules," Journal of Computational Physics, vol. 75, pp. 190-198,
1988.

[59]  C.  Lu,  J.  W.  Cooley,  and  R.  Tolimieri,  "FFT  algorithms
for prime  transform  sizes  and  their implementations on VAX,
IBM3090VF, and IBM RS/6000," IEEE Transactions on Signal Processing,
vol. 41, pp. 638-648, February 1993.

[60]  S.  Chu  and  C.  S.  Burrus,  "A  prime  factor  FFT  algorithm
using  distributed  arithmetic,"  IEEE Transactions on Acoustics,
Speech, and Signal Processing, vol. 30, pp. 217-227, April 1982.

[61]  K. T. Chen, "A new record: the largest known prime number," IEEE
Spectrum, vol. 27, p. 47, February 1990.

[62]  H. V. Sorensen, D. L. Jones, C. S. Burrus, and M. T. Heideman,
"On computing the discrete Hartley transform," IEEE Transactions on
Acoustics, Speech, and Signal Processing, vol. 33, pp. 1231-1238,
October 1985.

[63]  R. N. Bracewell, The Hartley Transform.  Oxford Press, 1986.

[64]  F.  M.  Wang  and  P.  Yip,  "Fast  prime  factor  decomposition
algorithms  for  a  family  of  discrete trigonometric transforms,"
Circuits, Systems, and Signal Processing, vol. 8, no. 4, pp. 401-419,
1989.

[65]  K. R. Rao and P. Yip, Discrete Cosine Transform:  Algorithms,
Advantages, Applications. San Diego, CA: Academic Press, 1990.

[66]  H. V. Sorensen, D. L. Jones, M. T. Heideman, and C. S. Burrus,
"Real valued fast Fourier transform algorithms,"  IEEE  Transactions
on Acoustics,  Speech,  and  Signal  Processing,  vol. 35,  pp.
849-863, June 1987.  also in IEEE Press FFT Reprints, by P. Duhamel,
1995.

[67]  P. Duhamel and M. Vetterli, "Improved Fourier and Hartley
transfrom algorithms, application to cyclic convolution of real data,"
IEEE Trans. on ASSP, vol. 35, pp. 818-824, June 1987.

[68]  M. Popovic and D. Sevic, "A new look at the comparison of the
fast Hartley and Fourier transforms," IEEE Transactions on Signal
Processing, vol. 42, pp. 2178-2182, August 1994.

[69]  P.  R.  Uniyal,  "Transforming  real-valued  sequences:   fast
Fourier  versus  fast  Hartley  transform algorithms," IEEE
Transactions on Signal Processing, vol. 42, pp. 3249-3254, November
1994.

[70]  G. Bruun,  "Z-transform DFT filters and FFTs,"  IEEE
Transactions on  ASSP, vol. 26,  pp. 56-63, February 1978.

[71]  R. Storn, "On the Bruun algorithm and its inverse," Frequenz,
vol. 46, pp. 110-116, 1992.  a German journal.

[72]  C. M. Rader and N. M. Brenner, "A new principle for fast Fourier
transformation," IEEE Transactions on Acoustics, Speech, and Signal
Processing, vol. ASSP-24, pp. 264-266, June 1976.

[73]  K.  M.  Cho  and  G.  C.  Temes,  "Real-factor  FFT
algorithms," in  Proceedings  of  IEEE  ICASSP-78, (Tulsa, OK), pp.
634-637, April 1978.

[74]  P. Duhamel, B. Piron, and J. M. Etcheto, "On computing the
inverse DFT," IEEE Transactions on ASSP, vol. 36, pp. 285-286,
February 1978.

[75]  R. Singleton, "An algorithm for computing the mixed radix fast
Fourier transform," IEEE Transactions on Audio and Electroacoustics,
vol. AU-17, pp. 93-103, June 1969.

[76]  J. A. Glassman, "A generalization of the fast Fourier
transform," IEEE Transactions on Computers, vol. C-19, pp. 105-116,
Feburary 1970.

[77]  W. E. Ferguson, Jr., "A simple derivation of Glassman general-n
fast Fourier transform," Comput. and Math. with Appls., vol. 8, no. 6,
pp. 401-411, 1982.  Also, in Report AD-A083811, NTIS, Dec. 1979.

[78]  L. Rabiner, R. Schafer, and C. Rader, "The chirp z-transform
algorithm," IEEE Transactions on Audio Electroacoustics, vol. AU-17,
pp. 86-92, June 1969.

[79]  G. A. Sitton, "The QFT algorithm," unpublished paper, 1985.

[80]  H. Guo, G. A. Sitton, and C. S. Burrus, "The quick discrete
Fourier transform," in Proceedings of the IEEE International
Conference on Acoustics, Speech, and Signal Processing, (IEEE
ICASSP-94, Adelaide, Australia), pp. III:445-448, April 19-22 1994.

[81]  H. Guo, G. A. Sitton, and C. S. Burrus, "The quick Fourier
transform, an FFT based on symmetries," IEEE Transactions on Signal
Processing, submitted Oct. 1994.

[82]  I.  W.  Selesnick  and  C.  S.  Burrus,  "Automating  the
design of  prime  length  FFT  programs,"  in Proceedings of the IEEE
International Symposium on Circuits and Systems, (ISCAS-92, San Diego,
CA), pp. 133-136, May 1992.

[83]  I.  W.  Selesnick  and  C.  S.  Burrus,   "Multidimensional
mapping techniques  for  convolution," in  Proceedings  of  the  IEEE
International  Conference  on  Signal  Processing,  (IEEE  ICASSP-93,
Minneapolis), pp. III-288-291, April 1993.

[84]  I.  W.  Selesnick  and  C.  S.  Burrus,  "Automatic  generation
of prime  length  FFT  programs,"  IEEE Transactions on Signal
Processing, to appear. 1995.

[85]  I.  W.  Selesnick  and  C.  S.  Burrus,  "Extending  Winograd's
small  convolution  algorithm  to  longer lengths," in Proceedings of
the IEEE International Symposium on Circuits and Systems, (IEEE ISCAS-
94, London), pp. 2.449-452, June 30 1994.

[86]  W.  K.  Hocking,  "Performing  Fourier  transforms  on
extremely long  data  streams,"  Computers  in Physics, vol. 3, pp.
59-65, January 1989.

[87]  R. Stasinski, "Extending sizes of effective convolution
algorithms," Electronics Letters, vol. 26, no. 19, pp. 1602-1604,
1990.

[88]  R. C. Agarwal and J. W. Cooley, "New algorithms for digital
convolution," IEEE Trans. on ASSP, vol. 25, pp. 392-410, October 1977.

[89]  I. Pitas and C. S. Burrus, "Time and error analysis of digital
convolution by rectangular transforms," Signal Processing, vol. 5, pp.
153-162, March 1983.

[90]  R. Meyer, R. Reng, and K. Schwarz, "Convolution algorithms on
DSP processors," in Proceedings of the ICASSP-91, (Toronto, Canada),
pp. 2193-2196, May 1991.

[91]  R. C. Agarwal and C. S. Burrus, "Number theoretic transforms to
implement fast digital convolution," Proceedings of the IEEE, vol. 63,
pp. 550-560, April 1975. also in IEEE Press DSP Reprints II, 1979.

[92]  H. V. Sorensen and C. S. Burrus, "Efficient computation of the
short-time FFT," in Proceedings of the IEEE International Conference
on Acoustics, Speech, and Signal Processing, (New York), pp. 1894-
1897, April 1988.

[93]  H.  V.  Sorensen,  C.  S.  Burrus,  and  D.  L.  Jones,  "A  new
efficient  algorithm  for  computing  a  few DFT points," in
Proceedings of the IEEE International Symposium on Circuits and
Systems, (Espoo, Finland), pp. 1915-1918, June 1988.

[94]  C. Roche, "A split-radix partial input/output fast Fourier
transform algorithm," IEEE Transactions on Signal Processing, vol. 40,
pp. 1273-1276, May 1992.

[95]  H. V. Sorensen and C. S. Burrus, "Efficient computation of the
DFT with only a subset of input or output points," IEEE Transactions
on Signal Processing, vol. 41, pp. 1184-1200, March 1993.

[96]  R. Meyer and K. Schwarz, "FFT implementation on DSP-chips _
theory and practice," in Proceedings of the ICASSP-90, (Albuquerque,
NM), pp. 1503-1506, April 1990.

[97]  H. V. Sorensen, C. A. Katz, and C. S. Burrus, "Efficient FFT
algorithms for DSP processors using tensor product decomposition,"  in
Proceedings  of  the  IEEE  International  Conference  on  Acoustics,
Speech, and Signal Processing, (Albuquerque, NM), pp. 1507-1510, April
1990.

[98]  J. Johnson, R. W. Johnson, D. Rodriguez, and R. Tolimieri, "A
methodology for designing, modifying, and implementing Fourier
transform algorithms on various architectures," Circuits, Systems and
Signal Processing, vol. 9, no. 4, pp. 449-500, 1990.

[99]  C. Van Loan, Matrix Frameworks for the Fast Fourier Transform.
Philadelphia, PA: SIAM, 1992.

[100]  R. Tolimieri, M. An, and C. Lu, Mathematics of Multidimensional
Fourier Transform Algorithms. New York:  Springer-Verlag, 1993.