Reply by Richard Owlett March 8, 20112011-03-08
posted from comp.dsp
Apologies for "top posting" if required

I'm not an expert in either DSP or math

However I needed to do  FFT's on large one dimensional arrays [~1 
second of audio sampled at 44.1 kHz]. I was restricted at the 
time to use Scilab 4.x . Someone suggested that I use a sample 
length which had a *FEW* (not repeated) mutually prime factors. 
For my *PARTICULAR* circumstances it worked well.

YMMV ;/


Felipe G. Nievinski wrote:
> The documentation for Matlab's fftw interface states that: > > fftw('planner', method) sets the method by which the tuning > algorithm searches for a good FFT algorithm when the dimension of the > FFT is _not_ a power of 2. > > So when the input dimension _is_ a power of 2 there is no potential > speed up in tuning the FFTW library? I seem to be getting different > performance for different tuning even with power-2 input. > > The doc also says: > > MATLAB records the optimal algorithm in an internal data base and > uses it to compute FFTs of the same size throughout the current > session. > > I'm concerned with running fft2. If Matlab remembers only tuning, then > fft2 will always be suboptimum for non-square input. In contrast, if > Matlab remembers all tunings performed in the current session, then > I'd need to perform two tunings: > > fftw('planner', 'measure'); fft(input(:,1)); > fftw('planner', 'measure'); fft(input(1,:)); > > I'm using Matlab R2009b. > > Thanks, > -Felipe.
Reply by Martin Brown March 8, 20112011-03-08
On 08/03/2011 19:29, Albert van der Horst wrote:
> In article<0df27dd3-9223-4fc5-ba5a-e46a3bca05cb@b13g2000prf.googlegroups.com>, > Felipe G. Nievinski<fgnievinski@gmail.com> wrote: >> The documentation for Matlab's fftw interface states that: >> >> fftw('planner', method) sets the method by which the tuning >> algorithm searches for a good FFT algorithm when the dimension of the >> FFT is _not_ a power of 2. >> >> So when the input dimension _is_ a power of 2 there is no potential >> speed up in tuning the FFTW library? I seem to be getting different >> performance for different tuning even with power-2 input.
I'd expect there to be some, but not a great gain over the initial assumption of using the largest radix that is suitable for the length.
> > No. The speed attainable depends on the dimension of the FFT, > and the algorithms differ considerably accordingly. > FFT's for sizes with many small factors (144, 180) can have > a speed advantage over comparably sized powers of 2.
Actually it begs an interesting practical question to list all the FFT lengths on a given platform for which there is a strict advantage in choosing a non-power of two transform when you do have a choice. I reckon there are two interesting cases worth searching for. Larger non-power of two transforms that are faster in absolute timing than the next smallest power of two transform - and these are rare. On my FFT code for instance only 36 as 6^2 beats 32 as 2^5. Non-power of two transforms that are faster than the next highest power of two transform - these are far more common and useful in practice. Listing only the ones that are MIPS wise competitive against optimised powers of two culls the list down to manageable proportions. In my case: 81, 216, 729, 1296, 2560, 7776, 8000 My 8192 transform is a bit weak, or rather 4096 is particularly strong being 16^3, 8^4, 4^6, 2^12 - though usually the largest radix wins out. I expect FFTW will behave better since their optimisations are smarter (and it has many more permutations of kernel available to try). I'd be a little bit surprised if the winner for powers of two was not in most cases predictable by the usual heuristic of taking the largest radix available to get a direct hit or within a factor of 2 or 4 fixup. OTOH funny things sometimes happen with cache management boundaries. If you do benchmark FFTW for all transforms 2^n3^p5^q upto 10000 complex real length and list the ones that meet these two criteria after tuning I'd be interested to know what the pattern looks like. Regards, Martin Brown
Reply by Albert van der Horst March 8, 20112011-03-08
In article <0df27dd3-9223-4fc5-ba5a-e46a3bca05cb@b13g2000prf.googlegroups.com>,
Felipe G. Nievinski <fgnievinski@gmail.com> wrote:
>The documentation for Matlab's fftw interface states that: > > fftw('planner', method) sets the method by which the tuning >algorithm searches for a good FFT algorithm when the dimension of the >FFT is _not_ a power of 2. > >So when the input dimension _is_ a power of 2 there is no potential >speed up in tuning the FFTW library? I seem to be getting different >performance for different tuning even with power-2 input.
No. The speed attainable depends on the dimension of the FFT, and the algorithms differ considerably accordingly. FFT's for sizes with many small factors (144, 180) can have a speed advantage over comparably sized powers of 2. <SNIP>
> >Thanks, >-Felipe.
-- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply by Felipe G. Nievinski March 7, 20112011-03-07
The documentation for Matlab's fftw interface states that:

    fftw('planner', method) sets the method by which the tuning
algorithm searches for a good FFT algorithm when the dimension of the
FFT is _not_ a power of 2.

So when the input dimension _is_ a power of 2 there is no potential
speed up in tuning the FFTW library? I seem to be getting different
performance for different tuning even with power-2 input.

The doc also says:

     MATLAB records the optimal algorithm in an internal data base and
uses it to compute FFTs of the same size throughout the current
session.

I'm concerned with running fft2. If Matlab remembers only tuning, then
fft2 will always be suboptimum for non-square input. In contrast, if
Matlab remembers all tunings performed in the current session, then
I'd need to perform two tunings:

    fftw('planner', 'measure');  fft(input(:,1));
    fftw('planner', 'measure');  fft(input(1,:));

I'm using Matlab R2009b.

Thanks,
-Felipe.