posted from comp.dsp
Apologies for "top posting" if required
I'm not an expert in either DSP or math
However I needed to do FFT's on large one dimensional arrays [~1
second of audio sampled at 44.1 kHz]. I was restricted at the
time to use Scilab 4.x . Someone suggested that I use a sample
length which had a *FEW* (not repeated) mutually prime factors.
For my *PARTICULAR* circumstances it worked well.
YMMV ;/
Felipe G. Nievinski wrote:

> The documentation for Matlab's fftw interface states that:
>
> fftw('planner', method) sets the method by which the tuning
> algorithm searches for a good FFT algorithm when the dimension of the
> FFT is _not_ a power of 2.
>
> So when the input dimension _is_ a power of 2 there is no potential
> speed up in tuning the FFTW library? I seem to be getting different
> performance for different tuning even with power-2 input.
>
> The doc also says:
>
> MATLAB records the optimal algorithm in an internal data base and
> uses it to compute FFTs of the same size throughout the current
> session.
>
> I'm concerned with running fft2. If Matlab remembers only tuning, then
> fft2 will always be suboptimum for non-square input. In contrast, if
> Matlab remembers all tunings performed in the current session, then
> I'd need to perform two tunings:
>
> fftw('planner', 'measure'); fft(input(:,1));
> fftw('planner', 'measure'); fft(input(1,:));
>
> I'm using Matlab R2009b.
>
> Thanks,
> -Felipe.

Reply by Martin Brown●March 8, 20112011-03-08

On 08/03/2011 19:29, Albert van der Horst wrote:

> In article<0df27dd3-9223-4fc5-ba5a-e46a3bca05cb@b13g2000prf.googlegroups.com>,
> Felipe G. Nievinski<fgnievinski@gmail.com> wrote:
>> The documentation for Matlab's fftw interface states that:
>>
>> fftw('planner', method) sets the method by which the tuning
>> algorithm searches for a good FFT algorithm when the dimension of the
>> FFT is _not_ a power of 2.
>>
>> So when the input dimension _is_ a power of 2 there is no potential
>> speed up in tuning the FFTW library? I seem to be getting different
>> performance for different tuning even with power-2 input.

I'd expect there to be some, but not a great gain over the initial
assumption of using the largest radix that is suitable for the length.

>
> No. The speed attainable depends on the dimension of the FFT,
> and the algorithms differ considerably accordingly.
> FFT's for sizes with many small factors (144, 180) can have
> a speed advantage over comparably sized powers of 2.

Actually it begs an interesting practical question to list all the FFT
lengths on a given platform for which there is a strict advantage in
choosing a non-power of two transform when you do have a choice.
I reckon there are two interesting cases worth searching for.
Larger non-power of two transforms that are faster in absolute timing
than the next smallest power of two transform - and these are rare.
On my FFT code for instance only 36 as 6^2 beats 32 as 2^5.
Non-power of two transforms that are faster than the next highest power
of two transform - these are far more common and useful in practice.
Listing only the ones that are MIPS wise competitive against optimised
powers of two culls the list down to manageable proportions. In my case:
81, 216, 729, 1296, 2560, 7776, 8000
My 8192 transform is a bit weak, or rather 4096 is particularly strong
being 16^3, 8^4, 4^6, 2^12 - though usually the largest radix wins out.
I expect FFTW will behave better since their optimisations are smarter
(and it has many more permutations of kernel available to try).
I'd be a little bit surprised if the winner for powers of two was not in
most cases predictable by the usual heuristic of taking the largest
radix available to get a direct hit or within a factor of 2 or 4 fixup.
OTOH funny things sometimes happen with cache management boundaries.
If you do benchmark FFTW for all transforms 2^n3^p5^q upto 10000 complex
real length and list the ones that meet these two criteria after tuning
I'd be interested to know what the pattern looks like.
Regards,
Martin Brown

Reply by Albert van der Horst●March 8, 20112011-03-08

In article <0df27dd3-9223-4fc5-ba5a-e46a3bca05cb@b13g2000prf.googlegroups.com>,
Felipe G. Nievinski <fgnievinski@gmail.com> wrote:

>The documentation for Matlab's fftw interface states that:
>
> fftw('planner', method) sets the method by which the tuning
>algorithm searches for a good FFT algorithm when the dimension of the
>FFT is _not_ a power of 2.
>
>So when the input dimension _is_ a power of 2 there is no potential
>speed up in tuning the FFTW library? I seem to be getting different
>performance for different tuning even with power-2 input.

No. The speed attainable depends on the dimension of the FFT,
and the algorithms differ considerably accordingly.
FFT's for sizes with many small factors (144, 180) can have
a speed advantage over comparably sized powers of 2.
<SNIP>

>
>Thanks,
>-Felipe.

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Reply by Felipe G. Nievinski●March 7, 20112011-03-07

The documentation for Matlab's fftw interface states that:
fftw('planner', method) sets the method by which the tuning
algorithm searches for a good FFT algorithm when the dimension of the
FFT is _not_ a power of 2.
So when the input dimension _is_ a power of 2 there is no potential
speed up in tuning the FFTW library? I seem to be getting different
performance for different tuning even with power-2 input.
The doc also says:
MATLAB records the optimal algorithm in an internal data base and
uses it to compute FFTs of the same size throughout the current
session.
I'm concerned with running fft2. If Matlab remembers only tuning, then
fft2 will always be suboptimum for non-square input. In contrast, if
Matlab remembers all tunings performed in the current session, then
I'd need to perform two tunings:
fftw('planner', 'measure'); fft(input(:,1));
fftw('planner', 'measure'); fft(input(1,:));
I'm using Matlab R2009b.
Thanks,
-Felipe.