stevenj@alum.mit.edu writes:
> [...]
> Second, you shouldn't believe these kind of formulas for predicting
> when one "algorithm" is faster than another...they have been total
> garbage for many years.  The reason is that you can't predict the
> performance of an algorithm by simply counting the number of
> arithmetic operations any more; it is dominated by other factors that
> depend on the implementation details.  

Hi Steven,

What other details are you referring to? Caching and pipeline stalls? 
-- 
%  Randy Yates                  % "With time with what you've learned, 
%% Fuquay-Varina, NC            %  they'll kiss the ground you walk 
%%% 919-577-9882                %  upon."
%%%% <yates@ieee.org>           % '21st Century Man', *Time*, ELO
http://home.earthlink.net/~yatescr

Thank you so much for your patience and persistence! I was asking almost
the same questions over and over again and always got answers.
Since we are going to use some hardware-solution, I will not bother again
with that kind of questions - I just had to write a report, why to do so.


>On Feb 15, 9:12 am, "quaste" <thomas.wald...@ctr.at> wrote:
>> I've a question concerning performance tests in MATLAB. Basically I'm
>> trying to get some (M) DFT-values from an size N signal. (actually
N=1024
>> for my application)
>> I tried to find out the maximal M for which I can use goertzel, czt
>> (chirp-z-transformation), ... that's it - I don't know any other
>> 'narrow-band-algorithm', which seem to pay off at all (by the way:
what
>> are for - '... apply some filter ...' -> the algorithm loses).
>> Now comes the point: the fft-function wins every time BY FAR, even if
I
>> try to bother it with some reconfigurations using the fftw-function.
>> EG: Theoretically goertzel should be faster if M < 5/6 log2(N) (M <
8.333
>> in my case) - I've found the formula on the net, but even the time for
>> calculating 1 point exceeds the time fft needs.
>> Man, they really got it optimized. Is there a way to get, lets call
it,
>> more fair results, without coding all the algorithms afoot?
>
>First of all, to really do a fair comparison of Goertzel with Matlab's
>built-in FFT, you almost certainly need to hand-code Goertzel in C or
>some other compiled language.  Otherwise you are probably just
>measuring the overhead of Matlab.
>
>Second, you shouldn't believe these kind of formulas for predicting
>when one "algorithm" is faster than another...they have been total
>garbage for many years.  The reason is that you can't predict the
>performance of an algorithm by simply counting the number of
>arithmetic operations any more; it is dominated by other factors that
>depend on the implementation details.  For example, if you look at
>different FFT implementations, the number of arithmetic operations
>varies by at most 30% or so, while the performance varies by more than
>a factor of 10 (see e.g. www.fftw.org/speed).
>
>Third, CZT will never be faster than an FFT for computing a subset of
>the FFT outputs.  The advantage of CZT comes if you want to
>interpolate a portion of the FFT spectrum more finely, without zero-
>padding to a much larger FFT.
>
>Fourth, you should be aware that Goertzel and similar algorithms are
>substantially less accurate than FFTs, in the face of finite numerical
>precision, for large N.
>
>Fifth, there is another alternative to computing a subset of the FFT
>outputs, a so-called "pruned" FFT, which may sometimes be faster, at
>least for a small number of contiguous outputs.  See www.fftw.org/
>pruned.html
>
>Regards,
>Steven G. Johnson
>
>

On Feb 15, 9:12 am, "quaste" <thomas.wald...@ctr.at> wrote:
> I've a question concerning performance tests in MATLAB. Basically I'm
> trying to get some (M) DFT-values from an size N signal. (actually N=1024
> for my application)
> I tried to find out the maximal M for which I can use goertzel, czt
> (chirp-z-transformation), ... that's it - I don't know any other
> 'narrow-band-algorithm', which seem to pay off at all (by the way: what
> are for - '... apply some filter ...' -> the algorithm loses).
> Now comes the point: the fft-function wins every time BY FAR, even if I
> try to bother it with some reconfigurations using the fftw-function.
> EG: Theoretically goertzel should be faster if M < 5/6 log2(N) (M < 8.333
> in my case) - I've found the formula on the net, but even the time for
> calculating 1 point exceeds the time fft needs.
> Man, they really got it optimized. Is there a way to get, lets call it,
> more fair results, without coding all the algorithms afoot?

First of all, to really do a fair comparison of Goertzel with Matlab's
built-in FFT, you almost certainly need to hand-code Goertzel in C or
some other compiled language.  Otherwise you are probably just
measuring the overhead of Matlab.

Second, you shouldn't believe these kind of formulas for predicting
when one "algorithm" is faster than another...they have been total
garbage for many years.  The reason is that you can't predict the
performance of an algorithm by simply counting the number of
arithmetic operations any more; it is dominated by other factors that
depend on the implementation details.  For example, if you look at
different FFT implementations, the number of arithmetic operations
varies by at most 30% or so, while the performance varies by more than
a factor of 10 (see e.g. www.fftw.org/speed).

Third, CZT will never be faster than an FFT for computing a subset of
the FFT outputs.  The advantage of CZT comes if you want to
interpolate a portion of the FFT spectrum more finely, without zero-
padding to a much larger FFT.

Fourth, you should be aware that Goertzel and similar algorithms are
substantially less accurate than FFTs, in the face of finite numerical
precision, for large N.

Fifth, there is another alternative to computing a subset of the FFT
outputs, a so-called "pruned" FFT, which may sometimes be faster, at
least for a small number of contiguous outputs.  See www.fftw.org/
pruned.html

Regards,
Steven G. Johnson

Hello!

I've a question concerning performance tests in MATLAB. Basically I'm
trying to get some (M) DFT-values from an size N signal. (actually N=1024
for my application)
I tried to find out the maximal M for which I can use goertzel, czt
(chirp-z-transformation), ... that's it - I don't know any other
'narrow-band-algorithm', which seem to pay off at all (by the way: what
are for - '... apply some filter ...' -> the algorithm loses).
Now comes the point: the fft-function wins every time BY FAR, even if I
try to bother it with some reconfigurations using the fftw-function.
EG: Theoretically goertzel should be faster if M < 5/6 log2(N) (M < 8.333
in my case) - I've found the formula on the net, but even the time for
calculating 1 point exceeds the time fft needs.
Man, they really got it optimized. Is there a way to get, lets call it,
more fair results, without coding all the algorithms afoot?

best regards,
Thomas