stevenj@alum.mit.edu wrote in news:1140208214.201862.187120
@g14g2000cwa.googlegroups.com:
> We have benchmarks comparing many different FFT codes at
> www.fftw.org/speed
I did have a look at it.
>
> Be sure you are using FFTW properly (if you re-create the plan for each
> transform you will slow it down tremendously). See the FAQ:
> http://www.fftw.org/faq/section3.html#slow
My plans are created once for type or array be4 the start of the
simulation proper. So, this is not the problem.
>
> (Note that our DCT code currently doesn't exploit SSE, unfortunately,
> so for that particular problem Intel may beat us by a large margin if
> they do. For such small DCTs you can speed FFTW up tremendously,
> however, by generating size-specific DCT codelets; see the "generating
> your own code" section of the manual.)
>
> Note also that for such small transforms working out-of-place might be
> faster. You should also definitely use FFTW's configure script to
> compile FFTW with gcc (e.g. via MinGW), since it turns out that the
> proper choice of compiler flags makes a big difference and the best
> choice is somewhat counter-intuitive.
This I will try.
Thank you very much. FFTW is still a great set of transforms. I am amazed
that even as the code stands, I can still do DNS simulation on a home pc
that I used to run on a Cray XMP in the early 90s.
--
>
> Best of luck,
> Steven G. Johnson
>
>
Reply by ●February 17, 20062006-02-17
We have benchmarks comparing many different FFT codes at
www.fftw.org/speed
Be sure you are using FFTW properly (if you re-create the plan for each
transform you will slow it down tremendously). See the FAQ:
http://www.fftw.org/faq/section3.html#slow
(Note that our DCT code currently doesn't exploit SSE, unfortunately,
so for that particular problem Intel may beat us by a large margin if
they do. For such small DCTs you can speed FFTW up tremendously,
however, by generating size-specific DCT codelets; see the "generating
your own code" section of the manual.)
Note also that for such small transforms working out-of-place might be
faster. You should also definitely use FFTW's configure script to
compile FFTW with gcc (e.g. via MinGW), since it turns out that the
proper choice of compiler flags makes a big difference and the best
choice is somewhat counter-intuitive.
Best of luck,
Steven G. Johnson
Reply by ##●February 17, 20062006-02-17
Hello,
I am using fftw in a fluid simulation code for doing 1D,2D ffts and
cosine transforms associaciated with Chebyshev differentiation. All this
is done on a P4 pc with winxp. Code written in C on the dev-C++ IDE i.e.
the compiler is gcc.
To my surprise, it turns out that the simulation run times are dominated
by the ffts (I expected the matrix inversion in the elliptical part of
the code to dominate cpu usage which is past experience.) Standard sizes
of the tranforms: 128/192/256/384/512 for the ffts in 1D/2D and 129/257
for the cosine transform in double and in place.
So, I am on the lookout for ffts going faster on the P4 than fftw. I am
aware of the Intel offering, so it is clearly possible, but but but but,
since I am doing this on my home pc in my spare time I would rather use
freeware. So, has anyone heard of or used freeware ffts that run faster
than fftw specifically on a P4 of a home pc? Any suggestions would be
appreciated.
Thanks.
--