Reply by m.baldasseroni●February 13, 20072007-02-13
First of all thanks for your always "positive" behaviour about my
comment on FFTW, about my work and about a person who is trying to
learn from his mistakes !!!
However, since October 2006 I have spent my time studying and working
around different aspect of my work and I have improved my knowledge
not only on the trite compile process!
I'm guessing that you have not read what I have write during the last
>>I'm guessing that you screwed up the compilation process. (From the
>>sound of things, you haven't even checked whether the Altivec code is
>>actually being used.)
In any case, I have overtake the step you are writing about 3 months
ago and, I repeat again, I'm sure that Altivec optimization is
To support my hypothesis I have generated an fftw library with Altivec
support and I have started my application disabling the altivec
coprocessor support through the taskSpawn() command of vxWorks. In
this case the foreseen error occurs (Altivec unavailable). In
particular the error occurs on the first assembler instruction
"stvx". This fact shows that Altivec optimization is active !
>>We routinely cross-compile FFTW for other platforms using the >>configure script, and a Google search reveals numerous people using >>autoconf configure script with vxWorks cross-compilers
In the fftw guide (par 8.3, line 32) I suppose you suggest to set the
various options and compiler characteristics in config.h file!
Do you think that I have read in a correct way or not??
Three months ago I followed your indications and, for example, I have
set the Altivec optimization (config.h, lines 69-70 /* Define to
enable Altivec optimizations. */ #define HAVE_ALTIVEC).
However, following other indications that I think is useless to repeat
again, now I have generated a code with fftw. When I used the script
you have suggested me (using cygwin etc..), I obtained the same
config.h file that I have previously hand configured.
I underline that I'm spending my time on FFTW because I have to obtain
the same results reported on the fftw.org site, because my work group
need it, and so I think that I'm doing a mistake that I'm not able to
I hope to be enough explicit and, in any case, I underline that the
first intent in the use of forum resource would not be to explain
personal critics about the approach to problems of other person, but
to help the colleagues and to have a positive intent to talk around
the main argument of the forum!!
I wish this can be the first step for a good dialog and not for
private attack !
Reply by ●February 13, 20072007-02-13
On Feb 13, 10:52 am, "m.baldasseroni" <mbaldasser...@progesi.it>
> In any case, I have overtake the step you are writing about 3 months
> ago and, I repeat again, I'm sure that Altivec optimization is
Try running the fftwf_print_plan command to see what algorithm FFTW is
using (the codelet names have a "v" in them for vectorized versions).
You can also pass FFTW_NO_SIMD, which disables the SIMD code---if
FFTW_NO_SIMD does not slow things down considerably, then the Altivec
code is not being used.
Note that you are using an *ancient* version of gcc (2.95). If you
read our FAQ, you'll find that some versions of gcc 2.95 don't compile
FFTW's altivec code correctly. I would strongly recommend upgrading
to gcc 3.4.4 at least. (I believe I recommended upgrading your gcc a
year ago, too.)
Note also that the configure script picks different compiler flags
than you are using, which probably make a 20% difference in speed. And
with older buggy gcc versions, the choice of optimization flags can
also affect correctness.
> In the fftw guide (par 8.3, line 32) I suppose you suggest to set the
> various options and compiler characteristics in config.h file!
This is as a last resort, for people who know what they are doing, in
cases where the configure script is not applicable. It is vastly
preferable to use our configure scripts and Makefiles, which we know
compile the correct files, enable the correct options, and use
carefully selected compiler optimizations.
> I underline that I'm spending my time on FFTW because I have to obtain
> the same results reported on the fftw.org site,
I'm not sure how you expect to get the "same" results as in the graphs
on the FFTW web site. Those PowerPC benchmarks were performed on a G5
(and other people have performed similar benchmarks with older FFTW
versions, see e.g. http://findsabrina.org/altivec/). You are
apparently using a G4, with an unspecified clock speed.
Many things can go wrong with benchmarks. For example, are you sure
it's even producing the correct results? Our provided "bench" program
can perform correctness and speed tests. As another example, if you
are benchmarking by repeatedly forwards FFTing and then backwards
FFTing the same array, this is a diverging process (because FFTW is
unnormalized) and will lead to floating-point exceptions that will
slow things down dramatically (the easy solution is to initialize the
array to zero). Or maybe there is some other bug. There's a limit to
how much other people can debug your code for you remotely, however.
In general, I would suggest compiling our provided self-test program,
checking for correctness first, and then using the same program to
check performance. e.g.
./bench -v2 -y 2048 # check correctness
./bench -v2 2048 # check speed
The "-v2" option will cause it to call fftw_print_plan so you can see
what algorithm it chooses.
In general, the less you diverge from our standard configure/
compilation process, the less you will have to debug and the easier it
is to help you. I'm sorry you feel insulted, but if your basic
approach is founded on ignorance, you need to be told (and others
trying to help you need to be aware too that they are dealing with
someone who doesn't know how to install "make" but is trying to hand-
configure, compile, and benchmark a large and complex piece of
software ... it affects the basic assumptions we make in offering you
advice). See also:
Steven G. Johnson
Reply by Ben Jackson●February 13, 20072007-02-13
On 2007-02-13, m.baldasseroni <firstname.lastname@example.org> wrote:
> particular the error occurs on the first assembler instruction
> "stvx". This fact shows that Altivec optimization is active !
That particular instruction (IIRC) would only prove that you have EABI
support (IE your compile supports OTHER people using Altivec by doing
full 64-bit register saves). You need to ensure that the actual altivec
instructions are being used.
Ben Jackson AD7GD
> Good morning,
> I'm using FFTW library for my project.
> I'm working on a PowerPC 7447 processor with vxWorks operating system.
> I have generated fftw library through gcc 2.95 compiler using the
> sequent options for obj files:
> -O3 -fomit-frame-pointer -fstrict-aliasing -fvec-eabi -mcpu=7450
> I'm working in single precision.
[Sorry for the off-topic post]
Try these CFLAGS:
-O3 -fno-schedule-insns -fvec-eabi -mcpu=7450
FFTW's codelets are generated in a way that minimizes the number of
register spills independently on the number of registers. This
remarkable property depends upon the compiler preserving the order in
which the code is written, however. In gcc, the -O3 flag destroys the
order. The suggested CFLAGS instruct gcc to preserve the original