Technical discussions related to Audio Signal Processing (digital effects, acoustics, noise reduction, musical signal processing, etc).
I'm new(ish) to FFT and to broaden my knowledge/experience I've been comparing FFTW and the Ooura FFT library. I'm getting a result with the latter that I don't quite understand, however. I have been analysing 1024 samples from a WAV file (16-bit, sr 44100) using FFTW's real-to-complex transform and the equivalent rdft from the Ooura library. I get the same output apart from the DC bin. FFTW gives me 22203 0, whereas Ooura gives me 22203 328721. I understand that the DC is just the average of the samples (22203 in this case), but shouldn't the phase necessarily be 0, as in the FFTW result? Why does Ooura give me 328721 for the imaginary part of the DC? Many thanks in advance, Cortez______________________________
On Tue, 30 Sep 2008, r...@hotmail.co.uk wrote: > FFTW gives me 22203 0, whereas Ooura gives me 22203 328721. DC has no phase, so the phase of it is meaningless. -- Grzegorz Kraszewski <k...@teleinfo.pb.edu.pl> http://teleinfo.pb.edu.pl/~krashan______________________________
Cortez- > I'm new(ish) to FFT and to broaden my knowledge/experience I've been > comparing FFTW and the Ooura FFT library. I'm getting a result with the > latter that I don't quite understand, however. I have been analysing 1024 > samples from a WAV file (16-bit, sr 44100) using FFTW's real-to-complex > transform and the equivalent rdft from the Ooura library. I get the same > output apart from the DC bin. FFTW gives me 22203 0, whereas Ooura gives > me 22203 328721. I understand that the DC is just the average of the > samples (22203 in this case), but shouldn't the phase necessarily be 0, > as in the FFTW result? Why does Ooura give me 328721 for the imaginary > part of the DC? If you calculate magnitude for bin 0 you should get identical results for both algorithms, and obviously that's not going to happen, so something is strange. My guesses would be that you fed something less than the FFT length to the test programs, so their default value for zero filling was different, or possibly you are running under Linux and the Gcc tools + optimization options produced executables that produce some anomaly. -Jeff______________________________
> If you calculate magnitude for bin 0 you should get identical results for both > algorithms, and obviously that's not going to happen, so something is strange. My > guesses would be that you fed something less than the FFT length to the test > programs, so their default value for zero filling was different, or possibly you are > running under Linux and the Gcc tools + optimization options produced executables > that produce some anomaly. > > -Jeff > Thanks Jeff. I should perhaps come clean and admit that I'm actually calling rdft from within Common Lisp via an FFI (I do the same with FFTW). I'm on WinXP, but I made both a unix dynamic library (.so) and a .dll using Dev-C++ (it uses gcc). So possibly there was some anomaly in compiling, as you suggest. I've tried both the .so and the .dll, and I get the same result. My data is just the first 1024 samples from the wav file. Cortez______________________________
Cortez- > > If you calculate magnitude for bin 0 you should get identical > results for both > > algorithms, and obviously that's not going to happen, so something > is strange. My > > guesses would be that you fed something less than the FFT length to > the test > > programs, so their default value for zero filling was different, or > possibly you are > > running under Linux and the Gcc tools + optimization options > produced executables > > that produce some anomaly. > > > > -Jeff > > Thanks Jeff. I should perhaps come clean and admit that I'm actually > calling rdft from within Common Lisp via an FFI (I do the same with > FFTW). I'm on WinXP, but I made both a unix dynamic library (.so) and > a .dll using Dev-C++ (it uses gcc). So possibly there was some anomaly > in compiling, as you suggest. I've tried both the .so and the .dll, > and I get the same result. My data is just the first 1024 samples from > the wav file. What if you use CoolEdit or MATLAB or other 'independent' tool? What do they give for the dc bin? Also, how do you know you got the first 1024 samples from the .wav file? What are you using for a .wav reader? If you send me your .wav file I can run it with Hypersignal and give you back real/imag and mag plots. If I had to guess I'd say the Ooura test program somehow doesn't "see" quite the same input data as FFTW. -Jeff PS. Please post to the group and cc me.______________________________
> What if you use CoolEdit or MATLAB or other 'independent' tool? What do they give > for the dc bin? I've got Scilab, but I'll have to go through the fft tutorial first... > Also, how do you know you got the first 1024 samples from the .wav file? What are > you using for a .wav reader? If you send me your .wav file I can run it with > Hypersignal and give you back real/imag and mag plots. I'm using my own Lisp code for reading wav files. I read the samples into a vector, and for this test I'm just selecting the first 1024 values. Just to be on the safe side I then read these into a C array via my FFI code, and pass this as the data to rdft (same with FFTW). The file I'm using can be found at http://www.captain.at/data/theevent-zoom.wav (it's a sonification of a cosmic ray event). I'll send it to you privately. > If I had to guess I'd say the Ooura test program somehow doesn't "see" quite the same > input data as FFTW. > > -Jeff > > PS. Please post to the group and cc me. Apologies Best, Cortez______________________________
Ah, it seems there was a bug in my wav file reading program. The data I was processing contained errors. Apologies, I shall look at it again once I've fixed the (minor) bug. Cortez______________________________
Christopher- > > What if you use CoolEdit or MATLAB or other 'independent' tool? > What do they give > > for the dc bin? > > I've got Scilab, but I'll have to go through the fft tutorial first... > > > Also, how do you know you got the first 1024 samples from the .wav > file? What are > > you using for a .wav reader? If you send me your .wav file I can > run it with > > Hypersignal and give you back real/imag and mag plots. > > I'm using my own Lisp code for reading wav files. I read the samples > into a vector, and for this test I'm just selecting the first 1024 > values. Just to be on the safe side I then read these into a C array > via my FFI code, and pass this as the data to rdft (same with FFTW). > > The file I'm using can be found at > http://www.captain.at/data/theevent-zoom.wav (it's a sonification of a > cosmic ray event). I'll send it to you privately. Got your .wav file. Here are some Hypersignal captures: 1) .wav file in time domain display: http://www.signalogic.com/images/evt_zoom.jpg 2) Real/imag data: http://www.signalogic.com/images/evt_zoom_RI.jpg The above display used 1024 pt FFT, no overlap, no window. Hypersignal is showing values of -433.3 for the dc bin real and zero for dc bin imag. I think you would multiply by 1024 to get the "raw" dc bin value. 3) 2-D Spectrograph: http://www.signalogic.com/images/evt_zoom_2Dspec.jpg The above display used 1024 pt frame, 4096 pt FFT, 75% overlap, Hamming window, and a log-log format to expand the area around your "event". What do you get for dc bin real after your .wav reader bug is fixed? -Jeff______________________________
Much obliged, Jeff. > What do you get for dc bin real after your .wav reader bug is fixed? > > -Jeff > I'm scaling my input between -1 and 1 now, like Matlab/Scilab, and I'm getting precisely the same values for real and imaginary parts as Scilab and FFTW - apart, still, from the imaginary part of DC. So, scaling between -1 and 1 Scilab and FFTW give me an absolute value of 1.3224182 for the real part of the DC bin, and I get the same with the Ooura rdft (I'm calling both rdft and FFTW's real2complex from Lisp). But DC has 0.0317687 for its imaginary part with rdft. I'm beginning to suspect that the parameters I'm passing just aren't correct. The documentation for the Ooura library is quite terse and to be honest I'm not entirely sure I've interpreted it correctly. The exact parameters are - void rdft(int n, int isgn, double *a, int *ip, double *w) n is obviously just the length of the data (a), isgn is the direction of the transform, and ip is the work area for bit reversal (with length >= 2+sqrt(n/2)). Is w a window parameter? Right now I'm passing an array filled with zeros (doubles), of length n/2, assuming that this is equivalent to an unwindowed transform. Grateful for any clarification, Chris______________________________
Chris- > Much obliged, Jeff. > > > What do you get for dc bin real after your .wav reader bug is fixed? > > > > -Jeff > > I'm scaling my input between -1 and 1 now, like Matlab/Scilab, and I'm > getting precisely the same values for real and imaginary parts as > Scilab and FFTW - apart, still, from the imaginary part of DC. So, > scaling between -1 and 1 Scilab and FFTW give me an absolute value of > 1.3224182 for the real part of the DC bin, and I get the same with the > Ooura rdft (I'm calling both rdft and FFTW's real2complex from Lisp). > But DC has 0.0317687 for its imaginary part with rdft. > > I'm beginning to suspect that the parameters I'm passing just aren't > correct. The documentation for the Ooura library is quite terse and to > be honest I'm not entirely sure I've interpreted it correctly. The > exact parameters are - > > void rdft(int n, int isgn, double *a, int *ip, double *w) > > n is obviously just the length of the data (a), isgn is the direction > of the transform, and ip is the work area for bit reversal (with > length >= 2+sqrt(n/2)). Is w a window parameter? Right now I'm passing > an array filled with zeros (doubles), of length n/2, assuming that > this is equivalent to an unwindowed transform. Did you try passing all 1's for *w? If that's actually window data, then "no window" would be a rectangular window which would be all 1's. Also I wouldn't assume that w is symmetric. There could be reasons for not doing that. > Grateful for any clarification, Given your Scilab and FFTW results -- well-documented routines -- I still question your .wav reader. If I just take the first 1024 points of event_zoom.wav, Hypersignal is showing a value of -0.113 for the real dc bin without the 1024 scaling and about -116 with it. Are you sure the reader is handling the .wav file header correctly? Hypersignal is showing a 44 byte header in this case. I can post the 1024 pt plots if you want. -Jeff______________________________
> Did you try passing all 1's for *w?
Yes, it didn't make any difference.
> Given your Scilab and FFTW results -- well-documented routines -- I
still question
> your .wav reader. If I just take the first 1024 points of
event_zoom.wav,
> Hypersignal is showing a value of -0.113 for the real dc bin without
the 1024 scaling
> and about -116 with it. Are you sure the reader is handling the
.wav file header
> correctly? Hypersignal is showing a 44 byte header in this case.
Yes, it's reading everything correctly. For example (warning: Lisp
code ahead!) -
;; Open the wav file and store it in a variable.
;; Returns an instance of the class WAV-FILE, whose
;; slots hold the data (values of the various header
;; fields and the actual samples).
(setf wav (open-wav-file "/theevent-zoom.wav"))
=> #<WAV-FILE @ #x215a0e52>
;; Check the subchunk2 size.
(subchunk2-size wav)
=> 179758
;; Check the size (in bytes) of the original file.
(with-open-file (str "/theevent-zoom.wav")
(file-length str))
=> 179802
(- 179802 179758)
=> 44
Scilab gives me the following for the first sample -
-->wavread('/theevent-zoom.wav', 1)
ans = 0.0051880
My Lisp code gives me -
(aref (samples wav) 0) ; Accessing the first index to the vector.
=> 0.0051880
I've also visualized the data in my own Lisp-based graphical app and
everything seems to check out OK...
I called the FFTW routines with exactly the same data as the Ooura,
and got 0 for the DC imag part. I've just checked it with the C
version of the telafft library (http://www.jjj.de/fft/fftpage.html) -
calling that from Lisp again gives a 0 DC imag part (well actually it
simply omits it and just gives the real part, like Scilab). So there's
obviously some problem with the way I'm calling the Ooura routine, or
setting up the data for it, or perhaps there's an issue with the .dll
I'm using. I'll have another look at, but for now I think I'm going to
stick with the telafft library.
Many thanks for your help and advice.
Best wishes,
Chris
Chris-
> > Did you try passing all 1's for *w?
>
> Yes, it didn't make any difference.
>
> > Given your Scilab and FFTW results -- well-documented routines -- I
> still question
> > your .wav reader. If I just take the first 1024 points of
> event_zoom.wav,
> > Hypersignal is showing a value of -0.113 for the real dc bin without
> the 1024 scaling
> > and about -116 with it. Are you sure the reader is handling the
> .wav file header
> > correctly? Hypersignal is showing a 44 byte header in this case.
>
> Yes, it's reading everything correctly. For example (warning: Lisp
> code ahead!) -
>
> ;; Open the wav file and store it in a variable.
> ;; Returns an instance of the class WAV-FILE, whose
> ;; slots hold the data (values of the various header
> ;; fields and the actual samples).
> (setf wav (open-wav-file "/theevent-zoom.wav"))
> => #<WAV-FILE @ #x215a0e52> ;; Check the subchunk2 size.
> (subchunk2-size wav)
> => 179758
>
> ;; Check the size (in bytes) of the original file.
> (with-open-file (str "/theevent-zoom.wav")
> (file-length str))
> => 179802
>
> (- 179802 179758)
> => 44
>
> Scilab gives me the following for the first sample -
>
> -->wavread('/theevent-zoom.wav', 1)
> ans = 0.0051880
>
> My Lisp code gives me -
>
> (aref (samples wav) 0) ; Accessing the first index to the vector.
> => 0.0051880
Hypersignal gives the same. So the consensus says your .wav reader is fine.
> I've also visualized the data in my own Lisp-based graphical app and
> everything seems to check out OK...
>
> I called the FFTW routines with exactly the same data as the Ooura,
> and got 0 for the DC imag part. I've just checked it with the C
> version of the telafft library (http://www.jjj.de/fft/fftpage.html) -
> calling that from Lisp again gives a 0 DC imag part (well actually it
> simply omits it and just gives the real part, like Scilab). So there's
> obviously some problem with the way I'm calling the Ooura routine, or
> setting up the data for it, or perhaps there's an issue with the .dll
> I'm using. I'll have another look at, but for now I think I'm going to
> stick with the telafft library.
Are all the other Ooura values Ok? It could just be that Ooura "uses"
the dc bin
imag value for some other purpose, for example an overall algorithm statistic
(max
value, avg, etc). My favorite would be max value -- useful for plotting. In
FFT
terms, it's a meaningless "spot" anyway.
-Jeff
______________________________> Are all the other Ooura values Ok? It could just be that Ooura "uses" the dc bin > imag value for some other purpose, for example an overall algorithm statistic (max > value, avg, etc). My favorite would be max value -- useful for plotting. In FFT > terms, it's a meaningless "spot" anyway. > > -Jeff > Thanks again Jeff. I think I've solved it - it's the magnitude of Nyquist. In fact checking the sources again I found this reference to the output of the Ooura real-to-complex, which I must admit at first somewhat mystified me - output data a[2*k] = R[k], 0<=k<n/2 a[2*k+1] = I[k], 0<k<n/2 a[1] = R[n/2] And indeed checking the value at the n/2-th index of the full output it's the same as the DC 'imaginary' part (I was of course discarding the second half of my output so it wasn't showing up previously). The FFTW tutorial mentions that some FFT implementations 'store the Nyquist element where the DC imaginary part would go, in order to make the input and output arrays the same size'. So it seems that that is what Ooura is doing. Many thanks again, and apologies if I wasted any of your time! Chris______________________________
> And indeed checking the value at the n/2-th index of the full output > it's the same as the DC 'imaginary' part (I was of course discarding > the second half of my output so it wasn't showing up previously). Or rather checking that index in Scilab gives me the same value as the Ooura DC 'imag' part. Ooura discards the second (reflected) half of the output, Scilab apparently doesn't. Chris______________________________
Chris- >> Are all the other Ooura values Ok? It could just be that Ooura > "uses" the dc bin >> imag value for some other purpose, for example an overall algorithm > statistic (max >> value, avg, etc). My favorite would be max value -- useful for > plotting. In FFT >> terms, it's a meaningless "spot" anyway. >> >> -Jeff >> Thanks again Jeff. I think I've solved it - it's the magnitude of > Nyquist. In fact checking the sources again I found this reference to > the output of the Ooura real-to-complex, which I must admit at first > somewhat mystified me - > > output data > a[2*k] = R[k], 0<=k<n/2 > a[2*k+1] = I[k], 0<k<n/2 > a[1] = R[n/2] > > And indeed checking the value at the n/2-th index of the full output > it's the same as the DC 'imaginary' part (I was of course discarding > the second half of my output so it wasn't showing up previously). The > FFTW tutorial mentions that some FFT implementations 'store the > Nyquist element where the DC imaginary part would go, in order to make > the input and output arrays the same size'. So it seems that that is > what Ooura is doing. > > Many thanks again, and apologies if I wasted any of your time! No waste of time at all, it's a small mystery solved. What's happening makes sense and is good to keep in mind. The Ooura doc must be really terse. I would have thought it would mention something like this and/or offer config options for different output behavior that matches other algorithms. -Jeff______________________________