Well, not a filter exactly, just an FFT in-out test.
I got a simple test program in a loop (using fftw)
* Read 2048 samples of audio at 48000 (16 bits stereo, 4 bytes)
* convert to complex float (copy audio to real array, zero imaginary array)
* use fftwf (2048 points) to get circular complex spectrum
* use fftwf again to return to time samples (adjust by 1/N)
* convert back to 16 bit integers (copy real only, ignore imaginary)
* send to audio device
- The sound output is crackly and with regular clicks, rather loud.
Louder than one would expect from just fp precision noise errors.
- Tried again using the double precision version of fftw, same results.
- Tried again at N=512 an even 256 to check if it was a fp dynamic range problem, still there.
- I am not doing any filtering: just forward fft, then back.
Arguably, this should result in a 1:1 unity matrix transform, i.e out=in
(with perhaps a small amount of fp noise)
- Timings aree OK, FFT takes 400uS, audio frames are about 40mS. Checked with scope.
- Remove just the FFT in-out transform bit (but leave the fp conversions), audio sounds OK.
- Tried modifying the software to do a 2:1 overlapp-add, gurgling much reduced, but still there
What may I be doing wrong? Any tests I can do?
This is with ALSA in a linux SBC.
You are doing a forward transform and then an inverse one, yes?
It may be very instructive to calculate X from x with a forward transform, then x' from X with a reverse transform, and then compare x and x' -- they should be identical, or at least differ only by numerical scrud. I'd reduce the problem down to a 32- or 64-point vector for x until I got x and x' identical, and then go back up from there.
Yes, quite. My maths says that a forward FFT followed by an inverse FFT should produce the same array (apart from crud, and normalisation) as the two matrix multiplications just end up as the unity matrix. Nothing to do with windowing or what the original looks like.
I think I'll try next with a simulated array (pattern of numbers) see what comes back.
An FFT introduces an artificial jump when sampling from a real signal. Have you used any windowing e.g hanning/hamming/blackman etc, which reduces this effect?
Well, no. Sampling from a real signal introduces a jump. The FFT itself loses no information, and "edinsam" is perfectly correct that an FFT followed by an IFFT should cough up a result that is theoretically identical, but with some numerical crud added.
Getting to an actual overlap-and-add filter will require more work -- I assume we'll get there in due time, but if he's got problems at this stage, it's best to work them out now, rather than later when things are altogether more complex.
The double FFT operation was returning the input array "time reversed"
I tried printing the output of a 64 point FFT-IFFT as you suggested, and it showed it clearly.
Placing Ar=1.0 and the rest to zero as input, returned Ar=1.0 and the rest zero (or very near)
Similar with other tests, i.e. Ar <--> Ar etc
Odd that this was not noticeable by listening to the sound (apart from the clicks)
I'm glad that things are better. That shouldn't happen, and doing an FFT followed by another FFT will result in a time-reversed signal. Either there's something screwy about the fftw library that "everyone knows" (except for you and me), or you weren't properly telling it to do the IFFT.
(I can't help you with details of the fftw library, unfortunately -- I think I used it once, about ten years ago.)
Put it down to program laziness. fftw requires arrays to be "prepared" so that the FFTs are optimised for the machine being used. In haste, I used only one "prepare" (for forward FFT) so what I was doing was two forward FFTs instead of a forward and an inverse.
Conceptually the same of course, except for the way the reals and imaginaries are returned!!!
BTW fftw is very fast and very easy to use, cannot recommend it enough.
Also, reading your post, are you sampling from an analogue signal? Have you thought of digitising the entire audio, which would create exactly contiguous samples at the boundaries?
From the description, it seems evident that you're using an arbitrary time record.
Here's a good test:
Don't FFT the record (that you would have FFTd) at all. Just play it in an endless loop. How does that sound? What have you done in constructing that record to avoid clicks where the ends come together?
As Tim noted earlier, you should be able to do the forward and inverse transforms and get virtually identical input/output. If not then you need to check your work - look for indexing problems, scaling problems, etc.
Once that's done you've got the easy part. You've still got a long way to go before you can actually do any real frequency domain processing. If you modify the contents of the frequency bins you'll start to create discontinuities in the output and the only way to get around them is to use overlap-save or overlap-add processing.
Once you've got your overlap processing working then you'll still find that it may be hard to make changes in the frequency domain that propagate correctly in time. To get around that you need to use phase-vocoder techniques - analyzing the phase rates of each FFT bin and preserving that in the output.
Frequency domain audio processing is interesting, but there are a whole closet full of gotchas lurking.
Thanks. This was just an experiment as the original code uses a FIR filter which works quite happily using the same sized blocks. The original uses the standard overlap-add convolution method as the samples are fixed in size. I am aware that a similar overlap technique is required for FFT filtering, but was surprised to find that a simple one to one transform (which is effectively a double matrix multiplication resulting in the unity matrix) is giving me those clicks.
You probably have a bug in your code. The clicks come from big discontinuities in the signal which shouldn't be there if you are just doing a forward and reverse transform with no alterations.
I would isolate the relevant code, print your input values and your output values to make sure they are working correctly.
Any noise introduced by the type conversion should be inaudible.
Hope this helps,
I am sure that's the problem :-)