DSPRelated.com
Forums

downsampling -> FFT -> upsampling

Started by Fred T. Weiler April 5, 2005
Hi!

I'm experimenting with FFT and inverse FFT, doing some filtering in real-time.
It sounds good but it's a bit too CPU heavy so I decided to downsample the
signal and apply the FFT and inverse FFT on samplerate/2 and then interpolate
the resulting upsampling the output again to the original samplerate.

However, I get some strange overtones when I resample the signal. I've tried
different FFT lengths (between 512 and 4096 samples on a 44.1 kHz signal)
and different windowing techniques, but no luck. I've checked the resample
code by applying it on the dry input signal, and it seems to be correct.

Any thoughts on what might be wrong and how this should be done?
I know that the FFT bins will appear on different frequencies when I downsample
the signal and therefore it might sound a bit different, but these are artifacts
which sound awful, like hiss/buzz/noise. For a while I thought that this was
caused by the fact that the window length in fact increases in relation to the
actual signal because the signal is downsampled. So I tried other FFT lengths
to adjust for the reduced number of samples, but no luck.

And I've considered other techniques to reduce the CPU load. For example
I saved the result of the FFT and reused it once or twice to do the inverse FFT, but
the result wasn't too good (slight metallic sound caused by repeating the same
slice). And I've tried different overlap techniques, but they haven't been that useful
and haven't reduced the CPU load that much. Do you have any optimization tricks
to share?

Fred


"Fred T. Weiler" wrote:
> > Hi! > > I'm experimenting with FFT and inverse FFT, doing some filtering in real-time. > It sounds good but it's a bit too CPU heavy
So why not filter in the time domain which in most cases is much less CPU intensive. Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo nospam@mega-nerd.com (Yes it's valid) +-----------------------------------------------------------+ "Microsoft, and other companies with shoddy security, ....." -- Bruce Schneier, cryto-guru, to a US Senate committee.
> > Hi! > > > > I'm experimenting with FFT and inverse FFT, doing some filtering in real-time. > > It sounds good but it's a bit too CPU heavy > > So why not filter in the time domain which in most cases is much less > CPU intensive. > > Erik
I do that too. But I'm implementing a filter, which allows the user to switch between an FFT filter and a filter in the time domain. Fred
Fred T. Weiler wrote:
> Hi! > > I'm experimenting with FFT and inverse FFT, doing some filtering in
real-time.
> It sounds good but it's a bit too CPU heavy so I decided to
downsample the
> signal and apply the FFT and inverse FFT on samplerate/2 and then
interpolate
> the resulting upsampling the output again to the original samplerate. > > However, I get some strange overtones when I resample the signal.
I've tried
> different FFT lengths (between 512 and 4096 samples on a 44.1 kHz
signal)
> and different windowing techniques, but no luck. I've checked the
resample
> code by applying it on the dry input signal, and it seems to be
correct. Exactly what do you do? I believe there is a post filter involved after the upsample stage. If you leave that out, you'd get all sorts of artifacts in the data. Rune
> > Exactly what do you do? I believe there is a post filter involved after the > upsample stage. If you leave that out, you'd get all sorts of artifacts > in the data. > > Rune >
Hi Rune, I take a buffer of size bufsize and: 1. Throw away every second sample to obtain a buffer of size bufsize/2. 2. Perform an FFT on that stream (size: bufsize/2). 3. Adjust the bins and perform the inverse FFT. 4. Add one (linear) interpolated sample making the buffer twice as big again (= bufsize), which is the original size. Shouldn't it work? It doesn't. NOTE: It works fine if I do just (2) and (3). But if I add (1) and (4) then it sounds *very* strange. Hard to describe in words, a bit like a comb filter with a metallic touch and I can hear a tendency of echoes. Here's the code: // Downsample. // pos = 0; for(count=0; count<SampleCount; count += 2) { _arrDs1[pos] = arr1[SampleStart + count]; pos++; } // Process the downsampled signal. // ProcessReplacingDs(_arrDs1, _outDs, 0, pos); // Upsample. // pos = 0; for(count=0; count<SampleCount; count += 2) { output[SampleStart + count] = 0.5*(_lastDs + _outDs[pos]); output[SampleStart + count +1] = _outDs[pos]; _lastDs = _outDs[pos]; pos++; } //Fred
On Tue, 05 Apr 2005 11:10:59 +0000, Fred T. Weiler wrote:

>> >> Exactly what do you do? I believe there is a post filter involved after the >> upsample stage. If you leave that out, you'd get all sorts of artifacts >> in the data. >> >> Rune >> > > Hi Rune, > > I take a buffer of size bufsize and: > > 1. Throw away every second sample to obtain a buffer of size bufsize/2.
This is badness: you introduce aliasing into your signal right here, unless you first filter it to ensure that there is no content above Fs/4. Straight decimation folds all of the energy from Fs/4 to Fs/2 into 0..Fs/4, so you have to make sure that there's none of the former. [Since this process reverses the harmonic structure, this aliased version of the sound typically sounds really bad.]
> 2. Perform an FFT on that stream (size: bufsize/2). > > 3. Adjust the bins and perform the inverse FFT.
Direct tweakage of frequency bins is rarely a good idea, because that corresponds to boosting or cutting whole-buffer sinusoids, in the time domain: you end up with lots of ugly ringing. You don't say how you tweak your frequency bins, but the "proper" way to do it is by multiplying by the frequency-domain version of some reasonable impulse response.
> 4. Add one (linear) interpolated sample making the buffer twice as big > again (= bufsize), which is the original size.
Linear interpolation won't be helping the output, but it probably won't be adding the metallic ring. It _will_ be causing the output to droop towards high frequencies. You need to do proper sync interpolation (or something similar) in order to keep the result spectrum flat in-band.
> Shouldn't it work? It doesn't.
You also don't mention doing overlap-save, which (in the context of the comment that you made in another post about windowing) makes me think that you're probably trying to do something vocoder-ish, rather than something EQ-ish. The difference is that the vocoder situation is more about dynamic movement, and it's not going to matter too much, long term, how the filtered output sounds: it'll sound bad in hi-fi EQ terms. Getting frequency domain filtering "right", in the low distortion, constrained ringing, no aliasing (i.e., linear shift-invariant processing) sense typically requires more mucking about getting things right than this.
> NOTE: It works fine if I do just (2) and (3). But if I add (1) and (4) > then it sounds *very* strange. Hard to describe in words, a bit like a > comb filter with a metallic touch and I can hear a tendency of echoes.
Metallic and echoey is most likely a combination of aliasing and block-boundary circualr-convolution errors. If you're lucky, you might also have a bug that's causing residual signal energy to leak into future frames. Terms that you probably want to goole for are "overlap-save" and "anti-alias filter", with possible side orders of "multi-rate" "sample rate conversion" and/or "sync interpolation". Cheers, -- Andrew
Hi Fred

> And I've considered other techniques to reduce the CPU load. For example > I saved the result of the FFT and reused it once or twice to do the
inverse FFT, but
> the result wasn't too good (slight metallic sound caused by repeating the
same
> slice). And I've tried different overlap techniques, but they haven't been
that useful
> and haven't reduced the CPU load that much. Do you have any optimization
tricks
> to share?
First off, I'm assuming that this is real time. I am doing a similar thing, albeit at RF, where I take an 200kHz wide IF passband at 10MHz, downconvert it, perform an arbitrary automated equalisation on the passband using FFT/IFFT, and upconvert it back to the IF frequency. First though, I do have an analog filter in the IF prior to my digital downconvert. This would be equivalent to applying a basic bandpass time domain filter in your case. There's also a filter on the output. Secondly, if it's real time, you don't mention if you use an overlap-add or overlap-save method to stitch back together your sample batches with an appropriate windowing function. If you don't do this you will lose information and potentially suffer strange artifacts due to discontinuities at the sampling extremities. I'm slightly surprised you're having difficulty with the amount of CPU available to you: what platform is this on? If it's on a PC, echoing can be as simple as leaving the Mic record setting on. In addition, I find displaying fancy FFT displays on a PC often hogs more CPU the the FFT itself. Cheers, Howard
> This is badness: you introduce aliasing into your signal right here, > unless you first filter it to ensure that there is no content above Fs/4. > Straight decimation folds all of the energy from Fs/4 to Fs/2 into > 0..Fs/4, so you have to make sure that there's none of the former.
But the original sample rate is 44.1 kHz and I downsample to 22.05 kHz. I'm aware of the fact that frequencies above 11.025 kHz will be distorted, but that doesn't matter. I've monitored the downsampled signal and it sounds just fine downsampled at 22.05 kHz.
> Direct tweakage of frequency bins is rarely a good idea, because that > corresponds to boosting or cutting whole-buffer sinusoids, in the time > domain: you end up with lots of ugly ringing. You don't say how you tweak > your frequency bins, but the "proper" way to do it is by multiplying by > the frequency-domain version of some reasonable impulse response.
I have stored an impulse response, which I use to do a complex multiplication of the buffer which I feed. Nothing's wrong with the actual filter. That part of the code works fine at the original sample rate.
> You also don't mention doing overlap-save,
I do. I've experimented with 50% overlap and 75% overlap and 4-5 different window types. Nothing's wrong with that part. As I wrote in the original post; Everything works fine if I stick to the original sample rate. Nothing's wrong with the actual filter at the original sample rate. However, strange things happen at half the sample rate, which is really confusing, because the FFT procedure isn't depending on the sample rate at all. It just receives a sequence of samples no matter the sample rate.
> Metallic and echoey is most likely a combination of aliasing and > block-boundary circualr-convolution errors. If you're lucky, you might > also have a bug that's causing residual signal energy to leak into future > frames. > > Terms that you probably want to goole for are "overlap-save" and > "anti-alias filter", with possible side orders of "multi-rate" "sample > rate conversion" and/or "sync interpolation".
As I mentioned, nothing's wrong with the actual FFT. It's when I apply the downsampling and upsampling code that I get these strange artifacts. Fred
> First off, I'm assuming that this is real time.
It is.
> First though, I do have an analog filter in the IF prior to my digital > downconvert. This would be equivalent to applying a basic bandpass time > domain filter in your case. There's also a filter on the output.
The downsampling is not that advanced, because it just downsamples in half. I've commented away the FFT/iFFT routine and downsampled the signal and then reconstructed it again by upsampling using the code I showed (linear interpolation) and it works just fine. A slight loss in the higher frequency domain, but that's OK.
> Secondly, if it's real time, you don't mention if you use an overlap-add or > overlap-save method to stitch back together your sample batches with an > appropriate windowing function.
I do. No problem there. I've even fine tuned it by testing a number of different approaches and alternatives of overlaps and window functions.
> If you don't do this you will lose information
Absolutely. I know.
> and potentially suffer strange artifacts due to discontinuities at the sampling extremities.
Remember that everything's fine at the original sample rate so it cannot have anything to do with the actual overlap and window functions which just operate on a stream of samples, no matter the sample rate.
> I'm slightly surprised you're having difficulty with the amount of CPU > available to you: what platform is this on?
I don't meant that I have problems with it. I just want to optimize it and it's a bit too CPU heavy compared to how I want it to be.
> If it's on a PC, echoing can be as simple as leaving the Mic record setting > on.
That's not the case. The test input is a raw sample file (so I can test different implementations and compare the result). Fred
"Erik de Castro Lopo" <nospam@mega-nerd.com> wrote in message 
news:42524DB2.396F6AB2@mega-nerd.com...
> "Fred T. Weiler" wrote: >> >> Hi! >> >> I'm experimenting with FFT and inverse FFT, doing some filtering in >> real-time. >> It sounds good but it's a bit too CPU heavy > > So why not filter in the time domain which in most cases is much less > CPU intensive.
Erik, If the filter is FIR and of any appreciable length then the longer the filter the more time is saved by the FFT/*/IFFT method. But you knew that didn't you? Anyway, so much for "most cases"? Fred