DSPRelated.com
Forums

downsampling -> FFT -> upsampling

Started by Fred T. Weiler April 5, 2005
Fred T. Weiler wrote:
> > Leaving aside the downsampling, if you are splitting your input
into
> > consecutive blocks of N samples, and assuming that the impulse > > response of your filter is less than N samples long, then zero pad
both
> > blocks of input samples and the filter impulse response to length
2N
> > before applying a 2N point FFT, multiplying and then applying a 2N > > point IFFT to give 2N sample output blocks which should be
overlapped
> > 50%. The only window functions necessary are rectangular, i.e.
implied
> > by taking blocks. > > I think that will work. Hope it helps. > > > > Donald > > > > Uh? You'll need some sort of window function to get rid of the
glitches at
> the edges of each block. I tried a Hamming window and it works fine. > Haven't tried Hanning yet. > > Fred
I don't think there should be any glitches - the blocks of 2N samples overlapped as described should fit together seamlessly. If you think of what you are doing in the time domain, you're convolving a block of N input samples with an N sample filter impulse response. That gives a 2N-1 sample output sequence (plus a zero sample) 'starting' at the same point as the 'start' of the block of input samples (call it sample 0). The next block of N input samples convolved with N samples of a filter response yields another 2N-1 output samples (plus a zero sample) 'starting' at sample N. The overall output between samples N and 2N-1 is the sum of the outputs due to the two blocks of N input samples just mentioned. This is 50% overlap of the longer convolution results. Taking length 2N FFTs of zero padded sample blocks, multiplying by length 2N transformed filter coefficients (filter impulse response) and IFFTing yields 2N sample output blocks to be overlapped that represent the convolution results described - exactly. No need for any windows. Donald
On Wed, 06 Apr 2005 12:52:06 +0000, Fred T. Weiler wrote:

>> Leaving aside the downsampling, if you are splitting your input into >> consecutive blocks of N samples, and assuming that the impulse >> response of your filter is less than N samples long, then zero pad both >> blocks of input samples and the filter impulse response to length 2N >> before applying a 2N point FFT, multiplying and then applying a 2N >> point IFFT to give 2N sample output blocks which should be overlapped >> 50%. The only window functions necessary are rectangular, i.e. implied >> by taking blocks. >> I think that will work. Hope it helps. >> >> Donald >> > > Uh? You'll need some sort of window function to get rid of the glitches at > the edges of each block. I tried a Hamming window and it works fine. > Haven't tried Hanning yet.
Not at all. You didn't google for "overlap save" did you? You don't get glitches at the edge of the blocks, because the tehnique exactly produces a linear filtering operation: the circular convolution artifacts that *necessarily* arise from multiplying a non-unit length impulse response with a full buffer of signal are in the part of the result that is discarded. No distortion leads to no glitches. Overlap save is the name of one of the favoured techniques for performing linear filtering in the frequency domain. The other one referenced in textbooks is usually called "overlap add", which is hardly ever used as it requires more work (there's an advantage, though, I seem to remember, but I can't remember what it is...). Both of these techniques rely on the filter coefficients staying the same, from one block to the next, to avoid block-to-block discontinuities. If that's not going to be the case (as for a vocoder, for example), then you can fall back on the windowed-overlap method that you're using, or something like it. It doesn't sound as good, but it sounds better than regular overlap-save with no overlap, when the coefficients change. -- Andrew
Fred T. Weiler wrote:
>>>Hi! >>> >>>I'm experimenting with FFT and inverse FFT, doing some filtering in real-time. >>>It sounds good but it's a bit too CPU heavy >> >>So why not filter in the time domain which in most cases is much less >>CPU intensive. >> >>Erik > > > I do that too. But I'm implementing a filter, which allows the user to > switch between an FFT filter and a filter in the time domain.
Why? To demonstrate the equivalence? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Fred T. Weiler wrote:

   ...

>>>1. Throw away every second sample to obtain a buffer of size bufsize/2. >> >>As others have said, you shouldn't do this without filtering first. It's >>easy enough to apply a halfband filter just to see what happens. But, read >>on. There is likely a better way that will save a bunch of compute time. > > > OK. Then I should lowpass filter at Fs/4 (= 11.025 kHz), right? > But do I gain anything else than reducing a slight distortion in the upper > frequency regions? The problem is that the resulting audio sounds like > it has flams/delays in it plus some nasty comb filtering.
A harmonic at 18 kHz becomes a non harmonic at 22.05 - 18 kHz. Look up "folding". ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Fred T. Weiler wrote:
>>Aliasing is NOT a slight distortion in the upper frequency regions. >>The upper half of the original spectrum is folded back into the new >>downsampled spectrum, so the whole new spectrum is affected. >> >>bye >>Andreas > > > Guys, > > Given a signal obtained at sampling frequency F. Assume that the signal > would have been sampled at the rate of F/2 instead of F. Then it would look > exactly the same as if we traverse the signal sampled at rate F and throw > away every second sample.
And it would be aliased. Before sampling at F/2, you must low-pass to below F/4.
> It's a completely different story if you start throwing away, say, every third > sample or keep 4 samples and throw away every fifth. That's an error.
That's an error, yes. The same error.
> But I'm talking about obtaining the signal as it would look if it would have > been sampled at F/2. Then it's indeed possible to throw away every > second sample. It's equivalent to as how the signal would have looked > if it had been sampled at F/2 instead of F.
Proper reconstruction requires that a signal be sampled at greater than twice the highest frequency it contains. To guarantee that, it is usually low-pass filtered before sampling. Consider your idee fixee: if every second sample of a set of valid samples is discarded, the result is still a set of valid samples. How many times in a row would you apply that theorem? Why stop there? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Fred T. Weiler wrote:
>>Leaving aside the downsampling, if you are splitting your input into >>consecutive blocks of N samples, and assuming that the impulse >>response of your filter is less than N samples long, then zero pad both >>blocks of input samples and the filter impulse response to length 2N >>before applying a 2N point FFT, multiplying and then applying a 2N >>point IFFT to give 2N sample output blocks which should be overlapped >>50%. The only window functions necessary are rectangular, i.e. implied >>by taking blocks. >>I think that will work. Hope it helps. >> >>Donald >> > > > Uh? You'll need some sort of window function to get rid of the glitches at > the edges of each block. I tried a Hamming window and it works fine. > Haven't tried Hanning yet. > > Fred
Properly done overlap-add shouldn't need a window. Something is wrong. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in message 
news:O9SdnW6nYYLhYs_fRVn-gA@centurytel.net...

> Note well: the longer an FFT, the more efficient it is.
No, that's not correct by itself. FFT is O(n log n), so the number of operations scales up more than linearly with the FFT size (which I know you know). The filtering gets more efficient with increasing FFT size because the overhead due to the overlap stuff goes down. If I recall correctly, if you keep increasing the FFT size, eventually you get to a point where the log n factor makes more difference than the decreased overhead, and from that point on increasing the FFT size actually decreases the overall efficiency of the filtering. (And then there's all the hardware-dependent stuff, like if you increase the FFT size so that it no longer fits into cache, the speed goes down. Keep increasing to the point where you have to swap to disk and the speed goes WAY down. But lets ignore all that, it's too messy...)
> So, instead of decimating in time at all, just do an FFT that's 2x as > long. Then filter accordingly by multiplying. Then IFFT. Don't > interpolate in time. This is well known to be more efficient by a huge > amount if the array is of any appreciable size. Mind the overlap stuff.
Right, the overhead from the overlap stuff goes down as the FFT size goes up. -- Eric Backus R&D Design Engineer Agilent Technologies, Inc. 425-356-6010 Tel
> Consider your idee fixee: if every second sample of a set of valid > samples is discarded, the result is still a set of valid samples. How > many times in a row would you apply that theorem? Why stop there?
I'm not saying it's "the same". If I said that then I expressed myself wrong. What I mean is that you'll get distortion above the Nyquist frequency, which is getting lower on the downsampled signal. For each time you downsample by dividing the sample rate in half, the more distortion you'll get above the Nyquist frequency. Fred
> > A harmonic at 18 kHz becomes a non harmonic at 22.05 - 18 kHz. Look up > "folding". > > ...
Yes, folding distortion. I know. That's in theory. However, if I take an audio file and throw away every second sample and listen to it at half the sample rate, then the degradation of the sound won't be that noticable other than you may notice that the sample rate has been cut in half. Can you then explain why? Yes, I know that there's folding distortion in theory, but not in practice. Fred
Fred,

Comments are interspersed below:

Fred

"Fred T. Weiler" <nospam@nospam.com> wrote in message 
news:uKM4e.134147$dP1.471425@newsc.telia.net...
>> Fred, >> >> Well, I've read all the posts so far and it sounds like "there's nothing >> wrong here or there but somewhere". Time to reevaluate. So my remarks >> are >> intended to suggest areas to investigate. > > Sure. Thanks. I appreciate all input. > >> > 1. Throw away every second sample to obtain a buffer of size bufsize/2. >> >> As others have said, you shouldn't do this without filtering first. It's >> easy enough to apply a halfband filter just to see what happens. But, >> read >> on. There is likely a better way that will save a bunch of compute time. > > OK. Then I should lowpass filter at Fs/4 (= 11.025 kHz), right? > But do I gain anything else than reducing a slight distortion in the upper > frequency regions? The problem is that the resulting audio sounds like > it has flams/delays in it plus some nasty comb filtering.
I was trying to say: "stop fighting this". Your "slight distortion" is arm-waving to an outside observer and cannot be justified in brief discussions. Later you talk about "what if we had sampled at fs/2 in the first place with no aliasing?" Well, if that is the case then you can throw out half the samples with no worries. The issue I have with that is: "prove it". Otherwise we will continue to caution you in this regard.
> >> > 2. Perform an FFT on that stream (size: bufsize/2). >> >> This could be the problem. The array size needs to be =>L+M-1 where L is >> the length of the filter and M is the length of the data sequence. You >> need >> to zero pad to get this on both the filter and the data. > > Yup.
Yup as in "that's what I'm doing" or as in "Oh, I see?" .... just curious if we've made some headway.
> >> > 3. Adjust the bins and perform the inverse FFT. >> >> It's still unclear to me what this operation called "adjust" really is. >> The >> proper operation is to multiply with a reasonable filter FFT. > > Sorry. More precise: I do a complex multiplication on each element in the > filter > FFT and the input signal.
Still not precise enough .... the "input signal" is in the time domain. I think you mean multiply the FFT of the filter unit sample response [of length N>=L+M-1] by the FFT of the input signal [of equal length N>=L+M-1].
> >> Now, for interpolation you don't need to add interpolated samples in time >> if >> you're already working in the frequency domain. Here's how: >> In principle, you can replicate the spectrum to double the sample rate. >> Doing this alone will introduce unwanted frequencies as you can imagine. >> So, you want to replicate the spectrum (which is almost a no-op because >> you >> don't have to actually *do it*). Then you apply a half-band filter >> before >> the IFFT. This is much more efficient than interpolation in the time >> domain. > > Hmmm, is it really? Applying the half-band filter will require some > computation > on the whole array. And the array size is equal to the size of the output > samples. > Sounds like it'd take at least around the same time.
I know it sounds like it would. It's counterintuitive. You need to count the number of operations to find that convolution has a *lot* of computations. See Stockham's paper from the 70's on the efficiency gained. He compares time domain convolution with FFT/*/IFFT. As the arrays get larger the latter is much, much better. That means both arrays so it may not be the same thing if the interpolating filter is really short like [0.5 0 0.5].
> >> If you do replicate the spectrum, I might suggest at the risk of being >> hammered by purists that IF the spectrum is low around fs/4 and 3fs/4 >> then >> you can *assume* that the samples in frequency between these two are zero >> and don't bother to multiply away the nonzero terms around fs/2. In >> other >> words multiply by a "perfect" halfband filter that is 1.0 in the passband >> and 0.00 in the stopband. Of course, then you don't multiply at all! > > I'm not sure if I understood it. You mean that if the buffer size is [0 - > top], then > I'd double the array size and zero the items at [top+1 - top*2] ? > Won't I get phase problems then?
No. You do this: If the buffer / array size is [0 - top] where top=N-1 so there are N samples then you double the array size by *repeating* the samples: F(N)which=F(top+1)=F(0), F(N+1)=F(1) .... F(2N)=F(top). If the time series was real, then the FFT samples are even. So the new array is also even. THEN you zero out the middle so the spectrum is retained but at a higher sample rate (by eliminating or by multiplying by something akin to a halfband filter response) the values from roughly 2N/4=N/2 to 6N/4=3N/2. This doesn't affect the evenness of the array so the time domain IFFT is purely real and is at the higher sample rate.
> >> Note well: the longer an FFT, the more efficient it is. So, instead of >> decimating in time at all, just do an FFT that's 2x as long. Then filter >> accordingly by multiplying. Then IFFT. Don't interpolate in time. This >> is >> well known to be more efficient by a huge amount if the array is of any >> appreciable size. Mind the overlap stuff. > > Yeah, well I've experimented with different sizes on the original sample > rate. > But no matter how much I optimize it could still be optimized even more > by downsampling.
I'm not sure what you mean here. You have to pick a point of reference. It's trivial to say: "if I have to process 1/2 the samples then the computing load will be lower" If it's a good model or a good approach, then fine. Maybe it all boils down to this: If you have 2x the number of samples necessary then it would be foolish to process them all. Then the only issue is how to do the interpolation. So the question is: how to do efficient interpolation? Maybe an IIR filter would be better than a FIR filter approach as has been suggested. If the result isn't what you expect then look for code errors, including how overlap add things are done. Fred