## Can an FFT be shared for spectral estimation and linear filtering?

Started by 2 years ago●13 replies●latest reply 2 years ago●206 views__I am familiar with the overlap-add and overlap-save methods for computing linear convolutions.__

I am also familiar with Welch's method for estimating power spectral density.

There are structural similarities between overlap-add/save and Welch's method (they both use an overlapped FFT) and I became interested in computing *both* using a shared FFT. This would allow me to compute the spectrum, and do some FIR filtering, using shared hardware.

The main challenge is that Welch's method applies a window function before the FFT, and we don't want that window function in overlap-add/save. I came across this free book, which explains that choosing a window function that satisfies the "COLA" (Constant OverLap-Add) condition means the effect of the window can be removed after the FFT->IFFT by overlap-adding in the time domain. (For example, a Hann window overlap-added to 1/2-shifted copies of itself sums to 1 at all points).

The problem I see is that this averaging to remove the window effectively "consumes" the FFT overlap. For each **pair** of 1/2-overlapped, windowed FFTs, we only recover the information from **one** (non-overlaped, non-windowed) FFT. So, if I understand correctly, there is "no overlap left" for doing my filtering (unless I increase the overlap).

This is neatly summarized by this inequality (explained in [1]):

$$N_{filter} \leq N_{fft} - N_{window} + 1$$

So if Nwindow = Nfft (like for ordinary Welch's method), then the maximum filter size is 1 tap. To get a reasonable filter size, the window must be significantly narrower than the FFT (and then we zero-pad up to the FFT size). In hardware, this effectively means doubling my FFT size and using a 3/4 overlap (step size = Nfft/4).

In hardware terms, zero-padding to double the FFT size almost exactly doubles the complexity (it's basically the same FFT, but computed twice in the same amount of time, with some extra twiddles). So, by "sharing" my FFT, I have slightly **increased** hardware cost.

What have I missed here? Is there really no way to share resources efficiently between spectral estimation and frequency-domain filtering?

[1] REALIZATION OF ARBITRARY FILTERS IN THE STFT DOMAIN, Michael M. Goodwin, 2009.

__EDIT__: What about applying the window in the frequency domain?

Of course, this is seriously worth considering in this case.

A quick web search and ... as is so often the case, DSP legend Rick Lyons has already explained this in depth:

https://www.embedded.com/dsp-tricks-frequency-doma...

I need to do some careful reading, but this looks exceedingly promising.

If it is me I will do fft as I want it then discard those bins I wanted them filtered.

I haven't tried something like this before but it is an interesting re-use/efficiency question. The first question I'd ask is a sort of step back question. I'm assuming you need both filtering and spectral estimation and perhaps these two processes are interdependent as well. In any case, my question would be are you limited to just using Welch's method for spectral estimation? Could you not just use your own FFT-based spectral estimation where you have full control over the overlap/window/averaging/etc. In other words, choose the spectral estimation technique that is friendly to your filtering needs. At a minimum the spectral estimation could be just:

yf = fft(y) % y is the time domain signal of interest

yf_dBW = 10*log10(abs(2*yf.*conj(yf)/N^2); % poor man's power spectrum in dBW

From there you can elaborate with averaging and/or windowing as needed. The overlapped processing optional if you really need it but in many cases applications which require spectral estimation, don't require overlapped processing. But engineers use it anyway because it comes along for the ride in Welch which is the first thing many engineers pick up, a familiar entity.

My apologies for not answering your question. It's a good one. I just started in backup and work-around mode as an option.

@kschutz I'm not specifically tied to Welch's method and I had also considered your "poor man's power spectrum". It's something I would probably resort to if I completely exhaust my hardware (FPGA) resources.

If I'm forced to choose, I would prefer to throw away the overlap in the spectrum estimate (like you suggested), than throw away the filtering. I've seen implementations, for example, where a subset of FFT bins are IFFTed (without any filtering) - which basically amounts to a rectangular "brick wall" anti-aliasing filter. I suspect the distortion caused by this "poor man's filter" would be a bigger problem for me than the "poor man's spectrum".

But, ideally, I'd like to "have my cake and eat it".

If I understand well your question, you want to make some filtering on the frequency domain while doing Welch's method. If your filter is simple (say for example lowpass, filtering out some high frequency bins) then you can do it within the Weighted Overlap Add approach with the same fft (and somewhat strange filter you need, modifying FFT bins directly). In this method, you need to apply 2 windows (analysis and synthesis windows). The point is if you have overlapping windows for the Welch's method, process it as this:

- Make square root of your window (Hann will come sine window, for example).

- Apply this window to your overlapping time domain data.

- Do your analysis performing FFT on it.

- Perform your linear filter on the FFT bins (modifying FFT bins).

- Do IFFT on windowed frame.

- Apply the same square root window that you used before.

- Do overlap add of this (adding to previous filtered frame).

- You are done.

Note: Your window needs to be a COLA window.

Hope this helps!

@samecues You understood exactly. And my filter is just a lowpass filter.

If I understood you correctly, you are saying that by changing my Hann "analysis" window to two sqrt(Hann) windows (one "analysis" window before the FFT, and one "synthesis" window after the FFT->IFFT), then I can achieve my goal using only a 1/2-overlapped FFT. Is that correct?

It is not obvious to me how this "synthesis" window solves the problem. Could you recommend any good books/references I could use to learn about this method?

Yes it is correct. In this WOLA paper you will find more info about this. I have had an approach very similar to yours.

Synthesis windows helps to suppress the non linear effects that will arise from that.

In your case this will have you to doing any spectrum processing without problem. You can do square root Hann directly if you Google for mlt sine window

I Hope this helps you!

@samecues I have read that WOLA article and some other papers related to WOLA. However, I cannot see anything related to a simple linear filter (like a lowpass filter).

You mentioned suppressing nonlinear effects (and the articles I read also discuss this). However, I just want a simple (linear) filter. Therefore, I don't understand what these nonlinear effects are in my case.

From what I have read, WOLA is useful for nonlinear processing, but OLA is better for ordinary linear filtering. Did I misunderstand something?

You don't need WOLA for linear filtering, but you need it for your approach (Welch analysis + filtering). You can do linear filtering with WOLA too, in your scenario it is a better option. That is, you don't need WOLA if you need to apply linear filtering ONLY, but in your scenario you can utilize the same FFT for filtering with it.

My first attempt with WOLA was a linear lowpass with great results ;)

As an example, WOLA is used in MP3 and others that need to do linear lowpass at the end too (non linear + linear filtering).

More info: WOLA is suitable within modifying bins directly. If you have the FFT of your lowpass filter, you need to apply it in a "bin sense", that's it, in a "throw your unwanted bins out" manner. Not in the "multiply your filter FFT manner".

I Hope this helps you.

My first answer though was too short, here are some more thoughts:

overlap/add for filtering is not needed if you just let data stream through for ever.

It seems you want to filter using DFT followed by iDFT and this results in frame processing which you then trying overlap/add approach. But if you got the stream don't break it into frames in the first place then stitch them up.

DFT followed by iDFT for the filtering is far more costly than just a FIR filter with DFT doing its spectrum job.

Your platform is FPGA and statements like y = iDFT(DFT(x)) are written easily in software so remember that although the concepts of DSP are same but soft dsp is completely a different mindset from hard dsp.

So in short I am not sure why you want to do DFT for filtering if an FIR solution is much easier. Yes you can use same one DFT as filter by setting stopband to zero but this needs to be followed by iDFT. If your clock speed allows using same DFT core in both directions then you may manage but at the cost of complexity.

@kaz My FIR filters are large, so DSP slice utilization is reduced immensely by implementing them in the frequency domain. (By a rough calculation, the reduction in my case is much more than 90%).

FFTs do consume more memory resources, so that needs careful optimization.

My clock frequency is one sixteenth of the sample rate, so resource sharing is certainly not an option.

clocking fft core at double sample rate will let you finish one frame then you have enough clocks to do iFFT on FFT output after zeroing stopband and using same one fft core (fft and ifft share same ip).

I suggest then you use averaging for your spectrum estimate rather than welch to keep it simpler.

At the iFFT output you start the process of re-stitching frames while receiving for next FFT frame. You will need one frame memory before you start the fft but the fft output can go direct to iFFT after zeroing stopband,

Controlling phase continuity at edges of stopband/passband(after ifft) is not easy) you can try ofdm post ifft windowing which is a fancy technique used by some to reduce phase discontinuity across adjacent ofdm symbols after ifft before passing them into air.

@kaz My sample rate is 4 GSps. So I was saying I can't clock my FFT core at 8 GHz (or 16 GHz, with 1/2-overlap) and reuse it for the IFFT. I will clock it at 250 MHz and implement a parallel FFT.

OFDM would only make a difference here if I were receiving OFDM symbols from a transmitter that is sending them, which is not the case in my system.

Those are very very high speeds for FPGA. I use 491MHz for Ultrascale and it is a struggle but it could be you are dealing with some special chips or hardcoded fft cores.

ofdm windowing applies both at downlink(after eNB ifft) and uplink (after UE ifft and received at eNB). The phase discontinuity needs some windowing in time domain between brickwall frames otherwise you will need convolution in frequency domain and that makes it even more complex.

As a stream filter I doubt DFT/iDFT will work cleanly at those edges.