Free Books

STFT with Modifications

FBS Fixed Modifications

Consider applying a fixed (time-invariant) filter $ H(\omega_k)$ to each $ X_m(\omega_k)$ before resynthesizing the signal:

$\displaystyle Y_m(\omega_k) = X_m(\omega_k)H(\omega_k)$ (10.28)

where, $ H(\omega_k)$ is the sampled frequency response of a filter with impulse response

$\displaystyle h(n) = \frac{1}{N} \sum_{k=0}^{N-1} H(\omega_k) e^{j\omega_kn}, \quad n=0,\ldots,N-1$ (10.29)

Let's examine the result this has on the signal in the time domain:

y(m) &=& \frac{1}{N} \sum_{k=0}^{N-1} Y_m(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{k=0}^{N-1} X_m(\omega_k)H(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{k=0}^{N-1} \left\{ \sum_{n=-\infty}^\infty x(n)w(n-m)e^{-j\omega_kn} \right\} H(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{n=-\infty}^\infty x(n)w(n-m) \sum_{k=0}^{N-1} H(\omega_k) e^{j\omega_k(m-n)} \\
&=& \sum_{n=-\infty}^\infty x(n) [ w(n-m) h(m-n)] \\
&=& \sum_{n=-\infty}^\infty x(n) [\tilde{w}(m-n)h(m-n)] \\
&=& (x*[\tilde{w} \cdot h])(m) \\

We see that the result is $ x$ convolved with a windowed version of the impulse response $ h$ . This is in contrast to the OLA technique where the result gave us a windowed $ x$ filtered by $ h$ without the window having any effect on the filter, provided it obeys the COLA constraint and sufficient zero padding is used to avoid time aliasing.

In other words, FBS gives

$\displaystyle y = x * [\tilde{w} \cdot h] \;\longleftrightarrow\;X \cdot [{\tilde W}\ast H]$ (10.30)

while OLA gives (for $ R=1$ )

$\displaystyle y = x * [W(0)\cdot h] \;\longleftrightarrow\;X \cdot [W(0)\cdot H]$ (10.31)

  • In FBS, the analysis window $ w$ smooths the filter frequency response by time-limiting the corresponding impulse response.

  • In OLA, the analysis window can only affect scaling.

For these reasons, FFT implementations of FIR filters normally use the Overlap-Add method.

Time Varying Modifications in FBS

Consider now applying a time varying modification.

$\displaystyle Y_m(\omega_k) = X_m(\omega_k)H_m(\omega_k) \qquad \hbox{($R=1$)}$ (10.32)


$\displaystyle H_m(\omega_k) \;\longleftrightarrow\;h_m(n) = \frac{1}{N} \sum_{k=0}^{N-1} H_m(\omega_k) e^{j\omega_kn}$ (10.33)

$ h_m(n)$ refers to the $ n^{th}$ tap of the FIR filter at time $ m$ .

y(m) &=& \frac{1}{N} \sum_{k=0}^{N-1} Y_m(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{k=0}^{N-1} X_m(\omega_k)H_m(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{k=0}^{N-1} \left\{ \sum_{n=-\infty}^\infty x(n)w(n-m)e^{-j\omega_kn} \right\} H_m(\omega_k) e^{j\omega_k m} \\
&=& \frac{1}{N} \sum_{n=-\infty}^\infty x(n)w(n-m) \sum_{k=0}^{N-1} H_m(\omega_k) e^{j\omega_k(m-n)} \\
&=& \sum_{n=-\infty}^\infty x(n) [ w(n-m) h_m(m-n)] \\
&=& \sum_{n=-\infty}^\infty x(n) [\tilde{w}(m-n)h_m(m-n)] \\
&=& (x*[\tilde{w} \cdot h_m])(m) \\

Hence, the result is the convolution of $ x$ with the windowed $ h_m$ .

Points to Note

  • We saw that in OLA with time varying modifications and $ R=1$ (a ``sliding'' DFT), the window served as a lowpass filter on each individual tap of the FIR filter being implemented.

  • In the more typical case in which $ R$ is the window length $ M$ divided by a small integer like $ 2$ -$ 10$ , we may think of the window as specifying a type of cross-fade from the LTI filter for one frame to the LTI filter for the next frame.

  • Using a Bartlett (triangular) window with $ 50$ % overlap, ($ R=2$ ), the sequence of FIR filters used is obtained simply by linearly interpolating the LTI filter for one frame to the LTI filter for the next.

  • In FBS, there is no limitation on how fast the filter $ h_m$ may vary with time, but its length is limited to that of the window $ w$ .

  • In OLA, there is no limit on length (just add more zero-padding), but the filter taps are band-limited to the spectral width of the window.

  • FBS filters are time-limited by $ w$ , while OLA filters are band-limited by $ w$ (another dual relation).

  • Recall for comparison that each frame in the OLA method is filtered according to

    $\displaystyle Y_m = X_m \cdot H_m = [X*W_m] \cdot H_m \;\longleftrightarrow\; \underbrace{[x \cdot w_m]}_{x_m} * h_m$ (10.34)

    where $ w_m$ denotes $ \hbox{\sc Shift}_{mR}(w)$ .
  • Time-varying FBS filters are instantly in ``steady state''
  • FBS filters must be changed very slowly to avoid clicks and pops (discontinuity distortion is likely when the filter changes)
For more details, see [9].

Next Section:
STFT Summary and Conclusions
Previous Section:
Downsampled STFT Filter Banks