Dual Views of the Short Time Fourier Transform (STFT)
In the overlap-add formulation of Chapter 8, we used a hopping window to extract time-limited signals to which we applied the DFT. Assuming for the moment that the hop size (the ``sliding DFT''), we have
This is the usual definition of the Short-Time Fourier Transform (STFT) (§7.1). In this chapter, we will look at the STFT from two different points of view: the OverLap-Add (OLA) and Filter-Bank Summation (FBS) points of view. We will show that one is the Fourier dual of the other [9]. Next we will explore some implications of the filter-bank point of view and obtain some useful insights. Finally, some applications are considered.
Overlap-Add (OLA) Interpretation of the STFT
In the OLA interpretation of the STFT, we apply a time-shifted window to our signal , selecting data near time , and compute the Fourier-transform to obtain the spectrum of the th frame. As shown in Fig.9.1, the STFT is viewed as a time-ordered sequence of spectra, one per frame, with the frames overlapping in time.
Filter-Bank Summation (FBS) Interpretation of the STFT
We can group the terms in the STFT definition differently to obtain
the filter-bank interpretation:
As will be explained further below (and illustrated further in Figures 9.3, 9.4, and 9.5), under the filter-bank interpretation, the spectrum of is first rotated along the unit circle in the plane so as to shift frequency down to 0 (via modulation by in the time domain), thus forming the heterodyned signal . Next, the heterodyned signal is lowpass-filtered to a narrow band about frequency 0 (via convolving with the time-reversed window ). The STFT is thus interpreted as a frequency-ordered collection of narrow-band time-domain signals, as depicted in Fig.9.2. In other words, the STFT can be seen as a uniform filter bank in which the input signal is converted to a set of time-domain output signals , , one for each channel of the -channel filter bank.
Expanding on the previous paragraph, the STFT (9.2) is computed by the following operations:
- Frequency-shift by to get .
- Convolve
with
to get
:
(10.3)
Note that the STFT analysis window is now interpreted as (the flip of) a lowpass-filter impulse response. Since the analysis window in the STFT is typically symmetric, we usually have . This filter is effectively frequency-shifted to provide each channel bandpass filter. If the cut-off frequency of the window transform is (typically half a main-lobe width), then each channel signal can be downsampled significantly. This downsampling factor is the FBS counterpart of the hop size in the OLA context.
Figure 9.3 illustrates the filter-bank interpretation for (the ``sliding STFT''). The input signal is frequency-shifted by a different amount for each channel and lowpass filtered by the (flipped) window.
FBS and Perfect Reconstruction
An important property of the STFT established in Chapter 8 is that it is exactly invertible when the analysis window satisfies the constant-overlap-add constraint. That is, neglecting numerical round-off error, the inverse STFT reproduces the original input signal exactly. This is called the perfect reconstruction property of the STFT, and modern filter banks are usually designed with this property in mind [287].
In the OLA processors of Chapter 8, perfect reconstruction was assured by using FFT analysis windows having the Constant-Overlap-Add (COLA) property at the particular hop-size used (see §8.2.1).
In the Filter Bank Summation (FBS) interpretation of the STFT (Eq. (9.1)), it is the analysis filter-bank frequency responses that are constrained to be COLA. We will take a look at this more closely below.
Next Section:
STFT Filter Bank
Previous Section:
Review of Zero Padding