Filter-Bank Summation (FBS) Interpretation of the STFT
We can group the terms in the STFT definition differently to obtain
the
filter-bank interpretation:
As will be explained further below (and illustrated further in
Figures
9.3,
9.4, and
9.5), under the filter-bank
interpretation, the
spectrum of

is first
rotated along the
unit circle in the

plane so as to shift frequency

down
to
0
(via
modulation by

in the time domain), thus
forming the
heterodyned signal

. Next, the heterodyned signal

is
lowpass-filtered to a
narrow band about frequency
0
(via
convolving with the time-reversed
window

). The STFT is thus interpreted as a
frequency-ordered collection of narrow-band time-domain
signals, as depicted in Fig.
9.2. In other words, the STFT can be
seen as a uniform
filter bank in which the input signal

is converted to a set of

time-domain output signals

,

, one for each channel of the

-channel filter bank.

Figure 9.2:
Filter Bank Summation (FBS) view of the STFT
|
Expanding on the previous paragraph, the STFT (
9.2) is
computed by the following operations:
- Frequency-shift
by
to get
.
- Convolve
with
to get
:
 |
(10.3) |
The STFT output signal

is regarded as a time-domain
signal (time index

) coming out of the

th channel of an

-channel filter bank. The
center frequency of the

th channel
filter is

,

. Each channel
output signal is a
baseband signal; that is, it is centered
about
dc, with the ``carrier term''

taken
out by ``
demodulation'' (frequency-shifting). In particular, the

th channel signal is constant whenever the input signal happens to
be a
sinusoid tuned to frequency

exactly.
Note that the STFT analysis window

is now interpreted as (the flip
of) a lowpass-filter
impulse response. Since the analysis window

in the STFT is typically symmetric, we usually have

.
This filter is effectively frequency-shifted to provide each channel
bandpass filter. If the cut-off frequency of the window transform is

(typically half a
main-lobe width), then each channel
signal can be
downsampled significantly. This
downsampling factor is
the FBS counterpart of the
hop size 
in the OLA context.
Figure
9.3 illustrates the filter-bank interpretation for

(the ``sliding STFT''). The input signal

is frequency-shifted
by a different amount for each channel and lowpass filtered by the
(flipped) window.
Next Section: FBS and Perfect ReconstructionPrevious Section: Overlap-Add (OLA) Interpretation of the STFT