Filter-Bank Summation (FBS) Interpretation of the STFT
We can group the terms in the STFT definition differently to obtain
the filter-bank interpretation:
As will be explained further below (and illustrated further in
Figures
9.3,
9.4, and
9.5), under the filter-bank
interpretation, the
spectrum of

is first
rotated along the
unit circle in the

plane so as to shift frequency

down
to
0 (via modulation by

in the time domain), thus
forming the
heterodyned signal

. Next, the heterodyned signal

is
lowpass-filtered to a
narrow band about frequency
0 (via
convolving with the time-reversed
window

). The STFT is thus interpreted as a
frequency-ordered collection of narrow-band time-domain
signals, as depicted in Fig.
9.2. In other words, the STFT can be
seen as a uniform
filter bank in which the input signal

is converted to a set of

time-domain output signals

,

, one for each channel of the

-channel filter bank.
Figure 9.2:
Filter Bank Summation (FBS) view of the STFT
 |
Expanding on the previous paragraph, the STFT (9.2) is
computed by the following operations:
The STFT output signal

is regarded as a time-domain
signal (time index

) coming out of the

th channel of an

-channel filter bank. The
center frequency of the

th channel
filter is

,

. Each channel
output signal is a
baseband signal; that is, it is centered
about
dc, with the ``carrier term''

taken
out by ``demodulation'' (frequency-shifting). In particular, the

th channel signal is constant whenever the input signal happens to
be a
sinusoid tuned to frequency

exactly.
Note that the STFT analysis window
is now interpreted as (the flip
of) a lowpass-filter impulse response. Since the analysis window
in the STFT is typically symmetric, we usually have
.
This filter is effectively frequency-shifted to provide each channel
bandpass filter. If the cut-off frequency of the window transform is
(typically half a main-lobe width), then each channel
signal can be downsampled significantly. This downsampling factor is
the FBS counterpart of the hop size
in the OLA context.
Figure 9.3 illustrates the filter-bank interpretation for
(the ``sliding STFT''). The input signal
is frequency-shifted
by a different amount for each channel and lowpass filtered by the
(flipped) window.
Previous:
Overlap-Add (OLA) Interpretation of the STFTNext:
FBS and Perfect Reconstruction
written by Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at
Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See
http://ccrma.stanford.edu/~jos/ for details.