Two Dual Interpretations of the STFT
The STFT
can be viewed as a function of either
frame-time
or bin-frequency
. We will develop both points of
view in this book.
At each frame time
, the STFT can be regarded as producing a
Fourier transform centered around that time. As
advances, a sequence of spectral transforms is obtained. This is
depicted graphically in Fig.9.1, and it forms the basis of the
overlap-add method for Fourier analysis, modification, and
resynthesis [9]. It is also the basis for
transform coders [16,284].
In an exact Fourier duality, each bin
of the STFT can
be regarded as a sample of the complex signal at the output of a
lowpass filter whose input is
. As discussed
in §9.1.2, this signal is obtained from
by
frequency-shifting it so that frequency
is translated
down to 0
Hz. For each value of
, the time-domain signal
, for
, is the output of
the
th ``filter bank channel,'' for
. In this
``filter bank'' interpretation, the hop size
can be interpreted as
the downsampling factor applied to each bin-filter output, and
the analysis window
is seen as the impulse
response of the anti-aliasing filter used prior to downsampling. The
window transform
is also the frequency response of each
channel filter (translated to dc). This point of view is depicted
graphically in Fig.9.2 and elaborated further in Chapter 9.
Next Section:
The STFT as a Time-Frequency Distribution
Previous Section:
Summary of STFT Computation Using FFTs