Summary of STFT Computation Using FFTs
- Read
samples of the input signal
into a local buffer of length
which is initially zeroed
the
th frame of the input signal, and
the
th time normalized input frame (time-normalized by translating it to time zero). The frame length is
, which we assume to be odd for reasons to be discussed later. The time advance
(in samples) from one frame to the next is called the hop size or step size.
- Multiply the data frame pointwise by a length
spectrum analysis window
to obtain the
th windowed data frame (time normalized):
- Extend
with zeros on both sides to obtain a zero-padded frame:
(8.5)
whereis chosen to be a power of two larger than
. The number
is the zero-padding factor. As discussed in §2.5.3, the zero-padding factor is the interpolation factor for the spectrum, i.e., each FFT bin is replaced by
bins, interpolating the spectrum using ideal bandlimited interpolation [264], where the ``band'' in this case is the
-sample nonzero duration of
in the time domain.
- Take a length
FFT of
to obtain the time-normalized, frequency-sampled STFT at time
:
(8.6)
where, and
is the sampling rate in Hz. As in any FFT, we call
the bin number.
- If needed, time normalization may be removed using a
linear phase term to yield the sampled STFT:
(8.7)
The (continuous-frequency) STFT may be approached arbitrarily closely by using more zero padding and/or other interpolation methods.Note that there is no irreversible time-aliasing when the STFT frequency axis
is sampled to the points
, provided the FFT size
is greater than or equal to the window length
.
Next Section:
Two Dual Interpretations of the STFT
Previous Section:
Practical Computation of the STFT