Search Spectral Audio Signal Processing
Book Index | Global Index
Would you like to be notified by email when Julius Orion Smith III publishes a new entry into his blog?
Classic Spectrograms
The spectrogram is a basic tool in audio spectral analysis and
other applications. It has been used extensively in speech analysis
[49]. The spectrogram can be defined as an intensity plot
(usually on a log scale, such as dB) of the Short-Time Fourier
Transform (STFT) magnitude.7.4. As defined in the
previous section, the STFT is simply a sequence of FFTs of windowed
data segments, where the windows are allowed to overlap in time,
typically by at least 50%
[10]. Parameters of the spectrogram include the
- window length
,
- window type (Hamming, Kaiser, etc.),
- hop-size
, and
- FFT length
.
As discussed in Chapter
1, the window length

controls
frequency resolution, the window type controls side-lobe suppression
(at the expense of resolution when

is fixed), and the FFT length

determines how much spectral
oversampling (interpolation) is to be
provided. The new hop-size parameter

determines how much
oversampling there will be along the time dimension. For

(the
``sliding FFT''), there is no
downsampling over time, so oversampling
is maximized. For a
periodic Hamming window,

gives
maximum downsampling of the sliding FFT without time
aliasing.
Avoiding time aliasing corresponds to retaining ``robust perfect
reconstruction'' in the inverse STFT.
7.5
The spectrogram is an important representation of audio data because
human hearing is based on a kind of real-time spectrogram encoded by
the cochlea of the inner ear [183]. The spectrogram
has been used extensively in the field of computer music as a guide
during the development of sound synthesis algorithms. When working
with an appropriate synthesis model, matching the spectrogram often
corresponds to matching the sound extremely well. In fact,
spectral modeling synthesis (SMS) is based on synthesizing the
short-time spectrum directly by some means (see Chapter 7)
[276].
Subsections
Previous:
STFT in MatlabNext:
Spectrogram of Speech
written by Julius Orion Smith III
Julius Smith's background is in electrical engineering (BS Rice 1975, PhD Stanford 1983). He is presently Professor of Music and Associate Professor (by courtesy) of Electrical Engineering at
Stanford's Center for Computer Research in Music and Acoustics (CCRMA), teaching courses and pursuing research related to signal processing applied to music and audio systems. See
http://ccrma.stanford.edu/~jos/ for details.