Classic spectrogram of speech sample.
An example spectrogram for recorded speech data
is shown in
. It was generated using the Matlab
. The function spectrogram
is listed in
. The spectrogram is computed
as a sequence of FFTs
of windowed data segments. The spectrogram is
plotted within spectrogram
Matlab for computing a speech spectrogram.
[y,fs,bits] = wavread('SpeechSample.wav');
soundsc(y,fs); % Let's hear it
% for classic look:
colormap('gray'); map = colormap; imap = flipud(map);
M = round(0.02*fs); % 20 ms window is typical
N = 2^nextpow2(4*M); % zero padding for interpolation
w = hamming(M);
title('Speech Sample Spectrogram');
In this example, the Hamming window
length was chosen to be 20 ms--a
common choice in speech analysis. This is short enough so that any
single 20 ms frame will typically contain data from only one phoneme,
yet long enough that it will include at least two periods
during voiced speech, assuming the lowest voiced
to be around 100 Hz.
More generally, for speech and the singing voice (and any periodic
tone), the STFT
analysis parameters are chosen to trade off among the
following conflicting criteria:
- The harmonics should be resolved.
- Pitch and formant variations should be closely followed.
in speech are the low-frequency resonances in the
vocal tract. They appear as dark groups of harmonics
. The first two formants largely determine the
'' in voiced speech. In telephone speech, nominally between
200 and 3200 Hz, only three or four formants are usually present in
Next Section: Auditory Filter BanksPrevious Section: STFT in Matlab