Free Books

Spectral Envelope Examples

This section presents matlab code for computing spectral envelopes by the cepstral and linear prediction methods discussed above. The signal to be modeled is a synthetic ``ah'' vowel (as in ``father'') synthesized using three formants driven by a bandlimited impulse train [128].

Signal Synthesis

% Specify formant resonances for an "ah" [a] vowel:
F = [700, 1220, 2600]; % Formant frequencies in Hz
B = [130,   70,  160]; % Formant bandwidths in Hz

fs = 8192;  % Sampling rate in Hz
	    % ("telephone quality" for speed)
R = exp(-pi*B/fs);     % Pole radii
theta = 2*pi*F/fs;     % Pole angles
poles = R .* exp(j*theta);
[B,A] = zp2tf(0,[poles,conj(poles)],1);

f0 = 200; % Fundamental frequency in Hz
w0T = 2*pi*f0/fs;

nharm = floor((fs/2)/f0); % number of harmonics
nsamps = fs;  % make a second's worth
sig = zeros(1,nsamps);
n = 0:(nsamps-1);
% Synthesize bandlimited impulse train:
for i=1:nharm,
    sig = sig + cos(i*w0T*n);
sig = sig/max(sig);
soundsc(sig,fs); % Let's hear it

% Now compute the speech vowel:
speech = filter(1,A,sig);
soundsc([sig,speech],fs); % "buzz", "ahh"
% (it would sound much better with a little vibrato)

The Hamming-windowed bandlimited impulse train sig and its spectrum are plotted in Fig.10.1.

Figure 10.1: Bandlimited impulse train.
\includegraphics[width=\textwidth ]{eps/ImpulseTrain}

Figure 10.2 shows the Hamming-windowed synthesized vowel speech, and its spectrum overlaid with the true formant envelope.

Figure 10.2: Synthetic vowel in time and frequency domains, with formant envelope overlaid.
\includegraphics[width=\textwidth ]{eps/Speech}

Spectral Envelope by the Cepstral Windowing Method

We now compute the log-magnitude spectrum, perform an inverse FFT to obtain the real cepstrum, lowpass-window the cepstrum, and perform the FFT to obtain the smoothed log-magnitude spectrum:

Nframe = 2^nextpow2(fs/25); % frame size = 40 ms
w = hamming(Nframe)';
winspeech = w .* speech(1:Nframe);
Nfft = 4*Nframe; % factor of 4 zero-padding
sspec = fft(winspeech,Nfft);
dbsspecfull = 20*log(abs(sspec));
rcep = ifft(dbsspecfull);  % real cepstrum
rcep = real(rcep); % eliminate round-off noise in imag part
period = round(fs/f0) % 41
nspec = Nfft/2+1;
aliasing = norm(rcep(nspec-10:nspec+10))/norm(rcep) % 0.02
nw = 2*period-4; % almost 1 period left and right
if floor(nw/2) == nw/2, nw=nw-1; end; % make it odd
w = boxcar(nw)'; % rectangular window
wzp = [w(((nw+1)/2):nw),zeros(1,Nfft-nw), ...
       w(1:(nw-1)/2)];  % zero-phase version
wrcep = wzp .* rcep;  % window the cepstrum ("lifter")
rcepenv = fft(wrcep); % spectral envelope
rcepenvp = real(rcepenv(1:nspec)); % should be real
rcepenvp = rcepenvp - mean(rcepenvp); % normalize to zero mean

Figure 10.3 shows the real cepstrum of the synthetic ``ah'' vowel (top) and the same cepstrum truncated to just under a period in length. In theory, this leaves only formant envelope information in the cepstrum. Figure 10.4 shows an overlay of the spectrum, true envelope, and cepstral envelope.

Figure 10.3: Real cepstrum (top) and windowed cepstrum (bottom).
\includegraphics[width=\textwidth ]{eps/CepstrumBoxcar}

Figure 10.4: Overlay of spectrum, true envelope, and cepstral envelope.
\includegraphics[width=\textwidth ]{eps/CepstrumEnvBoxcarC}

Instead of simply truncating the cepstrum (a rectangular windowing operation), we can window it more gracefully. Figure 10.5 shows the result of using a Hann window of the same length. The spectral envelope is smoother as a result.

Figure 10.5: Overlay of spectrum, true envelope, and cepstral envelope.
\includegraphics[width=\textwidth ]{eps/CepstrumEnvHanningC}

Spectral Envelope by Linear Prediction

Finally, let's do an LPC window. It had better be good because the LPC model is exact for this example.

M = 6; % Assume three formants and no noise

% compute Mth-order autocorrelation function:
rx = zeros(1,M+1)';
for i=1:M+1,
  rx(i) = rx(i) + speech(1:nsamps-i+1) ...
                * speech(1+i-1:nsamps)';

% prepare the M by M Toeplitz covariance matrix:
covmatrix = zeros(M,M);
for i=1:M,
  covmatrix(i,i:M) = rx(1:M-i+1)';
  covmatrix(i:M,i) = rx(1:M-i+1);

% solve "normal equations" for prediction coeffs:

Acoeffs = - covmatrix \ rx(2:M+1)

Alp = [1,Acoeffs']; % LP polynomial A(z)

dbenvlp = 20*log10(abs(freqz(1,Alp,nspec)'));
dbsspecn = dbsspec + ones(1,nspec)*(max(dbenvlp) ...
                   - max(dbsspec)); % normalize
plot(f,[max(dbsspecn,-100);dbenv;dbenvlp]); grid;

Figure 9.16:
\includegraphics[width=\textwidth ]{eps/LinearPredictionEnvC}

Linear Prediction in Matlab and Octave

In the above example, we implemented essentially the covariance method of LP directly (the autocorrelation estimate was unbiased). The code should run in either Octave or Matlab with the Signal Processing Toolbox.

The Matlab Signal Processing Toolbox has the function lpc available. (LPC stands for ``Linear Predictive Coding.'')

The Octave-Forge lpc function (version 20071212) is a wrapper for the lattice function which implements Burg's method by default. Burg's method has the advantage of guaranteeing stability ($ A(z)$ is minimum phase) while yielding accuracy comparable to the covariance method. By uncommenting lines in lpc.m, one can instead use the ``geometric lattice'' or classic autocorrelation method (called ``Yule-Walker'' in lpc.m). For details, ``type lpc''.

Next Section:
Additive Synthesis (Early Sinusoidal Modeling)
Previous Section:
Linear Prediction Spectral Envelope