# Spectral envelope normalization (speech processing)

Started by February 13, 2014
Dear forum,

I created a synthetic speech signal x[n] of 256 samples assuming that the source signal u[n] is an impulse train (not a glottal flow signal) and also I assumed that there are no lips. The vocal tract filter h[n] is an all pole filter of order 10. So, in frequency domain we have X(w)=H(W)U(W).

I computed with the modified periodogram the PSD of X(w) and with freqz the PSD of H(w) with the following matlab code:
N = 256; % signal length
T = 90; % pitch period
Fs = 8000; % sampling frequency
u = zeros(1,N); % source signal
u([1:T:N])=1;

% computing the vocal tract filter for the vowel /a/
F = [650, 1075, 2463, 3558, 4631]; % Formant frequencies (Hz)
BW = [94, 91, 107, 198, 89]; % Formant bandwidths (Hz)

poles = exp(-pi*BW/Fs) .* exp(j*(2*pi*F/Fs)); % convert formants to poles

num = 1;
den = real(poly([poles,conj(poles)]));

% creating the speech signal x
x = filter(num,den,u);
x = x(:);
NFFT=N;

% Choose window

% win = rectwin(N);
win = hanning(N);
% win = hamming(N);
% win = blackman(N);

X = fft(win.*x,NFFT);
X = X(1:NFFT/2+1);
psdX = (1/(norm(win,2)^2)).*(abs(X).^2); % modified periodogram
psdX(2:end-1) = 2*psdX(2:end-1); % normalized periodogram
psdX = 10*log10(psdX); % in dB

f = linspace(0,Fs/2,NFFT/2+1);
psdH = freqz(num,den,f,Fs)./norm(win,2);
psdH = 20*log10(abs(psdH));
plot(psdX,'b');
hold on;
plot(psdH,'r');
The problem as you can see is that the psdH (i.e. the spectral envelope) is not exactly on the Top of psdX. The book of T.F. Quatieri (Discrete-Time Speech Signal Processing:Principles and Practice) says that the spectral envelope should be exactly on the Top of the periodogram of X(w).

I have the bad feeling that I have not normalized correctly the PSD of H(w).

I have spent more than two days on this problem and I cannot find the solution. Please let me know if I have done any mistake.

Kind regards,
Andreas

_____________________________________