Hi, 1. According to statement given in book "Digital Speech" by A.M.Kondoz: "The LPC spectral envelope matches the signal spectrum much better in the spectral peaks than spectral valleys. This can be expected as our model transfer function, H(z), only has poles to model the formant peaks and no zeros to model spectral valleys" My question to this is, why are the spectral valleys ignored? Or in other words, Why LPC filter is always an all-pole filter? 2. Again the same book says: "Within the 4kHz spectrum (of speech signal), the maximum number of formants displayed is usually four, thus indicating that filter order needs to be atleast eight." What do each of these four formants model in the human speech? Harshad. |
|
All pole LPC
Started by ●April 29, 2002
Reply by ●April 30, 20022002-04-30
Hello Harshad! i'll first try to answer the second question: The formant frequencies are actually the 'preferred' frequencies of vibration of the vocal tract. if you imagine the vocal channel as a tube, then this tube will have its 'resonant' frequencies (similar to the vibrating modes of an acoustic tube that we learn in the junior college days). well, these frequencies are nothing but the "formants" in case of our vocal tract. i hope that answers the significance of formant frequencies in human speech. your next question (ie the first question) is "Why is the LPC filter an all-pole filter"... well, the filter should ideally contain poles as well as zeros. but for the purpose of simplifying the mathematical analysis, it is approximated as an all-pole filter.but the zeros are not ignored, mind you. as you know, the max no of formants is 4... and each formant is modelled by 2 poles, so that the order of the filter needed is atleast 8. now, 2 more poles are added to COMPENSATE for the effect of zeros so that the order of the all-pole LPC filter is 10 (typically). what iam not aware of is how addition of 2 poles 'compensates' for the ignored zeros... any comments? Sameer. --- Harshad Warnekar <> wrote: > Hi, > > 1. According to statement given in book "Digital > Speech" by > A.M.Kondoz: > > "The LPC spectral envelope matches the signal > spectrum much better in > the spectral peaks than spectral valleys. This can > be expected as our > model transfer function, H(z), only has poles to > model the formant > peaks and no zeros to model spectral valleys" > > My question to this is, why are the spectral valleys > ignored? Or in > other words, Why LPC filter is always an all-pole > filter? > > 2. Again the same book says: > > "Within the 4kHz spectrum (of speech signal), the > maximum number of > formants displayed is usually four, thus indicating > that filter order > needs to be atleast eight." > > What do each of these four formants model in the > human speech? > > Harshad. > ------------------------ Yahoo! Groups Sponsor > > _____________________________________ > Note: If you do a simple "reply" with your email > client, only the author of this message will receive > your answer. You need to do a "reply all" if you > want your answer to be distributed to the entire > group. > > _____________________________________ > About this discussion group: > > To Join: > > To Post: > > To Leave: > > Archives: > http://www.yahoogroups.com/group/speechcoding > > Other DSP-Related Groups: http://www.dsprelated.com > ">http://docs.yahoo.com/info/terms/ __________________________________________________ |
Reply by ●May 2, 20022002-05-02
--- Harshad Warnekar <> wrote: Hi, 1. According to statement given in book Digital Speech by A.M.Kondoz: The LPC spectral envelope matches the signal spectrum much better in the spectral peaks than spectral valleys. This can be expected as our model transfer function, H(z), only has poles to model the formant peaks and no zeros to model spectral valleys My question to this is, why are the spectral valleys ignored? Or in other words, Why LPC filter is always an all-pole filter? ------ if H(z) is your filter, then H(z) = R(z)/A(z), where R(z) represents zeros (useful to model valleys) and A(z) represents poles (useful to model peaks). In general, it is considered that H(z) = 1/A(z), ignoring zeros, and thus giving the name all pole model. If you consider R(z) also then it called ARMA (Auto Regressive Moving Average) model. Now question: why R(z) is ignored. 1. Because of computational complexities. 2. Mathematically all pole model is more tracable than ARMA. 3. It can be shown that we can substitute more poles as a replacement of zeros. 4. It is found that allpole model is sufficient enough for practial applications such as speech recognition and speech coding. ---------- 2. Again the same book says: Within the 4kHz spectrum (of speech signal), the maximum number of formants displayed is usually four, thus indicating that filter order needs to be atleast eight What do each of these four formants model in the human speech? --------- Formants are the peaks in the spectrum of the speech signal. They represent the resonances of the vocal tract and thus indicate the shape of the vocal tract during production of a particular speech sound. It is found that first two formants alone can be used to distinguish among English vowel sounds. ---------- Pardon me, if I am committed any (technical) errors while answering. Kishore Prahallad |
Reply by ●May 2, 20022002-05-02
>your next question (ie the first question) is "Why is >the LPC filter an all-pole filter"... well, the filter >should ideally contain poles as well as zeros. but for >the purpose of simplifying the mathematical analysis, >it is approximated as an all-pole filter.but the zeros >are not ignored, mind you. >as you know, the max no of formants is 4... and each >formant is modelled by 2 poles, so that the order of >the filter needed is atleast 8. now, 2 more poles are >added to COMPENSATE for the effect of zeros so that >the order of the all-pole LPC filter is 10 >(typically). > >what iam not aware of is how addition of 2 poles >'compensates' for the ignored zeros... any comments? Actually the 2 poles, which do not exactly correspond to the formants, are real poles that corresponds to usually the frequencies at 0 and Fs/2. Thus we use 10 poles, 4 pairs of complex conjugate poles and 2 real poles. I dont think the 2 poles are for compensating the zeros. Its just that any zero can be represented by poles. For eg. the first order polynomial 1-x is approximately equivalent to 1/(1+x). Hope this answers your question. Cheers -Hari |
|
Reply by ●May 5, 20022002-05-05
Hi all ! To understand whether the all pole LPC filter model 'ignores' the zeros or not, let us see a few points mentioned in the book "Digital Processing of Speech Signals" by Rabiner & Schafer: 1. "An all ploe model is a very good representation of the vocal tract effects for a majority of speech sounds ; however the acoustic theory tells us that NASALS and FRICATIVES require both resonances and antiresonances (poles and zeros)." 2."In these cases we may include zeros in the transfer funtion or we may use the fact that effect of zero of the transfer function can be achieved by including more poles." (See page 99). 3. As a proof of the above statement, the reader is asked to refer to the Problem 3.10 on page 112, which i shall repeat here : "Show that if |a| < 1, 1-a*z^(-1) = 1/(summation(a^n * z^(-n))), where n goes from 0 to infinity and thus, that a zero can be approximated as closely as desired by multiple poles." The above equation can be easily proved using the formula for sum of N terms of a geometric series ( the denominator of the RHS is a geometric series). Hence, if i may say so again, the all pole model does not "ignore" the zeros, but "compensates" for them but adding 2 more poles. Any more comments? Sameer. p.s : examples of nasals are the sounds like /m/,/n/ where the vocal tract is totally constricted at some point along the oral cavity and air flows thro' the nasal tract.Similarly for the fricatives like /s/, /z/,/sh/,/f/ there is formation of constriction along the vocal tract which calls for the antiresonance effect. > Actually the 2 poles, which do not exactly > correspond to the formants, are > real poles that corresponds to usually the > frequencies at 0 and Fs/2. Thus > we use 10 poles, 4 pairs of complex conjugate poles > and 2 real poles. I dont > think the 2 poles are for compensating the zeros. > Its just that any zero can > be represented by poles. For eg. the first order > polynomial 1-x is > approximately equivalent to 1/(1+x). > > Hope this answers your question. > > Cheers > -Hari __________________________________________________ |