Technical discussions related to Speech Coding (all itu and other vocoders, ACELP, CELP, AMR, etc)
|
hi all, i want to ask something here ... if the residual signal (the error between the original speech signal and the predicted signal) still has high inteligibility, what can we do about it ? thanks Eris -------------------------------------- DSP Laboratory STT Telkom Bandung www.stttelkom.ac.id/psd/ -------------------------------------- |
|
|
|
Dear Eris, I think it is perfectly normal that the LPC residual has kept most of its intelligibility. I have observed (heard) this myself, during my work, and read it several times in different sources (books, thesis). In theory the residual should be white noise, but in practice, it is not, due to the fact that the LPC model is not perfect. I can think of: The vocal tract filter is not all-pole, as it is assumed in LPC analysis. The order of the model is maybe not high enough to be a good model. The estimation method (least squares) is not perfect neither. The size of the window is a trade-off between short enough to cope with non-stationarity, and long enough to be able to perform a reasonable spectral estimation, so maybe sometimes is too long (during onset periods and plosives as example). The window itself modify the spectral information being estimated. But in my opinion, the most important is that the LPC coefficients model SHORT TERM redundancy (relationship between the present sample, and let's say the last 10 samples, for an LPC order of 10). This short term redundancy is (more or less efectively) removed by the LPC analysis filter. But you still have long term redundancy (relationship between a present sample, and a sample delayed by T0, where T0 is the pitch period). This long term redundancy is not removed by the LPC analysis filter (for that you need a "pitch predictor" or "long term predictor") and is responsabile for the periodic peaks you see in the residual, particularly for periodic portions of the speech (e.g. vowels). I hope this helps you, Sara "eris ristemena" <> wrote: original article:http://www.egroups.com/group/speechcoding/?start=48 > hi all, > > i want to ask something here ... > if the residual signal (the error between the original speech signal and the > predicted signal) still has high inteligibility, what can we do about it ? > > thanks > > Eris > > -------------------------------------- > DSP Laboratory > STT Telkom Bandung > www.stttelkom.ac.id/psd/ |
|
oke, i see now. There's many assumption so that accumulate the error. And
i observed that LPC can not model spectral envelope of womans voice well. talking about spectral envelope, in mans voice, the LPC (in my coder) can almost perfectly model the spectral envelope. But, when i filtered the LP filter with pulse periodic signal (using periode Tp). the output is far different from the original. Is it posible ? or is it maybe caused by the filter process ? eris -----Original Message----- From: <> To: <> Date: Monday, November 15, 1999 6:27 PM Subject: [speechcoding] Re: high inteligibility on residual signal >Dear Eris, > >I think it is perfectly normal that the LPC residual has kept most of >its intelligibility. I have observed (heard) this myself, during my >work, and read it several times in different sources (books, thesis). >In theory the residual should be white noise, but in practice, it is >not, due to the fact that the LPC model is not perfect. I can think of: >The vocal tract filter is not all-pole, as it is assumed in LPC >analysis. The order of the model is maybe not high enough to be a good >model. The estimation method (least squares) is not perfect neither. >The size of the window is a trade-off between short enough to cope with >non-stationarity, and long enough to be able to perform a reasonable >spectral estimation, so maybe sometimes is too long (during onset >periods and plosives as example). The window itself modify the spectral >information being estimated. >But in my opinion, the most important is that the LPC coefficients >model SHORT TERM redundancy (relationship between the present sample, >and let's say the last 10 samples, for an LPC order of 10). This short >term redundancy is (more or less efectively) removed by the LPC >analysis filter. But you still have long term redundancy (relationship >between a present sample, and a sample delayed by T0, where T0 is the >pitch period). This long term redundancy is not removed by the LPC >analysis filter (for that you need a "pitch predictor" or "long term >predictor") and is responsabile for the periodic peaks you see in the >residual, particularly for periodic portions of the speech (e.g. >vowels). > >I hope this helps you, > >Sara |