DSPRelated.com
Forums

LPC for Analysis of 2 Speech Signals

Started by Raeldor November 27, 2010
Hi All,

I am trying to find the best way to analyze 2 speech signals from 2
different speakers and come up with a 'match percentage' as to how
close they were (ie, did the speakers say the same thing).

I have been reading that LPC is great for encoding the basic
parameters of speech, but most of the articles are related to
compression or building of a vocoder.  Has anyone had experience of or
know any good resources or c/c++ libraries for extracting the
parameters of speech for comparison?  Is LPC likely the best algorithm
for this type of work?

Thanks
Ray


On Nov 28, 1:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote:
> Hi All, > > I am trying to find the best way to analyze 2 speech signals from 2 > different speakers and come up with a 'match percentage' as to how > close they were (ie, did the speakers say the same thing). > > I have been reading that LPC is great for encoding the basic > parameters of speech, but most of the articles are related to > compression or building of a vocoder. &#4294967295;Has anyone had experience of or > know any good resources or c/c++ libraries for extracting the > parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm > for this type of work? > > Thanks > Ray
Try Matlab.
On Nov 27, 7:08&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Nov 28, 1:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote: > > > Hi All, > > > I am trying to find the best way to analyze 2 speech signals from 2 > > different speakers and come up with a 'match percentage' as to how > > close they were (ie, did the speakers say the same thing). > > > I have been reading that LPC is great for encoding the basic > > parameters of speech, but most of the articles are related to > > compression or building of a vocoder. &#4294967295;Has anyone had experience of or > > know any good resources or c/c++ libraries for extracting the > > parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm > > for this type of work? > > > Thanks > > Ray > > Try Matlab.
Not sure that's an option, as I need to integrate the end results into a practical application. I have read about Matlab's DSP toolkit, and it looks really nice, but unless they provide source code or a c/c++ library for the functions it'll be very hard to translate it into my application.
i was working on filterbank approach to speech recognition.

This project was on isolated word recognition. So a user will
speak a set of words (example 1 to 10) and speak it for 8 times
to perform template averaging.

Templates were computed using filter banks, LPF's and decimation.
The 2D stored templates were then compared with the incoming 
word sample to find the match.

I have working c-code which was also ported on C6713 TI dSP platform.

Willing to sell the code if anyone is interested to buy it.

Regards
Bharat
On Nov 27, 7:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote:
> Hi All, > > I am trying to find the best way to analyze 2 speech signals from 2 > different speakers and come up with a 'match percentage' as to how > close they were (ie, did the speakers say the same thing). > > I have been reading that LPC is great for encoding the basic > parameters of speech, but most of the articles are related to > compression or building of a vocoder. &#4294967295;Has anyone had experience of or > know any good resources or c/c++ libraries for extracting the > parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm > for this type of work? > > Thanks > Ray
You can use either dynamic time warping or hidden markov models. You may use either the LPC coefs or convert them to PARCORs in either of these two cost mechanisms. Clay

Clay wrote:

> On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote: > >>Hi All, >> >>I am trying to find the best way to analyze 2 speech signals from 2 >>different speakers and come up with a 'match percentage' as to how >>close they were (ie, did the speakers say the same thing). >> >>I have been reading that LPC is great for encoding the basic >>parameters of speech, but most of the articles are related to >>compression or building of a vocoder. Has anyone had experience of or >>know any good resources or c/c++ libraries for extracting the >>parameters of speech for comparison? Is LPC likely the best algorithm >>for this type of work? >> >>Thanks >>Ray > > > You can use either dynamic time warping or hidden markov models. You > may use either the LPC coefs or convert them to PARCORs in either of > these two cost mechanisms.
Problem is, there is not much relation between the perceptual similarity and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are good functions for pattern matching; LSPs or cepstrum coefficients could be better in this regard. But, it's all minor technicalities compared to the incredible problem stated by OP. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
On Nov 29, 3:12&#4294967295;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote: > > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote: > > >>Hi All, > > >>I am trying to find the best way to analyze 2 speech signals from 2 > >>different speakers and come up with a 'match percentage' as to how > >>close they were (ie, did the speakers say the same thing). > > >>I have been reading that LPC is great for encoding the basic > >>parameters of speech, but most of the articles are related to > >>compression or building of a vocoder. &#4294967295;Has anyone had experience of or > >>know any good resources or c/c++ libraries for extracting the > >>parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm > >>for this type of work? > > >>Thanks > >>Ray > > > You can use either dynamic time warping or hidden markov models. You > > may use either the LPC coefs or convert them to PARCORs in either of > > these two cost mechanisms. > > Problem is, there is not much relation between the perceptual similarity > and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are > good functions for pattern matching; LSPs or cepstrum coefficients could > be better in this regard. But, it's all minor technicalities compared to > the incredible problem stated by OP. > > Vladimir Vassilevsky > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
Maybe I should explain where I am at the moment. I am calculating the FFT of a 128-sample windowed using gaussian distribution, then converting this to decibels. I can see the peaks of the formants visually, but there is a lot of noise on the graph (of the data), I think probably caused by the harmonics. Is there a way to clean this noise so I can see the formant peaks as smooth peaks in the graph? The smaller (128-bit) sample size helped with this, as did the gaussian window, but I can't help but think there is a better approach for this? Thanks Ray
On Nov 29, 3:12&#4294967295;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote: > > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote: > > >>Hi All, > > >>I am trying to find the best way to analyze 2 speech signals from 2 > >>different speakers and come up with a 'match percentage' as to how > >>close they were (ie, did the speakers say the same thing). > > >>I have been reading that LPC is great for encoding the basic > >>parameters of speech, but most of the articles are related to > >>compression or building of a vocoder. &#4294967295;Has anyone had experience of or > >>know any good resources or c/c++ libraries for extracting the > >>parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm > >>for this type of work? > > >>Thanks > >>Ray > > > You can use either dynamic time warping or hidden markov models. You > > may use either the LPC coefs or convert them to PARCORs in either of > > these two cost mechanisms. > > Problem is, there is not much relation between the perceptual similarity > and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are > good functions for pattern matching; LSPs or cepstrum coefficients could > be better in this regard. But, it's all minor technicalities compared to > the incredible problem stated by OP. > > Vladimir Vassilevsky > DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
Maybe I should explain where I am at the moment. I am calculating the FFT of a 128-sample windowed using gaussian distribution, then converting this to decibels. I can see the peaks of the formants visually, but there is a lot of noise on the graph (of the data), I think probably caused by the harmonics. Is there a way to clean this noise so I can see the formant peaks as smooth peaks in the graph? The smaller (128-bit) sample size helped with this, as did the gaussian window, but I can't help but think there is a better approach for this? Thanks Ray

Raeldor wrote:

> On Nov 29, 3:12 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote: > >>Clay wrote: >> >>>On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote: >> >>>>Hi All, >> >>>>I am trying to find the best way to analyze 2 speech signals from 2 >>>>different speakers and come up with a 'match percentage' as to how >>>>close they were (ie, did the speakers say the same thing). >> >>>>I have been reading that LPC is great for encoding the basic >>>>parameters of speech, but most of the articles are related to >>>>compression or building of a vocoder. Has anyone had experience of or >>>>know any good resources or c/c++ libraries for extracting the >>>>parameters of speech for comparison? Is LPC likely the best algorithm >>>>for this type of work? >> >>>You can use either dynamic time warping or hidden markov models. You >>>may use either the LPC coefs or convert them to PARCORs in either of >>>these two cost mechanisms. >> >>Problem is, there is not much relation between the perceptual similarity >>and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are >>good functions for pattern matching; LSPs or cepstrum coefficients could >>be better in this regard. But, it's all minor technicalities compared to >>the incredible problem stated by OP. >> > Maybe I should explain where I am at the moment. > I am calculating the FFT of a 128-sample windowed using gaussian > distribution, then converting this to decibels. I can see the peaks > of the formants visually, but there is a lot of noise on the graph > (of > the data), I think probably caused by the harmonics. Is there a way > to clean this noise so I can see the formant peaks as smooth peaks in > the graph? The smaller (128-bit) sample size helped with this, as > did > the gaussian window, but I can't help but think there is a better > approach for this?
Raeldor, This is business. You can hire my services; contact at the web site. You can also consider filterbank TMS C67x software offered by Bharat Pathak. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All, > > I am trying to find the best way to analyze 2 speech signals from 2 > different speakers and come up with a 'match percentage' as to how > close they were (ie, did the speakers say the same thing). > > I have been reading that LPC is great for encoding the basic > parameters of speech, but most of the articles are related to > compression or building of a vocoder. =A0Has anyone had experience of or > know any good resources or c/c++ libraries for extracting the > parameters of speech for comparison? =A0Is LPC likely the best algorithm > for this type of work? > > Thanks > Ray
Google MFCC, PLP, DP, HMM etc. etc. etc Go through about 1000 references Better yet, forget about the whole thing: it's called ASR (automatic speech recognition) Google it too... But, just for starters: http://cslu.cse.ogi.edu/toolkit/ http://cmusphinx.sourceforge.net/ http://www.isip.piconepress.com/projects/speech/software/ This is all crap anyway...