comp.dsp | LPC for Analysis of 2 Speech Signals

Hi All,

I am trying to find the best way to analyze 2 speech signals from 2
different speakers and come up with a 'match percentage' as to how
close they were (ie, did the speakers say the same thing).

I have been reading that LPC is great for encoding the basic
parameters of speech, but most of the articles are related to
compression or building of a vocoder.  Has anyone had experience of or
know any good resources or c/c++ libraries for extracting the
parameters of speech for comparison?  Is LPC likely the best algorithm
for this type of work?

Thanks
Ray

Reply by HardySpicer ●November 27, 20102010-11-27

On Nov 28, 1:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. &#4294967295;Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

Try Matlab.

Reply by Raeldor ●November 28, 20102010-11-28

On Nov 27, 7:08&#4294967295;pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Nov 28, 1:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote:
>
> > Hi All,
>
> > I am trying to find the best way to analyze 2 speech signals from 2
> > different speakers and come up with a 'match percentage' as to how
> > close they were (ie, did the speakers say the same thing).
>
> > I have been reading that LPC is great for encoding the basic
> > parameters of speech, but most of the articles are related to
> > compression or building of a vocoder. &#4294967295;Has anyone had experience of or
> > know any good resources or c/c++ libraries for extracting the
> > parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm
> > for this type of work?
>
> > Thanks
> > Ray
>
> Try Matlab.

Not sure that's an option, as I need to integrate the end results into
a practical application.  I have read about Matlab's DSP toolkit, and
it looks really nice, but unless they provide source code or a c/c++
library for the functions it'll be very hard to translate it into my
application.

Reply by bharat pathak ●November 28, 20102010-11-28

i was working on filterbank approach to speech recognition.

This project was on isolated word recognition. So a user will
speak a set of words (example 1 to 10) and speak it for 8 times
to perform template averaging.

Templates were computed using filter banks, LPF's and decimation.
The 2D stored templates were then compared with the incoming 
word sample to find the match.

I have working c-code which was also ported on C6713 TI dSP platform.

Willing to sell the code if anyone is interested to buy it.

Regards
Bharat

Reply by Clay ●November 29, 20102010-11-29

On Nov 27, 7:45&#4294967295;pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. &#4294967295;Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

You can use either dynamic time warping or hidden markov models. You
may use either the LPC coefs or convert them to PARCORs in either of
these two cost mechanisms.

Clay

Reply by Vladimir Vassilevsky ●November 29, 20102010-11-29


Clay wrote:

> On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
> 
>>Hi All,
>>
>>I am trying to find the best way to analyze 2 speech signals from 2
>>different speakers and come up with a 'match percentage' as to how
>>close they were (ie, did the speakers say the same thing).
>>
>>I have been reading that LPC is great for encoding the basic
>>parameters of speech, but most of the articles are related to
>>compression or building of a vocoder.  Has anyone had experience of or
>>know any good resources or c/c++ libraries for extracting the
>>parameters of speech for comparison?  Is LPC likely the best algorithm
>>for this type of work?
>>
>>Thanks
>>Ray
> 
> 
> You can use either dynamic time warping or hidden markov models. You
> may use either the LPC coefs or convert them to PARCORs in either of
> these two cost mechanisms.

Problem is, there is not much relation between the perceptual similarity 
and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are 
good functions for pattern matching; LSPs or cepstrum coefficients could 
be better in this regard. But, it's all minor technicalities compared to 
the incredible problem stated by OP.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by Raeldor ●November 29, 20102010-11-29

On Nov 29, 3:12&#4294967295;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. &#4294967295;Has anyone had experience of or
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

Maybe I should explain where I am at the moment.

I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels.  I can see the peaks
of the formants visually, but there is a lot of noise on the graph (of
the data), I think probably caused by the harmonics.  Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph?  The smaller (128-bit) sample size helped with this, as did
the gaussian window, but I can't help but think there is a better
approach for this?

Thanks
Ray

Reply by Raeldor ●November 30, 20102010-11-30

On Nov 29, 3:12&#4294967295;pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. &#4294967295;Has anyone had experience of or
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? &#4294967295;Is LPC likely the best algorithm
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

Maybe I should explain where I am at the moment.
I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels.  I can see the peaks
of the formants visually, but there is a lot of noise on the graph
(of
the data), I think probably caused by the harmonics.  Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph?  The smaller (128-bit) sample size helped with this, as
did
the gaussian window, but I can't help but think there is a better
approach for this?

Thanks
Ray

Reply by Vladimir Vassilevsky ●November 30, 20102010-11-30


Raeldor wrote:

> On Nov 29, 3:12 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> 
>>Clay wrote:
>>
>>>On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>>
>>>>Hi All,
>>
>>>>I am trying to find the best way to analyze 2 speech signals from 2
>>>>different speakers and come up with a 'match percentage' as to how
>>>>close they were (ie, did the speakers say the same thing).
>>
>>>>I have been reading that LPC is great for encoding the basic
>>>>parameters of speech, but most of the articles are related to
>>>>compression or building of a vocoder.  Has anyone had experience of or
>>>>know any good resources or c/c++ libraries for extracting the
>>>>parameters of speech for comparison?  Is LPC likely the best algorithm
>>>>for this type of work?
>>
>>>You can use either dynamic time warping or hidden markov models. You
>>>may use either the LPC coefs or convert them to PARCORs in either of
>>>these two cost mechanisms.
>>
>>Problem is, there is not much relation between the perceptual similarity
>>and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
>>good functions for pattern matching; LSPs or cepstrum coefficients could
>>be better in this regard. But, it's all minor technicalities compared to
>>the incredible problem stated by OP.
>>
> Maybe I should explain where I am at the moment.
> I am calculating the FFT of a 128-sample windowed using gaussian
> distribution, then converting this to decibels.  I can see the peaks
> of the formants visually, but there is a lot of noise on the graph
> (of
> the data), I think probably caused by the harmonics.  Is there a way
> to clean this noise so I can see the formant peaks as smooth peaks in
> the graph?  The smaller (128-bit) sample size helped with this, as
> did
> the gaussian window, but I can't help but think there is a better
> approach for this?

Raeldor,

This is business. You can hire my services; contact at the web site. You 
can also consider filterbank TMS C67x software offered by Bharat Pathak.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by fatalist ●December 1, 20102010-12-01

On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

Google MFCC, PLP, DP, HMM etc. etc. etc

Go through about 1000 references

Better yet, forget about the whole thing: it's called ASR (automatic
speech recognition)

Google it too...

But, just for starters:

http://cslu.cse.ogi.edu/toolkit/
http://cmusphinx.sourceforge.net/
http://www.isip.piconepress.com/projects/speech/software/

This is all crap anyway...

Previous12 Next

LPC for Analysis of 2 Speech Signals

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group