Reply by naveen January 19, 20022002-01-19

The most popular schemes used for speech recognition currently
are Hidden Markov models (HMM). In addition for a simple task
like isolated word, small vocabulary speech maybe
Dynamic time warping (DTW) can also be used. However the
accuracy of DTW will be much lesser than HMM.

A good reference book is
Fundamentals of speech recognition by Rabiner and Juang
or the paper by Rabiner
L. Rabiner. A tutorial on Hidden Markov Models and selected
applications in speech recognition. Proceedings of the IEEE,
77(2):257-286, 1989

Best,
Naveen

--- In speech-recognition@y..., "Renato D'Antonio" <rd@d...> wrote:
> From: Dr. Renato D'Antonio, President, DSP Global, 33 Plan Way,
Bldg.4,
> Warwick, RI 02886, 401-737-9900, Toll Free: 800-437-3282
> Fax: 401-739-4197, Web:
> <www.dspglobal.com>, e-mail:
> <rd@d...>
>
> Hi all,
> What is the answer to his question, "What is the simple speech
recognition
> method for
> > > isolated word, small vocabulary speech (not speaker)
> > > recognition system?". Is there a simple answer to this question
or a
> referral to some easy reading paper or book?
> Thanks,
> Renato
>
> ----- Original Message -----
> From: simha j <simha_aj@y...>
> To: Sanjay Patil <yashorath@y...>; <speech-recognition@y...>
> Sent: Thursday, January 17, 2002 11:23 PM
> Subject: Re: [speech-recognition] few queries! > > --- Sanjay Patil <yashorath@y...> wrote:
> > > 1. when a raw PCM data file is considered, what is
> > > usually sampling frequency for the same?
> >
> > If the data file corresponds to Telephonic speech
> > it is fs=8khz.
> > If it is a audio file then fs is 44khz
> > so fs purely depends on the bandwidth of the i/p
> > signal.
> >
> >
> >
> > > 2. How many bits should be used to sample and
> > > digitise
> > > the speech signals, 8 bits or 16 bits. Purpose is to
> > > have speech recognition done on the samples.
> >
> > To reconstruct the speech at the Rx end, speech
> > samples must be coded with at least 14bits ( or 16bits
> > with zero pads) at the Tx end.
> > Also for these 16bit data a 1st level of compression
> > can be given through A-law or mu-law which reduces
> > 16bit to 8bit. But remember that these 8bit data must
> > be expanded back to 16bits before it is played in any
> > player.
> > Using Goldwave s/w you can do lot of experiments on
> > speech signals. Try to get it from net.
> >
> >
> > > 3. What is the simple speech recognition method for
> > > isolated word, small vocabulary speech (not speaker)
> > > recognition system?
> >
> > refer 'digital processing of speech signals ' by
> > rabinar & Gold.
> >
> > > Please help and mail at the earliest!
> > > yashorath
> > >
> > >
> > >
> >
> >
> >
> > _____________________________________
> > Note: If you do a simple "reply" with your email client, only the
author
> of this message will receive your answer. You need to do a "reply
all" if
> you want your answer to be distributed to the entire group.
> >
> > _____________________________________
> > About this discussion group:
> >
> > To Join: speech-recognition-subscribe@y...
> >
> > To Post: speech-recognition@y...
> >
> > To Leave: speech-recognition-unsubscribe@y...
> >
> > Archives: http://www.yahoogroups.com/group/speech-recognition
> >
> > Other DSP-Related Groups: http://www.dsprelated.com
> >
> > ">http://docs.yahoo.com/info/terms/
> >
> >


Reply by Renato D'Antonio January 18, 20022002-01-18
From: Dr. Renato D'Antonio, President, DSP Global, 33 Plan Way, Bldg.4,
Warwick, RI 02886, 401-737-9900, Toll Free: 800-437-3282
Fax: 401-739-4197, Web:
<www.dspglobal.com>, e-mail:
<>

Hi all,
What is the answer to his question, "What is the simple speech recognition
method for
> > isolated word, small vocabulary speech (not speaker)
> > recognition system?". Is there a simple answer to this question or a
referral to some easy reading paper or book?
Thanks,
Renato

----- Original Message -----
From: simha j <>
To: Sanjay Patil <>; <>
Sent: Thursday, January 17, 2002 11:23 PM
Subject: Re: [speech-recognition] few queries! > --- Sanjay Patil <> wrote:
> > 1. when a raw PCM data file is considered, what is
> > usually sampling frequency for the same?
>
> If the data file corresponds to Telephonic speech
> it is fs=8khz.
> If it is a audio file then fs is 44khz
> so fs purely depends on the bandwidth of the i/p
> signal. >
> > 2. How many bits should be used to sample and
> > digitise
> > the speech signals, 8 bits or 16 bits. Purpose is to
> > have speech recognition done on the samples.
>
> To reconstruct the speech at the Rx end, speech
> samples must be coded with at least 14bits ( or 16bits
> with zero pads) at the Tx end.
> Also for these 16bit data a 1st level of compression
> can be given through A-law or mu-law which reduces
> 16bit to 8bit. But remember that these 8bit data must
> be expanded back to 16bits before it is played in any
> player.
> Using Goldwave s/w you can do lot of experiments on
> speech signals. Try to get it from net. > > 3. What is the simple speech recognition method for
> > isolated word, small vocabulary speech (not speaker)
> > recognition system?
>
> refer 'digital processing of speech signals ' by
> rabinar & Gold.
>
> > Please help and mail at the earliest!
> > yashorath
> >
> >
> >
> _____________________________________
> Note: If you do a simple "reply" with your email client, only the author
of this message will receive your answer. You need to do a "reply all" if
you want your answer to be distributed to the entire group.
>
> _____________________________________
> About this discussion group:
>
> To Join:
>
> To Post:
>
> To Leave:
>
> Archives: http://www.yahoogroups.com/group/speech-recognition
>
> Other DSP-Related Groups: http://www.dsprelated.com
>
> ">http://docs.yahoo.com/info/terms/




Reply by simha j January 18, 20022002-01-18
--- Sanjay Patil <> wrote:
> 1. when a raw PCM data file is considered, what is
> usually sampling frequency for the same?

If the data file corresponds to Telephonic speech
it is fs=8khz.
If it is a audio file then fs is 44khz
so fs purely depends on the bandwidth of the i/p
signal.
> 2. How many bits should be used to sample and
> digitise
> the speech signals, 8 bits or 16 bits. Purpose is to
> have speech recognition done on the samples.

To reconstruct the speech at the Rx end, speech
samples must be coded with at least 14bits ( or 16bits
with zero pads) at the Tx end.
Also for these 16bit data a 1st level of compression
can be given through A-law or mu-law which reduces
16bit to 8bit. But remember that these 8bit data must
be expanded back to 16bits before it is played in any
player.
Using Goldwave s/w you can do lot of experiments on
speech signals. Try to get it from net. > 3. What is the simple speech recognition method for
> isolated word, small vocabulary speech (not speaker)
> recognition system?

refer 'digital processing of speech signals ' by
rabinar & Gold.

> Please help and mail at the earliest!
> yashorath >


Reply by Sanjay Patil January 17, 20022002-01-17
1. when a raw PCM data file is considered, what is
usually sampling frequency for the same?
2. How many bits should be used to sample and digitise
the speech signals, 8 bits or 16 bits. Purpose is to
have speech recognition done on the samples.
3. What is the simple speech recognition method for
isolated word, small vocabulary speech (not speaker)
recognition system?
Please help and mail at the earliest!
yashorath