speech-recognition | few queries!

1. when a raw PCM data file is considered, what is usually sampling frequency for the same? 2. How many bits should be used to sample and digitise the speech signals, 8 bits or 16 bits. Purpose is to have speech recognition done on the samples. 3. What is the simple speech recognition method for isolated word, small vocabulary speech (not speaker) recognition system? Please help and mail at the earliest! yashorath

Reply by simha j ●January 18, 20022002-01-18

--- Sanjay Patil <> wrote: > 1. when a raw PCM data file is considered, what is > usually sampling frequency for the same? If the data file corresponds to Telephonic speech it is fs=8khz. If it is a audio file then fs is 44khz so fs purely depends on the bandwidth of the i/p signal. > 2. How many bits should be used to sample and > digitise > the speech signals, 8 bits or 16 bits. Purpose is to > have speech recognition done on the samples. To reconstruct the speech at the Rx end, speech samples must be coded with at least 14bits ( or 16bits with zero pads) at the Tx end. Also for these 16bit data a 1st level of compression can be given through A-law or mu-law which reduces 16bit to 8bit. But remember that these 8bit data must be expanded back to 16bits before it is played in any player. Using Goldwave s/w you can do lot of experiments on speech signals. Try to get it from net. > 3. What is the simple speech recognition method for > isolated word, small vocabulary speech (not speaker) > recognition system? refer 'digital processing of speech signals ' by rabinar & Gold. > Please help and mail at the earliest! > yashorath >

Reply by Renato D'Antonio ●January 18, 20022002-01-18

From: Dr. Renato D'Antonio, President, DSP Global, 33 Plan Way, Bldg.4, Warwick, RI 02886, 401-737-9900, Toll Free: 800-437-3282 Fax: 401-739-4197, Web: <www.dspglobal.com>, e-mail: <> Hi all, What is the answer to his question, "What is the simple speech recognition method for > > isolated word, small vocabulary speech (not speaker) > > recognition system?". Is there a simple answer to this question or a referral to some easy reading paper or book? Thanks, Renato ----- Original Message ----- From: simha j <> To: Sanjay Patil <>; <> Sent: Thursday, January 17, 2002 11:23 PM Subject: Re: [speech-recognition] few queries! > --- Sanjay Patil <> wrote: > > 1. when a raw PCM data file is considered, what is > > usually sampling frequency for the same? > > If the data file corresponds to Telephonic speech > it is fs=8khz. > If it is a audio file then fs is 44khz > so fs purely depends on the bandwidth of the i/p > signal. > > > 2. How many bits should be used to sample and > > digitise > > the speech signals, 8 bits or 16 bits. Purpose is to > > have speech recognition done on the samples. > > To reconstruct the speech at the Rx end, speech > samples must be coded with at least 14bits ( or 16bits > with zero pads) at the Tx end. > Also for these 16bit data a 1st level of compression > can be given through A-law or mu-law which reduces > 16bit to 8bit. But remember that these 8bit data must > be expanded back to 16bits before it is played in any > player. > Using Goldwave s/w you can do lot of experiments on > speech signals. Try to get it from net. > > 3. What is the simple speech recognition method for > > isolated word, small vocabulary speech (not speaker) > > recognition system? > > refer 'digital processing of speech signals ' by > rabinar & Gold. > > > Please help and mail at the earliest! > > yashorath > > > > > > > _____________________________________ > Note: If you do a simple "reply" with your email client, only the author of this message will receive your answer. You need to do a "reply all" if you want your answer to be distributed to the entire group. > > _____________________________________ > About this discussion group: > > To Join: > > To Post: > > To Leave: > > Archives: http://www.yahoogroups.com/group/speech-recognition > > Other DSP-Related Groups: http://www.dsprelated.com > > ">http://docs.yahoo.com/info/terms/

Reply by naveen ●January 19, 20022002-01-19

The most popular schemes used for speech recognition currently are Hidden Markov models (HMM). In addition for a simple task like isolated word, small vocabulary speech maybe Dynamic time warping (DTW) can also be used. However the accuracy of DTW will be much lesser than HMM. A good reference book is Fundamentals of speech recognition by Rabiner and Juang or the paper by Rabiner L. Rabiner. A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257-286, 1989 Best, Naveen --- In speech-recognition@y..., "Renato D'Antonio" <rd@d...> wrote: > From: Dr. Renato D'Antonio, President, DSP Global, 33 Plan Way, Bldg.4, > Warwick, RI 02886, 401-737-9900, Toll Free: 800-437-3282 > Fax: 401-739-4197, Web: > <www.dspglobal.com>, e-mail: > <rd@d...> > > Hi all, > What is the answer to his question, "What is the simple speech recognition > method for > > > isolated word, small vocabulary speech (not speaker) > > > recognition system?". Is there a simple answer to this question or a > referral to some easy reading paper or book? > Thanks, > Renato > > ----- Original Message ----- > From: simha j <simha_aj@y...> > To: Sanjay Patil <yashorath@y...>; <speech-recognition@y...> > Sent: Thursday, January 17, 2002 11:23 PM > Subject: Re: [speech-recognition] few queries! > > --- Sanjay Patil <yashorath@y...> wrote: > > > 1. when a raw PCM data file is considered, what is > > > usually sampling frequency for the same? > > > > If the data file corresponds to Telephonic speech > > it is fs=8khz. > > If it is a audio file then fs is 44khz > > so fs purely depends on the bandwidth of the i/p > > signal. > > > > > > > > > 2. How many bits should be used to sample and > > > digitise > > > the speech signals, 8 bits or 16 bits. Purpose is to > > > have speech recognition done on the samples. > > > > To reconstruct the speech at the Rx end, speech > > samples must be coded with at least 14bits ( or 16bits > > with zero pads) at the Tx end. > > Also for these 16bit data a 1st level of compression > > can be given through A-law or mu-law which reduces > > 16bit to 8bit. But remember that these 8bit data must > > be expanded back to 16bits before it is played in any > > player. > > Using Goldwave s/w you can do lot of experiments on > > speech signals. Try to get it from net. > > > > > > > 3. What is the simple speech recognition method for > > > isolated word, small vocabulary speech (not speaker) > > > recognition system? > > > > refer 'digital processing of speech signals ' by > > rabinar & Gold. > > > > > Please help and mail at the earliest! > > > yashorath > > > > > > > > > > > > > > > > > _____________________________________ > > Note: If you do a simple "reply" with your email client, only the author > of this message will receive your answer. You need to do a "reply all" if > you want your answer to be distributed to the entire group. > > > > _____________________________________ > > About this discussion group: > > > > To Join: speech-recognition-subscribe@y... > > > > To Post: speech-recognition@y... > > > > To Leave: speech-recognition-unsubscribe@y... > > > > Archives: http://www.yahoogroups.com/group/speech-recognition > > > > Other DSP-Related Groups: http://www.dsprelated.com > > > > ">http://docs.yahoo.com/info/terms/ > > > >

few queries!

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group