Technical discussions about the implementation and research of speech recognition algorithms.
Hi all, I'm very new with both DSP and Speech recognition. So could you please let me know: + What type of filters used to remove noise since my input is taken in a noisy environment. + Some documents that discuss the frequency range of human voice? + Other types of filters? + Algorithms/techniques/models used to retrieve and match sample/pattern. These are all I think I need to work with speech recognition. If I miss anything, please let me know. I really appreciate if you guys can give me a hand to start studying this field. Thanks in advance, Pete.______________________________
Hey man! To answer a few of your queries: ----------------------------------------------------------------- 1) Noise Cancellation is a very hot research field and there are numerous methods to do so. I will suggest a few noise cancellation methods in the ascending order of difficulty - spectral subtraction (multiband inclusive), frequency scaling methods, kalman filtering, global soft decision method, signal subspace methods and auditory scene analysis / blind source separation. For just a start you would want to work on spectral subtraction. References: Multiband spectral subtraction for speech enhancement [Sunil Devdas Kamant, MS Thesis] speech enhancement for personal communications using an adaptive gain equalizer. Neils Westerlund, Mattias Dahl and Ingvar Claesson ----------------------------------------------------------------- 2) Regarding the frequency range of human utterance, take it from me that it wont exceed 4kHz, so 8kHz sampling is just fine. ----------------------------------------------------------------- 3) Other types of filters for noise cancellation: well, wavelets. but most filterbanks are used to split the signal into sub-bands. ----------------------------------------------------------------- 4) Pattern Matching: just go through the chapter on HMMs from rabiner, or the famous tutorial for that matter... You just need to know how to fit a cluster of points in space into gaussians. That should give you a good start for speech recognition. -----------------------------------------------------------------______________________________