DSPRelated.com
Books

Fundamentals of Speech Recognition

Rabiner, Lawrence, Juang, Biing-Hwang 1993

Provides a theoretically sound, technically accurate, and complete description of the basic knowledge and ideas that constitute a modern system for speech recognition by machine. Covers production, perception, and acoustic-phonetic characterization of the speech signal; signal processing and analysis methods for speech recognition; pattern comparison techniques; speech recognition system design and implementation; theory and implementation of hidden Markov models; speech recognition based on connected word models; large vocabulary continuous speech recognition; and task- oriented application of automatic speech recognition. For practicing engineers, scientists, linguists, and programmers interested in speech recognition.


Why Read This Book

You will get a thorough, mathematically grounded foundation in the algorithms and system design principles behind automatic speech recognition, from acoustic modeling and feature extraction to HMM training and decoding. The book ties DSP methods to pattern-recognition and statistical estimation, giving you theory and practical insights useful for implementing and evaluating ASR systems.

Who Will Benefit

Graduate students, researchers, and engineers working on speech/audio signal processing or ASR systems who need a rigorous treatment of HMM-based recognition, feature design, and system-level issues.

Level: Advanced — Prerequisites: Undergraduate-level signals and systems / DSP, probability and statistics (stochastic processes/Bayesian basics), linear algebra, and some programming experience for implementations.

Get This Book

Key Takeaways

  • Apply short-time spectral analysis and compute standard speech features (cepstral coefficients, MFCC, LPC)
  • Formulate, train, and use Hidden Markov Models for speech recognition (Baum-Welch/EM, Viterbi decoding)
  • Design connected-word and large-vocabulary continuous speech recognition systems with language modeling
  • Analyze acoustic-phonetic properties of speech and translate them into effective feature representations
  • Evaluate ASR performance and implement practical system-level components (lexicons, search, scoring)
  • Address common robustness issues and speaker variability with adaptation and normalization strategies

Topics Covered

  1. 1. Introduction and Overview of Speech Recognition
  2. 2. Speech Production, Perception, and Acoustic-Phonetic Properties
  3. 3. Signal Processing for Speech: Windowing, Spectra, and Preprocessing
  4. 4. Linear Predictive Coding, Cepstral Analysis, and MFCCs
  5. 5. Pattern Comparison and Classification Methods
  6. 6. Hidden Markov Models: Definitions and Basic Algorithms
  7. 7. HMM Training: Maximum Likelihood, Baum-Welch (EM)
  8. 8. Viterbi Decoding and Search Algorithms
  9. 9. Connected-Word Recognition and Subword Units
  10. 10. Large-Vocabulary Continuous Speech Recognition and Language Modeling
  11. 11. Speaker Recognition, Adaptation, and Normalization
  12. 12. System Design, Implementation Issues, and Performance Evaluation
  13. Appendices: Mathematical Background and Implementation Notes

How It Compares

More tutorial and DSP-oriented than Jelinek's Statistical Methods for Speech Recognition (which emphasizes statistical language modeling and advanced probabilistic methods); unlike modern deep-learning ASR texts, Rabiner & Juang focus on classical HMM-based systems and signal-processing fundamentals.

Related Books