DSPRelated.com
Books

Theory and Applications of Digital Speech Processing

Rabiner, Lawrence, Schafer, Ronald 2010

Theory and Applications of Digital Speech Processing is ideal for graduate students in digital signal processing, and undergraduate students in Electrical and Computer Engineering. With its clear, up-to-date, hands-on coverage of digital speech processing, this text is also suitable for practicing engineers in speech processing.


This new text presents the basic concepts and theories of speech processing with clarity and currency, while providing hands-on computer-based laboratory experiences for students. The material is organized in a manner that builds a strong foundation of basics first, and then concentrates on a range of signal processing methods for representing and processing the speech signal. 


Why Read This Book

You should read this book if you want a rigorous but practical bridge between DSP theory and real-world speech applications: it explains core models (source-filter, LPC), modern feature extraction (cepstra, MFCCs), and speech coding/recognition concepts while providing MATLAB-based labs so you can implement and experiment with the algorithms yourself. The presentation balances mathematical clarity with engineering intuition, making it easy to go from understanding to working prototypes.

Who Will Benefit

Graduate students, senior undergraduates, and practicing engineers working on speech/audio systems who need both theoretical foundations and hands-on DSP implementations.

Level: Intermediate — Prerequisites: Basic signals & systems and digital signal processing (Fourier transforms, sampling, discrete-time filtering); elementary linear algebra and probability; familiarity with MATLAB (or Octave) recommended.

Get This Book

Key Takeaways

  • Implement and interpret short-time Fourier analysis and windowing methods for speech signals.
  • Derive, implement, and apply linear predictive coding (LPC) for analysis, synthesis, and formant estimation.
  • Compute cepstral features and MFCCs for feature extraction in speech recognition and audio analysis.
  • Design and simulate basic speech coders and vocoders, and understand bitrate/quality tradeoffs.
  • Implement pitch (fundamental frequency) detection and basic prosody analysis algorithms.
  • Apply statistical and practical methods used in modern speech recognition pipelines (feature extraction and preprocessing).

Topics Covered

  1. Introduction and digital representation of speech
  2. Time-domain and short-time analysis of speech
  3. Frequency-domain methods and spectral estimation (FFT, windows, multitaper)
  4. Sampling, quantization, and pre-processing for speech systems
  5. Linear prediction theory and LPC analysis/synthesis
  6. Formant analysis and filter-based models of the vocal tract
  7. Pitch detection, voicing, and prosodic features
  8. Cepstral analysis, MFCCs, and perceptual feature extraction
  9. Speech coding, vocoders, and low-bitrate compression
  10. Speech synthesis and concatenative/parametric techniques
  11. Introduction to statistical methods for recognition (feature pipelines)
  12. Practical MATLAB labs and algorithm implementation notes
  13. Evaluation metrics, perceptual considerations, and applications

Languages, Platforms & Tools

MATLABOctaveMATLAB Signal Processing ToolboxMATLAB scripts / example code (lab exercises)

How It Compares

Overlaps with Rabiner & Juang's Fundamentals of Speech Recognition but is broader in DSP and hands-on labs (less deep on HMM theory); compared to Gold & Morgan's Speech and Audio Signal Processing, Rabiner's book is more focused on classical speech models (LPC, vocoders) and implementation exercises.

Related Books