Principles of Speech Coding
It is becoming increasingly apparent that all forms of communication―including voice―will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding.
Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networks
Offering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the underlying signal processing techniques used in speech coding. The authors present coding standards from various organizations, including the International Telecommunication Union (ITU). With a focus on applications such as Voice-over-IP telephony, this comprehensive text covers recent research findings on topics including:
- A general introduction to speech processing
- Digital signal processing concepts
- Sampling theory and related topics
- Principles of pulse code modulation (PCM) and adaptive differential pulse code modulation (ADPCM) standards
- Linear prediction (LP) and use of the linear predictive coding (LPC) model
- Vector quantization and its applications in speech coding
- Case studies of practical speech coders from ITU and others
- The Internet low-bit-rate coder (ILBC)
Developed from the authors’ combined teachings, this book also illustrates its contents by providing a real-time implementation of a speech coder on a digital signal processing chip. With its balance of theory and practical coverage, it is ideal for senior-level undergraduate and graduate students in electrical and computer engineering. It is also suitable for engineers and researchers designing or using speech coding systems in their work.
Why Read This Book
You will get a practical, up-to-date grounding in the signal‑processing methods that power modern speech codecs and VoIP systems, with clear links between theory and real codec standards. The book emphasizes algorithms you can implement and tune — from LPC/CELP to adaptive filters and packet-loss mitigation — so you can design or evaluate real-world speech systems.
Who Will Benefit
Engineers and graduate students with some DSP background who design or evaluate speech codecs, VoIP systems, audio/speech applications, or telecom devices and need a practical, standards-aware treatment of speech coding and impairment mitigation.
Level: Intermediate — Prerequisites: Undergraduate signals and systems and basic digital signal processing (discrete‑time signals, z‑transform, FFT), plus basic probability/statistics and experience with MATLAB or equivalent for algorithm exploration.
Key Takeaways
- Explain the foundations of speech production, perception, and their implications for efficient coding
- Design and analyze linear predictive coding (LPC) and code‑excited linear prediction (CELP) based codecs
- Implement and compare common speech codecs and standards (e.g., G.711, G.726, G.729, AMR) and their rate‑quality tradeoffs
- Apply FFT, spectral analysis and wavelet techniques to speech analysis and transform/subband coding
- Implement adaptive filtering methods (e.g., NLMS, RLS) for echo cancellation and packet‑loss/jitter mitigation in VoIP
- Measure and evaluate speech quality using objective and subjective metrics and understand codec behavior on packet networks
Topics Covered
- 1. Introduction: Speech Coding in Packet Networks and Design Goals
- 2. Speech Production and Perception: Models and Implications for Coding
- 3. Discrete‑Time Representation, Quantization, and PCM
- 4. Linear Predictive Coding (LPC): Theory and Implementation
- 5. Code‑Excited Linear Prediction (CELP) and Residual Coding
- 6. Transform and Subband Coding; FFT and Waveform Approaches
- 7. Entropy Coding, Rate Control, and Bitstream Formatting
- 8. Adaptive Filtering: Echo Cancellation and Noise Suppression
- 9. Packet Network Impairments: Jitter, Loss, and Concealment Strategies
- 10. Standards and Practical Codecs: ITU‑T and 3GPP Implementations
- 11. Spectral Analysis, Wavelets, and Time–Frequency Methods for Speech
- 12. Objective and Subjective Quality Assessment (PESQ, POLQA, MOS)
- 13. Implementation Considerations, Optimization, and Case Studies
Languages, Platforms & Tools
How It Compares
Covers practical speech coding and VoIP impairment mitigation more directly than Quatieri's Discrete‑Time Speech Signal Processing (which is heavier on theory) and provides a more codec‑focused, implementation‑oriented alternative to Rabiner & Schafer's classic Digital Processing of Speech Signals.












