DSPRelated.com
Books

DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement: A Survey of the State of the Art (Synthesis L

Hendriks, Richard C., Gerkmann, Timo, Jensen, Je 2013

As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement. The field of single-microphone noise reduction for speech enhancement comprises a history of more than 30 years of research. In this survey, we wish to demonstrate the significant advances that have been made during the last decade in the field of discrete Fourier transform domain-based single-channel noise reduction for speech enhancement.Furthermore, our goal is to provide a concise description of a state-of-the-art speech enhancement system, and demonstrate the relative importance of the various building blocks of such a system. This allows the non-expert DSP practitioner to judge the relevance of each building block and to implement a close-to-optimal enhancement system for the particular application at hand. Table of Contents: Introduction / Single Channel Speech Enhancement: General Principles / DFT-Based Speech Enhancement Methods: Signal Model and Notation / Speech DFT Estimators / Speech Presence Probability Estimation / Noise PSD Estimation / Speech PSD Estimation / Performance Evaluation Methods / Simulation Experiments with Single-Channel Enhancement Systems / Future Directions


Why Read This Book

You will get a focused, research-level tour of single‑microphone, DFT‑domain noise reduction methods that have driven speech enhancement advances up to 2013. The survey synthesizes theoretical foundations (MMSE, Wiener, spectral subtraction), practical building blocks (noise PSD estimation, VAD, overlap‑add), and recent trends (NMF, time‑frequency masking, phase considerations) so you can both understand the literature and apply proven algorithms in real systems.

Who Will Benefit

Researchers and engineers with DSP and speech‑processing experience who are designing or evaluating single‑channel noise reduction for mobile, hearing‑aid, or voice‑interface applications.

Level: Advanced — Prerequisites: Undergraduate signals & systems and probability; familiarity with DFT/FFT, short‑time Fourier analysis (STFT), linear filtering, and basic programming in MATLAB/Python/C.

Get This Book

Key Takeaways

  • Implement DFT/STFT‑based noise reduction pipelines including overlap‑add, windowing, and frame processing
  • Apply and compare classical algorithms (spectral subtraction, Wiener filtering) and statistical estimators (Ephraim–Malah MMSE, log‑spectral estimators)
  • Design and implement practical noise PSD estimation and voice activity detection (VAD) methods used in single‑mic systems
  • Use time‑frequency masking, NMF and model‑based approaches for improved single‑channel separation
  • Evaluate enhancement performance with objective (PESQ, STOI, segmental SNR) and perceptual metrics and understand tradeoffs for real‑time deployment
  • Assess the role of phase, reverberation, and computational constraints when moving from simulation to embedded devices

Topics Covered

  1. 1. Introduction: scope, applications, and history of single‑microphone noise reduction
  2. 2. Time‑frequency fundamentals: DFT, STFT, windows, overlap‑add and spectral analysis
  3. 3. Signal and noise models for speech processing in the DFT domain
  4. 4. Classical methods: spectral subtraction and early implementations
  5. 5. Statistical estimators: Wiener filtering, Ephraim–Malah MMSE and log‑spectral estimators
  6. 6. Noise power spectral density estimation and tracking techniques
  7. 7. Gain functions, suppression rules, and musical noise mitigation
  8. 8. Time‑frequency masking and binary/soft masks
  9. 9. Model‑based approaches: NMF, HMMs and Bayesian methods for single‑channel separation
  10. 10. Phase processing and its impact on perceptual quality
  11. 11. Evaluation methodology: objective metrics, listening tests, and benchmark datasets
  12. 12. Real‑time implementation issues: computational complexity, latency, and embedded constraints
  13. 13. Applications and case studies: mobile phones, hearing aids, voice interfaces
  14. 14. Open problems and trends (up to 2013): deep learning prospects, dereverberation, and hybrid methods
  15. Appendices: mathematical derivations, pseudocode, and reference datasets

Languages, Platforms & Tools

MATLABOctavePython (NumPy/SciPy)C/C++Mobile phones (embedded DSP/ARM)Hearing aidsVoice‑controlled consumer devicesGeneral purpose desktops for researchFFTWMATLAB Signal Processing ToolboxVoiceboxPESQ (ITU‑T)STOI implementationBSS Eval toolboxAudacity/sox for audio I/O

How It Compares

Compared to Philipos Loizou's 'Speech Enhancement: Theory and Practice' (more tutorial and implementation‑oriented), this survey is more research‑centric and focused specifically on DFT‑domain single‑microphone methods and the state of the art through 2013; it is also narrower in scope than broad edited volumes like Benesty et al.'s 'Springer Handbook of Speech Processing' which covers multichannel and broader speech topics.

Related Books

Alan V. Oppenheim, Alan S. ...
Martin Vetterli, Jelena Kov...