DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement: A Survey of the State of the Art (Synthesis L
As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement. The field of single-microphone noise reduction for speech enhancement comprises a history of more than 30 years of research. In this survey, we wish to demonstrate the significant advances that have been made during the last decade in the field of discrete Fourier transform domain-based single-channel noise reduction for speech enhancement.Furthermore, our goal is to provide a concise description of a state-of-the-art speech enhancement system, and demonstrate the relative importance of the various building blocks of such a system. This allows the non-expert DSP practitioner to judge the relevance of each building block and to implement a close-to-optimal enhancement system for the particular application at hand. Table of Contents: Introduction / Single Channel Speech Enhancement: General Principles / DFT-Based Speech Enhancement Methods: Signal Model and Notation / Speech DFT Estimators / Speech Presence Probability Estimation / Noise PSD Estimation / Speech PSD Estimation / Performance Evaluation Methods / Simulation Experiments with Single-Channel Enhancement Systems / Future Directions
Why Read This Book
You will get a focused, research-level tour of single‑microphone, DFT‑domain noise reduction methods that have driven speech enhancement advances up to 2013. The survey synthesizes theoretical foundations (MMSE, Wiener, spectral subtraction), practical building blocks (noise PSD estimation, VAD, overlap‑add), and recent trends (NMF, time‑frequency masking, phase considerations) so you can both understand the literature and apply proven algorithms in real systems.
Who Will Benefit
Researchers and engineers with DSP and speech‑processing experience who are designing or evaluating single‑channel noise reduction for mobile, hearing‑aid, or voice‑interface applications.
Level: Advanced — Prerequisites: Undergraduate signals & systems and probability; familiarity with DFT/FFT, short‑time Fourier analysis (STFT), linear filtering, and basic programming in MATLAB/Python/C.
Key Takeaways
- Implement DFT/STFT‑based noise reduction pipelines including overlap‑add, windowing, and frame processing
- Apply and compare classical algorithms (spectral subtraction, Wiener filtering) and statistical estimators (Ephraim–Malah MMSE, log‑spectral estimators)
- Design and implement practical noise PSD estimation and voice activity detection (VAD) methods used in single‑mic systems
- Use time‑frequency masking, NMF and model‑based approaches for improved single‑channel separation
- Evaluate enhancement performance with objective (PESQ, STOI, segmental SNR) and perceptual metrics and understand tradeoffs for real‑time deployment
- Assess the role of phase, reverberation, and computational constraints when moving from simulation to embedded devices
Topics Covered
- 1. Introduction: scope, applications, and history of single‑microphone noise reduction
- 2. Time‑frequency fundamentals: DFT, STFT, windows, overlap‑add and spectral analysis
- 3. Signal and noise models for speech processing in the DFT domain
- 4. Classical methods: spectral subtraction and early implementations
- 5. Statistical estimators: Wiener filtering, Ephraim–Malah MMSE and log‑spectral estimators
- 6. Noise power spectral density estimation and tracking techniques
- 7. Gain functions, suppression rules, and musical noise mitigation
- 8. Time‑frequency masking and binary/soft masks
- 9. Model‑based approaches: NMF, HMMs and Bayesian methods for single‑channel separation
- 10. Phase processing and its impact on perceptual quality
- 11. Evaluation methodology: objective metrics, listening tests, and benchmark datasets
- 12. Real‑time implementation issues: computational complexity, latency, and embedded constraints
- 13. Applications and case studies: mobile phones, hearing aids, voice interfaces
- 14. Open problems and trends (up to 2013): deep learning prospects, dereverberation, and hybrid methods
- Appendices: mathematical derivations, pseudocode, and reference datasets
Languages, Platforms & Tools
How It Compares
Compared to Philipos Loizou's 'Speech Enhancement: Theory and Practice' (more tutorial and implementation‑oriented), this survey is more research‑centric and focused specifically on DFT‑domain single‑microphone methods and the state of the art through 2013; it is also narrower in scope than broad edited volumes like Benesty et al.'s 'Springer Handbook of Speech Processing' which covers multichannel and broader speech topics.












