DSPRelated.com
An application of neural networks to adaptive playout delay in VoIP

An application of neural networks to adaptive playout delay in VoIP

Ying Zhang, Damien Fay
Still RelevantIntermediate

The statistical nature of data traffic and the dynamic routing techniques employed in IP networks results in a varying network delay (jitter) experienced by the individual IP packets which form a VoIP flow. As a result voice packets generated at successive and periodic intervals at a source will typically be buffered at the receiver prior to playback in order to smooth out the jitter. However, the additional delay introduced by the playout buffer degrades the quality of service. Thus, the ability to forecast the jitter is an integral part of selecting an appropriate buffer size. This paper compares several neural network based models for adaptive playout buffer selection and in particular a novel combined wavelet transform/neural network approach is proposed. The effectiveness of these algorithms is evaluated using recorded VoIP traces by comparing the buffering delay and the packet loss ratios for each technique. In addition, an output speech signal is reconstructed based on the packet loss information for each algorithm and the perceptual quality of the speech is then estimated using the PESQ MOS algorithm. Simulation results indicate that proposed Haar-Wavelets-Packet MLP and Statistical-Model MLP adaptive scheduling schemes offer superior performance.


Summary

This 2007 paper evaluates neural-network models for adaptive playout buffer selection in VoIP and introduces a combined wavelet-transform plus neural-network approach for jitter forecasting. The reader will learn how wavelet pre-processing improves delay-prediction accuracy and how predicted jitter can drive buffer sizing to trade off latency and packet loss.

Key Takeaways

  • Implement neural-network predictors (e.g., MLP/RBF) to forecast per-packet network delay from recent delay samples.
  • Apply discrete wavelet transform as pre-processing to denoise and extract multi-scale delay features before NN training.
  • Select playout buffer size dynamically from predicted jitter to balance end-to-end delay against packet loss.
  • Evaluate adaptive playout algorithms on real packet traces using objective QoS metrics (packet loss, mean playout delay, MOS/PESQ).
  • Integrate the predictor into a VoIP receiver with attention to real-time constraints and model-update strategies.

Who Should Read This

Engineers and researchers working on VoIP, real-time communications, or audio/speech processing with experience in DSP and machine learning who want practical methods for jitter prediction and adaptive buffer control.

Still RelevantIntermediate

Topics

Audio ProcessingCommunicationsWaveletsMachine Learning

Related Documents