Variable Delay Speech Communication over Packet-Switched Networks

PhD Student 
Sawar Muhammad Esahn
Research Area


 Packet-switched networks are primarily designed for data trac and introduce extra delay and packet losses when used for real-time interactive speech communication. In this thesis, we study the eect of delay and packet loss on end-to-end speech quality. We have studied these problems from three dierent angles. First, a new theoretical model known as delay-distortion model is introduced to incorporate delay as a main impairment factor. We have explicitly included contributions of source coding delay and network delay into our formulation to obtain delay-distortion curves. Second, distortion-rate (DR) analysis of packet loss concealment (PLC) methods is presented which also includes the eect of buering packets on the PLC method. We use the DR function to develop a new speech quality predictor which incorporates both delay and packet losses. This quality predictor is also compared with PESQ and found to be consistent in predicting the PESQ score. Third, we introduce two methods to estimate the delay distribution at the receiver to control the playout buer. These methods are based on truncated Gaussians and on the principle of maximum entropy. We have demonstrated the eectiveness of the methods by running voice over internet protocol (VoIP) simulations using a network simulator (NS-2) and also with some Internet delay traces.  


This thesis is supervised by Gernot Kubin.