Signal Processing and Speech Communication Laboratory
homephd theses › Variable Delay Speech Communication over Packet-Switched Networks

Variable Delay Speech Communication over Packet-Switched Networks

Status
Finished
Student
~Muhammad Sarwar Ehsan
Mentor
Gernot Kubin
Research Areas

Packet-switched networks are primarily designed for data traffic and introduce extra delay and packet losses when used for real-time interactive speech communication. In this thesis, we study the effect of delay and packet loss on end-to-end speech quality. We have studied these problems from three different angles. First, a new theoretical model known as delay-distortion model is introduced to incorporate delay as a main impairment factor. We have explicitly included contributions of source coding delay and network delay into our formulation to obtain delay-distortion curves. Second, distortion-rate (DR) analysis of packet loss concealment (PLC) methods is presented which also includes the effect of buffering packets on the PLC method. We use the DR function to develop a new speech quality predictor which incorporates both delay and packet losses. This quality predictor is also compared with PESQ and found to be consistent in predicting the PESQ score. Third, we introduce two methods to estimate the delay distribution at the receiver to control the playout buffer. These methods are based on truncated Gaussians and on the principle of maximum entropy. We have demonstrated the effectiveness of the methods by running voice over internet protocol (VoIP) simulations using a network simulator (NS-2) and also with some Internet delay traces.