Perceptual-Domain Coding of Speech and Audio

The goal of the proposed research is the development of a new and efficient source coder for speech and audio signals based on the approach of coding in the perceptual domain. In this approach the signal is transformed into an auditory representation by passing it through a model of the human peripheral auditory system. The auditory representation is quantized and encoded for an efficient digital transmission or storage. Upon decoding the auditory representation is then transformed back into the acoustic domain using an inverse of the auditory model. Auditory modeling and research on perceptual-domain coding provides insight into human perception and facilitates the extraction of signal features that are most relevant to the listener. The gained findings not only yield a new coding method for transmission and storage but importantly assist the development of next-generation hearing aids and cochlear implants. The interdisciplinarity of perceptual-domain coding calls for consultation and cooperation with experts from information theory as well as hearing physiology. In collaboration with Professor Bastiaan Kleijn and his research group in Stockholm, an optimum quantizer for the encoding of auditory representations should be designed. By the cooperation with Professor Roy Patterson in Cambridge, a more accurate auditory model should be investigated and incorporated into the perceptual-domain coder. 


Institut für Signalverarbeitung und Sprachkommunikation
Kungliga Tekniska Högskolan, KTH (Schweden)
University of Cambridge (Vereinigtes Königreich)
Funding Program: 
Fonds zur Förderung der wissenschaftlichen Forschung, FWF (Österreich)
AKG Logo
Research Area: 
2006 - 2007