Perceptual-Domain Coding of Speech and Audio

home › research projects › Perceptual-Domain Coding of Speech and Audio

Perceptual-Domain Coding of Speech and Audio

Period

2006 — 2007

Funding

Fonds zur Förderung der wissenschaftlichen Forschung, FWF (Österreich)

Partners

Kungliga Tekniska Högskolan, KTH (Schweden)
University of Cambridge (Vereinigtes Königreich)
Institut für Signalverarbeitung und Sprachkommunikation

Research Areas

Speech Communication

The goal of the proposed research is the development of a new and efficient source coder for speech and audio signals based on the approach of coding in the perceptual domain. In this approach the signal is transformed into an auditory representation by passing it through a model of the human peripheral auditory system. The auditory representation is quantized and encoded for an efficient digital transmission or storage. Upon decoding the auditory representation is then transformed back into the acoustic domain using an inverse of the auditory model. Auditory modeling and research on perceptual-domain coding provides insight into human perception and facilitates the extraction of signal features that are most relevant to the listener. The gained findings not only yield a new coding method for transmission and storage but importantly assist the development of next-generation hearing aids and cochlear implants. The interdisciplinarity of perceptual-domain coding calls for consultation and cooperation with experts from information theory as well as hearing physiology. In collaboration with Professor Bastiaan Kleijn and his research group in Stockholm, an optimum quantizer for the encoding of auditory representations should be designed. By the cooperation with Professor Roy Patterson in Cambridge, a more accurate auditory model should be investigated and incorporated into the perceptual-domain coder.