Anthropomorphic Coding of Speech and Audio

Antropomorphic signal processing develops computational models of human communication modalities that emulate the physiological processes of their natural counterpart. Widely known examples are found in articulatory models for speech synthesis and hearing models for recognition. In speech and audio coding, the decoder's task is to synthesize signals that evoke the same auditory response as the original signal, independent of its source. While a lot is known about human audition and the related neural code, resynthesis of audible waveforms from such code has been achieved only recently. We develop one such auditory model inversion approach and investigate its application to speech and audio coding. It exhibits surprisingly low sensitivity to amplitude quantization errors and to random channel erasures as would be encountered during transmission over heterogenous communication networks where a specific quality of service is hard to guarantee. 

Kungliga Tekniska Högskolan, KTH (Schweden)
Funding Program: 
Technische Universität Graz , Erzherzog Johann Universität, TU Graz (Österreich)