Guest Lecture: Michael Pucher

 

Michael Pucher from Telecommunications Research Center Vienna FTW will give a guest lecture about

“Speech processing for multimodal and adaptive systems"

on Friday, May 29th, 11:00, in our Seminarroom IGI IC01074, Inffeldgasse 16b, first floor.

Abstract

In this talk I will present research results in multimodal speech processing achieved by me and my group at Telecommunications Research Center Vienna (FTW).

In the field of acoustic speech synthesis I will show how supervised and unsupervised variety interpolation can be implemented, which allows for the generation of in-between varieties. Furthermore, I will discuss how to adapt models to a speaker’s social and regional variety. In audio-visual speech synthesis, joint audio-visual modeling and speaker adaptation will be presented. With these methods, we are able to use the available training data in a more efficient way than with the standard method. For speaker verification, I will show how adaptive speech synthesis systems can break into a state-of-the-art speaker verification system, and how this can be blocked by an adaptive speaker verification system. 
 
 

Biography

 
Michael Pucher received the Ph.D. degree from Graz University of Technology in 2007 with a thesis on semantic language modeling for speech recognition. He is a Senior Researcher and Project Manager at the Telecommunications Research Center Vienna (FTW). His research interests are speech synthesis and recognition, multimodal dialog systems, and sensor fusion. He has authored and co-authored more than 40 refereed papers in international conferences and journals. In 2010, he was involved in the commercial development of Leopold, the first synthetic voice for Austrian German. Dr. Pucher was awarded a research grant from the Austrian Science Fund (FWF) for the project “Adaptive Audio-Visual Dialect Speech Synthesis” (AVDS) in 2011. Modern technology allows speech communication from anywhere to anywhere. With phone booths a relic of the past, speech intelligibility has become a common problem, particularly when the listener side is noisy. We will show that it is possible to enhance intelligibility in a noisy listener environment using a formal, information-theory based approach. The new paradigm leads to a family of intelligibility-enhancement algorithms, some of which resemble existing heuristically-derived methods.

 



Date with Time
29. May 2015 - 11:00
Contact
Location
Seminar room IGI IC001074 (Inffeldgasse 16b - first floor)