Signal Processing and Speech Communication Laboratory
homeevents › Guest Lecture: Junichi Yamagishi

Guest Lecture: Junichi Yamagishi

Start date/time
Tue Nov 18 10:00:00 2014
End date/time
Tue Nov 18 10:00:00 2014
Location
IC01074 Inffeldgasse 16b, first floor
Contact

    Prof. Dr. Junichi Yamagishi from National Institute of Informatics, Japan, and The Centre for Speech Technology Research, Univeristy of Edinburgh will present his work

    Deep, deep, deep architecture for speech synthesis on Tuesday, November 18th 2014, 11:00, in our seminar room IC01074, Inffeldgasse 16b, first floor.

    Abstract: The current statistical parametric speech synthesis typically uses hidden Markov models (HMMs) to represent probability densities of speech trajectories given texts. A new approach that speech synthesis researchers pay strong attentions now is deep learning, that is, a deep neural network (DNN) and there are several emerging attempts based on DNNs especially for acoustic modeling and prosody modeling. In this talk, after we overview HMM speech synthesis and its benefits, we introduce our latest approach using multi-DNNs where 1) we use a deep denoising auto-encoder for nonlinear feature extraction from spectra instead of the conventional linear mel-cepstral analysis, 2) we then use a DNN to learn the relationship between input texts and the extracted features instead of decision tree-based state tying, and furthermore 3) we use another DNN to model the conditional probability of the spectral differences between natural and synthetic speech and to reconstruct the spectral fine structures.