Guest lecture by Pejman Mowlaee

"Recent Advances in Machine Listening in Multisource Reverberant Environments"

 

Abstract:

In recent years, we are witnessing an increasing trend towards the necessity for truly immersive communication, leading to a genuine desire of bringing more naturalness and transparency in mediated communication. More specifically, enhancing the single-channel recorded speech signal of a target speaker in multi source reverberant environment is a challenge and is of high importance in designing robust systems for target applications, to name a few, automatic speech and speaker recognition and speech capture for immersive communication.
In this presentation, we will start from co-channel speech separation problem, we look forward for more challenging scenarios where we have additive background noise as well as reverberation phenomena. I will present two contributions for the co-channel speech separation from my PhD thesis [1]: including presenting a new estimator for speech-speech interaction model [2], and the concept of employing sinusoidal modeling for separation purposes [3]. As the second step toward a real-life communication challenge, we move forward to describe a multisource reverberant environment, where we are interested to focus on a singlespeaker termed as the target speaker. I will describe what the ongoing challenges are, and present my contribution for solving the problem. Separation and enhancement results of the proposed systems are demonstrated for Machine Listening in Multisource environment (CHiME) challenge and the third community-based signal Separation Evaluation Campaign (SiSEC 2011), recently introduced by PASCAL network. In the two mentioned steps, we only deal with enhancing the magnitude spectrum of the desired signal, while the phase is unaltered and directly used for the reconstruction stage. I will describe the two impacts  aused by the phase information in a separation system, and present the recent result for phase-estimation [4] as well as phase-based estimator [5]. Finally, we have a short discussion on the issue of estimating the sound quality of a speech communication system [2], [6].

 

Biography:

Pejman Mowlaee received the B.Sc. and M.Sc. degrees both with distinctions from Guilan University, Rasht, Iran, and Iran University of Science and Technology, Tehran, in 2005 and 2007, respectively. He received his Ph.D. degree at Aalborg University, Aalborg, Denmark in 2010 where he was granted by the European Union via a Marie Curie Fellowship (ESTSIGNAL program). He is now a Marie Curie postdoctoral fellow for Audiology in signal processing (AUDIS) project, at Institute of Communication Acoustics, Ruhr Universitat Bochum, Germany. AUDIS is funded by the European Union under the Framework 7 People Marie-Curie Programme. His research interests include digital signal processing theory and methods with application to machine learning and speech signal processing, in particular speech separation and enhancement. Dr. Mowlaee has received several awards during his academic career including Young Researchers Award for the M.Sc. degree, Honored electrical engineering M.Sc. thesis in nation-wide contest between Iranian electrical engineering students, and Young Researchers Award from Tehran polytechnic.

 

References:

[1] P. Mowlaee, New Stategies for Single-channel Speech Separation, Ph.D. thesis, Institut for Elektroniske Systemer, Aalborg Universitet, 2010.
[2] P. Mowlaee, R. Saeidi, Z. H. Tan, M. G. Christensen, T. Kinnunen, S. H. Jensen, and P. Fr¨anti, “A joint approach for single-channel speaker identification and speech separation,” to appear in IEEE Trans. Audio, Speech, and Language Process., 2012.
[3] P. Mowlaee, M. Christensen, and S. Jensen, “New results on single-channel speech separation using sinusoidal modeling,” IEEE Trans. Audio, Speech, and Language Process., vol. 19, no. 5, pp. 1265 – 1277, 2011.
[4] P. Mowlaee, R. Saiedi, and R. Martin, “Phase estimation for signal reconstruction in single-channel speech separation,” in Proceedings of the International Conference on Spoken Language Processing, 2012.
[5] P. Mowlaee, , and R. Martin, “On phase importance in parameter estimation for single-channel source separation,” in Proceedings the International Workshop on Acoustic Signal Enhancement (IWAENC), 2012.
[6] P. Mowlaee, R. Saeidi, M. G. Christensen, and R. Martin, “Subjective and objective quality assessment of single-channel speech separation algorithms,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Mar. 2012, pp. 69–72.



Date with Time
11. July 2012 - 11:30
Contact
Location
Seminar room IDEG134, Inffeldgasse 16c, ground floor