Target Speaker Separation in a Multisource Environment Using Speaker-dependent Postfilter and Noise Estimation

TitleTarget Speaker Separation in a Multisource Environment Using Speaker-dependent Postfilter and Noise Estimation
Publication TypeConference Proceedings
Year of Publication2013
AuthorsMowlaee, P., & Saeidi R.
Conference NameIEEE Int. Conf. Acoustics, Speech, Signal Processing, May. 2013
Pages7254-7258
Date Published2013
Conference LocationVancouver, Canada
Abstract

In this paper, we present a novel system for enhancing a target speech corrupted in a non-stationary real-life noise scenario. The proposed system consists of one spatial beamformer based on GCC-PHAT-estimated time-delay of arrival followed by three postfilters applied in a sequential way, namely: Wiener filter, minimum mean square error estimator (MMSE) of the log-amplitude, and a model-driven postfilter (MDP) that relies on particular speech signal statistics captured by target speaker Gaussian mixture model. The beamformer accounts for the directional interferences while the MMSE speech enhancement suppresses the stationary background noise, and MDP contributes to suppress the non-stationary sources from the binaural mixture. In our evaluation, multiple objective quality metrics are used to report the speech enhancement and separation performance, averaged on the CHiME development set. The proposed system performs better than standard state-of-the-art techniques and shows comparable performance with other systems submitted to the CHiME challenge. More precisely, it is successful in suppressing the non-stationary interfering sources at different SNR levels supported by the relatively high scores for signal-to-interference-ratio.

 

URLhttps://ieeexplore.ieee.org/document/6639071/
Citation KeyICASSP2013c
Refereed DesignationRefereed
AttachmentSize
ICASSP2013_Matlab.rar42.63 KB