SPSC Lab

dsc_9375paning_0.jpg

In 2000, the Signal Processing and Speech Communication Laboratory (SPSC Lab) of Graz University of Technology (TU Graz) was founded as a research and education center in nonlinear signal processing and computational intelligence, algorithm engineering, as well as circuits & systems modeling and design. It covers applications in wireless communications, speech/audio communication, and telecommunications.

The Research of SPSC Lab addresses fundamental and applied research problems in five scientific areas:

 

Result of the Month February 2016

Previous results of the month

system_overview_and_gsc.png

Recognizing speech under noisy condition is an ill-posed problem. The CHiME3 challenge targets robust speech recognition in realistic environments such as street, bus, caffee and pedestrian areas. We study variants of beamformers used for pre-processing multi-channel speech recordings.  In particular, we investigate three variants of generalized sidelobe  canceller (GSC) beamformers, i.e.  GSC with sparse blocking matrix (BM), GSC with adaptive BM (ABM), and GSC with minimum variance distortionless response (MVDR) and ABM. Furthermore, we apply several postfilters to further enhance the speech signal. We introduce  MaxPower postfilters and deep neural postfilters (DPFs). DPFs outperformed our baseline systems significantly when measuring the overall perceptual score (OPS) and the perceptual evaluation of speech quality (PESQ). In particular DPFs achieved an average relative improvement of $17.54% OPS points and $18.28% in PESQ, when compared to the CHiME3 baseline. DPFs also achieved the best WER when combined with an ASR engine on simulated development and evaluation data, i.e. 8.98% and 10.82% WER. The proposed MaxPower beamformer achieved the best overall WER on CHiME3 real development and evaluation data, i.e. 14.23% and 22.12%, respectively.