Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

TitleDeep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge
Publication TypeConference Paper
Year of Publication2016
AuthorsSchrank, T., Pfeifenberger L., Zöhrer M., Stahl J., Mowlaee P., & Pernkopf F.
Conference NameCHiME 4 Workshop
Abstract

Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed system achieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively

Citation Key3536