Measurement of Impact of Pop Sound Distortions on Automatic Speech Recgonition
- Status
- Open
- Type
- Master Project
- Announcement date
- 28 Sep 2016
- Mentors
- Research Areas
Description
Pop sounds are a common problem in audio signal acquisition. They occur when the diaphragm of the microphone is deflected too much. This effect introduces significant distortion to the signal. Several approaches to reduce the are available, but they only work at a certain size of distance to the diaphragm.
We are interested in pop reduction and suppression for microphones in very small enclosures, such as mobile phones or dictation devices, where there is no room for conventional pop protection. This kind of devices are often used for automatic speech recognition and pop sound distortions significantly increase the word error rate.
While there exists a standard for measuring the pop sensitivity for microphones, there is no measurement procedure for the impact of pop distortions on speech recognition.
You will be developing a measurement procedure to evaluate the impact of pop sound on automatic speech recognition to be able to test the effectiveness of measures to reduce the impact of pop sounds at different stages of the speech recognition signal chain.
Your tasks
- Review of literature
- measurement procedure using variable emphasis of plosive sounds
- Running a standard KALDI ASR recipe for TIMIT
- Documentation
Your profile/prerequisites
- Audio engineering
- Speech processing
- Signal Processing
- Should be finished by the end of WS 2016/2017