Keyword spotting using resource-efficient deep learning
- In work
- Master Thesis
- Announcement date
- 12 Dec 2019
- Research Areas
Automatic speech recognition (ASR) is becoming an increasingly important technology for user interaction with everyday consumer devices. However, ASR systems are typically complex and computation-intensive and running ASR in always-on mode would result in a steady high energy consumption. This is especially problematic for mobile devices whose batteries would be drained at unacceptable rates or where running an ASR system is simply not possible due to resource restrictions. A common solution to this problem is to permanently run a low-cost keyword spotting (KWS) system that is listening only for a limited set of prespecified keywords. Upon detection of such a keyword, a full ASR system is triggered which then listens for a rich set of user commands.
The requirements of a KWS system are immediate. (i) The system should be resource-efficient to mitigate the abovementioned energy problem, (ii) it should run in realtime, and (iii) it should be accurate to maintain a high user-experience.
In this thesis, different machine learning techniques to achieve all of these requirements should be implemented and compared. The focus of this thesis lies on deep neural network models and especially on techniques to reduce the computational overhead of these models.
This is a funded thesis that will be done in collaboration with Infineon.
- Good programming skills (Matlab or Python; preferably experience with numpy and Tensorflow/Pytorch or similar frameworks)
- Strong interest in Machine Learning, especially in deep learning and sequential data (at least absolved CI/EW or similar lectures)
 Y. Zhang, N. Suda, L. Lai, and Vikas Chandra, “Hello Edge: Keyword Spotting on Microcontrollers”, arXiv: CoRR abs/1711.07128 (2017)
 W. Roth, G. Schindler, M. Zöhrer, L. Pfeifenberger, R. Peharz, S. Tschiatschek, H. Fröning, F. Pernkopf, Z. Ghahramani, “Resource-Efficient Neural Networks for Embedded Systems”, arXiv: CoRR abs/2001.03048 (2020)