Automatic Speech Recognition
- Education level
- Master
- Term
- Summer
- Lecturers
General Information
This lecture course addresses the issue of automatic speech recognition, i.e. mapping an acoustic signal to a sequence of words. First various concepts of pattern recognition are introduced including hidden Markov models and language models built from large corpora of (symbolic) text. The course Speech Communication I is not required for this course.
The current lecture material can be found in the TeachCenter.
Contents
- Introduction to Automatic speech recognition (ASR)
- Speech Production & Phonetics
- Feature extraction
- Classification
- Estimation of probability distributions
- Gaussian Mixture Models
- Markov models
- Hidden Markov models (HMMs)
- Grammar models - Language Models
- Decoding (Viterbi decoder)
- Deep Neural Networks for Speech Recognition
Lecture notes (old slides)
- Course overview/Introduction to ASR
- Classification
- Hidden Markov Models
- Duration modeling in HMMs
- Acoustic/phonetic Elements
- Grammar Models
- Word Sequence Decoding
- Sinusoidal Modeling/Harmonic-plus-Noise Model
References
Speech recognition
- D. Jurafsky et al: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice-Hall 2009.
- X. Huang, A. Acero, H.-W. Hon: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR 2001.
Classification
- R.O. Duda and P.E. Hart: Pattern Classification and Scene Analysis. Wiley and Sons, Inc., 1973.
- C. M. Bishop: Pattern Recognition and Machine Learning, Springer, 2006