Automatic Speech Recognition for Air Traffic Controllers

Project Type: Master/Diploma Thesis
Student: Ackerl Mario


 The goal of this thesis is to investigate whether automatic speech recognition can reliably be used in air traffic control as support for air traffic controllers in documenting their work. Despite the simple structure of the air traffic control language, earlier studies could not recommend this technology to the authorities so far. In this work, currrent solutions to the known problems such as speaker adaptation, statistical language models or discriminative training are evaluated on a database of realistic air traffic control recordings. A framework for training and evaluation of a triphone-based speech recognizer was implemented based on available toolkits. This framework can be used with different speech databases and provides control over all stages of statistical modelling. Furthermore, it supports flat-start and bootstrapping training strategies, statistical uni-, bi-, and trigram language models, speaker adaptation, and discriminative acoustic training. The best results were achieved with a combination of bootstrapping, speaker adaptation, and a bi-gram language model. A word recognition rate of 95% and a sentence recognition rate of 70% were measured averaged over 10 speakers. This result is a significant improvement compared to earlier studies performed on the same database and a big step forward towards applicability in air traffic control.