Automatic disambiguation of homophones in spontaneous speech

TitleAutomatic disambiguation of homophones in spontaneous speech
Publication TypeConference Paper
Year of Publication2016
AuthorsSchuppler, B., & Schrank T.
Conference NameAccepted for presentation at SLSP Conference
Abstract

Homophones pose serious issues for automatic speech recognition (ASR). In order to deliver high quality ASR output, homophones need to be disambiguated. Homophone disambiguation is usually done by analysing the homophonic word’s context. However, most homophones are not strictly homophonic but differ in phonetic detail, especially when produced in spontaneous conversations. Whereas humans use phonetic detail present in the audio signal to disambiguate homophones, ASR systems usually ignore phonetic detail. In this paper, we show that phonetic detail can be used to automatically disambiguate homophones. For our experiments, we use 3146 homophonic tokens from a corpus of spontaneous German. We collect a set of acoustic features and train a random forest model. Our results show that homophones can be disambiguated reasonably well using acoustic features (71% F1 , 92% accuracy). In particular, this model is able to outperform a model based on lexical context (48% F1 ,89% accuracy). A module using phonetic detail similar to our model is suitable to be integrated in ASR systems in order to improve word recognition.
Index Terms: homophone disambiguation, automatic speech recognition, phonetic detail, spontaneous speech, random forests

Citation Key3464
SPSC cross-references
Research Area: