Homophone disambiguation for conversational Austrian German
- Master Thesis
- Announcement date
- 14 Jul 2020
- Research Areas
One serious problem for Automatic Speech Recognition (ASR) is the disambiguation of homophones, i.e., words of different meaning but similar pronunciation. Homophone disambiguation is usually done via the words context in the sentence: the word which best matches the meaning of the whole sentence is chosen. Whereas this method reaches good results in read speech, it fails in conversational, spontaneous speech, where sentences are often short, not realized syntactically complete or even grammatically incorrect. It may seem, that the problem is not solvable for spontaneous speech. In recent studies, however, we found that words which are homophonic in read speech are often pronounced differently or reduced in spontaneous speech. The word DAS, for instance, is likely to be pronounced as ’s Haus when used as article in a sentence like “I hab mir ’s Haus schon angschaut”, but as das Haus when used as demonstrative in a sentence like “ Das Haus hab ich mir angschaut”.
The aim of this thesis is to build an automatic classification tool which is able to disambiguate homophones on the basis of acoustic features extracted from the speech signal. Furthermore, the classifier shall be incorporated into our KALDI speech recognizer, in order to test whether in yields improvements for speech recognition.
- Review of literature
- Incorporate existing pronunciation lexicon into pronunciation modeling component of an ASR system
- System tests on different speech styles
- Motivation and interest in the topic
- Recommended: Speech Communication 2 or
- Recommended: Spoken language in human and human computer dialogue;
- or to attempt one of the Speech Communication courses in parallel to writing the thesis.