Homophone disambiguation for conversational Austrian German

home › theses & projects › Homophone disambiguation for conversational Austrian German

Homophone disambiguation for conversational Austrian German

Status

Open

Type

Master Thesis

Announcement date

14 Jul 2020

Mentors

Barbara Schuppler

Research Areas

Speech Communication

Topic Description

One serious problem for Automatic Speech Recognition (ASR) is the disambiguation of homophones, i.e., words of different meaning but similar pronunciation. Homophone disambiguation is usually done via the words context in the sentence: the word which best matches the meaning of the whole sentence is chosen. Whereas this method reaches good results in read speech, it fails in conversational, spontaneous speech, where sentences are often short, not realized syntactically complete or even grammatically incorrect. It may seem, that the problem is not solvable for spontaneous speech. In recent studies, however, we found that words which are homophonic in read speech are often pronounced differently or reduced in spontaneous speech. The word DAS, for instance, is likely to be pronounced as ’s Haus when used as article in a sentence like “I hab mir ’s Haus schon angschaut”, but as das Haus when used as demonstrative in a sentence like “ Das Haus hab ich mir angschaut”.

The aim of this thesis is to build an automatic classification tool which is able to disambiguate homophones on the basis of acoustic features extracted from the speech signal. Furthermore, the classifier shall be incorporated into our KALDI speech recognizer, in order to test whether in yields improvements for speech recognition.

Your Tasks

Review of literature
Incorporate existing pronunciation lexicon into pronunciation modeling component of an ASR system
System tests on different speech styles
Documentation

Your Requirements

Motivation and interest in the topic
Recommended: Speech Communication 2 or
Recommended: Spoken language in human and human computer dialogue;
or to attempt one of the Speech Communication courses in parallel to writing the thesis.