Signal Processing and Speech Communication Laboratory
hometheses & projects › Automatic detection of laughter in conversational speech

Automatic detection of laughter in conversational speech

In work
Master Thesis
Announcement date
06 Dec 2018
Witold Łuszcz
Research Areas

Automatic speech recognition (ASR) systems were originally designed to cope with carefully pronounced speech. Most real world applications of ASR systems, however, require the recognition of spontaneous, conversational speech (e.g., dialogue systems, voice input aids for physically disabled, medical dictation systems, etc.). Compared to prepared speech, conversational speech contains utterances that might be considered ‘ungrammatical’ and contain disfluencies such as “…oh, well, I think ahm exactly …”. Moreover, in spontaneous conversation, people laugh a lot and they often speak and laugh at the same time. In addition, it is very likeley that the conversation partner laughs at the same time. To recognize the words spoken during laughter is especially challenging for speech recognition.

The aim of this thesis is to build a tool that detects laughter in natural conversations. For this purpose, different sets of acoustic features shall be compared on a given machine learning technique (e.g., Random Forests, SVMs). The created tool will have to be documented in such a way that in the course of the project, it can be incorporated into the Speech Recognizer currently being developed at our department.

Requirements: The candidate should have a background in Automatic Speech Recognition (e.g., completed Speech Communication 2), be interested in speech processing and have excellent programming skills (e.g, Python, C++ and/or R). TEAMS are very welcome!