Signal Processing and Speech Communication Laboratory
hometheses & projects › Integration of Prosodic Features to Automatic Speech Recognition Systems

Integration of Prosodic Features to Automatic Speech Recognition Systems

Status
In work
Type
Master Project
Announcement date
12 Oct 2022
Student
Pablo Melendez
Mentors
Research Areas

Abstract:

Classical automatic speech recognition (ASR) systems are based on well-developed feature sets which provide satisfactory representations of phonetic units. On the other hand, studies on prosody demonstrate how long-term acoustic features transport important meaning. Based on available ASR Kaldi recipes for conversational Austrian German, this work should compare different feature extraction methods by adding acoustic features which relate to prosody to given baseline systems. Teams are welcome!

Your Tasks:

  • get to know the Kaldi speech recognition toolkit
  • reproduce baseline speech recognition results for conversational Austrian German (GRASS)
  • provide new speech recognition results by adding different sets of acoustic features to the ASR pipeline

Your Profile

  • good knowledge in programming (Shell/C++/Python/…) and UNIX tools
  • good knowlegde in data science and machine learning tooling

Contact:

Julian Linke (linke@tugraz.at) Barbara Schuppler (b.schuppler@tugraz.at)