Signal Processing and Speech Communication Laboratory
hometheses & projects › Speaker Diarization and Recognition for RadioPlays with SpeechBrain

Speaker Diarization and Recognition for RadioPlays with SpeechBrain

Status
Open
Type
Master Thesis
Announcement date
13 Aug 2021
Mentors
Research Areas

Short description

Radio plays have long been analyzed in literature studies based soley on the written author script. This has change recently to analyzing the interpretation of the play. Speech technology can provide tools for literature scientists to automatically transcribe the audio recording and do further analysis to gain insight into the character of the play. The aim of this thesis is to do speech diarization (i.e. who spoke when) for later recognition of the the spoken text.

SpeechBrain is a new speech recognition framework that was released in 2021. It is written in Python and uses PyTorch as its machine learning backend.

Your Tasks

  • Implementation of the speaker diarization and recognition for radioplays with SpeechBrain (Python).
  • Detailed evaluation of the algorithm(s).

Your Profile/Prerequisites

  • Motivation and interest in the topic.
  • Solid Background in Speech Processing, ideally you completed some of our speech communication courses
  • Experience with either Python.

Contact:

Martin Hagmüller (hagmueller@tugraz.at or 0316-873 4377)