Speaker Diarization and Recognition for RadioPlays with SpeechBrain

home › theses & projects › Speaker Diarization and Recognition for RadioPlays with SpeechBrain

Speaker Diarization and Recognition for RadioPlays with SpeechBrain

Status

Open

Type

Master Thesis

Announcement date

13 Aug 2021

Mentors

Martin Hagmüller

Research Areas

Speech Communication

Short description

Radio plays have long been analyzed in literature studies based soley on the written author script. This has change recently to analyzing the interpretation of the play. Speech technology can provide tools for literature scientists to automatically transcribe the audio recording and do further analysis to gain insight into the character of the play. The aim of this thesis is to do speech diarization (i.e. who spoke when) for later recognition of the the spoken text.

SpeechBrain is a new speech recognition framework that was released in 2021. It is written in Python and uses PyTorch as its machine learning backend.

Your Tasks

Implementation of the speaker diarization and recognition for radioplays with SpeechBrain (Python).
Detailed evaluation of the algorithm(s).

Your Profile/Prerequisites

Motivation and interest in the topic.
Solid Background in Speech Processing, ideally you completed some of our speech communication courses
Experience with either Python.

Contact:

Martin Hagmüller (hagmueller@tugraz.at or 0316-873 4377)