Streaming voice conversion for the improvement of pathological speech
- Status
- Open
- Type
- Master Thesis
- Announcement date
- 31 Oct 2024
- Mentors
- Research Areas
Abstract
Voice conversion refers to the processing of speech audio such that the speaker identity is modified while the linguistic content remains the same. Recent advances in deep neural networks have revolutionized this field of research, enabling synthesis of speech that is almost undistinguishable from authentic speech by listening. What remains a challenge is the conversion of pathological speech into healthy speech in real time. The aim here is to provide high-quality substitution speech for speaking impaired individuals.
Your Tasks
- convert impaired speech to substitution speech using state-of-the-art streaming voice conversion models
- design and conduct a listening experiment investigating speech intelligibility, as well as speaker and listener preferences
- optional: Create a smartphone app prototype for voice conversion during a telephone call
- documentation of the work (thesis writing, optional: paper writing)
Your Profile
- interest in speech science and technology
- interest in health applications
- good knowlegde in relevant Python frameworks for speech synthesis, voice conversion, and/or representation learning
- good communicatory skills
- optional: experience in mobile app development
Additional information
The thesis is conducted in cooperation with the MedUni Vienna, so (parts of) the work can also be done from Vienna.
Contact
Philipp Aichinger (philipp.aichinger@meduniwien.ac.at) Martin Hagmüller (hagmueller@tugraz.at)