Voice conversion for Dysphonic and Electrolaryngeal Speech
- Status
- In work
- Student
- Benedikt Mayrhofer
- Mentors
- Research Areas
Voice plays a fundamental role in human communication, not only serving a functional purpose but also shaping personal identity and social interaction. Voice disorders, such as dysphonia or conditions resulting from laryngeal cancer, can severely impact the ability to communicate, often leading to social isolation and psychological burdens. In cases requiring a laryngectomy, patients rely on electro-larynx (EL) devices, which generate unnatural, robotic speech that hinders effective interaction. This research explores the potential of voice conversion (VC) models to enhance speech quality for individuals with pathological voices, bridging the gap between assistive technology and natural communication. While state-of-the-art VC models exist, few are optimized for medical applications, particularly in real-time streaming scenarios. A key focus of this work is developing low-latency, high-quality VC models tailored for pathological speech, including EL voice conversion. By improving the efficiency and adaptability of VC systems, this research aims to push the boundaries of speech synthesis and enable real-world applications that enhance communication for individuals with voice disorders.