Voice conversion for Dysphonic and Electrolaryngeal Speech

home › phd theses › Voice conversion for Dysphonic and Electrolaryngeal Speech

Voice conversion for Dysphonic and Electrolaryngeal Speech

Status

In work

Student

Mentors

Research Areas

Voice plays a fundamental role in human communication, not only serving a functional purpose but also shaping personal identity and social interaction. Voice disorders, such as dysphonia or conditions resulting from laryngeal cancer, can severely impact the ability to communicate, often leading to social isolation and psychological burdens. In cases requiring a laryngectomy, patients rely on electro-larynx (EL) devices, which generate unnatural, robotic speech that hinders effective interaction. This research explores the potential of voice conversion (VC) models to enhance speech quality for individuals with pathological voices, bridging the gap between assistive technology and natural communication.

While state-of-the-art VC models exist, few are optimized for medical applications, particularly in real-time streaming scenarios. A key focus of this work is developing low-latency, high-quality VC models tailored for pathological speech, including EL voice conversion. By improving the efficiency and adaptability of VC systems, this research aims to push the boundaries of speech synthesis and enable real-world applications that enhance communication for individuals with voice disorders.