Speech Enhancement for Disordered and Substitution Voices

home › phd theses › Speech Enhancement for Disordered and Substitution Voices

Speech Enhancement for Disordered and Substitution Voices

Status

Finished

Date

2009-09-21

Student

Martin Hagmüller

Mentor

Gernot Kubin

Research Areas

Speech Communication

This thesis presents methods to enhance the speech of patients with voice disorders or with substitution voices. The first method enhances speech of patients with laryngeal neoplasm. The enhancement enables a reduction of pitch and a strengthening of the harmonics of voiced segments as well as decreasing the perceived speaking effort.

The need for reliable pitch mark determination on disordered and substitution voices led to the implementation of a state-space based algorithm. Its performance is comparable to a state-of-the art pitch detection algorithm but does not require post processing.

A subsequent part of the thesis deals with alaryngeal speech, with a focus on Electro-Larynx (EL) speech. After investigating an EL speech production model, which takes into account the common source of the speech signal and the directly radiated EL (DREL) sound, a solution to suppress the direct sound is based on the different temporal properties of the propagation paths. Time-invariant signal components, which can be attributed to the DREL sound are filtered out in the modulation frequency domain.

Another issue with EL speech production has been addressed, namely the flat F0 contour. Based on the observation that prosodic information is conveyed in whispered speech, we have assumed that formants can be used as substitute intonation cues. We therefore derive an artificial F0 contour from the speech formants and impose it on the EL speech signal. The artificial intonation contour was preferred in a subjective listening test.