Real-Time Enhancement of E-Larynx Speech Signals

home › theses & projects › Real-Time Enhancement of E-Larynx Speech Signals

Real-Time Enhancement of E-Larynx Speech Signals

Status

Finished

Type

Master Thesis

Announcement date

01 Jan 2009

Student

Noisternig Thomas

Mentors

Research Areas

Speech Communication

People who have lost their larynx (e.g. due to cancer) have no natural possibility to speak any more, because the vocal folds are not longer available to generate the necessary sound source. A solution to this problem is a mechanical device, which substitutes the missing body part - the so-called electrolarynx. Unfortunately, speaking with an electrolarynx suffers from high background noise caused by the sound of the device itself as well as low intelligibility because of the monotonic frequency the electrolarynx device produces. This work aims to reduce these drawbacks by increasing the quality of the electrolarynx speech signal in real-time using a Texas Instruments TMS320C6713B floating-point digital signal processor (DSP). The implemented quality enhancement targets on increasing the speech quality with two approaches: First, the directly-radiated electrolarynx noise (DREL) component is reduced. Therefore the DSP utilises spectrum-based modulation filtering methods for detecting the signal’s DREL component and removing it from the signal. The separation of signal and noise is hereby achieved via detecting constant spectral components - the electrolarynx generates it’s vibrations at a constant frequency. The second enhancement approach intends to adjust this constant frequency, which otherwise results in an artificial sounding voice with furthermore missing prosodic information. The enhancement algorithm is in such cases able to detect voice variations and apply them to the original, monotonic speech signal. This is done by utilising the speech’s formant contour, as detected by the implemented linear predictive coding (LPC) based formant tracker, to control the electrolarynx in order to generate an analogous pitch contour for the sake of making it sound more natural and understandable. To comply with arising requirements towards real-time ability the implementation process was accompanied with various speed optimisation procedures. Primarily these procedures consisted of finding optimal trade-off settings between accuracy and complexity as well as hard-coding constant coefficients, pre-calculating parameters during initialisation, re-arranging data structures and reducing redundancies.