Sample Based Glottal Excitation Signal Database
- Master Thesis
- Announcement date
- 06 Mar 2013
- Research Areas
A widely used model for natural speech production is the source-filter model. It is described by the larynx and the containing vocal folds which represents the source of excitation and the vocal tract as the linear acoustic filter. There are several approaches to model the glottal excitation signal including simple impulse trains, physical models (One-Mass model) and waveform models (Liljencrants-Fant model).
In speech synthesis that is based on the source-filter theory, it is known that the quality suffers due to an over-simplified excitation model. It can be shown that glottal source pulses computed from real speech using inverse filtering improves the task compared to the models mentioned above .
This leads to the conclusion that also electro-larynx speech, where the source of human speech is replaced by a hand-held, battery-driven device, can profit from an excitation signal from a database with real glottal source pulses.
- Review of literature
- Creating glottal excitation signal database
- Comparison to other approaches
- Motivation and interest in the topic
- Background information in Matlab and (speech) signal processing
K. Matsui, S. Person, K. Hata, and T. Kamai, “Improving naturalness in text-to-speech synthesis using natural glottal source,” in Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Converence on, pp. 769-772 vol.2.