Speech/Non-Speech Detection for Electro-Larynx Speech Using EMG
- Sun, Feb 01, 2015
In this work we use electromyographic (EMG) signals to investigate speech/non-speech detection for EL speech. The muscle activity, which is represented by the EMG signal, correlates with the intention to produce speech sounds and therefore, the short-term energy can serve as a feature to make a speech/non-speech decision.
We developed a data acquisition hardware to record EMG signals using surface electrodes. We then recorded a small database with parallel recordings of EMG and EL speech and used different approaches to classify the EMG signal into speech/non-speech sections. We compared the following envelope calculation methods: root mean square, Hilbert envelope, and low-pass filtered envelope, and different classification methods: single threshold, double threshold and a Gaussian mixture model based classification.
In the figure you can see activity detection (AD) using the recorded EMG signals during speech production. The upper plot shows single threshold detection and the lower plot double threshold detection.
This study suggests that the results are speaker dependent, i.e. they strongly depend on the signal-to-noise ratio of the EMG signal.
We show that a low-pass filtered envelope together with double threshold detection outperforms the rest.
More information can be found in our paper!