Authors:
Anna Katharina Fuchs
;
Clemens Amon
and
Martin Hagmüller
Affiliation:
Graz University of Technology, Austria
Keyword(s):
Speech/Non-Speech Detection (SND), Electro-Larynx (EL), Electromyography (EMG).
Related
Ontology
Subjects/Areas/Topics:
Applications and Services
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computer Vision, Visualization and Computer Graphics
;
Detection and Identification
;
Devices
;
Health Information Systems
;
Human-Computer Interaction
;
Medical Image Detection, Acquisition, Analysis and Processing
;
Physiological Computing Systems
;
Real-Time Systems
;
Wearable Sensors and Systems
Abstract:
Electro-larynx speech (EL) is a possibility to re-obtain speech when the larynx is surgically removed or damaged. As currently available devices normally are hand-held, a new generation of EL devices would benefit from a hands-free version. In this work we use electromyographic (EMG) signals to investigate speech/nonspeech detection for EL speech. The muscle activity, which is represented by the EMG signal, correlates with the intention to produce speech sounds and therefore, the short-term energy can serve as a feature to make a speech/non-speech decision. We developed a data acquisition hardware to record EMG signals using surface electrodes. We then recorded a small database with parallel recordings of EMG and EL speech and used different approaches to classify the EMG signal into speech/non-speech sections. We compared the
following envelope calculation methods: root mean square, Hilbert envelope, and low-pass filtered envelope, and different classification methods: single thresh
old, double threshold and a Gaussian mixture model based classification. This study suggests that the results are speaker dependent, i.e. they strongly depend on the signal-to-noise ratio of the EMG signal. We show that using low-pass filtered envelope together with double threshold detection outperforms the rest.
(More)