Facial SEMG for Speech Recognition Inter-Subject Variation

Sridhar P. Arjunan, Dinesh K. Kumar, Wai C. Yau, Hans Weghorn

2006

Abstract

The aim of this project is to identify speech using the facial muscle activity and without audio signals. The paper presents an effective technique that measures the relative muscle activity of the articulatory muscles. The paper has also tested the performance of this system for inter subject variation. Three English vowels were used as recognition variables. This paper reports using moving root mean square (RMS) of surface electromyogram (SEMG) of four facial muscles to segment the signal and identify the start and end of the utterance. The RMS of the signal between the start and end markers was integrated and normalised. This represented the relative muscle activity, and the relative muscle activities of the four muscles were classified using back propagation neural network to identify the speech. The results show that this technique gives high recognition rate when used for each of the subjects. The results also indicate that the system accuracy drops when the network trained with one subject is tested with another subject. This suggests that there is a large inter-subject variation in the speaking style for similar sounds. The experiments also show that the system is easy to train for a new user. It is suggested that such a system is suitable for simple commands for human computer interface when it is trained for the user.

References

  1. Basmajian, J.V., Deluca, C.J.: Muscles Alive; Their Functions Revealed by Electromyography. Fifth Edition. (1985)
  2. Chan, D.C., Englehart, K., Hudgins, B., Lovely, D. F.: A multi-expert speech recognition system using acoustic and myoelectric signals. 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] EMBS/BMES Conference (2002).
  3. Kumar, S., Kumar, D.K., Alemu, M., Burry, M.: EMG based voice recognition. Intelligent Sensors, Sensor Networks and Information Processing Conference (2004).
  4. Manabe,H., Hiraiwa, A., Sugimura, T.: Unvoiced speech recognition using SEMG - Mime Speech Recognition. CHI (2003).
  5. Veldhuizen, I.J.T., Gaillard, A.W.K., de Vries, J.: The influence of mental fatigue on facial EMG activity during a simulated workday. Vol. 63. Journal of Biological Psychology (2003).
  6. Fridlund, A.J., Cacioppo, J.T.: Guidelines for Human Electromyographic research. Vol. 23(5). Journal of Psychophysiology (1986).
  7. Lapatki, G., Stegeman,D. F., Jonas, I. E.: A surface EMG electrode for the simultaneous observation of multiple facial muscles. Vol 123. Journal of Neuroscience Methods (2003).
  8. Thomas .W. Parsons: Voice and speech processing (1986).
  9. Eric W. Weisstein:Durand's Rule. From MathWorld-A Wolfram Web Resource. http://mathworld.wolfram.com/DurandsRule.html
  10. David Freedman, Robert Pisani, Roger Purves.: Statistics. Third Edition
Download


Paper Citation


in Harvard Style

P. Arjunan S., K. Kumar D., C. Yau W. and Weghorn H. (2006). Facial SEMG for Speech Recognition Inter-Subject Variation . In Proceedings of the 2nd International Workshop on Biosignal Processing and Classification - Volume 1: BPC, (ICINCO 2006) ISBN 978-972-8865-67-2, pages 3-12. DOI: 10.5220/0001208600030012


in Bibtex Style

@conference{bpc06,
author={Sridhar P. Arjunan and Dinesh K. Kumar and Wai C. Yau and Hans Weghorn},
title={Facial SEMG for Speech Recognition Inter-Subject Variation},
booktitle={Proceedings of the 2nd International Workshop on Biosignal Processing and Classification - Volume 1: BPC, (ICINCO 2006)},
year={2006},
pages={3-12},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001208600030012},
isbn={978-972-8865-67-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Workshop on Biosignal Processing and Classification - Volume 1: BPC, (ICINCO 2006)
TI - Facial SEMG for Speech Recognition Inter-Subject Variation
SN - 978-972-8865-67-2
AU - P. Arjunan S.
AU - K. Kumar D.
AU - C. Yau W.
AU - Weghorn H.
PY - 2006
SP - 3
EP - 12
DO - 10.5220/0001208600030012