SILENT BILINGUAL VOWEL RECOGNITION - Using fSEMG for HCI based Speech Commands

Sridhar Poosapadi Arjunan; Hans Weghorn; Dinesh Kant Kumar; Wai Chee Yau

doi:10.5220/0002365400680075

SILENT BILINGUAL VOWEL RECOGNITION - Using fSEMG for HCI based Speech Commands

Sridhar Poosapadi Arjunan, Hans Weghorn, Dinesh Kant Kumar, Wai Chee Yau

2007

Abstract

This research examines the use of fSEMG (facial Surface Electromyogram) to recognise speech commands in English and German language without evaluating any voice signals. The system is designed for applications based on speech commands for Human Computer Interaction (HCI). An effective technique is presented, which uses the facial muscle activity of the articulatory muscles and human factors for silent vowel recognition. The difference in the speed and style of speaking varies between experiments, and this variation appears to be more pronounced when people are speaking a different language other than their native language. This investigation reports measuring the relative activity of the articulatory muscles for recognition of silent vowels of German (native) and English (foreign) languages. In this analysis, three English vowels and three German vowels were used as recognition variables. The moving root mean square (RMS) of surface electromyogram (SEMG) of four facial muscles is used to segment the signal and to identify the start and end of a silently spoken utterance. The relative muscle activity is computed by integrating and normalising the RMS values of the signals between the detected start and end markers. The output vector of this is classified using a back propagation neural network to identify the voiceless speech. The cross-validation was performed to test the reliability of the classification. The data is also tested using K-means clustering technique to determine the linearity of separation of the data. The experimental results show that this technique yields high recognition rate when used for all participants in both languages. The results also show that the system is easy to train for a new user and suggest that such a system works reliably for simple vowel based commands for human computer interface when it is trained for a user, who can speak one or more languages and for people who have speech disability.

References

Arjunan, S.P., Kumar, D.K., Yau, W.C., Weghorn, H. (2006). Unvoiced speech control based on vowels detected by facial surface electromyogram. In Proceedings of IADIS international conference eSociety 2006, Dublin, Ireland, Vol. I, pp. 381-388.
Basmajian, J. V. and Deluca, C. J. (1985). Muscles Alive: Their Functions Revealed by Electromyography. Williams and Wilkins, Baltimore, fifth edition.
Beyer, W. H. (Ed.) (1987). CRC Stndard Mathematical Tables. CRC press, Boca Raton, 28th edition. p. 127
Chan, A., Englehart, K., Hudgins, B., and Lovely, D. (2002). A multi-expert speech recognition system using acoustic and myoelectric signals. In Proceedings of 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society, Ottawa, Canada, Vol. 1, pp. 72-73. IEEE.
Freedman, D., Pisani, R., and Purves, R. (1997). Statistics. Norton College Books, New York, third edition.
Fridlund, A. and Cacioppo, J. (1986). Guidelines for human electromyographic research. In Journal of Psychophysiology, Vol. 23, No. 4, pp. 567-589. The Society for Psychophysiologial Research.
Gutierrez-Osuna, R. (2001). Lecture 13: Validation. In http://research.cs.tamu.edu/prism/lectures/iss Last access: October 2006. Wright State University.
Kumar, S., Kumar, D., Alemu, M., and Burry, M. (2004). EMG based voice recognition. In Proceedings of Intelligent Sensors, Sensor Networks and Information Processing Conference, Melbourne, Australia. IEEE.
Lapatki, B. G., Stegeman, D. F., and Jonas, I. E. (2003). A surface EMG electrode for the simultaneous observation of multiple facial muscles. In Journal of Neuroscience Methods, Vol. 123, No. 2, pp. 117-128.
Manabe, H., Hiraiwa, A., and Sugimura, T. (2003). Unvoiced speech recognition using SEMG- mime speech recognition. In ACM Conference on Human Factors in Computing Systems, Ft.Lauderdaler, Florida, USA, pp. 794-795.
. Ursula, H. and Pierre, P. (1998). Facial
cognition? In Cognition and Emotion, Vol. 12 No.4.

Download

Paper Citation

in Harvard Style

Poosapadi Arjunan S., Weghorn H., Kant Kumar D. and Chee Yau W. (2007). SILENT BILINGUAL VOWEL RECOGNITION - Using fSEMG for HCI based Speech Commands . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 5: ICEIS, ISBN 978-972-8865-92-4, pages 68-75. DOI: 10.5220/0002365400680075

in Bibtex Style

@conference{iceis07,
author={Sridhar Poosapadi Arjunan and Hans Weghorn and Dinesh Kant Kumar and Wai Chee Yau},
title={SILENT BILINGUAL VOWEL RECOGNITION - Using fSEMG for HCI based Speech Commands},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 5: ICEIS,},
year={2007},
pages={68-75},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002365400680075},
isbn={978-972-8865-92-4},
}

in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 5: ICEIS,
TI - SILENT BILINGUAL VOWEL RECOGNITION - Using fSEMG for HCI based Speech Commands
SN - 978-972-8865-92-4
AU - Poosapadi Arjunan S.
AU - Weghorn H.
AU - Kant Kumar D.
AU - Chee Yau W.
PY - 2007
SP - 68
EP - 75
DO - 10.5220/0002365400680075