
 
performed for each subject. To observe any changes 
in muscle activity, the recorded raw EMG signal was 
further processed.  
After the recording process was completed, the 
raw EMG was transferred to Matlab for further 
analysis. Using averaging filter, thresholding was 
done to remove the noise. The RMS (Root Mean 
Square) values of each signal was estimated with ‘s’ 
the window length being 1.5 s. This window size 
was selected as it represented the maximum size of 
the envelope for the vowels spoken by the subjects.  
4 TESTING 
Recognition of EMG based speech features may be 
achieved by applying a supervised artificial neural 
network. The artificial neural network is efficient 
regardless of data quality. Neural networks can learn 
from examples and once trained, are extremely fast 
making them suitable for real time applications 
(Freeman and Skapura, 1991) (Haung, 2001). The 
classification by ANN does not require any 
statistical assumptions of the data. ANNs learns to 
recognize the characteristic features of the data to 
classify the data efficiently and accurately.  
Back Propagation (BPN) type Artificial Neural 
Network has been designed and implemented. The 
advantage of choosing Feed Forward (FF) and BPN 
learning algorithm architecture is to overcome the 
drawback of the standard ANN architecture. 
Augmenting the input by hidden context units, 
which give feedback to the hidden layer, thus giving 
the network an ability of extracting features of the 
data from the training events is one advantage. The 
size of the hidden layer and other parameters of the 
network were chosen iteratively after 
experimentation with the back-propagation 
algorithm. There is an inherent trade off to be made 
more hidden units results in more time required for 
each iteration of training; fewer hidden units results 
in faster update rate. For this study, two hidden layer 
structure were found sufficiently suitable for good 
performance but not prohibitive in terms of training 
time. Sigmoid has been used as the threshold 
function and gradient desent and adaptive learning 
with momentum as training algorithm. A learning 
rate of 0.02 and the default momentum rate was 
found to be suitable for stable learning of the 
network. The training stopped when the network 
converged and the network error is less than the 
target error. The weights and biases of the network 
were saved and used for testing the network. The 
data was divided into subsets of training, validation, 
and test subsets data. One fourth of the data was 
used for the validation set, one-fourth for the test set, 
and one half for the training set. Three RMS values 
of EMG captured during the subject pronounce the 
vowels were defined as inputs to the ANN. The 
output of the ANN was one of the five vowels.  
5 RESULTS AND DISCUSSION 
Table 1: Accuracy of recognition of vowel from EMG 
  /a/  /e/ /i/  /o/ /u/ Average 
Subject 1  97  94  98  93  85  93.4 
Subject 2  91  86  90  85  93  89 
Subject 3  88  89  86  97  95  91 
Table 1 shows the experimental results. The results 
of the testing show that with the system described 
can classify the five vowels with an accuracy of up 
to 91%. The higher classification accuracy is due to 
better discriminating ability of neural network 
architecture and RMS of EMG as the features. At 
the present stage, the method has been tested 
successfully with only three subjects. In order to 
evaluate the intra and inter variability of the method, 
a study on a larger experimental population is 
required. 
6 CONCLUSIONS 
This paper describes a study to recognise human 
speech signal based on the EMG data extracted from 
the three articulatory facial muscles coupled with 
neural networks. Test results show recognition 
accuracy of 91 %. The system is accurate when 
compared to other attempts for EMG based speech 
recognition systems. These preliminary results 
suggest that the study is suitable to develop a real-
time EMG based speech recognition system. This 
would have number of applications such as for voice 
control of machines and toys in noisy environment 
and for people who do not have the gift of speech. It 
would also find other applications such as for noise 
reduction for telephonic conversations in noisy 
environments.  
7 FURTHER WORK 
Authors are currently working with a larger 
population of subjects to determine the inter and 
APPLICABILITY OF FACIAL EMG IN HCI AND VOICELESS COMMUNICATION
381