
 
cases of tests with MS or FS, this proving a high 
generalization capacity of the combined system. In 
the case of LPC and MFC coefficients, the 
combined-trained database is not so efficient, the 
best results being obtained if the tests are made on 
the same type of database used in the training 
processes. 
The PRR variation follows the WRR variation, 
which was expected, because nothing was especially 
done to enhance PRR. 
It is also obvious the improvement of the results 
when using HMM modeling triphones compared to 
the case of  HMM modeling monophones.  
REFERENCES 
Dumitru, C.O., Gavat, I., 2005. Features Extraction, 
Modeling and Training Strategies in Continuous 
Speech Recognition for Romanian Language, Proc. 
EUROCON,  Belgrade, Serbia & Montenegro, pp. 
1425-1428. 
Dumitru, C.O., Gavat, I., 2005. A Comparative Study of 
Features for Continuous Speech Recognition by 
Statistical Modeling with Monophones and Triphones, 
Proc. SPED, Cluj-Napoca, Romania, pp.73-78. 
Furui, S., 2000. Digital Speech Processing, Synthesis and 
Recognition, 2-end, rev and expanded Marcel Dekker, 
N.Y. 
Gold, B., Morgan, N., 2002. Speech and audio signal 
processing, John Wiley and Sons, N.Y. 
Goronzy, S., 2002. Robust Adaptation to Non-Native 
Accents in Automatic Speech Recognition, Springer – 
Verlag Berlin Heidelberg, Germany. 
Hanson, B.A., Applebaum, T.H., 1990. Robust Speaker-
Independent Word Features Using Static, Dynamic 
And Acceleration Features, Proc. ICASSP, pp. 857-
860. 
Hermansky, H., 1990. Perceptual Linear Predictive 
Analysis of Speech, J. Acoust. Soc. America, Vol.87, 
No.4, pp. 1738-1752. 
Huang, X., Acero, A., Hon, H.W., 2001. Spoken Language 
Processing – A Guide to Theory, Algorithm, and 
System Development, Prentice Hall. 
Huang, C., Chen, T., Chang, E., 2002. Speaker Selection 
Training For Large Vocabulary Continuous Speech 
Recognition, Proc. ICLSP Vol. 1, pp. 609-612. 
Milner, B.A., 2002. Comparison of Front-End 
Configurations for Robust Speech Recognition, ICLSP 
2002 Proceedings, Vol. 1, pp. 797-800.  
Oancea, E., Gavat, I., Dumitru, C.O., Munteanu, D., 2004. 
Continuous speech recognition for Romanian language 
based on context-dependent modeling, Proc. 
COMMUNICATION 2004, Bucharest, Romania, pp. 
221-224. 
Odell, J.J., 1992. The Use of Decision Trees with Context 
Sensitive Phoneme Modeling, MPhil Thesis, 
Cambridge University Engineering Department 
SAMPA - Speech Assessment Methods Phonetic 
Alphabet,     
http://www.phon.ucl.ac.uk/home/sampa/home.htm 
Vergin, R D., O’Shaughnessy, Farhat, A., 1999. 
Generalized Mel-Frequency Cepstral Coefficients for 
Large Vocabulary Speaker Independent Continuous 
Speech Recognition, IEEE Trans. Speech Audio 
Processing, Vol. 7, No.5, pp. 525-532. 
Woodland, P.C., Odell, J.J., Valtchev, V., Young, S.J., 
1994. Large Vocabulary Continuous Speech 
Recognition Using HTK, Proc. ICASSP 1994, 
Adelaide. 
Young, S.J., 1992. The General Use of Tying in Phoneme-
Based HMM Speech Recognizers, Proc. ICASSP’92, 
Vol. 1, pp. 569-572, San Francisco. 
Young, S.J., Odell, J.J., Woodland, P.C., 1994. Tree Based 
State Tying for High Accuracy Modeling, ARPA 
Workshop on Human Language Technology, 
Princeton. 
FEATURES EXTRACTION AND TRAINING STRATEGIES IN CONTINUOUS SPEECH RECOGNITION FOR
ROMANIAN LANGUAGE
121