BANGLA ISOLATED WORD SPEECH RECOGNITION

Adnan Firoze, M. Shamsul Arifin, Ryana Quadir, Rashedur M. Rahman

Abstract

The paper presents Bangla word speech recognition using spectral analysis and fuzzy logic. As human speech is imprecise and ambiguous, the fuzzy logic – the base of which is indeed linguistic ambiguity, could serve as a more precise tool for analysing and recognizing human speech. Even though the core source of an uttered word is a voiced signal, our system revolves around the visual representation of voiced signals – the spectrogram. The spectrogram may be perceived as a “visual” entity. The essences of a spectrogram are matrices that include information about properties of a sound, e.g., energy, frequency and time. In this research the spectral analysis has been chosen as opposed to image processing for increased accuracy. The decision making process of our system is based on fuzzy logic. Experimental results demonstrate that our system is 80% accurate compared to a commercial Hidden Markov Model (HMM) based speech recognizer that shows 73% accuracy on an average.

References

  1. Abul, Md. H., Jabir, M., Mumit, K, 2007. Isolated and Continuous Bangla Speech Recognition: Implementation, Performance and application perspective, in SNLP 07, Kasetsart University, Bangok, Thailand
  2. Davies, K. H., Biddulph, R., Balashek, S., 1952. Automatic Speech Recognition of Spoken Digits, J. Acoust. Soc. Am. 24(6) pp.637 -642.
  3. Dragon Natural Speaking (DNS), 2010, Wikipedia Encyclopedia, 2010. Available: http://en.wikipedia.org/wiki/Dragon_NaturallySpeakin g
  4. Fletcher, H., 1922. The Nature of Speech and its Interpretations, Bell Syst. Tech. J., Vol 1, pp. 129- 144.
  5. Hasan, M. R., Nath, B., Alauddin B. M. , 2003. Bengali Phoneme Recognition: A New Approach, in 6th ICCIT conference, Dhaka.
  6. Illinois Image Formation and Processing (IIFP), 2010. DSP Mini-Project: An Automatic Speaker Recognition System [Online]. Available: http://www.ifp.illinois.edu/minhdo/teaching/speaker_ recognition/speaker_recognition.html
  7. Islam, M. R., Sohail, A. S. M., Sadid, M. W. H.M., Mottalib, A., 2005. Bangla Speech Recognition using three layer Back-Propagation Neural Network, in NCCPB, Dhaka.
  8. Juang, B. H., Rabiner, L. R., 2005. Automatic Speech Recognition -A Brief History of the Technology, Elsevier Encyclopedia of Language and Linguistics, Second Edition, Amsterdam, Holland.
  9. Karim, A H M. R, Rahman, Md. S., Iqbal, Md.Zafar, 2002. Recognition of Spoken Letters in Bangla, in 6th ICCIT conference, Dhaka.
  10. Nuance Communications (NComm), (2010) Available: http://www.nuance.com/naturallyspeaking/
  11. Rahman, K. J., Hossain,M.A., Das, D., Islam, T. A. Z. and Ali, M.G., 2003. Continuous Bangla Speech Recognition System, in 6th Int. Conf. on Computer and Information Technology (ICCIT), Dhaka.
  12. Roy, K., Das, D., Ali, M.G, 2002. Development of the Speech Recognition System Using Artificial Neural Network, in 5th ICCIT conference, Dhaka.
  13. Spectrogram on Wikipedia Encyclopedia, 2010. [Online]. Available: http://en.wikipedia.org/wiki/Spectrogram
  14. Short-time Fourier Transform (STFT),Wikipedia Encyclopedia, 2010. [Online]. Available: http://en.wikipedia.org/wiki/STFT
  15. Traunmüller, H., Eriksson, A., 1995. Publications of Hartmut Traunmüller, Stockholm University, Sweden [Online]. Available: http://www.ling.su.se/staff/hartmut/f0_m&f.pdf
  16. Weiss, M., 2006 . Indo-European Language and Culture, Journal of the American Oriental Society [Online] . Available: http://findarticles.com/p/articles/mi_go2081/is_2_126/ ai_n29428508/
Download


Paper Citation


in Harvard Style

Firoze A., Arifin M., Quadir R. and Rahman R. (2011). BANGLA ISOLATED WORD SPEECH RECOGNITION . In Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8425-54-6, pages 73-82. DOI: 10.5220/0003492700730082


in Bibtex Style

@conference{iceis11,
author={Adnan Firoze and M. Shamsul Arifin and Ryana Quadir and Rashedur M. Rahman},
title={BANGLA ISOLATED WORD SPEECH RECOGNITION},
booktitle={Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2011},
pages={73-82},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003492700730082},
isbn={978-989-8425-54-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 13th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - BANGLA ISOLATED WORD SPEECH RECOGNITION
SN - 978-989-8425-54-6
AU - Firoze A.
AU - Arifin M.
AU - Quadir R.
AU - Rahman R.
PY - 2011
SP - 73
EP - 82
DO - 10.5220/0003492700730082