Model Adaptation via MAP for Speech Recognition in Noisy Environments

Tatiane Melo Vital, Carlos Alberto Ynoguti

2014

Abstract

The accuracy of speech recognition systems degrades severely when operating in noisy environments, mainly due to the mismatch between training and testing environmental conditions. The use of noise corrupted training utterances is being used with success in many works. However, as the type and intensity of the noise at operation time is unpredictable, the present work proposes a step beyond: the use of the MAP method to use samples of the actual audio signal that is being processed to adapt such systems to the real noise condition. Experimental results show an increase of almost 2% on average in the recognition rates, when compared to systems trained with noisy utterances.

References

  1. Alcaim, A., Solewicz, J. A. and Moraes, J. A., 1992. Frequência de ocorrência dos fones e listas de frases foneticamente balanceadas no português falado no Rio de Janeiro. Revista da Sociedade Brasileira de Telecomunicações, 7(1), pp.23-41.
  2. Furui, S., 2007. 50 years of progress in speech recognition technology: Where we are, and where we should go? from a poor dog to a super cat. In: ICASSP (International Conference on Acoustics, Speech, and Signal Processing), 2007 International Conference on Acoustic, Speech, and Signal Processing. Honolulu, Hawaii, USA 15-20 April 2007. Piscataway, NJ: IEEE.
  3. M. Grimm and K. Kroschel, Robust Speech Recognition and Understanding, InTech, 2007, pp. 439-460.
  4. Ney, H., 1984. The use of a one-stage dynamic programming algorithm for connected word recognition. IEEE Transactions on Acoustic, Speech and Signal Processing, 32(2), pp.263-271.
  5. NIST, 2011. National Institute o Standards and Technology. 01 February 2010. (online) Available at: <http://http://www.itl.nist.gov/iad/mig//tools/> [Accessed 10 January 2011].
  6. Pearce, D. and Hirsch H. G., 2000. The aurora experimental framework for the performance evaluation of speech recognition systems under noise conditions. In: ISCA (International Speech Conference Association), ISCA ITRW ASR2000 “Automatic Speech Recognition: Challenges for the Next Millennium”. Paris, France 18-20 September 2000. . Paris: ISCA.
  7. Reynolds, D. A., 2003. Channel robust speaker verification via feature mapping. In: ICASSP (International Conference on Acoustics, Speech, and Signal Processing), 2003 International Conference on Acoustic, Speech, and Signal Processing. Hong Kong, Hong Kong, 06-10 April 2003. New York, NY: IEEE.
  8. Valerio, T. A. F. and Ynoguti, C. A., 2011. Multi-style training analysis for robust speech recognition. Discriminative feature extraction for speech recognition in noise. In: IWT (International Workshop on Telecommunications), 2011 International Workshop on Telecommunications, Rio de Janeiro, Brazil 03-06 May 2011. Santa Rita do Sapucaí: INATEL.
  9. Ynoguti, C. A., 1999. Reconhecimento de fala contínua usando modelos ocultos de Markov. Ph. D. Universidade Estadual de Campinas.
  10. Ynoguti, C. A. and Violaro, F., 2000. Um sistema de reconhecimento de fala contínua baseado em modelos de Markov contínuos. In: SBrT (Sociedade Brasileira de Telecomunicações), XVIII Simpósio Brasileiro de Telecomunicações. Gramado-RS, Brazil 03-06 September 2000. Brazil: SBrT.
Download


Paper Citation


in Harvard Style

Melo Vital T. and Alberto Ynoguti C. (2014). Model Adaptation via MAP for Speech Recognition in Noisy Environments . In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2014) ISBN 978-989-758-011-6, pages 91-96. DOI: 10.5220/0004718400910096


in Bibtex Style

@conference{biosignals14,
author={Tatiane Melo Vital and Carlos Alberto Ynoguti},
title={Model Adaptation via MAP for Speech Recognition in Noisy Environments},
booktitle={Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2014)},
year={2014},
pages={91-96},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004718400910096},
isbn={978-989-758-011-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2014)
TI - Model Adaptation via MAP for Speech Recognition in Noisy Environments
SN - 978-989-758-011-6
AU - Melo Vital T.
AU - Alberto Ynoguti C.
PY - 2014
SP - 91
EP - 96
DO - 10.5220/0004718400910096