LIVE TV SUBTITLING - Fast 2-pass LVCSR System for Online Subtitling

Aleš Pražák, Ludkě Müller, J. V. Psutka, J. Psutka

2007

Abstract

The paper describes a fast 2-pass large vocabulary continuous speech recognition (LVCSR) system for automatic online subtitling of live TV programs. The proposed system implementation can be used for direct recognition of TV program audio channel or recognition of a shadow speaker who re-speaks the original audio channel. The first part of this paper focuses on preparation of an adaptive language model for TV programs, where person names are specific for each subtitling session and have to be added to the recognition vocabulary. The second part outlines the recognition system conception for automatic online subtitling with vocabulary up to 150 000 words in real-time. The recognition system is based on Hidden Markov Models, lexical trees and bigram and quadgram language models in the first and second pass, respectively. Finally, experimental results from our project with the Czech Television are reported and discussed.

References

  1. Evans, M. J. (2003). Speech recognition in assisted and live subtitling for television. BBC R&D White Paper, 065.
  2. J. Kanis, J. Zelinka, L. M. (2005). Automatic numbers normalization in inflectional languages. In SPECOM 2005, 10th International Conference SPEECH and COMPUTER.
  3. J. Psutka, L. Müller, J. V. P. (2001). Comparison of mfcc and plp parameterization in the speaker independent continuous speech recognition task. In EUROSPEECH 2001, 7th European Conference on Speech Communication and Technology.
  4. S. Young, e. a. (1999). The HTK Book. Entropic Inc.
  5. Stolcke, A. (2002). Srilm - an extensible language modeling toolkit. In ICSLP 2002, 7th International Conference on Spoken Language Processing.
Download


Paper Citation


in Harvard Style

Pražák A., Müller L., V. Psutka J. and Psutka J. (2007). LIVE TV SUBTITLING - Fast 2-pass LVCSR System for Online Subtitling . In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007) ISBN 978-989-8111-13-5, pages 139-142. DOI: 10.5220/0002140301390142


in Bibtex Style

@conference{sigmap07,
author={Aleš Pražák and Ludkě Müller and J. V. Psutka and J. Psutka},
title={LIVE TV SUBTITLING - Fast 2-pass LVCSR System for Online Subtitling},
booktitle={Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007)},
year={2007},
pages={139-142},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002140301390142},
isbn={978-989-8111-13-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007)
TI - LIVE TV SUBTITLING - Fast 2-pass LVCSR System for Online Subtitling
SN - 978-989-8111-13-5
AU - Pražák A.
AU - Müller L.
AU - V. Psutka J.
AU - Psutka J.
PY - 2007
SP - 139
EP - 142
DO - 10.5220/0002140301390142