Authors:
Aleš Pražák
1
;
Ludkě Müller
1
;
J. V. Psutka
2
and
J. Psutka
2
Affiliations:
1
SpeechTech s.r.o., Czech Republic
;
2
University of West Bohemia, Czech Republic
Keyword(s):
ASR, LVCSR, HMM, real-time, class-based language model, live TV, online subtitling.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Audio and Speech Processing
;
Digital Signal Processing
;
Multimedia
;
Multimedia Signal Processing
;
Pattern Recognition
;
Software Engineering
;
Telecommunications
Abstract:
The paper describes a fast 2-pass large vocabulary continuous speech recognition (LVCSR) system for automatic online subtitling of live TV programs. The proposed system implementation can be used for direct recognition of TV program audio channel or recognition of a shadow speaker who re-speaks the original audio channel. The first part of this paper focuses on preparation of an adaptive language model for TV programs, where person names are specific for each subtitling session and have to be added to the recognition vocabulary. The second part outlines the recognition system conception for automatic online subtitling with vocabulary up to 150 000 words in real-time. The recognition system is based on Hidden Markov Models, lexical trees and bigram and quadgram language models in the first and second pass, respectively. Finally, experimental results from our project with the Czech Television are reported and discussed.