LANGUAGE MODEL BASED ON POS TAGGER

Bartosz Ziółko, Suresh Manandhar, Richard C. Wilson, Mariusz Ziółko

2008

Abstract

Language models are necessary for any large vocabulary speech recogniser. There are two main types of information which can be used to support modelling a language: syntactic and semantic. One of the ways to apply syntactic modelling is to use POS taggers. Morphological information can be statistically analysed to provide probability of a sequence of words using their POS tags. The results for Polish language modelling are presented.

References

  1. A.Przepiórkowski (2006). The potential of the IPI PAN corpus. PoznaÁ Studies in Contemporary Linguistics, 41:31-48.
  2. Brill, E. (1995). Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, December:543-565.
  3. Cozens, S. (1998). Primitive part-of-speech tagging using word length and sentential structure. Computaion and Language.
  4. De¸bowski, L. (2003). A reconfigurable stochastic tagger for languages with complex tag structure. The Proceedings of the Workshop on Morphological Processing of Slavic Languages, EACL.
  5. Grocholewski, S. (1995). Zaloz?enia akustycznej bazy danych dla je¸zyka polskiego na nosniku cd rom (eng. Assumptions of acoustic database for Polish language). Mat. I KK: Glosowa komunikacja czlowiekkomputer, Wroclaw, pages 177-180.
  6. Johansson, S., Leech, G., and Goodluck, H. (1978). Manual of Information to Accompany the LancasterOlso/Bergen Corpus of British English, for Use with Digital Computers. Department of English, University of Oslo.
  7. Kucera, H. and Francis, W. (1967). Computational Analysis of Present Day American English. Brown University Press Providence.
  8. Piasecki, M. (2006). Hand-written and automatically extracted rules for polish tagger. Lecture Notes in Artificial Intelligence, Springer, W P. Sojka, I. Kopecek, K. Pala, eds. Proceedings of Text, Speech, Dialogue 2006:205-212.
  9. Przepiórkowski, A. and WoliÁski, M. (2003). The unbearable lightness of tagging: A case study in morphosyntactic tagging of Polish. Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora (LINC-03), EACL 2003.
  10. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. (2005). HTK Book. Cambridge University Engineering Department, UK.
Download


Paper Citation


in Harvard Style

Ziółko B., Manandhar S., C. Wilson R. and Ziółko M. (2008). LANGUAGE MODEL BASED ON POS TAGGER . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2008) ISBN 978-989-8111-60-9, pages 177-180. DOI: 10.5220/0001931701770180


in Bibtex Style

@conference{sigmap08,
author={Bartosz Ziółko and Suresh Manandhar and Richard C. Wilson and Mariusz Ziółko},
title={LANGUAGE MODEL BASED ON POS TAGGER},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2008)},
year={2008},
pages={177-180},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001931701770180},
isbn={978-989-8111-60-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2008)
TI - LANGUAGE MODEL BASED ON POS TAGGER
SN - 978-989-8111-60-9
AU - Ziółko B.
AU - Manandhar S.
AU - C. Wilson R.
AU - Ziółko M.
PY - 2008
SP - 177
EP - 180
DO - 10.5220/0001931701770180