A TRANSLITERATION ENGINE FOR ASIAN LANGUAGES

Sathiamoorthy Manoharan

2008

Abstract

Transliteration maps the alphabets of one script using alphabets of another script. It is commonly used when proper nouns of one language need to be written in the script of another language. This typically involves constructing approximate phonetically equivalent words. For instance, the phonetic equivalent of New Zealand in Japanese is “niyuu jiilando” which in Katakana is ニユー ジーランド. This paper illustrates the design and development of a transliteration engine which suits syllabary and alphasyllabary scripts. A syllabary is an alphabet set that represent syllables. The Japanese Katakana and Hiragana scripts fall under this category. An alphasyllabary is an alphabet set that represent consonants, vowels, and syllables composed of consonants and vowels. Scripts such as Thai, Sinhala, Burmese, and most of the Indian scripts such as Devanagari, Tamil, and Malayalam come under this category. The engine is useful in teaching and learning ethnic scripts. It is a useful tool for the internationalization and localization of computer programs, publishing ethnic scripts over the Internet, and to compose electronic documents in ethnic scripts.

References

  1. Al-Onaizan, Y. and Knight, K., Machine transliteration of names in Arabic text. In Proceedings of the ACL-02 workshop on Computational approaches to Semitic languages. Philadelphia, Pennsylvania, (2002) 1-13.
  2. Grefenstette, G., Yan, Q., and Evans, D. A., ,Mining the Web to Create a Language Model for Mapping between English Names and Phrases and Japanese, In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. (2004) 110-116
  3. Kang, I-H. and Kim, G., English-to-Korean transliteration using multiple unbounded over-lapping phoneme chunks, In Proceedings of the 18th International conference on Computational linguistics. Saarbrücken, Germany, (2000) 418-424
  4. Kawtrakul, A., et al. Backward transliteration for Thai document retrieval. In Proceedings of the Asia-Pacific conference on Circuits and Systems. Chiangmai, Thailand, (1998) 563-566
  5. Knight, K. and Grehl, J., Machine Transliteration. In Computational Linguistics, Vol.24, No.4, (1998) 599-612
  6. Sakai, T., Kumano, A., and Manabe, T., Generating transliteration rules for cross-language information retrieval from machine translation dictionaries, In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, (2002) 6-12
  7. The Unicode Consortium.The Unicode Standard, Version 4.0. Addison-Wesley, Boston, USA (2003)
Download


Paper Citation


in Harvard Style

Manoharan S. (2008). A TRANSLITERATION ENGINE FOR ASIAN LANGUAGES . In Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-26-5, pages 376-379. DOI: 10.5220/0001519903760379


in Bibtex Style

@conference{webist08,
author={Sathiamoorthy Manoharan},
title={A TRANSLITERATION ENGINE FOR ASIAN LANGUAGES},
booktitle={Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2008},
pages={376-379},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001519903760379},
isbn={978-989-8111-26-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fourth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - A TRANSLITERATION ENGINE FOR ASIAN LANGUAGES
SN - 978-989-8111-26-5
AU - Manoharan S.
PY - 2008
SP - 376
EP - 379
DO - 10.5220/0001519903760379