loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Karima Meftouh 1 ; Kamel Smaili 2 and Mohamed Tayeb Laskri 1

Affiliations: 1 Badji Mokhtar University, Algeria ; 2 INRIA-LORIA, France

ISBN: 978-989-8111-66-1

ISSN: 2184-433X

Keyword(s): Statistical language modeling, Arabic, French, Smoothing technique, n-gram model, Vocabulary, Perplexity, Performance.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Knowledge Engineering and Ontology Development ; Knowledge-Based Systems ; Natural Language Processing ; Pattern Recognition ; Symbolic Systems

Abstract: In this paper, we propose a comparative study of statistical language models of Arabic and French. The objective of this study is to understand how to better model both Arabic and French. Several experiments using different smoothing techniques have been carried out. For French, trigram models are most appropriate whatever the smoothing technique used. For Arabic, the n-gram models of higher order smoothed with Witten Bell method are more efficient. Tests are achieved with comparable corpora and vocabularies in terms of size.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.232.51.69

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Meftouh K.; Smaili K.; Tayeb Laskri M. and (2009). COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS.In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8111-66-1, pages 156-160. DOI: 10.5220/0001537501560160

@conference{icaart09,
author={Karima Meftouh and Kamel Smaili and Mohamed {Tayeb Laskri}},
title={COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2009},
pages={156-160},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001537501560160},
isbn={978-989-8111-66-1},
}

TY - CONF

JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS
SN - 978-989-8111-66-1
AU - Meftouh, K.
AU - Smaili, K.
AU - Tayeb Laskri, M.
PY - 2009
SP - 156
EP - 160
DO - 10.5220/0001537501560160

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.