loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Alaa Alahmadi ; Arash Joorabchi and Abdulhussain E. Mahdi

Affiliation: University of Limerick, Ireland

ISBN: 978-989-758-048-2

Keyword(s): Automatic Text Classification, Arabic Text, Bag-of-Words, Bag-of-Concepts, Wikipedia.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Concept Mining ; Evolutionary Computing ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Symbolic Systems

Abstract: With the exponential growth of Arabic text in digital form, the need for efficient organization, navigation and browsing of large amounts of documents in Arabic has increased. Text Classification (TC) is one of the important subfields of data mining. The Bag-of-Words (BOW) representation model, which is the traditional way to represent text for TC, only takes into account the frequency of term occurrence within a document. Therefore, it ignores important semantic relationships between terms and treats synonymous words independently. In order to address this problem, this paper describes the application of a Bag-of-Concepts (BOC) text representation model for Arabic text. The proposed model is based on utilizing the Arabic Wikipedia as a knowledge base for concept detection. The BOC model is used to generate a Vector Space Model, which in turn is fed into a classifier to categorize a collection of Arabic text documents. Two different machine-learning based classifiers have been deploye d to evaluate the effectiveness of the proposed model in comparison to the traditional BOW model. The results of our experiment show that the proposed BOC model achieves an improved performance with respect to BOW in terms of classification accuracy. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.227.240.31

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Alahmadi, A.; Joorabchi , A. and E. Mahdi, A. (2014). Arabic Text Classification using Bag-of-Concepts Representation.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014) ISBN 978-989-758-048-2, pages 374-380. DOI: 10.5220/0005138103740380

@conference{kdir14,
author={Alaa Alahmadi. and Arash Joorabchi . and Abdulhussain E. Mahdi.},
title={Arabic Text Classification using Bag-of-Concepts Representation},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)},
year={2014},
pages={374-380},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005138103740380},
isbn={978-989-758-048-2},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2014)
TI - Arabic Text Classification using Bag-of-Concepts Representation
SN - 978-989-758-048-2
AU - Alahmadi, A.
AU - Joorabchi , A.
AU - E. Mahdi, A.
PY - 2014
SP - 374
EP - 380
DO - 10.5220/0005138103740380

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.