Using Artificial Neural Networks in Dialect Identification in Less-resourced Languages - The Case of Kurdish Dialects Identification

Hossein Hassani, Oussama H. Hamid

Abstract

Dialect identification/classification is an important step in many language processing activities particularly with regard to multi-dialect languages. Kurdish is a multi-dialect language which is spoken by a large population in different countries. Some of the Kurdish dialects, for example, Kurmanji and Sorani, have significant grammatical differences and are also mutually unintelligible. In addition, Kurdish is considered a less-resourced language. The classification techniques based on machine learning approaches usually require a considerable amount of data. In this research, we are interested in using approaches based on Artificial Neural Network (ANN) in order to be able to identify the dialects of Kurdish texts without the need to have a large amount of data. We will also compare the outcomes of this approach with the previous work on Kurdish dialect identification to compare the performance of these methods. The results showed that the two approaches do not show a significant difference in their accuracy and performance with regard to long documents. However, they showed that the ANN approach performs better than traditional approach for the single sentence classification. The accuracy rate of the ANN sentence classifier was 99% for Kurmanji and 96% for Sorani.

Download


Paper Citation


in Harvard Style

Hassani H. and H. Hamid O. (2017). Using Artificial Neural Networks in Dialect Identification in Less-resourced Languages - The Case of Kurdish Dialects Identification.In Proceedings of the 9th International Joint Conference on Computational Intelligence - Volume 1: SCT, ISBN 978-989-758-274-5, pages 443-448. DOI: 10.5220/0006578004430448


in Bibtex Style

@conference{sct17,
author={Hossein Hassani and Oussama H. Hamid},
title={Using Artificial Neural Networks in Dialect Identification in Less-resourced Languages - The Case of Kurdish Dialects Identification},
booktitle={Proceedings of the 9th International Joint Conference on Computational Intelligence - Volume 1: SCT,},
year={2017},
pages={443-448},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006578004430448},
isbn={978-989-758-274-5},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Joint Conference on Computational Intelligence - Volume 1: SCT,
TI - Using Artificial Neural Networks in Dialect Identification in Less-resourced Languages - The Case of Kurdish Dialects Identification
SN - 978-989-758-274-5
AU - Hassani H.
AU - H. Hamid O.
PY - 2017
SP - 443
EP - 448
DO - 10.5220/0006578004430448