Authors:
Mariem Gargouri Kchaou
1
;
Slim Kanoun
1
;
Fouad Slimane
2
and
Souhir Bouaziz Affes
1
Affiliations:
1
University of Sfax, Tunisia
;
2
University of Sfax and University of Fribourg, Tunisia
Keyword(s):
Arabic Optical Character Recognition, Statistic Approach, Features Extraction, Classification, Offline Recognition.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Classification
;
Data Engineering
;
Ensemble Methods
;
Feature Selection and Extraction
;
Fuzzy Logic
;
Information Retrieval
;
Information Retrieval and Learning
;
Kernel Methods
;
Object Recognition
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
This paper presents a comparative study for Arabic optical character recognition techniques according to statistic approach. So, the current work consists in experimenting character image characterization and matching to show the most robust and reliable techniques. For features extraction phase, we test invariant moments, affine moment invariants, Tsirikolias–Mertzios moments, Zernike moments, Fourier-Mellin transform and Fourier descriptors. And for the classification phase, we use k-Nearest Neighbors and Support Vector Machine. Our data collection encloses 3 datasets. The first contains 2320 multi-font and multi-scale printed samples. The second contains 9280 multi-font, multi-scale and multi-oriented printed samples. And, the third contains 2900 handwritten samples which are extracted from the IFN/ENIT data. The aim was to cover a wide spectrum of Arabic characters complexity. The best performance rates found for each dataset are 99.91%, 99.26% and 66.68% respectively.