Using Automatic Features for Text-image Classification in Amharic Documents

Birhanu Belay, Tewodros Habtegebrial, Gebeyehu Belay, Didier Stricker

Abstract

In many documents, ranging from historical to modern archived documents, handwritten and machine printed texts may coexist in the same document image, raising significant issues within the recognition process and affects the performance of OCR application. It is, therefore, necessary to discriminate the two types of texts so that it becomes possible to apply the desired recognition techniques. Inspired by the recent successes CNN based features on pattern recognition, in this paper, we propose a method that can discriminate handwritten from machine printed text-lines in Amharic document image. In addition, we also demonstrate the effect of replacing the last fully connected layer with a binary support vector machine which minimizes a margin-based loss instead of the cross-entropy loss. Based on the results observed during experimentation, using Binary SVM gives significant discrimination performance compared to the fully connected layers.

Download


Paper Citation


in Harvard Style

Belay B., Habtegebrial T., Belay G. and Stricker D. (2020). Using Automatic Features for Text-image Classification in Amharic Documents.In Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-397-1, pages 440-445. DOI: 10.5220/0008940704400445


in Bibtex Style

@conference{icpram20,
author={Birhanu Belay and Tewodros Habtegebrial and Gebeyehu Belay and Didier Stricker},
title={Using Automatic Features for Text-image Classification in Amharic Documents},
booktitle={Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2020},
pages={440-445},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008940704400445},
isbn={978-989-758-397-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Using Automatic Features for Text-image Classification in Amharic Documents
SN - 978-989-758-397-1
AU - Belay B.
AU - Habtegebrial T.
AU - Belay G.
AU - Stricker D.
PY - 2020
SP - 440
EP - 445
DO - 10.5220/0008940704400445