loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Birhanu Hailu Belay 1 ; 2 ; Tewodros Habtegebrial 2 ; Marcus Liwicki 3 ; Gebeyehu Belay 1 and Didier Stricker 4 ; 2

Affiliations: 1 Bahir Dar Institute of Technology, Bahir Dar, Ethiopia ; 2 Technical University of Kaiserslautern, Kaiserslautern, Germany ; 3 Lulea University of Technology, Lulea, Sweden ; 4 DFKI, Augmented Vision Department, Kaiserslautern, Germany

Keyword(s): Amharic Script, Blended Attention-CTC, BLSTM, CNN, Encoder-decoder, Network Architecture, OCR, Pattern Recognition.

Abstract: In this paper, we propose a blended Attention-Connectionist Temporal Classification (CTC) network architecture for a unique script, Amharic, text-image recognition. Amharic is an indigenous Ethiopic script that uses 34 consonant characters with their 7 vowel variants of each and 50 labialized characters which are derived, with a small change, from the 34 consonant characters. The change involves modifying the structure of these characters by adding a straight line, or shortening and/or elongating one of its main legs including the addition of small diacritics to the right, left, top or bottom of the character. Such a small change affects orthographic identities of character and results in shape similarly among characters which are interesting, but challenging task, for OCR research. Motivated with the recent success of attention mechanism on neural machine translation tasks, we propose an attention-based CTC approach which is designed by blending attention mechanism directly within t he CTC network. The proposed model consists of an encoder module, attention module and transcription module in a unified framework. The efficacy of the proposed model on the Amharic language shows that attention mechanism allows learning powerful representations by integrating information from different time steps. Our method outperforms state-of-the-art methods and achieves 1.04% and 0.93% of the character error rate on ADOCR test datasets. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.142.98.108

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Belay, B.; Habtegebrial, T.; Liwicki, M.; Belay, G. and Stricker, D. (2021). A Blended Attention-CTC Network Architecture for Amharic Text-image Recognition. In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-486-2; ISSN 2184-4313, SciTePress, pages 435-441. DOI: 10.5220/0010284204350441

@conference{icpram21,
author={Birhanu Hailu Belay. and Tewodros Habtegebrial. and Marcus Liwicki. and Gebeyehu Belay. and Didier Stricker.},
title={A Blended Attention-CTC Network Architecture for Amharic Text-image Recognition},
booktitle={Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2021},
pages={435-441},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010284204350441},
isbn={978-989-758-486-2},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - A Blended Attention-CTC Network Architecture for Amharic Text-image Recognition
SN - 978-989-758-486-2
IS - 2184-4313
AU - Belay, B.
AU - Habtegebrial, T.
AU - Liwicki, M.
AU - Belay, G.
AU - Stricker, D.
PY - 2021
SP - 435
EP - 441
DO - 10.5220/0010284204350441
PB - SciTePress