A High Accuracy Text Detection Model of Newly Constructing and Training Strategies

Kha Nguyen, Ryosuke Odate

Abstract

Normally, text recognition systems include two main parts: text detection and text recognition. Text detection is a prerequisite and has a big impact on the performance of text recognition. In this paper, we propose a high-accuracy model for detecting text-lines on a receipt dataset. We focus on the three most important points to improve the performance of the model: anchor boxes for locating text regions, backbone networks to extract features, and a suppression method to select the best fitting bounding box for each text region. Specifically, we propose a clustering method to determine anchor boxes and apply novel convolution neural networks for feature extraction. These two points are the newly constructing strategies of the model. Besides, we propose a training strategy to make the model output angles of text-lines, then revise bounding boxes with the angles before applying the suppression method. This strategy is to detect skewed and downward/upward curved text-lines. Our model outperforms other best models submitted to the ICDAR 2019 competition with the detection rate of 98.87% (F1 score) so that we can trust the model for detecting text-lines automatically. These strategies are also flexible to apply for other datasets of various domains.

Download


Paper Citation


in Harvard Style

Nguyen K. and Odate R. (2021). A High Accuracy Text Detection Model of Newly Constructing and Training Strategies.In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-486-2, pages 635-642. DOI: 10.5220/0010343406350642


in Bibtex Style

@conference{icpram21,
author={Kha Nguyen and Ryosuke Odate},
title={A High Accuracy Text Detection Model of Newly Constructing and Training Strategies},
booktitle={Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2021},
pages={635-642},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010343406350642},
isbn={978-989-758-486-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - A High Accuracy Text Detection Model of Newly Constructing and Training Strategies
SN - 978-989-758-486-2
AU - Nguyen K.
AU - Odate R.
PY - 2021
SP - 635
EP - 642
DO - 10.5220/0010343406350642