ChronSeg: Novel Dataset for Segmentation of Handwritten Historical Chronicles

Josef Baloun, Josef Baloun, Pavel Král, Pavel Král, Ladislav Lenc, Ladislav Lenc

2021

Abstract

The segmentation of document images plays an important role in the process of making their content electronically accessible. This work focuses on the segmentation of historical handwritten documents, namely chronicles. We take image, text and background classes into account. For this goal, a new dataset is created mainly from chronicles provided by Porta fontium. In total, the dataset consists of 58 images of document pages and their precise annotations for text, image and graphic regions in PAGE format. The annotations are also provided at a pixel level. Further, we present a baseline evaluation using an approach based on a fully convolutional neural network. We also perform a series of experiments in order to identify the best method configuration. It includes a novel data augmentation method which creates artificial pages.

Download


Paper Citation


in Harvard Style

Baloun J., Král P. and Lenc L. (2021). ChronSeg: Novel Dataset for Segmentation of Handwritten Historical Chronicles.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 314-322. DOI: 10.5220/0010317203140322


in Bibtex Style

@conference{icaart21,
author={Josef Baloun and Pavel Král and Ladislav Lenc},
title={ChronSeg: Novel Dataset for Segmentation of Handwritten Historical Chronicles},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={314-322},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010317203140322},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - ChronSeg: Novel Dataset for Segmentation of Handwritten Historical Chronicles
SN - 978-989-758-484-8
AU - Baloun J.
AU - Král P.
AU - Lenc L.
PY - 2021
SP - 314
EP - 322
DO - 10.5220/0010317203140322