loading
Papers

Research.Publish.Connect.

Paper

Authors: Syed Saqib Bukhari 1 ; Ashutosh Gupta 2 ; Anil Kumar Tiwari 3 and Andreas Dengel 4

Affiliations: 1 German Research Center for Artificial Intelligence, Germany ; 2 German Research Center for Artificial Intelligence and IITJ-Indian Institute of Technology Jodhpur, Germany ; 3 IITJ-Indian Institute of Technology Jodhpur, India ; 4 German Research Center for Artificial Intelligence and Technical University Kaiserslautern, Germany

ISBN: 978-989-758-276-9

Keyword(s): Document Analysis, Historical Document Analysis, Layout Analysis, Document Image Segmentation.

Related Ontology Subjects/Areas/Topics: Applications ; Computer Vision, Visualization and Computer Graphics ; Image Understanding ; Pattern Recognition

Abstract: Layout analysis, mainly including binarization and page segmentation, is one of the most important performance determining steps of an OCR system for complex medieval document images, which contain noise, distortions and irregular layouts. In this paper, we present high performance page segmentation techniques for medieval European document images which include a novel main-body and side-notes segregation and an improved version of OCRopus (OCRopus, ) based text line extraction. In order to complete the high performance layout analysis pipeline, we have also presented the application of the percentile based binarization (Afzal et al., 2014) and the multiresolution morphology based text and non-text segmentation (Bukhari et al., 2011) methods over historical document images. presented layout analysis techniques are applied to a collection of the 15th century Latin document images, which achieved more than 90% accuracy for each of the segmentation techniques.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.227.3.146

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Bukhari, S.; Gupta, A.; Tiwari, A. and Dengel, A. (2018). High Performance Layout Analysis of Medieval European Document Images.In Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-276-9, pages 324-331. DOI: 10.5220/0006574603240331

@conference{icpram18,
author={Syed Saqib Bukhari. and Ashutosh Gupta. and Anil Kumar Tiwari. and Andreas Dengel.},
title={High Performance Layout Analysis of Medieval European Document Images},
booktitle={Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2018},
pages={324-331},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006574603240331},
isbn={978-989-758-276-9},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - High Performance Layout Analysis of Medieval European Document Images
SN - 978-989-758-276-9
AU - Bukhari, S.
AU - Gupta, A.
AU - Tiwari, A.
AU - Dengel, A.
PY - 2018
SP - 324
EP - 331
DO - 10.5220/0006574603240331

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.