Removal of Historical Document Degradations using Conditional GANs

Veeru Dumpala, Sheela Kurupathi, Syed Bukhari, Andreas Dengel

2019

Abstract

One of the most crucial problem in document analysis and OCR pipeline is document binarization. Many traditional algorithms over the past few decades like Sauvola, Niblack, Otsu etc,. were used for binarization which gave insufficient results for historical texts with degradations. Recently many attempts have been made to solve binarization using deep learning approaches like Autoencoders, FCNs. However, these models do not generalize well to real world historical document images qualitatively. In this paper, we propose a model based on conditional GAN, well known for its high-resolution image synthesis. Here, the proposed model is used for image manipulation task which can remove different degradations in historical documents like stains, bleed-through and non-uniform shadings. The performance of the proposed model outperforms recent state-of-the-art models for document image binarization. We support our claims by benchmarking the proposed model on publicly available PHIBC 2012, DIBCO (2009-2017) and Palm Leaf datasets. The main objective of this paper is to illuminate the advantages of generative modeling and adversarial training for document image binarization in supervised setting which shows good generalization capabilities on different inter/intra class domain document images.

Download


Paper Citation


in Harvard Style

Dumpala V., Kurupathi S., Bukhari S. and Dengel A. (2019). Removal of Historical Document Degradations using Conditional GANs.In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-351-3, pages 145-154. DOI: 10.5220/0007367701450154


in Bibtex Style

@conference{icpram19,
author={Veeru Dumpala and Sheela Kurupathi and Syed Bukhari and Andreas Dengel},
title={Removal of Historical Document Degradations using Conditional GANs},
booktitle={Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2019},
pages={145-154},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007367701450154},
isbn={978-989-758-351-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Removal of Historical Document Degradations using Conditional GANs
SN - 978-989-758-351-3
AU - Dumpala V.
AU - Kurupathi S.
AU - Bukhari S.
AU - Dengel A.
PY - 2019
SP - 145
EP - 154
DO - 10.5220/0007367701450154