HISTORICAL DOCUMENT IMAGE BINARIZATION

Carlos A. B. Mello, Adriano L. I. Oliveira, Ángel Sánchez

Abstract

Preservation and publishing historical documents are important issues which have gained more and more interest over the years. Digital media has been used to storage digital versions of the documents as image files. However, this digital image needs huge storage space as usually the documents are digitized in high resolutions and in true colour for preservation purposes. In order to make easier the access to the images they can be converted into bi-level images. We present in this work a new method composed by two algorithms for binarization of historical document images based on Tsallis entropy. The new method was compared to several other well-known threshold algorithms and it achieved the best qualitative and quantitative results when compared to the gold standard images of the documents, measuring precision, recall, accuracy, specificity, peak signal-to-noise ratio and mean square error.

References

  1. Bottou, L. et al., 1998. High Quality Document Image Compression with DjVu. Journal of Electronic Imaging, 410-425, SPIE (also: http://www.djvu.org).
  2. Kapur, J.N., 1994. Measures of Information and their Applications, J.Wiley & Sons.
  3. Kapur, J.N., et al, 1985. A New Method for Gray-Level Picture Thresholding using the Entropy of the Histogram, Comp Vision, Graphics and Image Proc., Vol 29, no 3.
  4. Li, C.H. and Lee, C.K., 1993. Minimum Cross Entropy Thresholding, Pattern Recognition, vol. 26, no 4.
  5. McMillan, N.A. and Creelman, C.D., 2005. Detection Theory. LEA Publishing.
  6. Oliveira, A.L.I., et al, 2006. Optical Digit Recognition for Images of Handwritten Historical Documents, Brazilian Symposium of Neural Networks, p.29, Brazil.
  7. Mello, C.A.B. et al., 2006. Image Thresholding of Historical Documents: Application to the Joaquim Nabuco's File, Eva Vienna, p. 115-122, Austria.
  8. Mello, C.A.B. and Lins, R.D., 2000. Image Segmentation of Historical Documents, Visual 2000, Mexico.
  9. Parker, J.R., 1997. Algorithms for Image Processing and Computer Vision. John Wiley & Sons.
  10. Pun, T., 1981. Entropic Thresholding, A New Approach, Computer Graphics and Image Processing, vol. 16.
  11. Sahoo, P. et al., 1997. Threshold Selection using Renyi's Entropy, Pattern Recognition, vol. 30, no 1.
  12. Sezgin, M., Sankur,B., 2004. Survey over image thresholding techniques and quantitative performance evaluation, J. of Electronic Imaging, no.13, vol 1, pp. 146-165.
  13. Shannon, C., 1948. A Mathematical Theory of Communication, Bell System Technology Journal, vol. 27, pp. 370-423, 623-656.
  14. Silva,J.M., et al., 2006. Binarizing and filtering historical documents with back-to-front interference, Proceedings of the ACM SAC, France.
  15. Tsallis, C., 1988. Possible Generalization of BoltzmannGibbs statistics, J. of Statistical Physics, vol. 52, nos. 1-2, pp. 479-487.
  16. Wu, L. et al., 1998. An Effective Entropic thresholding for Ultrasonic Images, International Conference on Pattern Recognition, pp 1552-1554, Australia.
  17. Yan, L. et al., 2006. An Application of Tsallis Entropy Minimum Difference on Image Segmentation, World Congress on Intelligent Control and Automation, pp. 9557-9561, China.
Download


Paper Citation


in Harvard Style

Mello C., Oliveira A. and Sánchez Á. (2008). HISTORICAL DOCUMENT IMAGE BINARIZATION . In Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008) ISBN 978-989-8111-21-0, pages 108-113. DOI: 10.5220/0001078201080113


in Bibtex Style

@conference{visapp08,
author={Carlos A. B. Mello and Adriano L. I. Oliveira and Ángel Sánchez},
title={HISTORICAL DOCUMENT IMAGE BINARIZATION},
booktitle={Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)},
year={2008},
pages={108-113},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001078201080113},
isbn={978-989-8111-21-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)
TI - HISTORICAL DOCUMENT IMAGE BINARIZATION
SN - 978-989-8111-21-0
AU - Mello C.
AU - Oliveira A.
AU - Sánchez Á.
PY - 2008
SP - 108
EP - 113
DO - 10.5220/0001078201080113