THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION

Hiroshi Tanaka, Yusaku Fujii, Yoshinobu Hotta

2011

Abstract

In this paper, a simple threshold correction method for document image binarization for text extraction is presented. This method enhances the binary image of characters, which is often adversely influenced by neighboring strong pixels or background noise. The threshold correction method is based on a similar method applied to ruled-line extraction presented by the author, and is claimed to be effective to text extraction. The author also reveals the relationship between effectiveness of the method and the image resolution.

References

  1. Otsu, N., 1979. A Threshold Selection Method from Graylevel Histograms. In IEEE Trans. Systems, Man, and Cybernetics, vol.9, no.1, 1979, pp. 62-66.
  2. Otsu, N., 1979. A Threshold Selection Method from Graylevel Histograms. In IEEE Trans. Systems, Man, and Cybernetics, vol.9, no.1, 1979, pp. 62-66.
  3. Trier, O. D., and Jain, A. K., Goal-directed Evaluation of Binarization Methods. In IEEE Trans. PAMI. vol.17, no.12, Dec. 1995, pp.1191-1201.
  4. Trier, O. D., and Jain, A. K., Goal-directed Evaluation of Binarization Methods. In IEEE Trans. PAMI. vol.17, no.12, Dec. 1995, pp.1191-1201.
  5. Niblack, W., An Introduction to Digital Image Processing. In Prentice Hall, Englewood Cliffs, N. J., 1986, pp.115-116.
  6. Niblack, W., An Introduction to Digital Image Processing. In Prentice Hall, Englewood Cliffs, N. J., 1986, pp.115-116.
  7. Sauvola, J., Seppanen, T., Haapakoski, S., and Pietikainen, M., Adaptive Document Binarization. In Proc. 4th. ICDAR, Ulm, Germany, Aug. 1997, pp.147-152.
  8. Sauvola, J., Seppanen, T., Haapakoski, S., and Pietikainen, M., Adaptive Document Binarization. In Proc. 4th. ICDAR, Ulm, Germany, Aug. 1997, pp.147-152.
  9. Gatos, B., Ntirogiannis, K., and Pratikakis, I., ICDAR 2009 Document Image Binarization Contest (DIBCO 2009). In Proc. 10th. ICDAR, Barcelona, Spain, Jul. 2009, pp.1375-1382.
  10. Gatos, B., Ntirogiannis, K., and Pratikakis, I., ICDAR 2009 Document Image Binarization Contest (DIBCO 2009). In Proc. 10th. ICDAR, Barcelona, Spain, Jul. 2009, pp.1375-1382.
  11. Kamada, H., and Fujimoto, K., High-speed, Highaccuracy Binarization Method for Recognizing Text in Images of Low Spatial Resolutions. In Proc. 5th. ICDAR, Bangalore, India, Sep. 1999, pp.139-142.
  12. Kamada, H., and Fujimoto, K., High-speed, Highaccuracy Binarization Method for Recognizing Text in Images of Low Spatial Resolutions. In Proc. 5th. ICDAR, Bangalore, India, Sep. 1999, pp.139-142.
  13. Tanaka, H., Threshold Correction of Document Image Binarization for Ruled-line Extraction. In Proc. 10th. ICDAR, Barcelona, Spain, Jul. 2009, pp.541-545.
  14. Tanaka, H., Threshold Correction of Document Image Binarization for Ruled-line Extraction. In Proc. 10th. ICDAR, Barcelona, Spain, Jul. 2009, pp.541-545.
  15. Eikvil, L., Taxt, T., and Moen, K., A Fast Adaptive Method for Binarization of Document Images. In Proc. 1st. ICDAR, Saint-Malo, France, Sep. 1991, pp.435- 443.
  16. Eikvil, L., Taxt, T., and Moen, K., A Fast Adaptive Method for Binarization of Document Images. In Proc. 1st. ICDAR, Saint-Malo, France, Sep. 1991, pp.435- 443.
  17. Bernsen, J., Dynamic Thresholding of Gray-level Images. In Proc. 8th. ICPR, Paris, France, Oct. 1986, pp.1251- 1255.
  18. Bernsen, J., Dynamic Thresholding of Gray-level Images. In Proc. 8th. ICPR, Paris, France, Oct. 1986, pp.1251- 1255.
Download


Paper Citation


in Harvard Style

Tanaka H., Fujii Y. and Hotta Y. (2011). THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011) ISBN 978-989-8425-47-8, pages 387-391. DOI: 10.5220/0003396503870391


in Harvard Style

Tanaka H., Fujii Y. and Hotta Y. (2011). THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011) ISBN 978-989-8425-47-8, pages 387-391. DOI: 10.5220/0003396503870391


in Bibtex Style

@conference{visapp11,
author={Hiroshi Tanaka and Yusaku Fujii and Yoshinobu Hotta},
title={THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)},
year={2011},
pages={387-391},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003396503870391},
isbn={978-989-8425-47-8},
}


in Bibtex Style

@conference{visapp11,
author={Hiroshi Tanaka and Yusaku Fujii and Yoshinobu Hotta},
title={THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)},
year={2011},
pages={387-391},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003396503870391},
isbn={978-989-8425-47-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)
TI - THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION
SN - 978-989-8425-47-8
AU - Tanaka H.
AU - Fujii Y.
AU - Hotta Y.
PY - 2011
SP - 387
EP - 391
DO - 10.5220/0003396503870391


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2011)
TI - THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION
SN - 978-989-8425-47-8
AU - Tanaka H.
AU - Fujii Y.
AU - Hotta Y.
PY - 2011
SP - 387
EP - 391
DO - 10.5220/0003396503870391