Authors:
Hiroshi Tanaka
;
Yusaku Fujii
and
Yoshinobu Hotta
Affiliation:
Fujitsu, Japan
Keyword(s):
Adaptive binarization, Text extraction, Thresholding, Otsu binarization, Threshold correction, Background noise, Niblack, Image resolution.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Data Manipulation
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Methodologies and Methods
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Soft Computing
Abstract:
In this paper, a simple threshold correction method for document image binarization for text extraction is presented. This method enhances the binary image of characters, which is often adversely influenced by neighboring strong pixels or background noise. The threshold correction method is based on a similar method applied to ruled-line extraction presented by the author, and is claimed to be effective to text extraction. The author also reveals the relationship between effectiveness of the method and the image resolution.