loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Taeka Awazu ; Manami Fukuo ; Masami Takata and Kazuki Joe

Affiliation: Nara Women's University, Japan

Keyword(s): Character Recognition, Character Clipping, Genetic Programing, Early-modern Japanese Printed Books.

Related Ontology Subjects/Areas/Topics: Applications ; Character Recognition ; Classification ; Evolutionary Computation ; Pattern Recognition ; Software Engineering ; Theory and Methods

Abstract: The web site of National Diet Library in Japan provides a lot of early-modern (AD1868-1945) Japanese printed books to the public, but full-text search is essentially impossible. In order to perform advanced search for historical literatures, the automatic textualization of the images is required. However, the ruby system, which is peculiar to Japanese books, gives a serious obstacle against the textualization. When we apply existing OCRs to early-modern Japanese printed books, the recognition rate is extremely low. To solve this problem, we have already proposed a multi-font Kanji character recognition method using the PDC feature and an SVM. In this paper, we propose a ruby character removal method for early-modern Japanese printed books using genetic programming, and evaluate our multi-fonts Kanji character recognition method with 1,000 types of early-modern Japanese printed Kanji characters.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.222.119.148

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Awazu, T.; Fukuo, M.; Takata, M. and Joe, K. (2014). A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters. In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-018-5; ISSN 2184-4313, SciTePress, pages 637-645. DOI: 10.5220/0004825306370645

@conference{icpram14,
author={Taeka Awazu. and Manami Fukuo. and Masami Takata. and Kazuki Joe.},
title={A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters},
booktitle={Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2014},
pages={637-645},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004825306370645},
isbn={978-989-758-018-5},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters
SN - 978-989-758-018-5
IS - 2184-4313
AU - Awazu, T.
AU - Fukuo, M.
AU - Takata, M.
AU - Joe, K.
PY - 2014
SP - 637
EP - 645
DO - 10.5220/0004825306370645
PB - SciTePress