Analysis for Advancements of Optical Character Recognition in Handwriting Recognition
Yitao Yao
2024
Abstract
In the digital age, despite the widespread use of digital documents, many handwritten documents still need to be converted into digital formats. Optical Character Recognition (OCR) technology-based handwriting recognition addresses this need by converting printed or handwritten text into machine-readable form, improving work efficiency. This paper examines key OCR technologies, including Hidden Markov Models (HMM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) with Long Short-Term Memory (LSTM). The methodology section discusses how HMMs use probabilistic models to recognize text in noisy environments, while CNNs automatically extract features from images. RNNs and LSTMs capture temporal dependencies and context in sequential data, making them effective for recognizing continuous characters and complex text structures. The paper also explores the combination of CNNs with LSTMs for end-to-end text recognition, further enhancing OCR capabilities. The discussion highlights the strengths and limitations of these technologies: HMMs are efficient but limited in expressive power, CNNs excel in feature extraction but require large datasets, and LSTMs handle long-term dependencies well but are computationally intensive. Despite advancements, OCR still faces challenges. This paper offers a comprehensive overview of key models in OCR technology, guiding future research in selecting suitable models for specific tasks and improving accuracy and efficiency.
DownloadPaper Citation
in Harvard Style
Yao Y. (2024). Analysis for Advancements of Optical Character Recognition in Handwriting Recognition. In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-754-2, SciTePress, pages 70-74. DOI: 10.5220/0013487600004619
in Bibtex Style
@conference{daml24,
author={Yitao Yao},
title={Analysis for Advancements of Optical Character Recognition in Handwriting Recognition},
booktitle={Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2024},
pages={70-74},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013487600004619},
isbn={978-989-758-754-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - Analysis for Advancements of Optical Character Recognition in Handwriting Recognition
SN - 978-989-758-754-2
AU - Yao Y.
PY - 2024
SP - 70
EP - 74
DO - 10.5220/0013487600004619
PB - SciTePress