the common presence of distortions like noise,
smudges, and skewed writing create significant
hurdles. Traditional Optical Character Recognition
(OCR) systems often stumble when faced with
handwritten text, particularly when dealing with
cursive script, multiple languages, or aged historical
documents.
So, when we're talking about the history of how
computers learn to 'read' text in images, Steve Britton's
article is a great place to start. He basically walks you
through how OCR has evolved, like, from clunky
early versions to the pretty slick systems we have now.
He also points out how it's used everywhere, from
offices digitizing paperwork to libraries scanning old
books, showing how it's really changed how we
handle information.
Then, Gomez and Karatzas tackled a tricky
problem: getting computers to recognize text in
messy, real-world images. Think of trying to read a
street sign in a blurry photo that's the kind of thing they
were working on. They presented their findings at a
big conference, and it's super helpful for anyone trying
to make OCR work outside of perfect lab conditions.
Zelic and Sable give a nice, up-to-date overview of
OCR, and they focus on this tool called Tesseract. It's
basically the engine behind a lot of the text recognition
we use. Their article helps you understand how these
systems actually work and what they can do in
everyday situations.
Lefèvre and Piantanida introduced this thing called
Pytesseract, which is like a bridge between Python, the
programming language, and Tesseract. They basically
made it way easier for people to use Tesseract in their
own projects, which is a huge deal for developers.
Dale and Kilgariff talk about how important it is
for people in the language technology world to work
together and share resources. It's not directly about
OCR, but it highlights how collaboration can really
push tech forward. And finally, if you want to get a
really solid foundation in how computers understand
language, Bird, Klein, and Loper's book is awesome.
It's not just about OCR, but it gives you the
background you need to understand how language
processing works, which is crucial for making OCR
even better.
The book "Digital Image Processing" by Rafael C.
Gonzalez and Richard E. Woods (4th edition, in 2018,
a basic understanding of image processing, an
essential part of Optical Character Recognition
(OCR), is given. This book adds to the body of
literature by providing informative information of
digital image processing methods. which support
OCR technologies.
3 METHODOLOGY
3.1 Preprocessing the Input Image
First, we take the image of the handwritten text and
clean it up using OpenCV, which is a really important
tool for this. OpenCV makes the image easier to read
and helps the computer recognize the text better. This
cleaning process involves a few key steps:
• We start by turning the image into grayscale,
which usually gives better contrast and makes
it simpler to analyze.
• Then, we use Gaussian blur to reduce noise
and improve the image. This helps create a
cleaner output.
• Thresholding is a crucial step where we
convert the image to black and white. This
makes the text stand out more from the
background.
By doing these cleaning steps S. Hochreiter and J.
Schmidhuber, 1997, we help the system to recognize
and extract the handwritten text accurately, and we set
the stage for the next steps.
3.2 OCR with Pytesseract
After getting the image ready, the next really
important thing is to actually read the text. For this, we
use Pytesseract, which is a powerful and easy-to-use
OCR tool. This step is essential for turning the text
into a format that the computer can understand.
Pytesseract uses what it has learned to identify the
characters in the image and read them for further use.
Pytesseract A. Vaswani et al., 2017 has settings that
we can change to fit different kinds of handwriting.
These settings let the system adapt and adjust.
3.2.1 Pytesseract
A. Vaswani et al., 2017 Pytesseract is a Python tool
that adds extra features to Google's Tesseract-OCR
engine, and it's designed to be really good at
recognizing handwritten text. It makes it easier to use
OCR in Python programs and lets you choose things
like languages and how to clean up the images
beforehand. Because it uses Tesseract's powerful
technology, Pytesseract is excellent at reading
handwritten text from images, and it can handle lots
of different languages and writing styles for various
tasks.