0% found this document useful (0 votes)
17 views7 pages

Optical Character Recognition Overview

This document discusses optical character recognition (OCR) and a proposed solution for an OCR system. It provides background on OCR, including that it can recognize both handwritten and printed characters and convert them to a digital format. The proposed solution involves preprocessing the image through noise removal, segmentation of text into lines, words and characters, and using a neural network for character recognition trained on generated character samples. It discusses performing image acquisition, noise removal, normalization, tilt detection, line detection, word detection, and character detection. Limitations include issues with text separation and requiring high contrast between text and background.

Uploaded by

nancy Poonia
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views7 pages

Optical Character Recognition Overview

This document discusses optical character recognition (OCR) and a proposed solution for an OCR system. It provides background on OCR, including that it can recognize both handwritten and printed characters and convert them to a digital format. The proposed solution involves preprocessing the image through noise removal, segmentation of text into lines, words and characters, and using a neural network for character recognition trained on generated character samples. It discusses performing image acquisition, noise removal, normalization, tilt detection, line detection, word detection, and character detection. Limitations include issues with text separation and requiring high contrast between text and background.

Uploaded by

nancy Poonia
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Optical Character Recognizer

(A broader aspect of handwritten digit recognizer)

CS-16
Project Supervisor : Dr. Krishna K. Mishra
Team Member:
Payal Gupta (20188062)
Prerna Agarwal (20184066)
Nancy (20184191)
Harshit Meena (20164065)
Intoduction
Text is everywhere! It is present in PDFs, docs as well as
images. There are lots of applications where text data is useful
for doing analytics like include receipts recognition, number
plate detection, extracting the latex formulas from the images
etc. By using the computer’s voice-operated program, blind
people can scan books, magazines, and incoming faxes into
word processing programs with ease.
As OCR stands for optical character recognition, OCR
technology deals with the problem of recognizing all kinds of
different characters. Both handwritten and printed characters
can be recognized and converted into a machine-readable,
digital data format.
Think of any kind of serial number or code consisting of
numbers and letters that you need digitized. By using OCR you
can transform these codes into a digital output. The technology
makes use of many different techniques. Put simply, the image
taken is processed, the characters extracted, and are then
recognized.
What OCR does not do is consider the actual nature of the
object that you want to scan. It simply “takes a look” at the
characters that you aim to transform into a digital format. For
example, if you scan a word it will learn and recognize the
letters, but not the meaning of the word.
LITERATURE REVIEW
Character recognition is not a new problem but its roots can
be traced back to systems before the inventions of computers.
The earliest OCR systems were not computers but mechanical
devices that were able to recognize characters, but very slow
speed and low accuracy. The early OCR systems were
criticized due to errors and slow recognition speed. Hence, not
much research efforts were put on the topic during 60’s and
70’s. The only developments were done on government
agencies and large corporations like banks, newspapers and
airlines etc. OCR text works efficiently with the printed text
only and not with handwritten text.
Documents generated on a high quality paper with modern
printing technologies allow the systems to exceed 99%
recognition accuracy. However, the recognition rate of the
commercially available products depends on the age of the
documents, quality of the paper and ink, which may result in
significant data acquisitions noise. Documents with coloured
or patterned backgrounds, marked with pens, crooked when
scanned, can yield poor OCR results. Some improvement can
be done by either adjusting the scanner settings and
rescanning the document or manually correcting the electronic
data.
PROPOSED SOLUTION
The OCR is performed in the following phases:
• Image is retrieved The image should be cropped in such
a way that only text is present. Also, the background
should be very lighter than the text. Ideal image would
be black text on a white paper background.
• Preprocessing Noises are removed by blurring. The it is
converted to binary image along with invert. For this
we've used OpenCV methods such as gaussian blur and
threshold.
• Segmentation Segmentation is divided into three parts.
First we segment the image based on lines. Then the
lines are separated into words. Lastly, the words are
separated into characters. OpenCV methods such as
projections and contour detections are used. The
characters are then fed into the neural network.
• Neural Network There are two parts to neural network.
First is Training Neural Network. For training the neural
network, we will first generate our own samples for each
characters. So we will then converte those images into
numpy array and combine all samples with
corresponding labels required by the neural network.
Second is Recognizing characters.
• Along with that, we also checked each words in the
english dictionary to fix the spelling errors.
SIMULATIONS and CONCLUSION
Image Processing-
[Link] Acquisition- Retrieve image saved in a remote
location in the computer.
[Link] Removal- Use blur/smoothen.
[Link]- Convert image pixels to one of two pixels –
either black or white.
[Link] Detection- Detect text pixels
[Link] Detection- Calculate horizontal projections of the
image.
[Link] Detection- Calculate vertical projections of the image.
[Link] Detection- Create separate character images based
on contours.
Limitations-
Text separation.
•The system cannot work with an input image consisting of
only a small amount of text and a large amount of scenery.
•The text should be darker and the background should be
brighter.

REFERENCES
[Link]
[Link]
[Link]

Common questions

Powered by AI

In current OCR processes, the arrangement of images affects system effectiveness. Ideally, the text should contrast sharply against a brighter or plain white background to enhance clarity. Poorly arranged texts, such as those intertwined with heavy scenery or on dark backgrounds, can yield poor recognition results. Crooked scans or those with visible markings can further degrade OCR performance, necessitating additional preprocessing or manual corrections to improve accuracy .

Noise significantly affects the accuracy of OCR systems as it can distort text characters during scanning, leading to misrecognition. Effective noise removal is essential for enhancing recognition accuracy; this is typically achieved via noise reduction techniques such as blurring. Converting the image into a binary format further aids in clarity, facilitating better recognition by neural networks. Proper preprocessing mitigates noise-induced errors, elevating OCR's text recognition efficacy .

Image normalization in OCR is crucial as it standardizes different image attributes, converting pixels to a binary (black and white) format. This transformation simplifies the image, focusing computational resources on extracting text rather than processing various color distinctions, thus making the characters easily recognizable by subsequent OCR processes like segmentation and neural network recognition .

The OCR system addresses preprocessing issues by removing noise using blurring techniques and converting the image to binary form. OpenCV methods like Gaussian blur and thresholding are employed to ensure that characters are clearly distinguishable. Optimal results are achieved with an ideal image being black text on a white background .

OCR technology historically faced challenges like slow speed and low accuracy, particularly pertaining to handwritten text recognition. Earliest systems, being mechanical, were criticized for these limitations, which led to limited research in the 60s and 70s, with advancements being confined primarily to high-quality printed text used by banks and airlines. Hence, commercial OCR systems work efficiently with printed text on high-quality paper and modern printing technologies but struggle with aged documents, paper quality, and backgrounds, leading to significant data noise .

OCR systems can be hindered by limitations such as the inability to effectively process images with minimal text against a complex background. The text needs to be significantly darker than the background, which should be bright. Other challenges include dealing with images where the text is crooked, marked by pens, or affected by colored/patterned backgrounds, which increases the probability of OCR errors .

Advancements in OCR technology have allowed visually impaired individuals to scan textual content from books, magazines, and other documents using voice-operated programs. This technology facilitates conversion of scanned text into audible output or digital text, thereby improving access to written information and enhancing their ability to independently process written materials .

Neural networks in OCR systems are crucial for recognizing characters. The process involves training a network where samples of characters are converted into numpy arrays and labeled appropriately. Once trained, this network is used to recognize scanned characters from segmented images. This involves feeding characters obtained from segmentation phases into the trained model for identification. Additionally, words are checked against an English dictionary to correct potential spelling errors stemming from misrecognition .

Segmentation is essential in OCR as it facilitates breaking down an image into manageable parts, enabling detailed analysis. It is implemented in three stages: firstly, segmenting the image based on lines, secondly splitting these into words, and finally separating words into characters. Techniques such as projections and contour detection using OpenCV methods are employed here to ensure precise segmentation, which is a foundational step before character recognition takes place .

Initially, OCR development lagged due to limited interest and technology. However, government agencies, banks, and large organizations such as airlines played a pivotal role by investing in high-quality OCR technologies suited for their specific needs—processing of bank checks, printed tickets, and newspapers. This necessity for accurate document processing pushed forward technological advancements leading to systems capable of achieving over 99% accuracy under suitable conditions .

You might also like