Pattern Recognition
CHAPTER 8: Applications of Pattern Recognition
Prepared By: Prof. Manisha C. Chandramaully
What is speech recognition?
• Speech recognition, also known as automatic speech
recognition (ASR), computer speech recognition, or
speech-to-text, is a capability which enables a program
to process human speech into a written format.
• While it’s commonly confused with voice recognition,
speech recognition focuses on the translation of speech
from a verbal format to a text one whereas voice
recognition just seeks to identify an individual user’s
voice.
Key features of effective speech
recognition:
• Language weighting: Improve precision by weighting specific
words that are spoken frequently (such as product names or
industry jargon), beyond terms already in the base vocabulary.
• Speaker labeling: Output a transcription that cites or tags
each speaker’s contributions to a multi-participant
conversation.
• Acoustics training: Attend to the acoustical side of the
business. Train the system to adapt to an acoustic environment
(like the ambient noise in a call center) and speaker styles (like
voice pitch, volume and pace).
• Profanity filtering: Use filters to identify certain words or
phrases and sanitize speech output.
Speech recognition algorithms:
• Natural language processing (NLP)
• Hidden markov models (HMM)
• N-grams
• Neural networks
• Speaker Diarization (SD)
Speech recognition use cases:
• Automotive
• Technology
• Healthcare
• Security
Character recognition:
• Characters are then identified using one of two
algorithms: pattern recognition or feature recognition.
Pattern recognition is used when the OCR program is
fed examples of text in various fonts and formats to
compare and recognize characters in the scanned
document or image file.
Optical Character Recognition
• Optical Character Recognition (OCR) is the process that
converts an image of text into a machine-readable text
format.
• For example, if you scan a form or a receipt, your
computer saves the scan as an image file. You cannot
use a text editor to edit, search, or count the words in
the image file.
Why is OCR important?
• Most business workflows involve receiving information from
print media.
• Paper forms, invoices, scanned legal documents, and printed
contracts are all part of business processes.
• These large volumes of paperwork take a lot of time and
space to store and manage.
• The process requires manual intervention and can be
tedious and slow.
• OCR technology solves the problem by converting text
images into text data that can be analyzed by other
business software.
How does OCR work?
• Image acquisition: A scanner reads documents and converts
them to binary data. The OCR software analyzes the scanned
image and classifies the light areas as background and the dark
areas as text.
• Preprocessing: The OCR software first cleans the image and
removes errors to prepare it for reading. Cleaning techniques are:
1. Deskewing or tilting the scanned document slightly to fix
alignment issues during the scan.
2. Despeckling or removing any digital image spots or smoothing
the edges of text images.
3. Cleaning up boxes and lines in the image.
4. Script recognition for multi-language OCR technology
Cont..
• Text recognition: The two main types of OCR algorithms or software
processes that an OCR software uses for text recognition are called pattern
matching and feature extraction.
• Pattern matching: Pattern matching works by isolating a character image,
called a glyph, and comparing it with a similarly stored glyph. Pattern
recognition works only if the stored glyph has a similar font and scale to the
input glyph. This method works well with scanned images of documents that
have been typed in a known font.
• Feature extraction: Feature extraction breaks down or decomposes the
glyphs into features such as lines, closed loops, line direction, and line
intersections. It then uses these features to find the best match or the nearest
neighbor among its various stored glyphs.
• Postprocessing: After analysis, the system converts the extracted text data
into a computerized file. Some OCR systems can create annotated PDF files
that include both the before and after versions of the scanned document.
Types of OCR
• Simple optical character recognition software
• Intelligent character recognition software
• Intelligent word recognition
• Optical mark recognition
OCR Aplications:
• Banking
• Healthcare
• Logistics
Scene Analysis
• How to Analyze a Scene ?
• While you can analyze an entire film, you can also
choose a scene from the movie and break it down even
further. Before you choose a scene you want to analyze,
watch the entire film first so you can understand what’s
happening. Go over the scene you want to analyze
multiple times so you can pick out the details and take
notes on it. Once you have your notes, you can write a
formal analysis essay about the scene.