Computer Vision R22
INTRODUCTION
Image Processing Foundations
Digital Image Processing means processing digital image by means of a digital computer. We
can also say that it is a use of computer algorithms, in order to get enhanced image either to
extract some useful information.
Digital image processing is the use of algorithms and mathematical models to process and
analyze digital images. The goal of digital image processing is to enhance the quality of images,
extract meaningful information from images, and automate image-based tasks.
The basic steps involved in digital image processing are:
1. Image acquisition: This involves capturing an image using a digital camera or scanner,
or importing an existing image into a computer.
2. Image enhancement: This involves improving the visual quality of an image, such as
increasing contrast, reducing noise, and removing artifacts.
3. Image restoration: This involves removing degradation from an image, such as blurring,
noise, and distortion.
4. Image segmentation: This involves dividing an image into regions or segments, each of
which corresponds to a specific object or feature in the image.
5. Image representation and description: This involves representing an image in a way that
can be analyzed and manipulated by a computer, and describing the features of an image
in a compact and meaningful way.
6. Image analysis: This involves using algorithms and mathematical models to extract
information from an image, such as recognizing objects, detecting patterns, and
quantifying features.
7. Image synthesis and compression: This involves generating new images or compressing
existing images to reduce storage and transmission requirements.
8. Digital image processing is widely used in a variety of applications, including medical
imaging, remote sensing, computer vision, and multimedia.
Image processing mainly include the following steps:
[Link] the image via image acquisition tools;
[Link] and manipulating the image;
[Link] in which result can be altered image or a report which is based on analysing that
image.
What is an image?
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial
coordinates, and the amplitude of F at any pair of coordinates (x,y) is called the intensity of
that image at that point. When x,y, and amplitude values of F are finite, we call it a digital
image.
In other words, an image can be defined by a two-dimensional array specifically arranged in
rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a
particular value at a particular [Link] elements are referred to as picture
elements,image elements,and pixels.A Pixel is most widely used to denote the elements of a
Digital Image.
Sampling: Digitizing the coordinate values (the spatial resolution) to create a grid of
discrete locations.
Quantization: Digitizing the amplitude (intensity or color values) at each location into
a finite number of discrete levels, typically represented by a certain number of bits (e.g.,
8-bit grayscale offers 256 shades).
Each element in the resulting grid is called a pixel (picture element).
Types of an image
1. BINARY IMAGE- The binary image as its name suggests, contain only two pixel
elements i.e 0 & 1,where 0 refers to black and 1 refers to white. This image is also
known as Monochrome.
2. BLACK AND WHITE IMAGE- The image which consist of only black and white
color is called BLACK AND WHITE IMAGE.
3. 8 bit COLOR FORMAT- It is the most famous image [Link] has 256 different
shades of colors in it and commonly known as Grayscale Image. In this format, 0 stands
for Black, and 255 stands for white, and 127 stands for gray.
4. 16 bit COLOR FORMAT- It is a color image format. It has 65,536 different colors in
[Link] is also known as High Color Format. In this format the distribution of color is not
as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue.
That famous RGB format.
Image as a Matrix
As we know, images are represented in rows and columns we have the following syntax in
which images are represented:
The right side of this equation is digital image by definition. Every element of this matrix is
called image element , picture element , or pixel.
DIGITAL IMAGE REPRESENTATION IN MATLAB:
In MATLAB the start index is from 1 instead of 0. Therefore, f(1,1) = f(0,0).
henceforth the two representation of image are identical, except for the shift in origin.
In MATLAB, matrices are stored in a variable i.e X,x,input_image , and so on. The variables
must be a letter as same as other programming languages.
PHASES OF IMAGE PROCESSING:
[Link]- It could be as simple as being given an image which is in digital form.
The main work involves:
a) Scaling
b) Color conversion(RGB to Gray or vice-versa)
[Link] ENHANCEMENT- It is amongst the simplest and most appealing in areas of
Image Processing it is also used to extract some hidden details from an image and is
subjective.
[Link] RESTORATION- It also deals with appealing of an image but it is
objective(Restoration is based on mathematical or probabilistic model or image degradation).
[Link] IMAGE PROCESSING- It deals with pseudocolor and full color image
processing color models are applicable to digital image processing.
[Link] AND MULTI-RESOLUTION PROCESSING- It is foundation of
representing images in various degrees.
[Link] COMPRESSION-It involves in developing some functions to perform this
operation. It mainly deals with image size or resolution.
[Link] PROCESSING-It deals with tools for extracting image
components that are useful in the representation & description of shape.
[Link] PROCEDURE-It includes partitioning an image into its constituent
parts or objects. Autonomous segmentation is the most difficult task in Image Processing.
[Link] & DESCRIPTION-It follows output of segmentation stage,
choosing a representation is only the part of solution for transforming raw data into processed
data.
[Link] DETECTION AND RECOGNITION-It is a process that assigns a label to an
object based on its descriptor.
OVERLAPPING FIELDS WITH IMAGE PROCESSING
According to block 1,if input is an image and we get out image as a output, then it is termed
as Digital Image Processing.
According to block 2,if input is an image and we get some kind of information or
description as a output, then it is termed as Computer Vision.
According to block 3,if input is some description or code and we get image as an output,
then it is termed as Computer Graphics.
According to block 4,if input is description or some keywords or some code and we get
description or some keywords as a output,then it is termed as Artificial Intelligence
Advantages of Digital Image Processing:
1. Improved image quality: Digital image processing algorithms can improve the visual
quality of images, making them clearer, sharper, and more informative.
2. Automated image-based tasks: Digital image processing can automate many image-
based tasks, such as object recognition, pattern detection, and measurement.
3. Increased efficiency: Digital image processing algorithms can process images much
faster than humans, making it possible to analyze large amounts of data in a short
amount of time.
4. Increased accuracy: Digital image processing algorithms can provide more accurate
results than humans, especially for tasks that require precise measurements or
quantitative analysis.
Disadvantages of Digital Image Processing:
1. High computational cost: Some digital image processing algorithms are
computationally intensive and require significant computational resources.
2. Limited interpretability: Some digital image processing algorithms may produce results
that are difficult for humans to interpret, especially for complex or sophisticated
algorithms.
3. Dependence on quality of input: The quality of the output of digital image processing
algorithms is highly dependent on the quality of the input images. Poor quality input
images can result in poor quality output.
4. Limitations of algorithms: Digital image processing algorithms have limitations, such
as the difficulty of recognizing objects in cluttered or poorly lit scenes, or the inability
to recognize objects with significant deformations or occlusions.
5. Dependence on good training data: The performance of many digital image processing
algorithms is dependent on the quality of the training data used to develop the
algorithms. Poor quality training data can result in poor performance of the algorithm
Applications
Image processing fundamentals are applied across diverse fields, including:
Medical Imaging: Analyzing X-rays, MRIs, and CT scans for diagnostics.
Remote Sensing: Monitoring the environment and mapping land use via satellite
imagery.
Security and Surveillance: Object recognition and monitoring of live video feeds.
Agriculture: Identifying plant diseases, sorting fruits, and precision farming.
Review of image processing techniques
Purpose of Image processing
The purpose of image processing is divided into 5 groups. They are :
Visualization - Observe the objects that are not visible.
Image sharpening and restoration - To create a better image.
Image retrieval - Seek for the image of interest.
Measurement of pattern - Measures various objects in an image.
Image Recognition - Distinguish the objects in an image.
1. What is the role of thresholding in image processing?
Thresholding is a fundamental image processing technique used for image segmenta on. Its primary
role is to convert a grayscale image into a binary image by separa ng pixels into two groups (typically
foreground and background) based on their intensity levels rela ve to a predefined constant.
2. How does noise affect edge detec on?
Noise consists of random varia ons in image intensity that can be mistaken for actual edges, leading
to false edge detec on. Because edge detec on relies on calcula ng intensity gradients (deriva ves),
noise can significantly distort these calcula ons, making it difficult to dis nguish true boundaries
from background interference.
3. What is the role of Fourier descriptors in object recogni on?
Fourier descriptors are used as boundary descriptors to represent the shape of an object in the
frequency domain. Their role in object recogni on is to provide a way to describe shapes that is
o en invariant to rota on, scale, and transla on, allowing the system to iden fy objects regardless
of their orienta on or size.
4. Why is occlusion handling important in shape analysis?
Occlusion occurs when an object is par ally hidden by another element in the scene. Handling
occlusion is cri cal because it allows a computer vision system to recognize or track objects even
when their full shape is not visible, preven ng the system from failing when a boundary is
incomplete.
5. What is the basic concept of the Hough Transform?
The basic concept of the Hough Transform is to detect geometric primi ves (such as lines, circles, or
ellipses) by mapping edge points from the image space into a parameter space. It uses a vo ng
procedure where each point "votes" for all possible shapes it could belong to; the parameters with
the highest votes indicate the detected shapes.
[Link] edge detec on techniques and their applica ons in image processing.
Edge Detec on: Techniques and Applica ons
Edge detec on is a fundamental tool in image processing, machine vision, and computer
vision. It refers to the process of iden fying and loca ng sharp discon nui es in an image,
which typically reflect important events and changes in proper es of the world, such as
boundaries between objects.
1. Common Edge Detec on Techniques
Edge detec on involves several mathema cal approaches to iden fy pixels where image
brightness changes sharply.
Classical Filtering (Gradient-Based): These methods calculate the first-order
deriva ve (gradient) of the image intensity.
o Sobel Operator: Uses two $3 \ mes 3$ kernels to calculate horizontal and
ver cal gradients, providing a degree of smoothing to reduce noise.
o Prewi Operator: Similar to Sobel but uses a different set of weights to
detect edges.
Laplacian (Second-Order Deriva ve): This technique searches for zero-crossings in
the second deriva ve of the image intensity to find the exact loca on of an edge.
Canny Edge Detector: Considered the industry standard, it involves a mul -stage
process:
1. Noise Reduc on: Smoothing the image with a Gaussian filter.
2. Gradient Calcula on: Finding intensity gradients.
3. Non-Maximum Suppression: Thinning the edges to 1-pixel width.
4. Hysteresis Thresholding: Using two thresholds to connect "strong" and
"weak" edges while discarding noise.
2. Role of Morphology in Edge Detec on
Mathema cal morphology is o en used alongside edge detec on to refine results.
Dila on and Erosion: These opera ons can bridge gaps in detected edges or remove
small noise ar facts that were incorrectly labeled as edges.
Skeletons and Thinning: These techniques reduce a detected edge or shape to its
one-pixel thick representa on, which is essen al for shape analysis and object
recogni on.
3. Applica ons in Image Processing
Edge detec on serves as a building block for more complex computer vision tasks.
Shape Analysis and Object Recogni on: By extrac ng the boundaries of objects,
systems can perform shape modeling and recogni on. For example, Fourier
descriptors or moments can be applied to the detected edges to iden fy specific
shapes.
Feature Extrac on (Hough Transform): Edge detec on is the first step for the Hough
Transform. Once edges are found, the transform can detect specific geometric shapes
like lines, circles (e.g., human iris loca on), or ellipses.
In-Vehicle Vision Systems: Edge detec on is used to locate roadway road markings
and iden fy road signs by isola ng their dis nct shapes against the background.
Medical Imaging: Used for hole detec on or iden fying the boundaries of organs
and tumors in 2D or 3D reconstruc ons.
Surveillance and Tracking: Edge-based features help in foreground-background
separa on and tracking pedestrians or moving objects in surveillance feeds.
7. Explain about corner and interest point detec on.
Corner and Interest Point Detec on
Corner and interest point detec on is a fundamental step in computer vision used to iden fy
specific, repeatable points in an image that can be robustly tracked or matched. These points
represent loca ons where the image signal varies significantly in mul ple direc ons.
1. Core Concept of Interest Points
An interest point (or feature point) is a point in an image that has a well-defined posi on
and is stable under local and global perturba ons, such as changes in illumina on or
brightness.
Edge vs. Corner: While an edge point only shows a change in intensity in one
direc on (perpendicular to the edge), a corner point shows significant intensity
changes in all direc ons.
Invariance: A good interest point detector should be invariant to rota on, scale, and
transla on, meaning the same point should be detected even if the camera moves or
the object rotates.
2. Mathema cal Founda ons
The detec on of corners o en relies on mathema cal deriva ves of the image intensity.
Second Moment Matrix: Most detectors analyze the local distribu on of gradients. If
the eigenvalues of the local gradient matrix are both large, the point is classified as a
corner.
Harris Corner Detector: One of the most common classical techniques, it uses a
mathema cal formula to find points with high "cornerness" by looking at intensity
shi s in a small window.
3. Role in the Computer Vision Pipeline
Interest points serve as the "anchor points" for more complex algorithms:
Feature Colla on: Once interest points are detected, they are grouped or "collated"
to iden fy larger structures or objects.
GHT (Generalized Hough Transform): Interest points can be used as input for the
GHT to locate arbitrary shapes by looking at the spa al rela onship between
detected points.
Point-based Representa on: In 3D vision, objects are o en represented as a
collec on of 3D interest points derived from 2D images.
4. Key Applica ons
Detec ng these points is essen al for several prac cal computer vision tasks:
3D Object Recogni on: Iden fying a specific 3D object in a 2D image by matching its
interest points to a known model.
Mo on Analysis & Triangula on: Tracking interest points across mul ple video
frames allows for the calcula on of mo on and the reconstruc on of 3D scenes
through triangula on.
Image Alignment (Photo Albums): Used in applica ons like automated photo albums
to align and s tch images by matching common interest points.
Face Recogni on: Interest points on the face (eyes, nose, mouth corners) are used to
build Eigenfaces or Ac ve Appearance Models for iden fica on.
8. Describe boundary tracking procedures and their applica ons.
Boundary Tracking Procedures and Their Applica ons
Boundary tracking (also known as contour following) is a technique used in digital image
processing to extract the outer contour of an object from a binary or segmented image.
Unlike simple edge detec on which iden fies all local intensity changes, boundary tracking
follows a sequen al path along the edge of a specific region to describe its shape.
1. The Basic Procedure of Boundary Tracking
Boundary tracking algorithms generally follow a step-by-step logical flow to trace the
perimeter of a shape:
Star ng Point Iden fica on: The algorithm scans the image (usually from top-to-
bo om, le -to-right) un l it hits the first pixel belonging to an object (a foreground
pixel).
Neighbor Searching: Once the first pixel is found, the algorithm examines its
neighboring pixels (using 4-connec vity or 8-connec vity) to find the next pixel that
lies on the boundary.
Direc onal Movement: The algorithm moves to the next boundary pixel and updates
its "current direc on." It con nues searching for neighbors in a clockwise or counter-
clockwise manner to ensure the en re outer perimeter is traversed.
Termina on Criterion: The process repeats un l the algorithm returns to the ini al
star ng pixel, effec vely closing the loop and defining the complete boundary.
2. Advanced Techniques in Shape Analysis
Boundary tracking o en serves as the founda on for more sophis cated shape modeling
methods:
Chain Codes: As the boundary is tracked, the sequence of movements (e.g., 0 for
East, 2 for North) is recorded as a chain code. This provides a compact, numerical
representa on of the shape.
Ac ve Contours (Snakes): These are "energy-minimizing" splines that use boundary
tracking logic to lock onto object edges, even if the edges are slightly noisy or blurry.
Deformable Shape Analysis: This involves tracking boundaries that may change over
me or vary between different instances of the same object class.
3. Applica ons of Boundary Tracking
Tracking the boundary is essen al for high-level recogni on and measurement tasks:
Object Labeling and Coun ng: By tracking and closing boundaries, a system can
dis nguish individual objects in a crowded scene, allowing for accurate size filtering
and coun ng.
Feature Extrac on for Recogni on: The tracked boundary is used to calculate
Fourier descriptors or Centroidal profiles, which allow the system to recognize
objects regardless of their orienta on.
Medical and Biological Analysis: Boundary tracking is used for Human Iris loca on
and hole detec on within biological structures, where precise perimeter
measurements are required.
Surveillance and Mo on: In surveillance, tracking the boundary of a moving person
allows for human gait analysis—analyzing the way a person walks to iden fy them or
their intent.
Handling Occlusion: Sophis cated tracking procedures can help es mate the "true"
boundary of an object even when part of it is hidden by another object.
4. Summary for Shape Analysis
Boundary tracking is a bridge between low-level pixel processing and high-level 3D object
recogni on. By conver ng a collec on of pixels into a structured boundary, computer vision
systems can perform complex tasks like 3D shape modeling and ac ve appearance
modeling of faces.
9 How do ac ve contours help in shape recogni on?
Ac ve Contours in Shape Recogni on
Ac ve contours, o en referred to as "Snakes," are energy-minimizing splines used in
computer vision to delineate object boundaries from noisy 2D images. They play a vital role
in shape recogni on by providing a flexible and accurate way to describe complex
geometries.
1. Basic Mechanism (The "Snake" Concept)
The fundamental idea behind an ac ve contour is to place a deformable curve near an
object and let it "evolve" or shrink un l it fits the object's boundary. This evolu on is driven
by an energy minimiza on process:
Internal Energy: Controls the smoothness and con nuity of the curve, preven ng it
from developing sharp corners or breaking.
External (Image) Energy: Pulls the curve toward image features like edges, lines, or
boundaries.
Constraint Energy: Allows for user interac on or high-level informa on to guide the
contour.
2. How They Facilitate Shape Recogni on
Ac ve contours are par cularly powerful for shape recogni on because they bridge the gap
between low-level pixels and high-level shape models:
Handling Incomplete Data: Unlike simple edge detec on, ac ve contours can
"bridge" gaps in boundaries caused by noise or low contrast, crea ng a con nuous
shape model that is easier for a computer to recognize.
Deformable Shape Analysis: They are essen al for deformable shape analysis,
where the object’s shape may vary (e.g., a hand in different posi ons). The contour
adapts to these varia ons while maintaining the overall topology of the object.
Genera ng Shape Descriptors: Once the ac ve contour has se led on a boundary,
the resul ng curve can be used to generate boundary descriptors, such as Fourier
descriptors or centroidal profiles, which are then matched against a database for
recogni on.
Ac ve Appearance Models: In advanced applica ons like face recogni on, ac ve
contours help build ac ve appearance and 3D shape models of faces, allowing the
system to iden fy individuals despite changes in expression or pose.
3. Key Applica ons
Ac ve contours are used in various specialized fields men oned in the course:
Medical Imaging: Iden fying the boundaries of organs or tumors where edges might
be fuzzy.
Face Recogni on: Specifically through ac ve appearance models to map facial
features.
Surveillance: Tracking the boundary of a moving person for human gait analysis.
Shape from Shading/Texture: Assis ng in iden fying surface representa ons in 3D
vision.
10 Describe the foot-of-normal method for line localiza on.
11 Explain about Hough Transform-based circular object detec on.
1. Define edge detection.
Edge detection is a mathematical technique used to identify and locate points in an image
where the image brightness changes sharply. It typically identifies discontinuities in
intensity, such as those caused by object boundaries or surface texture.
2. What is the role of image thresholding in object segmentation?
The role of thresholding is to simplify an image by converting it into a binary form,
separating the foreground (objects of interest) from the background. It is a foundational step
in object labeling and counting by isolating regions based on intensity values.
3. What is the purpose of skeletonization in shape analysis?
Skeletonization is a thinning process used to reduce a shape to a one-pixel thick central
spine (skeleton) while preserving the original object's topology and connectivity. This
simplifies the shape, making it easier to analyze the structure and connectivity of the object.
4. Explain how chain codes represent object boundaries.
Chain codes represent boundaries by recording a sequence of unit-length segments in
specific directions (e.g., 4-connectivity or 8-connectivity) starting from a specific pixel. This
provides a compact, coordinate-independent numerical description of the object's perimeter.
5. Define the foot-of-normal method in line detection.
6. Explain about edge detec on methods and their advantages and
disadvantages.
Edge Detec on Methods: Techniques, Advantages, and Disadvantages
Edge detec on is a primary tool in image processing used to iden fy areas in a digital image
where the brightness changes sharply. These discon nui es o en represent object
boundaries, surface markings, or changes in scene depth.
1. Classical Gradient-Based Operators (Sobel and Prewi )
These methods use first-order deriva ves to find the maximum change in intensity.
Mechanism: They apply $3 \ mes 3$ kernels to the image to calculate horizontal and
ver cal gradients.
Advantages: * Simple to implement and computa onally inexpensive.
o Effec ve for images with high contrast and low noise.
Disadvantages: * Very sensi ve to noise.
o Produces thick edges that may require further processing to thin.
2. Laplacian (Second-Order Deriva ve)
The Laplacian operator finds edges by loca ng zero-crossings in the second deriva ve of the
image intensity.
Mechanism: It uses a single mask to calculate the rate of change of the gradient.
Advantages:
o Can detect the exact center of an edge (zero-crossing).
o Isotropic, meaning it responds equally to edges in any direc on.
Disadvantages:
o Extremely sensi ve to noise, o en producing "ghost edges."
o Does not provide informa on about the direc on of the edge.
3. Canny Edge Detector
O en cited as the op mal edge detector, it follows a mul -stage mathema cal approach.
Mechanism:
1. Gaussian Filtering: Smooths the image to remove noise.
2. Gradient Calcula on: Finds the edge strength and direc on.
3. Non-Maximum Suppression: Thins the edges down to 1-pixel width.
4. Hysteresis Thresholding: Uses two thresholds to link strong edges and
remove weak ones.
Advantages:
o High signal-to-noise ra o.
o Excellent localiza on (edges are marked exactly where they exist).
o Minimal response—it marks each edge only once.
Disadvantages:
o Computa onally intensive and slower than simple operators.
o Complexity in choosing the correct low and high threshold values.
4. Mathema cal Morphology and Interest Points
Beyond simple filtering, mathema cal morphology and interest point detec on help refine
edge data.
Mechanism: Opera ons like dila on and erosion are used to bridge gaps or remove
isolated noise points. Interest point detec on (like Harris Corner Detec on) iden fies
where edges change direc on sharply.
Advantages: * Improves the connec vity of boundaries for boundary tracking.
o Essen al for complex tasks like shape recogni on and feature colla on.
Disadvantages: * Requires more memory and sophis cated algorithms compared to
basic filtering.
Applica ons in Computer Vision
Edge detec on serves as the founda on for several advanced units in your course:
Line and Circle Detec on: Acts as the prerequisite for the Hough Transform to detect
lines, circles, and ellipses.
In-Vehicle Vision: Used for loca ng roadway road markings and iden fying road
signs.
3D Vision: Helps in shape from focus and surface representa ons.
7. Explain the various classical filtering opera ons used in image processing.
Classical Filtering Opera ons in Image Processing
Classical filtering opera ons are fundamental image processing techniques used to enhance
images, reduce noise, or extract specific features like edges. These opera ons typically
involve a convolu on process where a small matrix (kernel or mask) is moved over the
image pixels to compute a new value for each point.
1. Linear Smoothing Filters (Mean/Average Filtering)
The primary goal of smoothing filters is to reduce noise and blur sharp details.
Mechanism: It replaces each pixel value with the average of its neighbors.
Applica on: Used for noise reduc on before more complex tasks like interest point
detec on.
Advantage: Very simple and effec ve for removing small, random noise.
Disadvantage: It blurs important image features like edges.
2. Order-Sta s c Filters (Median Filtering)
These are non-linear filters based on the ordering (ranking) of the pixels contained in the
filter mask.
Mechanism: The median filter replaces the value of a pixel with the median of the
intensity levels in the neighborhood.
Applica on: Highly effec ve at removing Salt-and-Pepper noise while preserving
edges be er than mean filters.
Advantage: Excellent at handling "outlier" noise without significant blurring.
3. Edge Detec on Filtering (Gradient Operators)
These filters emphasize areas of the image with high intensity changes.
Sobel and Prewi Operators: Use specific kernels to calculate the first-order
deriva ve (gradient) of the image in horizontal and ver cal direc ons.
Laplacian Filter: A second-order deriva ve filter used to find the exact loca on of
edges through zero-crossings.
Applica on: Essen al for boundary tracking, shape recogni on, and applica ons like
loca ng road markings.
4. Thresholding Opera ons
While o en categorized as a segmenta on tool, thresholding is a founda onal filtering
opera on that converts grayscale images into binary form.
Mechanism: It maps pixels to either "0" or "1" based on whether their intensity is
above or below a certain mathema cal constant.
Applica on: Used for object labeling and coun ng, as well as separa ng foreground
objects from the background.
5. Morphological Filtering
Mathema cal morphology uses the shape of a "structuring element" to process images.
Dila on and Erosion: Dila on adds pixels to the boundaries of objects, while erosion
removes them.
Applica on: Used for size filtering, skeletons and thinning, and handling occlusion in
shape analysis.
Benefit: Helps in cleaning up noisy binary images a er thresholding or edge
detec on.
8. Discuss the concept of connectedness and its role in object labeling and
coun ng.
Connectedness, Object Labeling, and Coun ng
In binary shape analysis, the concept of connectedness is the mathema cal
founda on used to define which pixels belong to the same object. It is essen al
for high-level tasks like object labeling and coun ng, which allow a computer
vision system to dis nguish between individual items in a scene.
1. The Concept of Connectedness
Connectedness defines the rela onship between a pixel and its neighbors
based on their spa al proximity and intensity values (usually 1 for foreground
in binary images).
4-Connectedness: A pixel is connected to its neighbors only if they share
an edge (North, South, East, West).
8-Connectedness: A pixel is connected to its neighbors if they share an
edge or a corner (includes diagonals).
Connec vity Paradox: Choosing the right connec vity is crucial; for
example, using 8-connec vity for foreground and 4-connec vity for
background (or vice versa) helps avoid topological inconsistencies, such
as a thin line not appearing to "separate" two regions.
2. Object Labeling (Connected Component Labeling)
Object labeling is the process of assigning a unique iden fica on (a "label") to
every pixel belonging to the same connected component.
Scan Phase: The algorithm scans the image, typically from top-to-bo om
and le -to-right.
Label Assignment: When it encounters a foreground pixel, it checks the
labels of its already-scanned neighbors (based on the chosen
connec vity).
o If no neighbors have labels, a new label is created.
o If one neighbor has a label, the current pixel takes that label.
o If neighbors have different labels, the algorithm records that these
labels are "equivalent."
Equivalence Resolu on: A second pass is made through the image to
replace all equivalent labels with a single unique iden fier, ensuring the
en re object has one label.
3. Object Coun ng and Size Filtering
Once pixels are labeled, the system can perform sta s cal analysis on the
iden fied regions:
Coun ng: The total number of unique labels generated (a er
equivalence resolu on) represents the total number of objects in the
image.
Size Filtering: By coun ng the number of pixels assigned to a specific
label, the system calculates the area of that object. Size filtering can
then be applied to remove "noise" (very small objects) or to focus only
on objects within a specific size range.
4. Applica ons in Computer Vision
The ability to label and count objects is cri cal for several applica ons listed in
the syllabus:
Surveillance: Dis nguishing between mul ple pedestrians or vehicles in
a scene for tracking.
Automated Photo Albums: Iden fying and separa ng different faces or
objects within a single image.
In-vehicle Vision: Coun ng and iden fying road signs or loca ng
pedestrians to trigger safety alerts.
Medical Imaging: Iden fying and coun ng specific cells or "holes" in
biological structures.
9 Differen ate between region descriptors and boundary descriptors.
In computer vision and shape analysis, descriptors are mathematical features used
to represent and recognize objects. These are generally categorized into Boundary
Descriptors, which focus on the external perimeter, and Region Descriptors,
which focus on the interior content of a shape.
1. Boundary Descriptors (External)
Boundary descriptors characterize the external shape of an object by analyzing its
perimeter or contour.
Definition: These techniques represent the shape based on the sequence of
pixels that form the outer edge of an object.
Key Techniques:
o Chain Codes: A sequence of unit-length segments in specific
directions (e.g., 4 or 8-connectivity) that represent a boundary.
o Boundary Length: A simple measure of the total length of the
tracked perimeter.
o Fourier Descriptors: Mathematical representations that use the
Fourier transform of the boundary coordinates to provide rotation and
scale invariance.
o Centroidal Profiles: Measuring the distance from the centroid to
every point on the boundary.
Best Use Case: They are ideal for applications where the shape's
silhouette is the primary distinguishing feature.
2. Region Descriptors (Internal)
Region descriptors characterize the object by looking at the entire set of pixels that
make up the interior of the shape.
Definition: These techniques consider the "mass" or the area occupied by
the object rather than just its outline.
Key Techniques:
o Moments: Statistical measures (like area, centroid, and Hu moments)
that describe the distribution of pixels within a region.
o Texture: Analyzing the patterns, roughness, or regularity of the pixels
within the object's boundaries.
o Connectedness and Labeling: Identifying how interior pixels are
grouped together to form a solid object.
o Skeletons and Thinning: Reducing the region to a central spine that
represents its structural "bones".
Best Use Case: They are preferred when the internal content or structural
density of the object is more important than its outline, such as in medical
imaging or texture recognition.
3. Key Differences Summary
Feature Boundary Descriptors Region Descriptors
Focus Perimeter and contour Internal pixels and area
Highly sensitive to noise on the Robust against small boundary
Sensitivity
edges noise
Requires less data (only perimeter Requires more data (all interior
Data Amount
pixels) pixels)
Representation 1D sequence or signal 2D distribution of intensity/mass
Examples Chain codes, Fourier descriptors Moments, texture, skeletons
10 Explain the working principle of the Hough Transform for line detec on.
Applications and Refinement
RANSAC for Straight Line Detection: This is often used alongside HT to improve
line fitting and handle outliers.
In-Vehicle Vision: HT is critical for locating roadway road markings and detecting
lane boundaries.
Generalized Hough Transform (GHT): The principle is extended to locate arbitrary
shapes or collate features.
5. Advantages
Robustness to Noise: Since it is a voting-based system, isolated noise pixels do not
significantly impact the final result.
Handling Occlusion: HT can detect lines even if they are broken or partially hidden
by other objects.
Simultaneous Detection: It can detect multiple lines in a single pass of the
accumulator.
11 Illustrate how Hough Transform is used in ellipse detec on with an example.
3. Example: Human Iris Location
A primary case study for ellipse detection is Human Iris Location:
1. Input: A close-up image of a human eye is captured.
2. Edge Detection: The boundaries of the iris and pupil are identified using edge
detection techniques.
3. Applying HT: Since the iris is often viewed at an angle, it appears as an ellipse
rather than a perfect circle. The Hough Transform is used to find the best-fit ellipse
that matches the iris boundary.
4. Result: The system accurately locates the center and boundaries of the iris, which is
essential for biometric surveillance and tracking.
4. Generalized Hough Transform (GHT) for Ellipses
The Generalized Hough Transform (GHT) is frequently used for ellipse detection to
improve speed and accuracy:
Spatial Matched Filtering: GHT uses spatial matched filtering to locate features by
correlating image data with a predefined ellipse model.
Feature Collation: It allows for the collation of various edge features to identify the
global structure of the ellipse even when parts of it are occluded.
5. Advantages and Applications
Handling Occlusion: Ellipse detection via HT can identify shapes even if part of the
boundary is missing or hidden.
Robustness: It is highly resistant to image noise and background clutter.
Applications: Beyond iris location, it is used in In-vehicle vision systems for
identifying circular/elliptical road signs and in 3D object recognition.
1. Define Thresholding Techniques.
Thresholding is a founda onal image processing technique used for segmenta on. It
involves conver ng a grayscale image into a binary image by separa ng pixels into two
groups—typically foreground and background—based on whether their intensity levels are
above or below a specific mathema cal constant.
2. What are Interest Point Operators?
Interest point operators (also known as corner detectors) are mathema cal algorithms
used to iden fy specific points in an image that have a well-defined posi on and are stable
across different views. These operators detect loca ons where image brightness varies
significantly in mul ple direc ons, making them essen al for tracking and matching.
3. Define Boundary Descriptor.
A boundary descriptor is a mathema cal representa on used to characterize the external
contour or perimeter of an object. These descriptors, such as boundary length or Fourier
descriptors, provide a numerical way to iden fy and recognize shapes based on their
silhoue e.
4. What are Chain Codes?
Chain codes are a type of boundary descriptor used to represent a shape's perimeter as a
sequence of unit-length segments in specified direc ons. By following a path from a
star ng pixel using 4-connec vity or 8-connec vity, the boundary is stored as a compact
numerical string.
5. What is Line Detec on?
Line detec on is the process of iden fying and localizing straight-line segments within an
image. It is commonly achieved using the Hough Transform (HT), which maps edge pixels
into a parameter space to find the most likely orienta on and posi on of lines
6. Discuss how corner and interest point detection methods contribute to
image analysis?
Corner and Interest Point Detec on in Image Analysis
Corner and interest point detec on are fundamental mathema cal opera ons in the ini al
stages of computer vision. These methods iden fy specific, repeatable loca ons in an
image that stay consistent even when the image is rotated, scaled, or changed in
brightness.
1. Theore cal Contribu on
Interest points are loca ons where the image signal varies significantly in mul ple
direc ons. Unlike edges, which only provide informa on perpendicular to the boundary,
corners provide a fixed 2D coordinate.
Mathema cal Founda ons: These methods rely on analyzing intensity gradients or
local auto-correla on to dis nguish between flat regions, edges, and corners.
Stability: Interest points are considered "robust" features because they can be
reliably detected across different views of the same scene.
2. Contribu ons to Shape and Region Analysis
In the study of shapes and regions, interest points serve as the "anchor" for more complex
descriptors:
Feature Colla on: Detected points are grouped together using techniques like the
Generalized Hough Transform (GHT) to locate specific objects.
Shape Recogni on: Interest points help define the geometry of a shape,
contribu ng to centroidal profiles and boundary descriptors.
Handling Occlusion: Because interest points are local, a system can s ll recognize
an object if some interest points are hidden, as long as others remain visible.
3. Contribu ons to 3D Vision and Mo on
Interest point detec on is the prerequisite for understanding depth and movement:
3D Object Recogni on: Objects are o en represented as a collec on of 3D points
derived from 2D interest points.
Triangula on and Reconstruc on: By matching interest points between two
different camera views, the system can calculate the 3D posi on of those points
(triangula on) to reconstruct a scene.
Mo on Analysis: Tracking interest points over a sequence of frames allows the
system to calculate op cal flow and parametric mo on.
4. Prac cal Applica ons
These methods enable several high-level computer vision applica ons listed in the
curriculum:
Face Detec on and Recogni on: Interest points on the face (such as the corners of
the eyes and mouth) are used to align images for Eigenfaces and Ac ve Appearance
Models.
Surveillance: They assist in foreground-background separa on and tracking human
gait by iden fying key points on the moving body.
In-vehicle Vision Systems: Detec ng interest points helps in iden fying road signs
and loca ng pedestrians by isola ng their unique features from the background.
7. Explain briefly about digital image processing.
Digital Image Processing (DIP)
Digital Image Processing (DIP) refers to the use of a digital computer to process digital
images through an algorithm. As a subfield of signals and systems, it focuses specifically on
images, aiming to improve pictorial informa on for human interpreta on or to process
image data for autonomous machine percep on.
1. Core Concept and Structure
A digital image is composed of a finite number of elements, each having a par cular
loca on and value. These elements are known as pixels (picture elements). Processing
these pixels involves several fundamental stages:
Image Founda ons: This involves classical filtering opera ons to enhance quality
and prepare the image for further analysis.
Segmenta on: Using techniques like thresholding to separate objects of interest
from the background.
Feature Extrac on: Iden fying specific structures such as edges, corners, and
interest points.
2. Fundamental Opera ons
The syllabus outlines several cri cal opera ons used to manipulate and analyze digital data:
Filtering and Edge Detec on: Applying mathema cal operators to find boundaries
and reduce noise in the image.
Morphology: Using opera ons for shape refinement, such as skeletons and
thinning.
Shape and Region Analysis: Understanding geometric proper es through
connectedness, object labeling, and coun ng.
Transforma ons: Using the Hough Transform to map pixel data into a parameter
space for detec ng regular shapes like lines, circles, and ellipses.
3. Advanced Analysis and 3D Vision
Modern digital image processing extends beyond simple 2D manipula on into spa al and
mo on analysis:
3D Vision: U lizing projec on schemes and surface representa ons to reconstruct
three-dimensional scenes.
Mo on Analysis: Techniques like op cal flow, triangula on, and bundle
adjustment are used to track moving objects and analyze camera movement.
4. Real-World Applica ons
The ul mate goal of processing images is to develop func onal, real-world applica ons:
Biometrics: Performing face detec on, face recogni on, and human iris loca on.
Surveillance: Implemen ng foreground-background separa on, tracking, and
human gait analysis.
Automo ve Safety: Developing in-vehicle vision systems to locate road markings,
iden fy signs, and detect pedestrians.
According to block 1,if input is an image and we get out image as a output, then it is
termed as Digital Image Processing.
According to block 2,if input is an image and we get some kind of informa on or
descrip on as a output, then it is termed as Computer Vision.
According to block 3,if input is some descrip on or code and we get image as an output,
then it is termed as Computer Graphics.
According to block 4,if input is descrip on or some keywords or some code and we get
descrip on or some keywords as a output,then it is termed as Ar ficial Intelligence
8. Write a short note on followings:
a) Fourier descriptors.
b) Region descriptors
9 Describe the centroidal profile approach to shape analysis. Obtain a general formula expressing
the shape a straight line presents in the centroidal profile.
10 Discuss about the RANSAC for straight line detec on.
RANSAC for Straight Line Detec on
RANSAC (Random Sample Consensus) is an itera ve mathema cal method used to
es mate the parameters of a mathema cal model—such as a straight line—from a set of
observed data that contains a significant number of outliers. Unlike standard least-squares
regression, which can be heavily skewed by noisy data, RANSAC is designed to ignore "bad"
data points.
1. The Core Working Principle
The fundamental idea of RANSAC is to find the model that is consistent with the largest
number of data points. For straight line detec on, the process follows these steps:
1. Random Sampling: The algorithm randomly selects the minimum number of points
required to define a line (which is two points).
2. Model Fi ng: A candidate line is generated passing through these two points, o en
using the foot-of-normal method $(\rho, \theta)$ to avoid infinite slopes.
3. Consensus Set (Inliers): Every other edge point in the image is tested against this
candidate line. If a point lies within a predefined distance threshold from the line, it
is considered an inlier.
4. Itera on: This process is repeated for a fixed number of itera ons.
5. Selec on: The line that results in the highest number of inliers (the largest
consensus set) is selected as the best fit.
2. Role in the Computer Vision Pipeline
RANSAC is rarely used in isola on; it typically follows or enhances other techniques
men oned in the syllabus:
Post-Hough Transform: While the Hough Transform (HT) is excellent for finding
poten al lines, RANSAC is o en used to refine those lines and accurately localize
them by filtering out background noise.
Edge Detec on: RANSAC operates on the output of edge detec on techniques (like
Canny or Sobel), which provide the "raw" points for the sampling process.
Line Fi ng: It provides a much more robust alterna ve to simple linear regression
when the image contains mul ple objects or complex textures.
3. Advantages of RANSAC
Robustness: It is highly effec ve at finding a "correct" line even when up to $50\%$
(or more) of the data points are outliers or noise.
Simplicity: The algorithm is easy to implement and does not require complex
op miza on beyond simple distance calcula ons.
Mul ple Line Detec on: By removing the inliers of the first detected line and
running the algorithm again, RANSAC can sequen ally detect mul ple lines in an
image.
4. Prac cal Applica ons
RANSAC is cri cal for the development of real-world computer vision applica ons:
In-Vehicle Vision Systems: It is used for loca ng roadway road markings and
detec ng lanes where road debris or shadows might create "noisy" edge points.
3D Vision and Mo on: In bundle adjustment and triangula on, RANSAC is used to
match interest points between frames while discarding false matches.
Surveillance: It helps in foreground-background separa on by fi ng models to the
background to be er isolate moving objects.
11 Explain the line detection by using Hough transform (HT)?
3. Line Localiza on and Fi ng
Once the peaks are iden fied, the system performs line localiza on to find the exact
posi on of the lines. Advanced methods like RANSAC (Random Sample Consensus) are
o en used for straight-line detec on to refine the results and eliminate outliers that don't
fit the model.
4. Real-World Applica ons
In the curriculum, line detec on via HT is applied to:
In-vehicle vision systems: Specifically for loca ng roadway road markings and
iden fying lane boundaries.
Feature Colla on: Grouping detected line segments to iden fy complex objects.
3D Vision: Assis ng in projec on schemes and surface representa ons.