0% found this document useful (0 votes)

19 views11 pages

AI in Computer Vision: Comprehensive Notes

The document provides detailed notes on AI for Computer Vision, covering various units including image formation, feature detection, motion estimation, 3D reconstruction, and deep learning applications. Each unit includes theoretical concepts, key techniques, and practical examples, particularly focusing on methods like CNNs, SIFT, and stereo vision. Additionally, the document highlights important exam questions and code examples relevant to the topics discussed.

Uploaded by

dp9476825

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views11 pages

AI in Computer Vision: Comprehensive Notes

Uploaded by

dp9476825

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

AI for Computer Vision - Detailed Notes

(RGPV 7th Sem)

Unit I: Introduction to Image Formation and Processing
Detailed notes will be added here...

Unit II: Feature Detection, Matching and Segmentation

Detailed notes will be added here...

Unit III: Feature-based Alignment & Motion Estimation (2D/3D)

Detailed notes will be added here...

Unit IV: 3D Reconstruction Techniques

Detailed notes will be added here...

Unit V: Image-based Rendering and Recognition

Detailed notes will be added here...

✅ Unit I: Introduction to Image Formation and Processing

--- THEORY NOTES ---

• Image Formation Basics: An image is a 2D projection of a 3D scene captured by a camera

using lens and light.

• Camera Model: Describes how light rays map from the world to the image plane.

- Pinhole Camera Model: Simplest geometry-based projection model.

- Perspective Projection: Objects farther away appear smaller.

• Radiometry & Photometry: Measurement of light energy affecting pixel intensity values.

• Image Types: Binary, Grayscale, Color, RGB, HSV.

--- IMAGE PROCESSING PIPELINE ---

Image Acquisition ➝ Preprocessing ➝ Feature Extraction ➝ Analysis ➝

Recognition/Interpretation

--- IMAGE FILTERING TECHNIQUES ---

1 Spatial Domain Filters: Operate directly on pixels

1️⃣

• Smoothing Filters → Noise Reduction (Mean, Gaussian)

• Sharpening Filters → Highlight Edges (Laplacian, High-Boost)

2️⃣Frequency Domain Processing: Apply Fourier Transform

• Low-Pass Filters → Remove noise

• High-Pass Filters → Detect edges

--- EDGE DETECTION ---

• Purpose: Detect boundaries of objects

• Operators: Sobel, Prewitt, Roberts, Canny

• Canny Edge Detector Steps: Smoothing → Gradient → Non-max suppression → Hysteresis

thresholding

--- IMAGE TRANSFORMS ---

• Fourier Transform → represents image in frequency domain

• Wavelets → Multi-resolution image analysis

--- PYRAMIDS ---

• Gaussian Pyramid: Repeated smoothing + downsampling

• Laplacian Pyramid: Edge information pyramid

--- OPTIMIZATION IN VISION ---

Used in feature matching, energy minimization, segmentation etc.

--- SHORT EXAM QUESTIONS ---

Q1: Define Pinhole Camera Model.

Q2: Differentiate between spatial and frequency domain filters.

Q3: Write steps of Canny edge detection.

--- LONG EXAM QUESTIONS ---

Q1: Explain image formation process with camera model and geometry.

Q2: Explain image filtering with suitable examples and diagrams.

✅ Unit II: Feature Detection, Matching & Segmentation

--- THEORY NOTES ---

⭐ IMP: Feature Detection identifies key points in an image that are invariant to changes in
scale, rotation or lighting.

Common detectors: Harris Corner, SIFT, SURF, FAST, ORB

➡ Harris Corner Detector

• Based on detecting corners where there is a large change in all directions

• Uses auto-correlation matrix

➡ SIFT (Scale Invariant Feature Transform) ⭐ IMP

• Detects scale and rotation invariant features

• Steps: Scale-space → Keypoint localization → Orientation assignment → Descriptor

generation

➡ SURF (Speeded-Up Robust Features)

• Faster than SIFT using box filters and integral images

➡ FAST & ORB

• FAST is extremely fast corner detector

• ORB combines FAST + BRIEF descriptors for real-time apps (e.g., robotics)

--- FEATURE MATCHING ---

⭐ IMP: Used to find correspondences between images

Methods: SSD (Sum of Squared Differences), NCC (Normalized Cross Correlation), Hamming
distance for ORB

➡ RANSAC (Random Sample Consensus) ⭐ Most Asked

• Removes false matches by estimating the best model through random sampling

--- IMAGE SEGMENTATION ---

⭐ IMP: Process of dividing image into meaningful regions

➡ Thresholding-based Segmentation

• Otsu Method: Finds optimal threshold by maximizing variance between classes

➡ Region-based Segmentation

• Region growing & splitting, merging

➡ Clustering-based Segmentation ⭐ IMP

• K-Means clustering: groups pixels based on similarity

➡ Graph-based Segmentation ⭐ IMP

• Graph Cut: minimizes cut cost to separate foreground & background

➡ Watershed Segmentation ⭐ Important for diagrams

• Visualizes gradient of image as a topographic surface

--- SHORT EXAM QUESTIONS ---

Q1: Define feature detection (IMP)

Q2: Difference between SIFT and SURF

Q3: What is RANSAC? Why used? ⭐

--- LONG EXAM QUESTIONS (Repeated in RGPV) ---

Q1: Explain SIFT algorithm with steps ⭐⭐

Q2: Explain image segmentation techniques with examples ⭐⭐

✅ Unit III: Feature-based Alignment & Motion Estimation (2D/3D)

--- THEORY NOTES ---

➡ ⭐ IMP: Pose Estimation

• Determines camera location + orientation relative to the object

• Uses feature correspondences and geometric constraints

➡ Triangulation

• Uses 2D projections from multiple views to recover 3D points

➡ ⭐ IMP: Structure from Motion (SfM)

• Recovers 3D scene + camera motion from multiple images

• Used in 3D mapping, AR, drones

➡ ⭐ Most Asked: Optical Flow

• Motion estimation by pixel intensity changes between frames

• Assumption: intensity constant over motion

Popular Methods:

• Lucas-Kanade → Sparse estimation

• Horn-Schunck → Dense estimation

➡ Bundle Adjustment ⭐⭐ Highly Asked in Exams

• Optimization technique in SfM to minimize reprojection error

➡ ⭐ Most Repeated: Camera Calibration

• Estimation of intrinsic + extrinsic parameters of camera

➡ Layered Motion Estimation

• Separates motion into different layers for better tracking

--- ✅ OpenCV Code Examples (IMP for Viva) ---

➡ Lucas-Kanade Optical Flow

import cv2

cap = [Link](0)
lk_params = dict(winSize=(15, 15))

while True:
ret, frame = [Link]()
gray = [Link](frame, cv2.COLOR_BGR2GRAY)
[Link]("Optical Flow", gray)
if [Link](1) & 0xFF == ord('q'):
break

[Link]()
[Link]()

➡ Camera Calibration (Pseudo Example)

import cv2
import numpy as np

# Termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
objp = [Link]((6*7, 3), np.float32)
objp[:, :2] = [Link][0:7, 0:6].[Link](-1, 2)
# Used when clicking chessboard patterns

--- SHORT EXAM QUESTIONS ---

Q1: What is optical flow? ⭐

Q2: Define structure from motion (SfM).

Q3: What is camera calibration? ⭐

--- LONG EXAM QUESTIONS (Repeated in RGPV) ---

Q1: Explain optical flow with its different methods ⭐⭐

Q2: Explain Structure from Motion with example diagrams ⭐⭐

✅ Unit IV: 3D Reconstruction Techniques

--- THEORY NOTES ---

➡ ⭐ Stereo Vision (IMP + Diagrams)

• Uses two or more camera views to estimate depth

• Works like human eyes → Binocular disparity enables depth estimation

• Steps: Feature Matching → Disparity Map → Depth Map Computation

📌 Depth Formula:

Depth (Z) = (f × B) / Disparity (d)

where f = focal length, B = baseline distance

📝 Diagram (Concept - Text Representation):

[Camera-L] ---- Object ---- [Camera-R]

Matching points shift → Disparity → Depth

➡ ⭐ Shape-from-X Methods (Mostly Asked)

• Recover 3D shape using different clues:

Shape-from-Shading
1️⃣

• Uses variations in brightness to estimate surface orientation

• Assumes: Single light source, uniform material

2️⃣Shape-from-Silhouette

• Combined outlines from multiple views create visual hull

Shape-from-Motion
3️⃣ (SfM) Repeated in Exams

• Uses motion between frames to build 3D structure

➡ 3D Point Cloud Representation

• Collection of 3D points representing object shape

• Used in robotics, AR, autonomous navigation

➡ Volumetric Reconstruction

• Represents shape using voxels (3D pixels)

• Example: CT scan volume model

➡ Surface Reconstruction ⭐ Viva Question

• Creates surface mesh from point cloud

• Output: Triangular mesh

--- APPLICATIONS ---

• AR/VR, Robotics, Medical Imaging, Gaming, Drone Mapping

--- SHORT EXAM QUESTIONS ---

Q1: Define disparity in stereo vision. ⭐

Q2: What is point cloud?

Q3: Define volumetric reconstruction.

--- LONG EXAM QUESTIONS (Frequently Asked in RGPV) ---

Q1: Explain stereo vision with depth estimation formula + diagram ⭐⭐

Q2: Describe different Shape-from-X methods ⭐⭐

UNIT-5: Deep Learning for Computer Vision

1 Introduction to Deep Learning for Vision

1️⃣
Deep Learning uses neural networks with multiple layers to learn image patterns. It
automatically extracts features like edges, textures, objects — unlike traditional computer
vision where features are manually designed.

2️⃣Convolutional Neural Networks (CNN)

CNNs are the backbone of modern computer vision. They consist of layers that learn
hierarchical features:
• Low-level: edges, corners
• Mid-level: shapes
• High-level: full objects

Main Components of CNN:

1️⃣Convolution Layer – extracts features using filters/kernels

2️⃣Activation Function (ReLU) – introduces non-linearity

3️⃣Pooling Layer – reduces spatial size (Max Pooling is common)

4️⃣Fully Connected Layer – final classification

CNN Block Diagram

Input Image → Convolution → ReLU → Pooling → Flatten → Fully Connected → Softmax
Output

Important CNN Terms

• Stride – step size of filter movement
• Padding – keeps output size same as input
• Feature Map – output of convolution layer
• Kernel – small filter matrix
3️⃣Popular CNN Architectures
✅ LeNet-5 – first CNN for digits (MNIST)
✅ AlexNet – deeper CNN using ReLU & dropout
✅ VGGNet – uses 3×3 convolutions repeatedly
✅ ResNet – introduces skip connections to solve vanishing gradient

ResNet Block Structure:

Input → Convolution Layers → Output + Input (Skip Connection) → ReLU

4️⃣Transfer Learning
Pre-trained models like VGG16, ResNet50 are trained on large datasets like ImageNet.
We reuse their learned features and train only final layers for new tasks.

Advantages:

• Less data required

• Faster training
• Higher accuracy

5️⃣CNN Code Example (PyTorch)

A simple CNN image classifier example:

import torch
import [Link] as nn
import [Link] as optim
from torchvision import datasets, transforms

# Data Preprocessing
transform = [Link]([
[Link](),
[Link]((0.5,), (0.5,))
])

train_data = [Link](root='./data', train=True,

download=True, transform=transform)
train_loader = [Link](train_data,
batch_size=64, shuffle=True)

# CNN Model
class CNN([Link]):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3)
[Link] = nn.MaxPool2d(2, 2)
self.fc1 = [Link](32 * 13 * 13, 10)

def forward(self, x):

x = [Link](self.conv1(x))
x = [Link](x)
x = [Link](-1, 32*13*13)
x = self.fc1(x)
return x

model = CNN()
criterion = [Link]()
optimizer = [Link]([Link](), lr=0.001)

# Training Loop
for epoch in range(1):
for images, labels in train_loader:
optimizer.zero_grad()
output = model(images)
loss = criterion(output, labels)
[Link]()
[Link]()

print("Training Completed!")

6️⃣Applications of Deep Learning in Vision

• Face Recognition
• Self-driving Cars
• Medical Imaging
• Object Detection (YOLO)
• Image Segmentation (U-Net)

✅ Unit-5 Completed ✅

AI in Computer Vision Notes
No ratings yet
AI in Computer Vision Notes
7 pages
Understanding Tricometry in Imaging
No ratings yet
Understanding Tricometry in Imaging
3 pages
Computer Vision Exam Study Guide
No ratings yet
Computer Vision Exam Study Guide
2 pages
Image Processing Techniques Overview
No ratings yet
Image Processing Techniques Overview
5 pages
15 Important Questions
No ratings yet
15 Important Questions
9 pages
Computer Vision Course Overview and Topics
No ratings yet
Computer Vision Course Overview and Topics
6 pages
COMPUTER VISION (Elective-II) B. Tech Sem VIII Electronics & Telecommunication Engineering
No ratings yet
COMPUTER VISION (Elective-II) B. Tech Sem VIII Electronics & Telecommunication Engineering
3 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
1 page
CBM 371 Important Questions With Answers
No ratings yet
CBM 371 Important Questions With Answers
21 pages
ComputerVision Syllabus BEComp Elective
No ratings yet
ComputerVision Syllabus BEComp Elective
7 pages
ICT601 Syllabus
No ratings yet
ICT601 Syllabus
2 pages
RGPV AI for Computer Vision Questions
No ratings yet
RGPV AI for Computer Vision Questions
2 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
11 Computer Vision
No ratings yet
11 Computer Vision
3 pages
CV Final
No ratings yet
CV Final
37 pages
Object Recognition and Motion Tracking Guide
No ratings yet
Object Recognition and Motion Tracking Guide
37 pages
Computer Vision - Syllabus
No ratings yet
Computer Vision - Syllabus
2 pages
CCS338 Computer Vision 2 Marksdocx
No ratings yet
CCS338 Computer Vision 2 Marksdocx
4 pages
Computer Vision Exam Guide
No ratings yet
Computer Vision Exam Guide
195 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
3 pages
Course Outline - Computer Vision - AI319
No ratings yet
Course Outline - Computer Vision - AI319
2 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
92 pages
Photometric Stereo and Image Processing Guide
No ratings yet
Photometric Stereo and Image Processing Guide
2 pages
CC8349 Image and Video Analytics Notes
No ratings yet
CC8349 Image and Video Analytics Notes
9 pages
Chapter1 Computer Vision Handout
No ratings yet
Chapter1 Computer Vision Handout
2 pages
CCS338 Computer Vision Q&A Guide
No ratings yet
CCS338 Computer Vision Q&A Guide
2 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
2 pages
CV EndTerm Complete Answers
No ratings yet
CV EndTerm Complete Answers
19 pages
Computer Vision Fundamentals and Techniques
No ratings yet
Computer Vision Fundamentals and Techniques
2 pages
Computer Vision Robotics Overview
No ratings yet
Computer Vision Robotics Overview
72 pages
Sem Notes CV
No ratings yet
Sem Notes CV
65 pages
UNIT I Computer Vision
No ratings yet
UNIT I Computer Vision
41 pages
Computer Vision Course Syllabus Overview
No ratings yet
Computer Vision Course Syllabus Overview
20 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
2 pages
Computer Vision Course Overview GTU
No ratings yet
Computer Vision Course Overview GTU
3 pages
SYLLABUS
No ratings yet
SYLLABUS
1 page
RMK Group Computer Vision Syllabus
No ratings yet
RMK Group Computer Vision Syllabus
76 pages
CV, Course Introduction
No ratings yet
CV, Course Introduction
8 pages
Computer Vision Master’s Course Overview
No ratings yet
Computer Vision Master’s Course Overview
2 pages
Computer Vision Course Syllabus
No ratings yet
Computer Vision Course Syllabus
2 pages
Canny Edge & Corner Detection Techniques
No ratings yet
Canny Edge & Corner Detection Techniques
20 pages
Computer Vision Syllabus
No ratings yet
Computer Vision Syllabus
2 pages
Computer Vision Course Overview and Syllabus
No ratings yet
Computer Vision Course Overview and Syllabus
9 pages
Image and Video Analytics Course
No ratings yet
Image and Video Analytics Course
2 pages
CS131 Computer Vision Practice Final Solutions
No ratings yet
CS131 Computer Vision Practice Final Solutions
15 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
2 pages
OpenCV Notes
No ratings yet
OpenCV Notes
20 pages
Feature-Based Alignment in Computer Vision
No ratings yet
Feature-Based Alignment in Computer Vision
5 pages
Low-Level Computer Vision Techniques
No ratings yet
Low-Level Computer Vision Techniques
4 pages
Image and Video Analytics Course
No ratings yet
Image and Video Analytics Course
2 pages
Computer Vision Question Bank 2024-25
No ratings yet
Computer Vision Question Bank 2024-25
7 pages
Computer Vision Q&A: Key Concepts Explained
No ratings yet
Computer Vision Q&A: Key Concepts Explained
29 pages
Introduction to Computer Vision Techniques
No ratings yet
Introduction to Computer Vision Techniques
3 pages
Orthographic vs. Perspective Projection Guide
No ratings yet
Orthographic vs. Perspective Projection Guide
9 pages
Computer Vision Exam Scheme & Key
No ratings yet
Computer Vision Exam Scheme & Key
7 pages
Computer Vision Fundamentals Overview
No ratings yet
Computer Vision Fundamentals Overview
54 pages
AD8703 Computer Vision Overview
No ratings yet
AD8703 Computer Vision Overview
67 pages
Role of Mathematical Models
No ratings yet
Role of Mathematical Models
16 pages
Business Intelligence Effective Timely Decisions
No ratings yet
Business Intelligence Effective Timely Decisions
12 pages
RGPV Soft Computing Key Questions Guide
No ratings yet
RGPV Soft Computing Key Questions Guide
2 pages
Hopfield Network Overview and Applications
No ratings yet
Hopfield Network Overview and Applications
2 pages
Feature-Based Alignment in Computer Vision
No ratings yet
Feature-Based Alignment in Computer Vision
30 pages
Colored Pencil Techniques
No ratings yet
Colored Pencil Techniques
3 pages
Visua Lacuity and Contrast Sensitivity Testing
No ratings yet
Visua Lacuity and Contrast Sensitivity Testing
8 pages
Anatomy and Function of the Human Eye
No ratings yet
Anatomy and Function of the Human Eye
6 pages
Core Maths WASSCE 2025 Resources
No ratings yet
Core Maths WASSCE 2025 Resources
18 pages
Differential Diagnosis of Leukocoria
No ratings yet
Differential Diagnosis of Leukocoria
5 pages
Print File Submission Guidelines
No ratings yet
Print File Submission Guidelines
3 pages
Opthal Manual
No ratings yet
Opthal Manual
152 pages
Understanding the Nervous System
No ratings yet
Understanding the Nervous System
9 pages
CL Compendium Volume 39 PDF
100% (2)
CL Compendium Volume 39 PDF
142 pages
Color Swatch RGB and HEX Codes
No ratings yet
Color Swatch RGB and HEX Codes
7 pages
LaVision Imager CX Camera Series Overview
No ratings yet
LaVision Imager CX Camera Series Overview
2 pages
Main PCB Replacement Guidelines for Mimaki
No ratings yet
Main PCB Replacement Guidelines for Mimaki
6 pages
Artistic Portrait Photography Analysis
No ratings yet
Artistic Portrait Photography Analysis
2 pages
M1. Color Fundamentals
No ratings yet
M1. Color Fundamentals
77 pages
DSLR Photography Masterclass Guide
No ratings yet
DSLR Photography Masterclass Guide
4 pages
Eye Anatomy, Tests, and Disorders Guide
No ratings yet
Eye Anatomy, Tests, and Disorders Guide
6 pages
Fuji X-T3 Recommended Settings Guide
No ratings yet
Fuji X-T3 Recommended Settings Guide
7 pages
Snellen Chart Usage and Testing Guide
No ratings yet
Snellen Chart Usage and Testing Guide
6 pages
NCLEX Eye Disorders Quiz
100% (1)
NCLEX Eye Disorders Quiz
8 pages
Grade 2 MAPEH Unit Plan: Colors & Painting
No ratings yet
Grade 2 MAPEH Unit Plan: Colors & Painting
11 pages
Prophylactic CTR in High Myopic Cataract
No ratings yet
Prophylactic CTR in High Myopic Cataract
63 pages
Digital Image Processing Practice Questions
No ratings yet
Digital Image Processing Practice Questions
12 pages
Merging HDR Images in Photoshop CS3
No ratings yet
Merging HDR Images in Photoshop CS3
14 pages
Grade 6 Arts: Weeks 1-3 Quiz
No ratings yet
Grade 6 Arts: Weeks 1-3 Quiz
10 pages
SEO Best Practices for Content Optimization
No ratings yet
SEO Best Practices for Content Optimization
3 pages
Avoiding Mistakes for Sharp Wildlife Photos
No ratings yet
Avoiding Mistakes for Sharp Wildlife Photos
35 pages
Chili Pepper Sorting System Design
No ratings yet
Chili Pepper Sorting System Design
6 pages
Image Processing vs Computer Vision Explained
No ratings yet
Image Processing vs Computer Vision Explained
38 pages
Visual Media Compression Overview
No ratings yet
Visual Media Compression Overview
11 pages
MAKE09 TiltShiftPhotography
No ratings yet
MAKE09 TiltShiftPhotography
3 pages

AI in Computer Vision: Comprehensive Notes

Uploaded by

AI in Computer Vision: Comprehensive Notes

Uploaded by

AI for Computer Vision - Detailed Notes

(RGPV 7th Sem)

Unit II: Feature Detection, Matching and Segmentation

Unit III: Feature-based Alignment & Motion Estimation (2D/3D)

Unit IV: 3D Reconstruction Techniques

Unit V: Image-based Rendering and Recognition

✅ **Unit I: Introduction to Image Formation and Processing**

--- THEORY NOTES ---

• Image Formation Basics: An image is a 2D projection of a 3D scene captured by a camera

- Pinhole Camera Model: Simplest geometry-based projection model.

- Perspective Projection: Objects farther away appear smaller.

• Image Types: Binary, Grayscale, Color, RGB, HSV.

Image Acquisition ➝ Preprocessing ➝ Feature Extraction ➝ Analysis ➝

--- IMAGE FILTERING TECHNIQUES ---

1 Spatial Domain Filters: Operate directly on pixels

• Smoothing Filters → Noise Reduction (Mean, Gaussian)

• Sharpening Filters → Highlight Edges (Laplacian, High-Boost)

2️⃣Frequency Domain Processing: Apply Fourier Transform

• Low-Pass Filters → Remove noise

• High-Pass Filters → Detect edges

--- EDGE DETECTION ---

• Purpose: Detect boundaries of objects

• Operators: Sobel, Prewitt, Roberts, Canny

• Canny Edge Detector Steps: Smoothing → Gradient → Non-max suppression → Hysteresis

--- IMAGE TRANSFORMS ---

• Fourier Transform → represents image in frequency domain

• Wavelets → Multi-resolution image analysis

--- PYRAMIDS ---

• Gaussian Pyramid: Repeated smoothing + downsampling

• Laplacian Pyramid: Edge information pyramid

--- OPTIMIZATION IN VISION ---

Used in feature matching, energy minimization, segmentation etc.

Q1: Define Pinhole Camera Model.

Q2: Differentiate between spatial and frequency domain filters.

Q3: Write steps of Canny edge detection.

--- LONG EXAM QUESTIONS ---

Q2: Explain image filtering with suitable examples and diagrams.

✅ **Unit II: Feature Detection, Matching & Segmentation**

--- THEORY NOTES ---

Common detectors: Harris Corner, SIFT, SURF, FAST, ORB

➡ Harris Corner Detector

• Based on detecting corners where there is a large change in all directions

• Uses auto-correlation matrix

➡ SIFT (Scale Invariant Feature Transform) ⭐ IMP

• Detects scale and rotation invariant features

• Steps: Scale-space → Keypoint localization → Orientation assignment → Descriptor

➡ SURF (Speeded-Up Robust Features)

• Faster than SIFT using box filters and integral images

➡ FAST & ORB

--- FEATURE MATCHING ---

⭐ IMP: Used to find correspondences between images

➡ RANSAC (Random Sample Consensus) ⭐ Most Asked

--- IMAGE SEGMENTATION ---

⭐ IMP: Process of dividing image into meaningful regions

• Otsu Method: Finds optimal threshold by maximizing variance between classes

• Region growing & splitting, merging

➡ Clustering-based Segmentation ⭐ IMP

• K-Means clustering: groups pixels based on similarity

➡ Graph-based Segmentation ⭐ IMP

• Graph Cut: minimizes cut cost to separate foreground & background

➡ Watershed Segmentation ⭐ Important for diagrams

• Visualizes gradient of image as a topographic surface

--- SHORT EXAM QUESTIONS ---

Q2: Difference between SIFT and SURF

Q3: What is RANSAC? Why used? ⭐

--- LONG EXAM QUESTIONS (Repeated in RGPV) ---

Q1: Explain SIFT algorithm with steps ⭐⭐

Q2: Explain image segmentation techniques with examples ⭐⭐

✅ **Unit III: Feature-based Alignment & Motion Estimation (2D/3D)**

--- THEORY NOTES ---

➡ ⭐ IMP: Pose Estimation

• Determines camera location + orientation relative to the object

• Uses feature correspondences and geometric constraints

• Uses 2D projections from multiple views to recover 3D points

➡ ⭐ IMP: Structure from Motion (SfM)

• Recovers 3D scene + camera motion from multiple images

✅ Unit I: Introduction to Image Formation and Processing

✅ Unit II: Feature Detection, Matching & Segmentation

✅ Unit III: Feature-based Alignment & Motion Estimation (2D/3D)

✅ Unit IV: 3D Reconstruction Techniques