0% found this document useful (0 votes)

88 views31 pages

YOLO: Object Detection Overview

Uploaded by

Manel Lnsry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views31 pages

YOLO: Object Detection Overview

Uploaded by

Manel Lnsry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Concepts in Object Detection
A Brief History of Object Detection
YOLO: You Only Look Once
Training and Performance
Criticisms of YOLO
Security Concerns

You Only Look Once

path to design a detector

Feng Wang

AIRD, Coretronic Co.

Apr 17, 2019
The slides and a list of references can be found from
[Link]
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
Classification vs detection/recognition
Common tasks on images

[Link]
Bounding box proposal
Region of interest, region proposal, box proposal
Ground truth

Proposed bounding box

5 parameters
 w, h
 x, y
 confidence score: how likely it
contains an object & accuracy
of the box
How good: Intersection over Union (IOU)

Overlap Area Examples

IOU =
Union Area
0:

1:
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
A brief history of object detection

[Link]
A brief history of object detection

 Before CNN, people use handcrafted features to locate and

classify objects. (not too bad)

 CNN boosts the accuracy of classification

ImageNet
A brief history of object detection

Region proposal -> Single shot:

classification Region proposal + classification
 e.g. RCNN  e.g. YOLO, SSD
 accurate  fast
 slow  less accurate
Outlines

 Concepts in object detection

 A brief history of object detection

 YOLO
 design
 loss function
 training
 weaknesses
YOLO: you look only once

Results
 x, y, w, h
 confidence
Look once score:
contain an object &
box accuracy
 class score:
belong to a class

Let's use CNN, Why not regress?

since it's good. They are just numbers.
Let's go to CNN

YOLO v1's CNN: GoogLeNet variant, 24 layers

YOLO v3's CNN: darknet-53

YOLO v2's CNN: darknet-19, 19 layers

Let's do regression
-- wait, wait, how many bounding boxes? Where are they
initially?
Better solution: using grids

Results for one box

 x, y, w, h
 confidence score:
contain an object &
box accuracy
 class score:
belong to a class
 Maybe set N as a large number?
 Maybe initially put them randomly?

Note: N is large, but much smaller than R-CNN's

region proposal.
Let's do regression with non-maximal suppression
Proposed Proposed Class scores
box 1 box 2
class 1
Grid x, y, w, h x, y, w, h class 2,
1
confidence confidence ...
score score class 20

... ... ... ...

Proposed Proposed Class scores
box 1 box 2
class 1
We can use CNN to extract features, and Grid x, y, w, h x, y, w, h class 2,
SxS
finally perform a regression to detect confidence confidence ...
objects. score score class 20
 YOLO v1: fully connected layers
 v2 & v3: convolutional layers
arXiv: 1506.02640, 1612.08242, 1804.02767 vector size: SxSx(5x2+20)
Loss function
Problems
 One object is partially/fully covered by several boxes.
 Most boxes has no objects.
 Multi-task training problem: location & class
 Small objects need more accurate location & box
size.

Solution
Oh, no math please. Let's speak human language

Problem 1:
One object is
partially/fully
covered by
several boxes.

 Each true object has one proposed box “responsible” to it.

Rule: the one with highest overlap with the ground truth boxes.
 When inference, we use non-maximal suppression to select the best among the proposals.
Human language

Problem 2: 0.5
Most boxes has
no objects.
Human language

Problem 3:
Multi-task training
problem: location
& class. Weighted sum: here the problem is left untouched.
Human language

sqrt

Problem 4:
Small objects need
more accurate
location & box size.
Other problems
 x, y can be out of the grid cell
 smaller objects can locate
worse than the largers

 probability can be out of [0, 1]

Fix them in YOLO v2

Pre-defined box size

Pre-defined box: anchor
 Naturally, objects have special aspect ratios and sizes.
 This can be a good starting point.
 We don't need randomly initialized boxes' shapes.

 Handcrafted box size vs clustering algorithms

 Box can reshape during training.

 The number of pre-defined boxes is

a hyperparameter
 v2 uses 5
 v3 uses 9

Anchor-free detection is a research topic, see [Link] for an instance. anchors used in YOLO v2
Improvements (in v2)
 Resizing image sizes randomly during training: {320, 352, ..., 608}
 CNN only reduce an image by a constant factor (here 32), hence is robust to input image size
 resize every 10 epochs.
 multi-scale training

 Passthrough layer  Odd number of grid cells

 No loss to perform reshaping

Feature map
Training
ImageNet: COCO/PASCAL VOC:
classification dataset detection dataset

YOLO
Step 1: Step 2 (transfer learning):
 train classification backbone  remove head layers
 add regression as new head
 fine-tune backbone & train head

Training tricks
 decaying learning rate
 batch normalization
 data augmentation
Performance
Generalizability

Picasso & People-Art dataset

But ... no free lunch
 YOLO is not as accurate as RCNN-series models
 multi-task problem:
YOLO wins in less background error,
however, loses in localization error.

 YOLO is poor for detecting small objects

 CNN: training on ImageNet may not generalize well for small objects (classification)
 loss function equalizes location weights for small & large objects (localization)
50+ years
 YOLO is not good at crowd objects
 non-maximal suppression. See an improvement: Adaptive NMS (arXiv:1904.03629)

 YOLO is bad when encountering strange aspect ratio

 pre-defined anchors, or anchors learned from data. Go anchor-free (arXiv:1904.01355).
Security
CNN (classification) can be fooled, as well as
YOLO, and the issues can be even worse.

Non-maximal suppression is fooled.

Daedalus: Breaking Non-Maximum

Suppression in Object Detection via
Adversarial Examples. arXiv:1902.02067
Is there anything helpful to improve?
Darwin's evolution

arXiv: 1807.05511

You Only Look Once
path to design a detector
Feng Wang
AIRD, Coretronic Co.
Apr 17, 2019
The slides and a list of references

Outlines
Concepts in object detection
A brief history of object detection
YOLO
design
loss function
training
weaknesse

Common tasks on images
https://medium.com/@nikasa1889/the-modern-history-of-object-recognition-infographic-aea18517c318

Bounding box proposal
Region of interest, region proposal, box proposal
Ground truth
Proposed bounding box
5 parameters
w, h

How good: Intersection over Union (IOU)
Overlap Area
Union Area
IOU =
0:
1:
Examples

A brief history of object detection
https://stats385.github.io

A brief history of object detection
Before CNN, people use handcrafted features to locate and
classify objects. (not too ba

A brief history of object detection
Region proposal ->
classification
e.g. RCNN
accurate
slow
Single shot:
Region proposa

YOLO Architecture for Object Detection
100% (1)
YOLO Architecture for Object Detection
30 pages
Comprehensive Guide to YOLO Models
100% (1)
Comprehensive Guide to YOLO Models
58 pages
YOLO: Real-Time Object Detection System
No ratings yet
YOLO: Real-Time Object Detection System
10 pages
YOLO: Real-Time Object Detection Overview
100% (1)
YOLO: Real-Time Object Detection Overview
36 pages
YOLO and Selective Search in Object Detection
No ratings yet
YOLO and Selective Search in Object Detection
90 pages
YOLO v7: Object Detection Explained
100% (1)
YOLO v7: Object Detection Explained
32 pages
YOLO Algorithm and Its Evolution
100% (1)
YOLO Algorithm and Its Evolution
264 pages
YOLO for Real-Time Object Detection
No ratings yet
YOLO for Real-Time Object Detection
17 pages
General Framework for Object Detection
No ratings yet
General Framework for Object Detection
9 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
61 pages
Computer Vision Midterm Exam Guide
No ratings yet
Computer Vision Midterm Exam Guide
24 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
5 pages
Computer Vision Part1
No ratings yet
Computer Vision Part1
96 pages
MobileNets for Efficient Mobile Vision
No ratings yet
MobileNets for Efficient Mobile Vision
9 pages
YOLOv8 and ByteTrack for Object Tracking
No ratings yet
YOLOv8 and ByteTrack for Object Tracking
5 pages
Object Detection and Identification Report
67% (3)
Object Detection and Identification Report
20 pages
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
No ratings yet
CNN Architectures: Lenet, Alexnet, VGG, Googlenet, Resnet and More
9 pages
Enhancing Action Recognition with CNNs
No ratings yet
Enhancing Action Recognition with CNNs
8 pages
TensorFlow C++ Overview and Features
No ratings yet
TensorFlow C++ Overview and Features
17 pages
YOLO: Unified Real-Time Object Detection
100% (1)
YOLO: Unified Real-Time Object Detection
21 pages
Medical Image Fusion Method by Deep Learning
No ratings yet
Medical Image Fusion Method by Deep Learning
9 pages
Machine Learning Techniques for Big Data
No ratings yet
Machine Learning Techniques for Big Data
10 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
32 pages
Polyp Detection in Colonoscopy Videos
No ratings yet
Polyp Detection in Colonoscopy Videos
100 pages
YOLO V5 for Multiple Object Tracking
No ratings yet
YOLO V5 for Multiple Object Tracking
5 pages
Ultrasound Dataset for Thyroid Nodules
No ratings yet
Ultrasound Dataset for Thyroid Nodules
12 pages
LeNet-5 Architecture Overview
No ratings yet
LeNet-5 Architecture Overview
3 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
23 pages
Image Caption Generator Project Overview
No ratings yet
Image Caption Generator Project Overview
13 pages
Understanding Object Recognition in CV
100% (1)
Understanding Object Recognition in CV
30 pages
PyTorch Lightning Documentation Overview
100% (1)
PyTorch Lightning Documentation Overview
421 pages
Enhancing YOLOv8 for Maritime Monitoring
No ratings yet
Enhancing YOLOv8 for Maritime Monitoring
6 pages
Advanced Machine Learning Course Overview
No ratings yet
Advanced Machine Learning Course Overview
14 pages
Understanding Deep Learning Techniques
No ratings yet
Understanding Deep Learning Techniques
20 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
41 pages
Image Caption Generator Project Report
No ratings yet
Image Caption Generator Project Report
69 pages
Feature Detection in Image Matching
No ratings yet
Feature Detection in Image Matching
80 pages
Key Hyperparameters in Neural Networks
No ratings yet
Key Hyperparameters in Neural Networks
15 pages
Evaluating Modified Faster R-CNN Performance
No ratings yet
Evaluating Modified Faster R-CNN Performance
10 pages
Deep Learning for Recommendation Systems
No ratings yet
Deep Learning for Recommendation Systems
47 pages
Deep Learning for Image Super-Resolution
No ratings yet
Deep Learning for Image Super-Resolution
12 pages
Object Detection Models Overview
100% (1)
Object Detection Models Overview
36 pages
RBF Neural Networks Overview and Applications
No ratings yet
RBF Neural Networks Overview and Applications
34 pages
Regularization Techniques in Deep Learning
No ratings yet
Regularization Techniques in Deep Learning
100 pages
SRT Model for Thyroid Nodule Classification
No ratings yet
SRT Model for Thyroid Nodule Classification
8 pages
Understanding RNNs and LSTMs Explained
No ratings yet
Understanding RNNs and LSTMs Explained
39 pages
08 Real - Time - Object - Detection - With - Audio - Feedback - Using - Yolo - vs. - Yolo - v3
No ratings yet
08 Real - Time - Object - Detection - With - Audio - Feedback - Using - Yolo - vs. - Yolo - v3
7 pages
YOLO Algorithm in Agricultural Object Detection
No ratings yet
YOLO Algorithm in Agricultural Object Detection
15 pages
Non-Linear Moving Target Tracking: A Particle Filter Approach
No ratings yet
Non-Linear Moving Target Tracking: A Particle Filter Approach
7 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
46 pages
Denoising Autoencoders Explained
No ratings yet
Denoising Autoencoders Explained
7 pages
Overview of One-Stage Detectors
No ratings yet
Overview of One-Stage Detectors
35 pages
Deep Learning for Object Detection
No ratings yet
Deep Learning for Object Detection
37 pages
Introduction to Object Classification
No ratings yet
Introduction to Object Classification
24 pages
YOLOv3 for Handwritten Signature Detection
No ratings yet
YOLOv3 for Handwritten Signature Detection
4 pages
YOLO: Object Detection Overview
No ratings yet
YOLO: Object Detection Overview
20 pages
YOLO9000: Real-Time Object Detection
No ratings yet
YOLO9000: Real-Time Object Detection
9 pages
YOLO Object Detection Overview
100% (1)
YOLO Object Detection Overview
19 pages
CV Presentation
No ratings yet
CV Presentation
14 pages
KTD1624 NPN Transistor Datasheet
No ratings yet
KTD1624 NPN Transistor Datasheet
4 pages
On-Site Problem Categorization in SCM
No ratings yet
On-Site Problem Categorization in SCM
22 pages
Emergency Trolley Setup and Benefits
No ratings yet
Emergency Trolley Setup and Benefits
17 pages
Development and Validation of Quantitative Determination and Sampling Methods For Acetaminophen Residues On Pharmaceutical Equipment Surfaces
No ratings yet
Development and Validation of Quantitative Determination and Sampling Methods For Acetaminophen Residues On Pharmaceutical Equipment Surfaces
7 pages
Brueninghaus Hydromatik Parts List
No ratings yet
Brueninghaus Hydromatik Parts List
21 pages
Evolution of Classical Guitar Recording
100% (8)
Evolution of Classical Guitar Recording
445 pages
Transverse Section of Maize Stem
No ratings yet
Transverse Section of Maize Stem
2 pages
Data Analysis and Processing Techniques
No ratings yet
Data Analysis and Processing Techniques
3 pages
EDU-210-8.1-Lab Guide PDF
0% (1)
EDU-210-8.1-Lab Guide PDF
167 pages
Minimizing Hood Jam in IRUT & LF
No ratings yet
Minimizing Hood Jam in IRUT & LF
14 pages
Top 7 Coaching Strategies for Success
No ratings yet
Top 7 Coaching Strategies for Success
48 pages
Billabong High Jabalpur Outclass Program
No ratings yet
Billabong High Jabalpur Outclass Program
9 pages
Wind Profiles and Resource Assessment
No ratings yet
Wind Profiles and Resource Assessment
4 pages
From Feudalism to the Renaissance
No ratings yet
From Feudalism to the Renaissance
9 pages
Master of Laws (LLM) Course Overview
No ratings yet
Master of Laws (LLM) Course Overview
13 pages
PFEIFER Lifting and Turning Devices Overview
No ratings yet
PFEIFER Lifting and Turning Devices Overview
60 pages
Notification - LDCE - PA SA 2026 - Gujarat Circle
No ratings yet
Notification - LDCE - PA SA 2026 - Gujarat Circle
42 pages
Retail Deployment Guide: Microsoft Dynamics AX 2012 Feature Pack
No ratings yet
Retail Deployment Guide: Microsoft Dynamics AX 2012 Feature Pack
39 pages
Cost of Capital Calculations Guide
No ratings yet
Cost of Capital Calculations Guide
6 pages
Affordability and Dental Service Use in Australia
No ratings yet
Affordability and Dental Service Use in Australia
11 pages
Od Notes
No ratings yet
Od Notes
2 pages
Community Pharmacy Lab Manual
No ratings yet
Community Pharmacy Lab Manual
45 pages
Instrumentation Engineer CV - 20 Years Experience
No ratings yet
Instrumentation Engineer CV - 20 Years Experience
3 pages
Children's Palliative Care Guide 2018
No ratings yet
Children's Palliative Care Guide 2018
21 pages
Math5 - Q3 - M19 - Measuring The Circumference of A Circle Using Appropriate Tools
No ratings yet
Math5 - Q3 - M19 - Measuring The Circumference of A Circle Using Appropriate Tools
17 pages
Being Good A Short Introduction To Ethics 2nd Simon Blackburn Ready To Read
No ratings yet
Being Good A Short Introduction To Ethics 2nd Simon Blackburn Ready To Read
104 pages
Surface Tension Experiment Overview
No ratings yet
Surface Tension Experiment Overview
11 pages
Marieb 4e Anatomy & Physiology Test Bank
No ratings yet
Marieb 4e Anatomy & Physiology Test Bank
28 pages
Iphp Lesson1b Reviewer - 082734
No ratings yet
Iphp Lesson1b Reviewer - 082734
5 pages
SRv6 uSID Support in FRR
No ratings yet
SRv6 uSID Support in FRR
34 pages