0% found this document useful (0 votes)
7 views97 pages

Introduction to Robot Vision Concepts

The document provides an introduction to robot vision, covering various aspects of robotics including applications, trends, and types of robots. It discusses advancements in sensor technologies and artificial intelligence that are expanding the use of robots beyond factory settings into areas like personal services and medical applications. Additionally, it outlines key concepts in robotic manipulation and navigation, including SLAM and visual odometry.

Uploaded by

vu.nv100803
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views97 pages

Introduction to Robot Vision Concepts

The document provides an introduction to robot vision, covering various aspects of robotics including applications, trends, and types of robots. It discusses advancements in sensor technologies and artificial intelligence that are expanding the use of robots beyond factory settings into areas like personal services and medical applications. Additionally, it outlines key concepts in robotic manipulation and navigation, including SLAM and visual odometry.

Uploaded by

vu.nv100803
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Robot Vision

Assoc. Prof. Nguyen Xuan Ha


Dept. Mechatronics, SME, HUST, C7-712M
[Link]@[Link]

1
Introduce yourself
• Name

• Major program

• Which year in the program?

• Why are you interested in robotics?

2
Outline
• Robotics applications and trends

• Robotic systems

• Robotic manipulator

• Robotic navigation
• SLAM
• Visual odometry

• About this course: content, grading, toolboxes

3
Robotics Yesterday

Kuka Robots

4
Current Trends in Robotics (1)
• Robots are moving away from factory floors to
• Entertainment, toys
• Personal services
• Medical, surgery
• Industrial automation (mining, harvesting, …)
• Hazardous environments (space, underwater)
• Why?
→ Advancement in
• Sensor technologies
• Computing methods/tools: hardware and software
• Artificial Intelligence

5
Current Trends in Robotics (2)

6
Robotics Today (1)

Collaborative Robot (CoBot) from Rethink Robotics

7
Robotics Today (2)

Self-driving Cars and AI-based Robots


8
Robotics Today (3)

Microrobotics (Uni Oldenburg, Germany)


9
Robotics Today (4)

Autonomous Intelligent Robots (Obelix from Uni Freiburg, Germany) 10


Applications of Autonomous Intelligent Robots

Indoors Undersea

Space Underground

11
Robots in Factories and Warehouses

Welding and Assembling Material Handling Delivering

12
Robots in Human Environments

Cleaning Robots Telepresence Robots Smart Speakers


How can we have more powerful robots assisting people at homes or offices?
• Mobile manipulators
• Humanoids

13
14
Amazon Astro

15
Google Everyday Robots

16
Tesla Bot

17
Future Intelligent Robots in Human Environments

Assisting
Senior Care Serving

Cooking Cleaning Dish washing

18
Robot Types

19
Humanoid Robots
• A humanoid robot is a robot with its body shape built to resemble the
human body

Honda P series iCub robot

20
Robot Manipulators
• A device used to manipulate materials without direct physical contact of
the operator

Franka Emika

21
Wheeled Robots
• Use wheels for locomotion
• Self-driving cars

Starship Technologies Amazon Astro Robot Perseverance Rover

22
Walking Robots
• Legged robots, use articulated limbs to provide locomotion

Boston Dynamics Robot Cassie

23
Boston Dynamics

24
Other Robots
• Flying robots
• Drones

• Swimming robots
• Underwater gliders Robotic Fish: iSplash-II

• Snake robots

Two robot snakes. Left one


has 64 motors (with 2
degrees of freedom per
segment), the right one 10.

25
Robots vs. Humans
• Sensing
• Robots: cameras, Inertial Measurement Units (IMUs), joint encoders
• Humans: vision, vestibular, proprioceptive senses

• Control
• Robots: motors
• Humans: muscles

• Computation
• Robots: robot brain, AI?
• Humans: human brain

26
What is a Robot?

27
What is a Robot?
• A robot is a machine capable of carrying out a complex series of actions
automatically (Wikipedia)

• A goal-oriented machine that can sense, plan and act

• A robot senses its environment and uses that information, together with a goal, to plan
some action

• The action might be to move the tool of an arm-robot to grasp an object, or it might be
to drive a mobile robot to some place

28
Robotic Systems

Tasks

Perception Planning Control

Learning

Sensing Action
World

29
Our Focus in this Course

• Vision for

• Robot Manipulation

• Robot Navigation

30
Robot Manipulation
• The ways robots interact with objects

• Examples
• Grasping an object
• Placing an object
• Pushing an object
• Opening a door
• Folding laundry
• Etc.

[Link]

31
Robot Manipulation

Perception Planning Control


Robust and Accurate High degree of freedom Contact with objects
Multi-modal grasping
Sensed image Planning scene Real world execution
2X

32
6D Object Pose Estimation for Robot Manipulation

2X

33
Robot Navigation
• Go from A to B without hitting anything

Perception Planning Control


Simultaneous localization Path planning Path following
and mapping (SLAM)

Laser-based SLAM
2D occupancy grid map

34
The SLAM
36
Representations

• Grid maps or scans

[Lu & Milios, 97; Gutmann, 98: Thrun 98; Burgard, 99; Konolige & Gutmann, 00; Thrun, 00; Arras, 99; Haehnel, 01;…]

• Landmark-based

[Leonard et al., 98; Castelanos et al., 99: Dissanayake et al., 2001; Montemerlo et al., 2002;…

37
Structure of the Landmark-based SLAM-Problem

38
39
Example

40
SLAM is a hard problem
Reasons for Motion Errors

different wheel
ideal case diameters

bump
carpet
and many more …

42
Odometry
• Odometry is a method of estimating a robot's position and orientation by
integrating motion sensor data.
• It commonly uses wheel odometry (Wheel Odometry) or inertial sensors
(IMU), cameras (Visual Odometry).
• In Wheel Odometry: position estimation by integrating angular velocity
from wheel encoders
𝑥𝑡 = 𝑥𝑡−1 + Δ𝑠 𝑐𝑜𝑠 𝜃
𝑦𝑡 = 𝑦𝑡−1 + Δ𝑠 𝑠𝑖𝑛(𝜃)
(𝑣𝑟 − 𝑣𝑙 )
𝜃𝑡 = 𝜃𝑡−1 +
𝐿
Where 𝑣𝑟 , 𝑣𝑙 are the right/left wheel velocities, and 𝐿 is the distance between wheels)

43
Example Wheel Encoders
These modules require +5V and GND
to power them, and provide a 0 to 5V
output. They provide +5V output when
they "see" white, and a 0V output when
they "see" black.

These disks are manufactured out of high


quality laminated color plastic to offer a
very crisp black to white transition. This
enables a wheel encoder sensor to easily
see the transitions.

Source: [Link]
44
Coordinate Systems

• In general, the configuration of a robot can be described by six parameters.


• Three-dimensional cartesian coordinates plus three Euler angles pitch, roll,
and tilt.
• Throughout this section, we consider robots operating on a planar surface.

• The state space of such systems is


three-dimensional (x,y,).

45
Odometry Model

• Robot moves from x , y , to x ' , y ' , ' .

• Odometry information u =  rot1 ,  rot. 2 ,  trans

 trans = ( x '− x ) 2 + ( y '− y ) 2


 rot1 = atan2( y '− y, x '− x ) − 
 rot 2 =  '− −  rot1  rot 2
x ' , y ' , '

x , y ,  trans
 rot1
Noise Model for Odometry
• The measured motion is given by the true motion corrupted with noise.

ˆrot1 =  rot1 +  1 | rot1 |+ 2 | trans |

ˆtrans =  trans +  3 | trans |+ 4 | rot1 + rot 2 |

ˆrot 2 =  rot 2 +  1 | rot 2 |+ 2 | trans |


Typical Distributions for Probabilistic Motion Models

Normal distribution Triangular distribution

1 x2
0 if | x | 6 2
1 − 2 
  ( x) =
2 e 2  2 ( x) =  6 2 − | x |
2 2

 6 2
Examples (Odometry-Based)
Mapping with Raw Odometry

Exact map of the building Map derived from raw odometry

50
Typical Motion Models

• In practice, one often finds two types of motion models:


• Odometry-based
• Velocity-based (dead reckoning)
• Odometry-based models are used when systems are equipped
with wheel encoders.
• Velocity-based models have to be applied when no wheel
encoders are given.
• They calculate the new pose based on the velocities and the
time elapsed.

51
Typical Measurement Errors of an Range Measurements

1. Beams reflected by
obstacles
2. Beams reflected by
persons / caused
by crosstalk
3. Random
measurements
4. Maximum range
measurements

52
Proximity Measurement

• Measurement can be caused by …


• a known obstacle.
• cross-talk.
• an unexpected obstacle (people, furniture, …).
• missing all obstacles (total reflection, glass, …).
• Noise is due to uncertainty …
• in measuring distance to known obstacle.
• in position of known obstacles.
• in position of additional obstacles.
• whether obstacle is missed.

53
Raw Sensor Data

Measured distances for expected distance of 300 cm.

Sonar Laser

54
Beam-based Proximity Model

Measurement noise Unexpected obstacles

0 zexp zmax 0 zexp zmax

1 ( z − z exp )   e − z z  zexp 
2

1 −
Phit ( z | x, m) =  e 2 b Punexp ( z | x, m) =  
2b  0 otherwise

55
Beam-based Proximity Model

Random measurement Max range

0 zexp zmax 0 zexp zmax

1 1
Prand ( z | x, m) =  Pmax ( z | x, m) = 
z max z sm all

56
Approximation Results

Laser

  hit 
T
 Phit ( z | x, m) 
   
 unexp   Punexp ( z | x, m) 
P ( z | x, m ) =  
 max  Pmax ( z | x, m) 
   
   P ( z | x, m ) 
 rand   rand 
Sonar

300cm 400cm

57
Why is SLAM a hard problem?

SLAM: robot path and map are both unknown

Robot path error correlates errors in the map


58
Why is SLAM a hard problem?

Robot pose
uncertainty

• In the real world, the mapping between observations and landmarks is


unknown
• Picking wrong data associations can have catastrophic consequences
• Pose error correlates data associations

59
60
SLAM Procedure

61
Back-end
• The back-end performs inference on the abstracted data produced by the
front-end to estimate the map
• Probabilistic formulations: Extended Kalman Filters, Rao-Blackwellised
Particle Filters, and maximum likelihood estimation
• Suppose that: the front-end is believed to perform correct data association
→ work well with many indoor SLAM applications
• How about dynamic-changing environment?
→ Need a robust front-end/data association

62
Front-end: Data association

• Which observation belongs to which landmark?


• A robust SLAM solution must consider possible data associations
• Potential data associations depend also on the pose of the robot

63
Front-end: Loop Closure (1)
• Recognizing an already mapped area, typically after a long exploration
path (the robot “closes a loop”)

64
Front-end: Loop Closure (2)

65
Front-end: Loop Closure (3)

• Structurally identical to data association, but


• High levels of ambiguity
• Possibly useless validation gates
• Environment symmetries
• Uncertainties collapse after a loop closure (whether the closure
was correct or not)

66
Front-end: Loop Closure (2)

Before After
• By revisiting previously seen places, uncertainties in robot and landmark estimates can be
reduced
• Loop closure ensures for lifelong, robustnees and convengence of SLAM
→ Need a robust data association method considering loop closure

67
SLAM Example: Tennis Court (1)

[courtesy by J. Leonard]
68
SLAM Example: Tennis Court (2)

[courtesy by J. Leonard]
69
EKF SLAM Victoria Park Dataset

[courtesy by E. Nebot] 70
Victoria Park: Landmark and Ground True

[courtesy by E. Nebot] 71
Victoria Park: Estimated Trajectory

[courtesy by E. Nebot] 72
Advantages and Disadvantages of Odometry
• Visual Odometry (VO)
• Based on cameras tracking feature point changes between frames
• Uses SLAM algorithms to determine position
• Lidar Odometry
• Uses Lidar sensors to scan the environment and determine position changes
• Combined with ICP (Iterative Closest Point) or LOAM (Lidar Odometry and Mapping)
algorithms

73
What is Visual Odometry (VO) ?
• VO is the process of incrementally estimating the pose of the vehicle by
examining the changes that motion induces on the images of its onboard
cameras

74
Why Visual Odometry (VO) ?
• VO is crucial for flying, walking, and
underwater robots
• Contrary to wheel odometry, VO is not
affected by wheel slippage (e.g., on sand or
wet floor)
• Very accurate: relative position error is 0.1%
− 2% of the travelled distance
• VO can be used as a complement to
• wheel encoders (wheel odometry)
• GPS (when GPS is degraded)
• Inertial Measurement Units (IMUs)
• laser odometry

75
Assumptions
• Sufficient illumination in the environment
• Dominance of static scene over moving
objects
• Enough texture to allow apparent motion to
be extracted
• Sufficient scene overlap between
consecutive frames

76
A Brief history of VO
• 1980: First known VO real-time implementation
on a robot by Hans Moraveck PhD thesis
(NASA/JPL) for Mars rovers using one sliding
camera (sliding stereo).
• 1980 to 2000: The VO research was dominated
by NASA/JPL in preparation of the 2004 mission
to Mars
• 2004: VO was used on a robot on another planet:
Mars rovers Spirit and Opportunity
• 2004: VO was revived in the academic
environment by David Nister’s «Visual
Odometry» paper. The term VO became popular.
• 2015-today: VO becomes a fundamental tool of
several products: VR/AR, drones, smartphones
• 2021: VO is used on the Mars helicopter

77
VO vs VSLAM vs SFM

78
Structure from Motion (SFM)
• SFM is more general than VO and tackles the problem of 3D
reconstruction and 6DOF pose estimation from unordered image sets

construction from 3 million images from [Link] on a cluster of 250 computers, 24 hours of computation
Paper: “Building Rome in a Day”, ICCV’09. State of the art software: COLMAP

79
VO vs SFM
• VO is a particular case of SFM
• VO focuses on estimating the 6DoF motion of the camera sequentially
(as a new frame arrives) and in real time
• Terminology: sometimes SFM is used as a synonym of VO

80
VO vs. Visual SLAM
• Visual Odometry
• Focuses on incremental estimation
• Guarantees local consistency (i.e., estimated
trajectory is locally correct, but not globally, i.e.
from the start to the end)
• Visual SLAM (Simultaneous Localization
And Mapping)
• SLAM = visual odometry + loop detection &
closure
• Guarantees global consistency (the estimated
trajectory is globally correct, i.e. from the start to
the end

81
VSLAM ⊆ VO ⊆ SFM
Why?
• because every VSLAM and VO are SFM,
but not every SFM is VO or SLAM
• because every VSLAM is a VO, but not
every VO is a SLAM
• VSLAM applies more stringent requirements,
such as loop detection and closure, than VO,
making it a particular case. Every VSLAM
functions as a VO, given that VSLAM, like VO,
incrementally estimates poses (although Bundle
Adjustment may further refine these
estimations). Moreover, if VSLAM achieves
global consistency, it inherently ensures local
consistency as well.

82
VO Flow Chart
VO computes the camera path incrementally (pose after pose)

83
VO Flow Chart
VO computes the camera path incrementally (pose after pose)

84
VO Flow Chart
VO computes the camera path incrementally (pose after pose)

85
VO Flow Chart
VO computes the camera path incrementally (pose after pose)

86
Course Topics
Tuần Nội dung
1 Giới thiệu về thị giác robot. Ứng dụng và xu hướng
2 Các kỹ thuật xử lý ảnh cơ bản 1: tạo ảnh, mô hình camera, chiếu phối cảnh, nội suy ảnh
Các kỹ thuật xử lý ảnh cơ bản 2: lọc nhiễu, phát hiện cạnh biên
3
Bài tập 1: Camera notation
Hiệu chỉnh camera
4
Bài tập 2: PnP
Trích xuất đặc trưng và đối sánh ảnh 1
5
Bài tập 3 : Toán tử Harris, bộ mô tả và đối sánh ảnh
Trích xuất đặc trưng và đối sánh ảnh 2
6
Bài tập 4: SIFT, bộ mô tả và đối sánh ảnh
Chuyển động và theo dõi đối tượng
7
Bài tập 5: Bộ theo dõi Kalman, LKT
Stereo/3D vision
8
Bài tập 6: Stereo vision

87
Course Topics
Tuần Nội dung
9
10 Ứng dụng thị giác vào điều hướng robot tự hành
- Structure from motion
- Visual Odometry
- Visual SLAM
11 Bài tập 7: Giải thuật 8 điểm
Bài tập 8: Giải thuật P3P và RANSAC
Bài tập 9: Bundle Adjustment
- Visual Inertial Fusion
12 Bài tập 10: Visual Inertial Fusion

13 Học máy cho thị giác robot: Mô hình SVM, KNN, cây quyết định
14 Mạng nơ-ron nhân tạo trong thị giác: CNN, ResNet, MobileNet
15 Nhận diện vật thể trong môi trường thực: YOLO, Faster R-CNN
16 Nhận diện, định vị vật thể và điều khiển cánh tay robot bằng thị giác
17 Báo cáo, đánh giá và tổng kết môn học

88
Course Topics
Tuần Nội dung
9
10 Ứng dụng thị giác vào điều hướng robot tự hành
- Structure from motion
- Visual Odometry
- Visual SLAM
11 Bài tập 7: Giải thuật 8 điểm
Bài tập 8: Giải thuật P3P và RANSAC
Bài tập 9: Bundle Adjustment
- Visual Inertial Fusion
12 Bài tập 10: Visual Inertial Fusion

13 Học máy cho thị giác robot: Mô hình SVM, KNN, cây quyết định
14 Mạng nơ-ron nhân tạo trong thị giác: CNN, ResNet, MobileNet
15 Nhận diện vật thể trong môi trường thực: YOLO, Faster R-CNN
16 Nhận diện, định vị vật thể và điều khiển cánh tay robot bằng thị giác
17 Báo cáo, đánh giá và tổng kết môn học

89
Prerequisites

• Linear algebra
• Matrix calculus: matrix multiplication, inversion, singular value
decomposition
• Check out this Linear Algebra Primer from Stanford University
[Link]
• Check out this Immersive Linear Algebra interactive tool by Ström, Åström,
and Akenine-Möller
[Link]

• No prior knowledge of computer vision and image processing is


required

90
Exercises (1)
• Learning Goal of the exercises: Implement
a full visual odometry pipeline (like the one
running on Mars rovers).
• Each week you will learn how to implement
a building block of visual odometry.
• Two exercises will be dedicated to system
integration.

91
Exercises (2)
• Bring your own laptop
• Exercises in Python or Matlab. You will need to have Matlab or Python
already pre-installed on your machine for the exercises.
• Please install all the toolboxes included in the license. If you don’t have
enough space in your PC, then install at least the Image Processing,
Computer Vision, and Optimization toolboxes.

92
References

No. Name
Richard Szeliski (2022). Computer Vision: Algorithms and Applications. Springer
[1]
Cham.
Sebastian Thrun, Wolfram Burgard, Dieter Fox (2005). Probabilistic Robotics. MIT
[2]
Press.
Roland Siegwart, Illah Reza Nourbakhsh and Davide Scaramuzza (2011).
[3]
Introduction to Autonomous Mobile Robots, Second Edition. MIT press.
Peter Corke (2023). Robotics, Vision and Control: Fundamental Algorithms, 2nd
[4]
edition, Springer Nature.
Rosebrock, Adrian (2017). Deep learning for computer vision with python: Starter
[5]
bundle. PyImageSearch.

93
Grading Policy
CĐR
Phương pháp đánh Tỷ
Điểm thành phần Mô tả được
giá cụ thể trọng
đánh giá
[1] [2] [3] [4] [5]
A1. Điểm quá trình (*) Đánh giá quá trình 40%
A1.1. Thảo luận trên Thuyết trình M1, M2,
10%
lớp M3
A1.2. Kiểm tra giữa kỳ Bài tập trắc nghiệm M1, M2
hoặc tự luận hoặc 30%
tiểu luận

A2. Điểm cuối kỳ A2.1. Thi cuối kỳ Thi viết hoặc làm M2, M3
dự án và báo cáo
hoặc thi trắc
nghiệm hoặc thi 60%
vấn đáp

94
Questions?

95
The Pin-hole
The Pin-hole approximation
approximation

96
The Pin-hole approximation

97

You might also like