Lecture3 Booksection
Lecture3 Booksection
dimensional space. Such generalities are avoided in this book because their current
applications are limited. Unless otherwise stated, the world generally contains two
kinds of entities:
1. Obstacles: Portions of the world that are “permanently” occupied, for
example, as in the walls of a building.
Chapter 3 2. Robots: Bodies that are modeled geometrically and are controllable via a
motion plan.
Geometric Representations and Based on the terminology, one obvious application is to model a robot that moves
around in a building; however, many other possibilities exist. For example, the
Transformations robot could be a flexible molecule, and the obstacles could be a folded protein.
As another example, the robot could be a virtual human in a graphical simulation
that involves obstacles (imagine the family of Doom-like video games).
This section presents a method for systematically constructing representations
This chapter provides important background material that will be needed for Part of obstacles and robots using a collection of primitives. Both obstacles and robots
II. Formulating and solving motion planning problems require defining and manip- will be considered as (closed) subsets of W. Let the obstacle region O denote the
ulating complicated geometric models of a system of bodies in space. Section 3.1 set of all points in W that lie in one or more obstacles; hence, O ⊆ W. The
introduces geometric modeling, which focuses mainly on semi-algebraic modeling next step is to define a systematic way of representing O that has great expressive
because it is an important part of Chapter 6. If your interest is mainly in Chapter power while being computationally efficient. Robots will be defined in a similar
5, then understanding semi-algebraic models is not critical. Sections 3.2 and 3.3 way; however, this will be deferred until Section 3.2, where transformations of
describe how to transform a single body and a chain of bodies, respectively. This geometric bodies are defined.
will enable the robot to “move.” These sections are essential for understanding
all of Part II and many sections beyond. It is expected that many readers will al-
3.1.1 Polygonal and Polyhedral Models
ready have some or all of this background (especially Section 3.2, but it is included
for completeness). Section 3.4 extends the framework for transforming chains of In this and the next subsection, a solid representation of O will be developed in
bodies to transforming trees of bodies, which allows modeling of complicated sys- terms of a combination of primitives. Each primitive Hi represents a subset of W
tems, such as humanoid robots and flexible organic molecules. Finally, Section 3.5 that is easy to represent and manipulate in a computer. A complicated obstacle
briefly covers transformations that do not assume each body is rigid. region will be represented by taking finite, Boolean combinations of primitives.
Using set theory, this implies that O can also be defined in terms of a finite
number of unions, intersections, and set differences of primitives.
3.1 Geometric Modeling
Convex polygons First consider O for the case in which the obstacle region is
A wide variety of approaches and techniques for geometric modeling exist, and
a convex, polygonal subset of a 2D world, W = R2 . A subset X ⊂ Rn is called
the particular choice usually depends on the application and the difficulty of the
convex if and only if, for any pair of points in X, all points along the line segment
problem. In most cases, there are generally two alternatives: 1) a boundary repre-
that connects them are contained in X. More precisely, this means that for any
sentation, and 2) a solid representation. Suppose we would like to define a model
x1 , x2 ∈ X and λ ∈ [0, 1],
of a planet. Using a boundary representation, we might write the equation of a
λx1 + (1 − λ)x2 ∈ X. (3.1)
sphere that roughly coincides with the planet’s surface. Using a solid representa-
tion, we would describe the set of all points that are contained in the sphere. Both Thus, interpolation between x1 and x2 always yields points in X. Intuitively, X
alternatives will be considered in this section. contains no pockets or indentations. A set that is not convex is called nonconvex
The first step is to define the world W for which there are two possible choices: (as opposed to concave, which seems better suited for lenses).
1) a 2D world, in which W = R2 , and 2) a 3D world, in which W = R3 . These A boundary representation of O is an m-sided polygon, which can be described
choices should be sufficient for most problems; however, one might also want to using two kinds of features: vertices and edges. Every vertex corresponds to a
allow more complicated worlds, such as the surface of a sphere or even a higher “corner” of the polygon, and every edge corresponds to a line segment between a
3.1. GEOMETRIC MODELING 83 84 S. M. LaValle: Planning Algorithms
+ +
+
− + +
− − +
− +
− +
− −
−
− −
Figure 3.2: The sign of the f (x, y) partitions R2 into three regions: two half-planes
given by f (x, y) < 0 and f (x, y) > 0, and the line f (x, y) = 0.
line fi (x, y) = 0 (including the points on the line). A convex, m-sided, polygonal
obstacle region O is expressed as
O = H1 ∩ H2 ∩ · · · ∩ Hm . (3.3)
The right part may be alternatively represented as −f (x, y, z) ≤ 0, and −f may Figure 3.5: A polygon with holes can be expressed by using different orientations:
be considered as a new polynomial function of x, y, and z. For an example that counterclockwise for the outer boundary and clockwise for the hole boundaries.
involves the = relation, consider the primitive Note that the shaded part is always to the left when following the arrows.
H = {(x, y, z) ∈ W | f (x, y, z) = 0}. (3.15) Nonconvex polygons and polyhedra The method in Section 3.1.1 required
nonconvex polygons to be represented as a union of convex polygons. Instead, a
It can instead be constructed as H = H1 ∩ H2 , in which boundary representation of a nonconvex polygon may be directly encoded by list-
ing vertices in a specific order; assume that counterclockwise order is used. Each
H1 = {(x, y, z) ∈ W | f (x, y, z) ≤ 0} (3.16) polygon of m vertices may be encoded by a list of the form (x1 , y1 ), (x2 , y2 ), . . .,
(xm , ym ). It is assumed that there is an edge between each (xi , yi ) and (xi+1 , yi+1 )
for each i from 1 to m−1, and also an edge between (xm , ym ) and (x1 , y1 ). Ordinar-
and
ily, the vertices should be chosen in a way that makes the polygon simple, meaning
H2 = {(x, y, z) ∈ W | − f (x, y, z) ≤ 0}. (3.17) that no edges intersect. In this case, there is a well-defined interior of the polygon,
which is to the left of every edge, if the vertices are listed in counterclockwise
The relation < does add some expressive power if it is used to construct primitives.2 order.
It is needed to construct models that do not include the outer boundary (for What if a polygon has a hole in it? In this case, the boundary of the hole
example, the set of all points inside of a sphere, which does not include points on can be expressed as a polygon, but with its vertices appearing in the clockwise
the sphere). These are generally called open sets and are defined Chapter 4. direction. To the left of each edge is the interior of the outer polygon, and to the
right is the hole, as shown in Figure 3.5
Although the data structures are a little more complicated for three dimen-
3.1.3 Other Models sions, boundary representations of nonconvex polyhedra may be expressed in a
The choice of a model often depends on the types of operations that will be per- similar manner. In this case, instead of an edge list, one must specify faces, edges,
formed by the planning algorithm. For combinatorial motion planning methods, and vertices, with pointers that indicate their incidence relations. Consistent ori-
to be covered in Chapter 6, the particular representation is critical. On the other entations must also be chosen, and holes may be modeled once again by selecting
hand, for sampling-based planning methods, to be covered in Chapter 5, the par- opposite orientations.
ticular representation is important only to the collision detection algorithm, which
is treated as a “black box” as far as planning is concerned. Therefore, the models 3D triangles Suppose W = R3 . One of the most convenient geometric models
given in the remainder of this section are more likely to appear in sampling-based to express is a set of triangles, each of which is specified by three points, (x1 , y1 , z1 ),
approaches and may be invisible to the designer of a planning algorithm (although (x2 , y2 , z2 ), (x3 , y3 , z3 ). This model has been popular in computer graphics because
it is never wise to forget completely about the representation). graphics acceleration hardware primarily uses triangle primitives. It is assumed
that the interior of the triangle is part of the model. Thus, two triangles are
2
An alternative that yields the same expressive power is to still use ≤, but allow set comple- considered as “colliding” if one pokes into the interior of another. This model offers
ments, in addition to unions and intersections. great flexibility because there are no constraints on the way in which triangles must
3.1. GEOMETRIC MODELING 91 92 S. M. LaValle: Planning Algorithms
each “1” in the image corresponds to a rectangular region that contains at least
one point of O, and “0” represents those that do not contain any of O. Although
bitmaps do not have the elegance of the other models, they often arise in applica-
Figure 3.6: Triangle strips and triangle fans can reduce the number of redundant tions. One example is a digital map constructed by a mobile robot that explores
points. an environment with its sensors. One generalization of bitmaps is a gray-scale
map or occupancy grid. In this case, a numerical value may be assigned to each
be expressed; however, this is also one of the drawbacks. There is no coherency cell, indicating quantities such as “the probability that an obstacle exists” or the
that can be exploited to easily declare whether a point is “inside” or “outside” of “expected difficulty of traversing the cell.” The latter interpretation is often used
a 3D obstacle. If there is at least some coherency, then it is sometimes preferable in terrain maps for navigating planetary rovers.
to reduce redundancy in the specification of triangle coordinates (many triangles
will share the same corners). Representations that remove this redundancy are Superquadrics Instead of using polynomials to define fi , many generalizations
called a triangle strip, which is a sequence of triangles such that each adjacent can be constructed. One popular primitive is a superquadric, which generalizes
pair shares a common edge, and a triangle fan, which is a triangle strip in which quadric surfaces. One example is a superellipsoid, which is given for W = R3 by
all triangles share a common vertex. See Figure 3.6.
n1 /n2
|x/a|n1 + |y/b|n2 + |z/c|n1 − 1 ≤ 0, (3.20)
Nonuniform rational B-splines (NURBS) These are used in many engi-
neering design systems to allow convenient design and adjustment of curved sur- in which n1 ≥ 2 and n2 ≥ 2. If n1 = n2 = 2, an ellipse is generated. As n1 and n2
faces, in applications such as aircraft or automobile body design. In contrast to increase, the superellipsoid becomes shaped like a box with rounded corners.
semi-algebraic models, which are implicit equations, NURBS and other splines are
parametric equations. This makes computations such as rendering easier; however, Generalized cylinders A generalized cylinder is a generalization of an ordinary
others, such as collision detection, become more difficult. These models may be cylinder. Instead of being limited to a line, the center axis is a continuous spine
defined in any dimension. A brief 2D formulation is given here. curve, (x(s), y(s), z(s)), for some parameter s ∈ [0, 1]. Instead of a constant radius,
A curve can be expressed as a radius function r(s) is defined along the spine. The value r(s) is the radius of
n
X the circle obtained as the cross section of the generalized cylinder at the point
wi Pi Ni,k (u) (x(s), y(s), z(s)). The normal to the cross-section plane is the tangent to the spine
i=0 curve at s.
C(u) = n , (3.18)
X
wi Ni,k (u)
i=0 3.2 Rigid-Body Transformations
in which wi ∈ R are weights and Pi are control points. The Ni,k are normalized
Any of the techniques from Section 3.1 can be used to define both the obstacle
basis functions of degree k, which can be expressed recursively as
region and the robot. Let O refer to the obstacle region, which is a subset of W.
u − ti ti+k+1 − u Let A refer to the robot, which is a subset of R2 or R3 , matching the dimension
Ni,k (u) = Ni,k−1 (u) + Ni+1,k−1 (u). (3.19) of W. Although O remains fixed in the world, W, motion planning problems will
ti+k − ti ti+k+1 − ti+1
require “moving” the robot, A.
The basis of the recursion is Ni,0 (u) = 1 if ti ≤ u < ti+1 , and Ni,0 (u) = 0 otherwise.
A knot vector is a nondecreasing sequence of real values, {t0 , t1 , . . . , tm }, that
controls the intervals over which certain basic functions take effect. 3.2.1 General Concepts
Before giving specific transformations, it will be helpful to define them in general to
Bitmaps For either W = R2 or W = R3 , it is possible to discretize a bounded avoid confusion in later parts when intuitive notions might fall apart. Suppose that
portion of the world into rectangular cells that may or may not be occupied. a rigid robot, A, is defined as a subset of R2 or R3 . A rigid-body transformation is
The resulting model looks very similar to Example 2.1. The resolution of this a function, h : A → W, that maps every point of A into W with two requirements:
discretization determines the number of cells per axis and the quality of the ap- 1) The distance between any pair of points of A must be preserved, and 2) the
proximation. The representation may be considered as a binary image in which orientation of A must be preserved (no “mirror images”).
3.2. RIGID-BODY TRANSFORMATIONS 93 94 S. M. LaValle: Planning Algorithms
Using standard function notation, h(a) for some a ∈ A refers to the point in Defining frames It was assumed so far that A is defined in R2 or R3 , but before
W that is “occupied” by a. Let it is transformed, it is not considered to be a subset of W. The transformation h
places the robot in W. In the coming material, it will be convenient to indicate
h(A) = {h(a) ∈ W | a ∈ A}, (3.21)
this distinction using coordinate frames. The origin and coordinate basis vectors
which is the image of h and indicates all points in W occupied by the transformed of W will be referred to as the world frame.3 Thus, any point w ∈ W is expressed
robot. in terms of the world frame.
The coordinates used to define A are initially expressed in the body frame,
Transforming the robot model Consider transforming a robot model. If A which represents the origin and coordinate basis vectors of R2 or R3 . In the case
is expressed by naming specific points in R2 , as in a boundary representation of a of A ⊂ R2 , it can be imagined that the body frame is painted on the robot.
polygon, then each point is simply transformed from a to h(a) ∈ W. In this case, Transforming the robot is equivalent to converting its model from the body frame
it is straightforward to transform the entire model using h. However, there is a to the world frame. This has the effect of placing4 A into W at some position
slight complication if the robot model is expressed using primitives, such as and orientation. When multiple bodies are covered in Section 3.3, each body will
have its own body frame, and transformations require expressing all bodies with
Hi = {a ∈ R2 | fi (a) ≤ 0}. (3.22) respect to the world frame.
2
This differs slightly from (3.2) because the robot is defined in R (which is not
necessarily W), and also a is used to denote a point (x, y) ∈ A. Under a transfor- 3.2.2 2D Transformations
mation h, the primitive is transformed as
Translation A rigid robot A ⊂ R2 is translated by using two parameters, xt , yt ∈
h(Hi ) = {h(a) ∈ W | fi (a) ≤ 0}. (3.23) R. Using definitions from Section 3.2.1, q = (xt , yt ), and h is defined as
To transform the primitive completely, however, it is better to directly name points h(x, y) = (x + xt , y + yt ). (3.25)
in w ∈ W, as opposed to h(a) ∈ W. Using the fact that a = h−1 (w), this becomes
A boundary representation of A can be translated by transforming each vertex in
h(Hi ) = {w ∈ W | fi (h−1 (w)) ≤ 0}, (3.24) the sequence of polygon vertices using (3.25). Each point, (xi , yi ), in the sequence
in which the inverse of h appears in the right side because the original point a ∈ A is replaced by (xi + xt , yi + yt ).
needs to be recovered to evaluate fi . Therefore, it is important to be careful Now consider a solid representation of A, defined in terms of primitives. Each
because either h or h−1 may be required to transform the model. This will be primitive of the form
observed in more specific contexts in some coming examples. Hi = {(x, y) ∈ R2 | f (x, y) ≤ 0} (3.26)
Using the notation of Section 3.2.1, R(θ) becomes h(q), for which q = θ. For
(a) Translation of the robot (b) Translation of the frame linear transformations, such as the one defined by (3.32), recall that the column
vectors represent the basis vectors of the new coordinate frame. The column
vectors of R(θ) are unit vectors, and their inner product (or dot product) is zero,
Figure 3.7: Every transformation has two interpretations. indicating that they are orthogonal. Suppose that the x and y coordinate axes,
which represent the body frame, are “painted” on A. The columns of R(θ) can be
derived by considering the resulting directions of the x- and y-axes, respectively,
which is the familiar equation for a disc centered at (xt , yt ). In this example, the after performing a counterclockwise rotation by the angle θ. This interpretation
inverse, h−1 is used, as described in Section 3.2.1. generalizes nicely for higher dimensional rotation matrices.
Note that the rotation is performed about the origin. Thus, when defining the
model of A, the origin should be placed at the intended axis of rotation. Using
The translated robot is denoted as A(xt , yt ). Translation by (0, 0) is the iden- the semi-algebraic model, the entire robot model can be rotated by transforming
tity transformation, which results in A(0, 0) = A, if it is assumed that A ⊂ W each primitive, yielding A(θ). The inverse rotation, R(−θ), must be applied to
(recall that A does not necessarily have to be initially embedded in W). It will be each primitive.
convenient to use the term degrees of freedom to refer to the maximum number of
independent parameters that are needed to completely characterize the transfor- Combining translation and rotation Suppose a rotation by θ is performed,
mation applied to the robot. If the set of allowable values for xt and yt forms a followed by a translation by xt , yt . This can be used to place the robot in any
two-dimensional subset of R2 , then the degrees of freedom is two. desired position and orientation. Note that translations and rotations do not
Suppose that A is defined directly in W with translation. As shown in Figure commute! If the operations are applied successively, each (x, y) ∈ A is transformed
3.7, there are two interpretations of a rigid-body transformation applied to A: 1) to
x cos θ − y sin θ + xt
The world frame remains fixed and the robot is transformed; 2) the robot remains . (3.33)
x sin θ + y cos θ + yt
fixed and the world frame is translated. The first one characterizes the effect of
the transformation from a fixed world frame, and the second one indicates how The following matrix multiplication yields the same result for the first two vector
the transformation appears from the robot’s perspective. Unless stated otherwise, components:
the first interpretation will be used when we refer to motion planning problems
cos θ − sin θ xt x x cos θ − y sin θ + xt
because it often models a robot moving in a physical world. Numerous books cover sin θ cos θ yt y = x sin θ + y cos θ + yt . (3.34)
coordinate transformations under the second interpretation. This has been known
0 0 1 1 1
to cause confusion because the transformations may sometimes appear “backward”
from what is desired in motion planning. This implies that the 3 × 3 matrix,
cos θ − sin θ xt
Rotation The robot, A, can be rotated counterclockwise by some angle θ ∈ T = sin θ cos θ yt , (3.35)
[0, 2π) by mapping every (x, y) ∈ A as 0 0 1
represents a rotation followed by a translation. The matrix T will be referred to
(x, y) 7→ (x cos θ − y sin θ, x sin θ + y cos θ). (3.30) as a homogeneous transformation matrix. It is important to remember that T
3.2. RIGID-BODY TRANSFORMATIONS 97 98 S. M. LaValle: Planning Algorithms
z Yaw, pitch, and roll rotations A 3D body can be rotated about three orthog-
α onal axes, as shown in Figure 3.8. Borrowing aviation terminology, these rotations
Yaw will be referred to as yaw, pitch, and roll:
matrix would result. Be careful when interpreting the rotations. Consider the The homogeneous transformation matrix for 3D bodies As in the 2D
final rotation, a yaw by α. Imagine sitting inside of a robot A that looks like case, a homogeneous transformation matrix can be defined. For the 3D case, a
an aircraft. If β = γ = 0, then the yaw turns the plane in a way that feels 4 × 4 matrix is obtained that performs the rotation given by R(α, β, γ), followed
like turning a car to the left. However, for arbitrary values of β and γ, the final by a translation given by xt , yt , zt . The result is
rotation axis will not be vertically aligned with the aircraft because the aircraft is
left in an unusual orientation before α is applied. The yaw rotation occurs about cos α cos β cos α sin β sin γ − sin α cos γ cos α sin β cos γ + sin α sin γ xt
the z-axis of the world frame, not the body frame of A. Each time a new rotation sin α cos β sin α sin β sin γ + cos α cos γ sin α sin β cos γ − cos α sin γ yt
T = .
matrix is introduced from the left, it has no concern for original body frame of − sin β cos β sin γ cos β cos γ z t
0 0 0 1
A. It simply rotates every point in R3 in terms of the world frame. Note that 3D
(3.50)
rotations depend on three parameters, α, β, and γ, whereas 2D rotations depend
Once again, the order of operations is critical. The matrix T in (3.50) represents
only on a single parameter, θ. The primitives of the model can be transformed
the following sequence of transformations:
using R(α, β, γ), resulting in A(α, β, γ).
1. Roll by γ 3. Yaw by α
Determining yaw, pitch, and roll from a rotation matrix It is often con- 2. Pitch by β 4. Translate by (xt , yt , zt ).
venient to determine the α, β, and γ parameters directly from a given rotation
matrix. Suppose an arbitrary rotation matrix The robot primitives can be transformed to yield A(xt , yt , zt , α, β, γ). A 3D rigid
body that is capable of translation and rotation therefore has six degrees of free-
r11 r12 r13 dom.
r21 r22 r23 (3.43)
r31 r32 r33
is given. By setting each entry equal to its corresponding entry in (3.42), equations 3.3 Transforming Kinematic Chains of Bodies
are obtained that must be solved for α, β, p and γ. Note that r21 /r11 = tan α and
2 2
The transformations become more complicated for a chain of attached rigid bodies.
r32 /r33 = tan γ. Also, r31 = − sin β and r32 + r33 = cos β. Solving for each For convenience, each rigid body is referred to as a link. Let A1 , A2 , . . . , Am denote
angle yields a set of m links. For each i such that 1 ≤ i < m, link Ai is “attached” to link Ai+1
α = tan−1 (r21 /r11 ), (3.44) in a way that allows Ai+1 some constrained motion with respect to Ai . The motion
−1
q 2 2
constraint must be explicitly given, and will be discussed shortly. As an example,
β = tan − r31 r32 + r33 , (3.45)
imagine a trailer that is attached to the back of a car by a hitch that allows the
and trailer to rotate with respect to the car. In general, a set of attached bodies will
γ = tan−1 (r32 /r33 ). (3.46) be referred to as a linkage. This section considers bodies that are attached in a
There is a choice of four quadrants for the inverse tangent functions. How can single chain. This leads to a particular linkage called a kinematic chain.
the correct quadrant be determined? Each quadrant should be chosen by using
the signs of the numerator and denominator of the argument. The numerator sign 3.3.1 A 2D Kinematic Chain
selects whether the direction will be above or below the x-axis, and the denomi-
nator selects whether the direction will be to the left or right of the y-axis. This Before considering a kinematic chain, suppose A1 and A2 are unattached rigid
is the same as the atan2 function in the C programming language, which nicely bodies, each of which is capable of translating and rotating in W = R2 . Since
expands the range of the arctangent to [0, 2π). This can be applied to express each body has three degrees of freedom, there is a combined total of six degrees
(3.44), (3.45), and (3.46) as of freedom; the independent parameters are x1 , y1 , θ1 , x2 , y2 , and θ2 .
α = atan2(r21 , r11 ), (3.47)
q Attaching bodies When bodies are attached in a kinematic chain, degrees of
β = atan2 − r31 , r32 2 2
+ r33 , (3.48) freedom are removed. Figure 3.9 shows two different ways in which a pair of 2D
links can be attached. The place at which the links are attached is called a joint.
and For a revolute joint, one link is capable only of rotation with respect to the other.
γ = atan2(r32 , r33 ). (3.49) For a prismatic joint is shown, one link slides along the other. Each type of joint
Note that this method assumes r11 6= 0 and r33 6= 0. removes two degrees of freedom from the pair of bodies. For example, consider a