0% found this document useful (0 votes)
6 views10 pages

Lecture3 Booksection

This chapter discusses geometric representations and transformations necessary for motion planning in robotics. It introduces obstacles and robots as geometric entities and outlines methods for modeling them using primitives, focusing on both convex and nonconvex polygonal and polyhedral representations. The chapter emphasizes the importance of efficient geometric modeling for formulating and solving motion planning problems.

Uploaded by

mkhedr417
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views10 pages

Lecture3 Booksection

This chapter discusses geometric representations and transformations necessary for motion planning in robotics. It introduces obstacles and robots as geometric entities and outlines methods for modeling them using primitives, focusing on both convex and nonconvex polygonal and polyhedral representations. The chapter emphasizes the importance of efficient geometric modeling for formulating and solving motion planning problems.

Uploaded by

mkhedr417
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

82 S. M.

LaValle: Planning Algorithms

dimensional space. Such generalities are avoided in this book because their current
applications are limited. Unless otherwise stated, the world generally contains two
kinds of entities:
1. Obstacles: Portions of the world that are “permanently” occupied, for
example, as in the walls of a building.
Chapter 3 2. Robots: Bodies that are modeled geometrically and are controllable via a
motion plan.

Geometric Representations and Based on the terminology, one obvious application is to model a robot that moves
around in a building; however, many other possibilities exist. For example, the
Transformations robot could be a flexible molecule, and the obstacles could be a folded protein.
As another example, the robot could be a virtual human in a graphical simulation
that involves obstacles (imagine the family of Doom-like video games).
This section presents a method for systematically constructing representations
This chapter provides important background material that will be needed for Part of obstacles and robots using a collection of primitives. Both obstacles and robots
II. Formulating and solving motion planning problems require defining and manip- will be considered as (closed) subsets of W. Let the obstacle region O denote the
ulating complicated geometric models of a system of bodies in space. Section 3.1 set of all points in W that lie in one or more obstacles; hence, O ⊆ W. The
introduces geometric modeling, which focuses mainly on semi-algebraic modeling next step is to define a systematic way of representing O that has great expressive
because it is an important part of Chapter 6. If your interest is mainly in Chapter power while being computationally efficient. Robots will be defined in a similar
5, then understanding semi-algebraic models is not critical. Sections 3.2 and 3.3 way; however, this will be deferred until Section 3.2, where transformations of
describe how to transform a single body and a chain of bodies, respectively. This geometric bodies are defined.
will enable the robot to “move.” These sections are essential for understanding
all of Part II and many sections beyond. It is expected that many readers will al-
3.1.1 Polygonal and Polyhedral Models
ready have some or all of this background (especially Section 3.2, but it is included
for completeness). Section 3.4 extends the framework for transforming chains of In this and the next subsection, a solid representation of O will be developed in
bodies to transforming trees of bodies, which allows modeling of complicated sys- terms of a combination of primitives. Each primitive Hi represents a subset of W
tems, such as humanoid robots and flexible organic molecules. Finally, Section 3.5 that is easy to represent and manipulate in a computer. A complicated obstacle
briefly covers transformations that do not assume each body is rigid. region will be represented by taking finite, Boolean combinations of primitives.
Using set theory, this implies that O can also be defined in terms of a finite
number of unions, intersections, and set differences of primitives.
3.1 Geometric Modeling
Convex polygons First consider O for the case in which the obstacle region is
A wide variety of approaches and techniques for geometric modeling exist, and
a convex, polygonal subset of a 2D world, W = R2 . A subset X ⊂ Rn is called
the particular choice usually depends on the application and the difficulty of the
convex if and only if, for any pair of points in X, all points along the line segment
problem. In most cases, there are generally two alternatives: 1) a boundary repre-
that connects them are contained in X. More precisely, this means that for any
sentation, and 2) a solid representation. Suppose we would like to define a model
x1 , x2 ∈ X and λ ∈ [0, 1],
of a planet. Using a boundary representation, we might write the equation of a
λx1 + (1 − λ)x2 ∈ X. (3.1)
sphere that roughly coincides with the planet’s surface. Using a solid representa-
tion, we would describe the set of all points that are contained in the sphere. Both Thus, interpolation between x1 and x2 always yields points in X. Intuitively, X
alternatives will be considered in this section. contains no pockets or indentations. A set that is not convex is called nonconvex
The first step is to define the world W for which there are two possible choices: (as opposed to concave, which seems better suited for lenses).
1) a 2D world, in which W = R2 , and 2) a 3D world, in which W = R3 . These A boundary representation of O is an m-sided polygon, which can be described
choices should be sufficient for most problems; however, one might also want to using two kinds of features: vertices and edges. Every vertex corresponds to a
allow more complicated worlds, such as the surface of a sphere or even a higher “corner” of the polygon, and every edge corresponds to a line segment between a
3.1. GEOMETRIC MODELING 83 84 S. M. LaValle: Planning Algorithms

+ +
+
− + +
− − +
− +
− +
− −

− −

Figure 3.2: The sign of the f (x, y) partitions R2 into three regions: two half-planes
given by f (x, y) < 0 and f (x, y) > 0, and the line f (x, y) = 0.

line fi (x, y) = 0 (including the points on the line). A convex, m-sided, polygonal
obstacle region O is expressed as

O = H1 ∩ H2 ∩ · · · ∩ Hm . (3.3)

Figure 3.1: A convex polygonal region can be identified by the intersection of


half-planes. Nonconvex polygons The assumption that O is convex is too limited for most
applications. Now suppose that O is a nonconvex, polygonal subset of W. In this
pair of vertices. The polygon can be specified by a sequence, (x1 , y1 ), (x2 , y2 ), . . ., case O can be expressed as
(xm , ym ), of m points in R2 , given in counterclockwise order.
A solid representation of O can be expressed as the intersection of m half- O = O1 ∪ O2 ∪ · · · ∪ On , (3.4)
planes. Each half-plane corresponds to the set of all points that lie to one side
of a line that is common to a polygon edge. Figure 3.1 shows an example of an in which each Oi is a convex, polygonal set that is expressed in terms of half-
octagon that is represented as the intersection of eight half-planes. planes using (3.3). Note that Oi and Oj for i 6= j need not be disjoint. Using this
An edge of the polygon is specified by two points, such as (x1 , y1 ) and (x2 , y2 ). representation, very complicated obstacle regions in W can be defined. Although
Consider the equation of a line that passes through (x1 , y1 ) and (x2 , y2 ). An these regions may contain multiple components and holes, if O is bounded (i.e., O
equation can be determined of the form ax + by + c = 0, in which a, b, c ∈ R will fit inside of a big enough rectangular box), its boundary will consist of linear
are constants that are determined from x1 , y1 , x2 , and y2 . Let f : R2 → R be segments.
the function given by f (x, y) = ax + by + c. Note that f (x, y) < 0 on one side In general, more complicated representations of O can be defined in terms of
of the line, and f (x, y) > 0 on the other. (In fact, f may be interpreted as a any finite combination of unions, intersections, and set differences of primitives;
signed Euclidean distance from (x, y) to the line.) The sign of f (x, y) indicates a however, it is always possible to simplify the representation into the form given
half-plane that is bounded by the line, as depicted in Figure 3.2. Without loss of by (3.3) and (3.4). A set difference can be avoided by redefining the primitive.
generality, assume that f (x, y) is defined so that f (x, y) < 0 for all points to the Suppose the model requires removing a set defined by a primitive Hi that contains1
left of the edge from (x1 , y1 ) to (x2 , y2 ) (if it is not, then multiply f (x, y) by −1). fi (x, y) < 0. This is equivalent to keeping all points such that fi (x, y) ≥ 0, which is
Let fi (x, y) denote the f function derived from the line that corresponds to equivalent to −fi (x, y) ≤ 0. This can be used to define a new primitive Hi′ , which
the edge from (xi , yi ) to (xi+1 , yi+1 ) for 1 ≤ i < m. Let fm (x, y) denote the line when taken in union with other sets, is equivalent to the removal of Hi . Given
equation that corresponds to the edge from (xm , ym ) to (x1 , y1 ). Let a half-plane a complicated combination of primitives, once set differences are removed, the
Hi for 1 ≤ i ≤ m be defined as a subset of W: expression can be simplified into a finite union of finite intersections by applying
Boolean algebra laws.
Hi = {(x, y) ∈ W | fi (x, y) ≤ 0}. (3.2)
1
In this section, we want the resulting set to include all of the points along the boundary.
Above, Hi is a primitive that describes the set of all points on one side of the Therefore, < is used to model a set for removal, as opposed to ≤.
3.1. GEOMETRIC MODELING 85 86 S. M. LaValle: Planning Algorithms

Note that the representation of a nonconvex polygon is not unique. There


are many ways to decompose O into convex components. The decomposition
should be carefully selected to optimize computational performance in whatever
algorithms that model will be used. In most cases, the components may even be
allowed to overlap. Ideally, it seems that it would be nice to represent O with the
minimum number of primitives, but automating such a decomposition may lead to
an NP-hard problem (see Section 6.5.1 for a brief overview of NP-hardness). One
efficient, practical way to decompose O is to apply the vertical cell decomposition
algorithm, which will be presented in Section 6.2.2

Defining a logical predicate What is the value of the previous representation?


As a simple example, we can define a logical predicate that serves as a collision
detector. Recall from Section 2.4.1 that a predicate is a Boolean-valued function. (a) (b)
Let φ be a predicate defined as φ : W → {true, false}, which returns true for
a point in W that lies in O, and false otherwise. For a line given by f (x, y) = Figure 3.3: (a) A polyhedron can be described in terms of faces, edges, and vertices.
0, let e(x, y) denote a logical predicate that returns true if f (x, y) ≤ 0, and (b) The edges of each face can be stored in a circular list that is traversed in
false otherwise. counterclockwise order with respect to the outward normal vector of the face.
A predicate that corresponds to a convex polygonal region is represented by a
logical conjunction,
Several data structures have been proposed that allow one to conveniently
α(x, y) = e1 (x, y) ∧ e2 (x, y) ∧ · · · ∧ em (x, y). (3.5) “walk” around the polyhedral features. For example, the doubly connected edge
list [264] data structure contains three types of records: faces, half-edges, and
The predicate α(x, y) returns true if the point (x, y) lies in the convex polygonal
vertices. Intuitively, a half-edge is a directed edge. Each vertex record holds the
region, and false otherwise. An obstacle region that consists of n convex polygons
point coordinates and a pointer to an arbitrary half-edge that touches the vertex.
is represented by a logical disjunction of conjuncts,
Each face record contains a pointer to an arbitrary half-edge on its boundary. Each
φ(x, y) = α1 (x, y) ∨ α2 (x, y) ∨ · · · ∨ αn (x, y). (3.6) face is bounded by a circular list of half-edges. There is a pair of directed half-edge
records for each edge of the polyhedon. Each half-edge is shown as an arrow in
Although more efficient methods exist, φ can check whether a point (x, y) lies Figure 3.3b. Each half-edge record contains pointers to five other records: 1) the
in O in time O(n), in which n is the number of primitives that appear in the vertex from which the half-edge originates; 2) the “twin” half-edge, which bounds
representation of O (each primitive is evaluated in constant time). the neighboring face, and has the opposite direction; 3) the face that is bounded by
Note the convenient connection between a logical predicate representation and the half-edge; 4) the next element in the circular list of edges that bound the face;
a set-theoretic representation. Using the logical predicate, the unions and inter- and 5) the previous element in the circular list of edges that bound the face. Once
sections of the set-theoretic representation are replaced by logical ORs and ANDs. all of these records have been defined, one can conveniently traverse the structure
It is well known from Boolean algebra that any complicated logical sentence can of the polyhedron.
be reduced to a logical disjunction of conjunctions (this is often called “sum of Now consider a solid representation of a polyhedron. Suppose that O is a con-
products” in computer engineering). This is equivalent to our previous statement vex polyhedron, as shown in Figure 3.3. A solid representation can be constructed
that O can always be represented as a union of intersections of primitives. from the vertices. Each face of O has at least three vertices along its boundary.
Assuming these vertices are not collinear, an equation of the plane that passes
Polyhedral models For a 3D world, W = R3 , and the previous concepts can through them can be determined of the form
be nicely generalized from the 2D case by replacing polygons with polyhedra and ax + by + cz + d = 0, (3.7)
replacing half-plane primitives with half-space primitives. A boundary represen-
in which a, b, c, d ∈ R are constants.
tation can be defined in terms of three features: vertices, edges, and faces. Every
Once again, f can be constructed, except now f : R3 → R and
face is a “flat” polygon embedded in R3 . Every edge forms a boundary between
two faces. Every vertex forms a boundary between three or more edges. f (x, y, z) = ax + by + cz + d. (3.8)
3.1. GEOMETRIC MODELING 87 88 S. M. LaValle: Planning Algorithms

Let m be the number of faces. For each face of O, a half-space Hi is defined as a +


subset of W: + + +
+ +
Hi = {(x, y, z) ∈ W | fi (x, y, z) ≤ 0}. (3.9)
+ + − +
It is important to choose fi so that it takes on negative values inside of the poly- − − +
hedron. In the case of a polygonal model, it was possible to consistently define
+ − − −
fi by proceeding in counterclockwise order around the boundary. In the case of − − +
+ +
a polyhedron, the half-edge data structure can be used to obtain for each face
the list of edges that form its boundary in counterclockwise order. Figure 3.3b ++ +
++
shows the edge ordering for each face. For every edge, the arrows point in opposite
+ +
directions, as required by the half-edge data structure. The equation for each face
can be consistently determined as follows. Choose three consecutive vertices, p1 , (a) (b)
p2 , p3 (they must not be collinear) in counterclockwise order on the boundary of
the face. Let v12 denote the vector from p1 to p2 , and let v23 denote the vector
from p2 to p3 . The cross product v = v12 × v23 always yields a vector that points Figure 3.4: (a) Once again, f is used to partition R2 into two regions. In this case,
out of the polyhedron and is normal to the face. Recall that the vector [a b c] the algebraic primitive represents a disc-shaped region. (b) The shaded “face” can
is parallel to the normal to the plane. If its components are chosen as a = v[1], be exactly modeled using only four algebraic primitives.
b = v[2], and c = v[3], then f (x, y, z) ≤ 0 for all points in the half-space that
contains the polyhedron. be centered at the origin. Suppose that the “eyes” have radius r2 and r3 and are
As in the case of a polygonal model, a convex polyhedron can be defined as centered at (x2 , y2 ) and (x3 , y3 ), respectively. Let the “mouth” be an ellipse with
the intersection of a finite number of half-spaces, one for each face. A nonconvex major axis a and minor axis b and is centered at (0, y4 ). The functions are defined
polyhedron can be defined as the union of a finite number of convex polyhedra. as
The predicate φ(x, y, z) can be defined in a similar manner, in this case yielding f1 = x2 + y 2 − r12 ,
true if (x, y, z) ∈ O, and false otherwise. f2 = − (x − x2 )2 + (y − y2 )2 − r22 ,

 (3.11)
f3 = − (x − x3 )2 + (y − y3 )2 − r32 ,
3.1.2 Semi-Algebraic Models 
f4 = − x2 /a2 + (y − y4 )2 /b2 − 1 .
In both the polygonal and polyhedral models, f was a linear function. In the
For f2 , f3 , and f4 , the familiar circle and ellipse equations were multiplied by −1 to
case of a semi-algebraic model for a 2D world, f can be any polynomial with real-
yield algebraic primitives for all points outside of the circle or ellipse. The shaded
valued coefficients and variables x and y. For a 3D world, f is a polynomial with
region O is represented as
variables x, y, and z. The class of semi-algebraic models includes both polygonal
and polyhedral models, which use first-degree polynomials. A point set determined O = H1 ∩ H2 ∩ H3 ∩ H4 . (3.12)
by a single polynomial primitive is called an algebraic set; a point set that can be 
obtained by a finite number of unions and intersections of algebraic sets is called
a semi-algebraic set.
Consider the case of a 2D world. A solid representation can be defined using In the case of semi-algebraic models, the intersection of primitives does not
algebraic primitives of the form necessarily result in a convex subset of W. In general, however, it might be
necessary to form O by taking unions and intersections of algebraic primitives.
H = {(x, y) ∈ W | f (x, y) ≤ 0}. (3.10) A logical predicate, φ(x, y), can once again be formed, and collision checking
is still performed in time that is linear in the number of primitives. Note that
As an example, let f = x2 + y 2 − 4. In this case, H represents a disc of radius it is still very efficient to evaluate every primitive; f is just a polynomial that is
2 that is centered at the origin. This corresponds to the set of points (x, y) for evaluated on the point (x, y, z).
which f (x, y) ≤ 0, as depicted in Figure 3.4a. The semi-algebraic formulation generalizes easily to the case of a 3D world.
This results in algebraic primitives of the form
Example 3.1 (Gingerbread Face) Consider constructing a model of the shaded
region shown in Figure 3.4b. Let the center of the outer circle have radius r1 and H = {(x, y, z) ∈ W | f (x, y, z) ≤ 0}, (3.13)
3.1. GEOMETRIC MODELING 89 90 S. M. LaValle: Planning Algorithms

which can be used to define a solid representation of a 3D obstacle O and a logical


predicate φ.
Equations (3.10) and (3.13) are sufficient to express any model of interest. One
may define many other primitives based on different relations, such as f (x, y, z) ≥
0, f (x, y, z) = 0, f (x, y, z) < 0, f (x, y, z) = 0, and f (x, y, z) 6= 0; however, most
of them do not enhance the set of models that can be expressed. They might,
however, be more convenient in certain contexts. To see that some primitives do
not allow new models to be expressed, consider the primitive

H = {(x, y, z) ∈ W | f (x, y, z) ≥ 0}. (3.14)

The right part may be alternatively represented as −f (x, y, z) ≤ 0, and −f may Figure 3.5: A polygon with holes can be expressed by using different orientations:
be considered as a new polynomial function of x, y, and z. For an example that counterclockwise for the outer boundary and clockwise for the hole boundaries.
involves the = relation, consider the primitive Note that the shaded part is always to the left when following the arrows.

H = {(x, y, z) ∈ W | f (x, y, z) = 0}. (3.15) Nonconvex polygons and polyhedra The method in Section 3.1.1 required
nonconvex polygons to be represented as a union of convex polygons. Instead, a
It can instead be constructed as H = H1 ∩ H2 , in which boundary representation of a nonconvex polygon may be directly encoded by list-
ing vertices in a specific order; assume that counterclockwise order is used. Each
H1 = {(x, y, z) ∈ W | f (x, y, z) ≤ 0} (3.16) polygon of m vertices may be encoded by a list of the form (x1 , y1 ), (x2 , y2 ), . . .,
(xm , ym ). It is assumed that there is an edge between each (xi , yi ) and (xi+1 , yi+1 )
for each i from 1 to m−1, and also an edge between (xm , ym ) and (x1 , y1 ). Ordinar-
and
ily, the vertices should be chosen in a way that makes the polygon simple, meaning
H2 = {(x, y, z) ∈ W | − f (x, y, z) ≤ 0}. (3.17) that no edges intersect. In this case, there is a well-defined interior of the polygon,
which is to the left of every edge, if the vertices are listed in counterclockwise
The relation < does add some expressive power if it is used to construct primitives.2 order.
It is needed to construct models that do not include the outer boundary (for What if a polygon has a hole in it? In this case, the boundary of the hole
example, the set of all points inside of a sphere, which does not include points on can be expressed as a polygon, but with its vertices appearing in the clockwise
the sphere). These are generally called open sets and are defined Chapter 4. direction. To the left of each edge is the interior of the outer polygon, and to the
right is the hole, as shown in Figure 3.5
Although the data structures are a little more complicated for three dimen-
3.1.3 Other Models sions, boundary representations of nonconvex polyhedra may be expressed in a
The choice of a model often depends on the types of operations that will be per- similar manner. In this case, instead of an edge list, one must specify faces, edges,
formed by the planning algorithm. For combinatorial motion planning methods, and vertices, with pointers that indicate their incidence relations. Consistent ori-
to be covered in Chapter 6, the particular representation is critical. On the other entations must also be chosen, and holes may be modeled once again by selecting
hand, for sampling-based planning methods, to be covered in Chapter 5, the par- opposite orientations.
ticular representation is important only to the collision detection algorithm, which
is treated as a “black box” as far as planning is concerned. Therefore, the models 3D triangles Suppose W = R3 . One of the most convenient geometric models
given in the remainder of this section are more likely to appear in sampling-based to express is a set of triangles, each of which is specified by three points, (x1 , y1 , z1 ),
approaches and may be invisible to the designer of a planning algorithm (although (x2 , y2 , z2 ), (x3 , y3 , z3 ). This model has been popular in computer graphics because
it is never wise to forget completely about the representation). graphics acceleration hardware primarily uses triangle primitives. It is assumed
that the interior of the triangle is part of the model. Thus, two triangles are
2
An alternative that yields the same expressive power is to still use ≤, but allow set comple- considered as “colliding” if one pokes into the interior of another. This model offers
ments, in addition to unions and intersections. great flexibility because there are no constraints on the way in which triangles must
3.1. GEOMETRIC MODELING 91 92 S. M. LaValle: Planning Algorithms

each “1” in the image corresponds to a rectangular region that contains at least
one point of O, and “0” represents those that do not contain any of O. Although
bitmaps do not have the elegance of the other models, they often arise in applica-
Figure 3.6: Triangle strips and triangle fans can reduce the number of redundant tions. One example is a digital map constructed by a mobile robot that explores
points. an environment with its sensors. One generalization of bitmaps is a gray-scale
map or occupancy grid. In this case, a numerical value may be assigned to each
be expressed; however, this is also one of the drawbacks. There is no coherency cell, indicating quantities such as “the probability that an obstacle exists” or the
that can be exploited to easily declare whether a point is “inside” or “outside” of “expected difficulty of traversing the cell.” The latter interpretation is often used
a 3D obstacle. If there is at least some coherency, then it is sometimes preferable in terrain maps for navigating planetary rovers.
to reduce redundancy in the specification of triangle coordinates (many triangles
will share the same corners). Representations that remove this redundancy are Superquadrics Instead of using polynomials to define fi , many generalizations
called a triangle strip, which is a sequence of triangles such that each adjacent can be constructed. One popular primitive is a superquadric, which generalizes
pair shares a common edge, and a triangle fan, which is a triangle strip in which quadric surfaces. One example is a superellipsoid, which is given for W = R3 by
all triangles share a common vertex. See Figure 3.6.
n1 /n2
|x/a|n1 + |y/b|n2 + |z/c|n1 − 1 ≤ 0, (3.20)
Nonuniform rational B-splines (NURBS) These are used in many engi-
neering design systems to allow convenient design and adjustment of curved sur- in which n1 ≥ 2 and n2 ≥ 2. If n1 = n2 = 2, an ellipse is generated. As n1 and n2
faces, in applications such as aircraft or automobile body design. In contrast to increase, the superellipsoid becomes shaped like a box with rounded corners.
semi-algebraic models, which are implicit equations, NURBS and other splines are
parametric equations. This makes computations such as rendering easier; however, Generalized cylinders A generalized cylinder is a generalization of an ordinary
others, such as collision detection, become more difficult. These models may be cylinder. Instead of being limited to a line, the center axis is a continuous spine
defined in any dimension. A brief 2D formulation is given here. curve, (x(s), y(s), z(s)), for some parameter s ∈ [0, 1]. Instead of a constant radius,
A curve can be expressed as a radius function r(s) is defined along the spine. The value r(s) is the radius of
n
X the circle obtained as the cross section of the generalized cylinder at the point
wi Pi Ni,k (u) (x(s), y(s), z(s)). The normal to the cross-section plane is the tangent to the spine
i=0 curve at s.
C(u) = n , (3.18)
X
wi Ni,k (u)
i=0 3.2 Rigid-Body Transformations
in which wi ∈ R are weights and Pi are control points. The Ni,k are normalized
Any of the techniques from Section 3.1 can be used to define both the obstacle
basis functions of degree k, which can be expressed recursively as
region and the robot. Let O refer to the obstacle region, which is a subset of W.
   
u − ti ti+k+1 − u Let A refer to the robot, which is a subset of R2 or R3 , matching the dimension
Ni,k (u) = Ni,k−1 (u) + Ni+1,k−1 (u). (3.19) of W. Although O remains fixed in the world, W, motion planning problems will
ti+k − ti ti+k+1 − ti+1
require “moving” the robot, A.
The basis of the recursion is Ni,0 (u) = 1 if ti ≤ u < ti+1 , and Ni,0 (u) = 0 otherwise.
A knot vector is a nondecreasing sequence of real values, {t0 , t1 , . . . , tm }, that
controls the intervals over which certain basic functions take effect. 3.2.1 General Concepts
Before giving specific transformations, it will be helpful to define them in general to
Bitmaps For either W = R2 or W = R3 , it is possible to discretize a bounded avoid confusion in later parts when intuitive notions might fall apart. Suppose that
portion of the world into rectangular cells that may or may not be occupied. a rigid robot, A, is defined as a subset of R2 or R3 . A rigid-body transformation is
The resulting model looks very similar to Example 2.1. The resolution of this a function, h : A → W, that maps every point of A into W with two requirements:
discretization determines the number of cells per axis and the quality of the ap- 1) The distance between any pair of points of A must be preserved, and 2) the
proximation. The representation may be considered as a binary image in which orientation of A must be preserved (no “mirror images”).
3.2. RIGID-BODY TRANSFORMATIONS 93 94 S. M. LaValle: Planning Algorithms

Using standard function notation, h(a) for some a ∈ A refers to the point in Defining frames It was assumed so far that A is defined in R2 or R3 , but before
W that is “occupied” by a. Let it is transformed, it is not considered to be a subset of W. The transformation h
places the robot in W. In the coming material, it will be convenient to indicate
h(A) = {h(a) ∈ W | a ∈ A}, (3.21)
this distinction using coordinate frames. The origin and coordinate basis vectors
which is the image of h and indicates all points in W occupied by the transformed of W will be referred to as the world frame.3 Thus, any point w ∈ W is expressed
robot. in terms of the world frame.
The coordinates used to define A are initially expressed in the body frame,
Transforming the robot model Consider transforming a robot model. If A which represents the origin and coordinate basis vectors of R2 or R3 . In the case
is expressed by naming specific points in R2 , as in a boundary representation of a of A ⊂ R2 , it can be imagined that the body frame is painted on the robot.
polygon, then each point is simply transformed from a to h(a) ∈ W. In this case, Transforming the robot is equivalent to converting its model from the body frame
it is straightforward to transform the entire model using h. However, there is a to the world frame. This has the effect of placing4 A into W at some position
slight complication if the robot model is expressed using primitives, such as and orientation. When multiple bodies are covered in Section 3.3, each body will
have its own body frame, and transformations require expressing all bodies with
Hi = {a ∈ R2 | fi (a) ≤ 0}. (3.22) respect to the world frame.
2
This differs slightly from (3.2) because the robot is defined in R (which is not
necessarily W), and also a is used to denote a point (x, y) ∈ A. Under a transfor- 3.2.2 2D Transformations
mation h, the primitive is transformed as
Translation A rigid robot A ⊂ R2 is translated by using two parameters, xt , yt ∈
h(Hi ) = {h(a) ∈ W | fi (a) ≤ 0}. (3.23) R. Using definitions from Section 3.2.1, q = (xt , yt ), and h is defined as
To transform the primitive completely, however, it is better to directly name points h(x, y) = (x + xt , y + yt ). (3.25)
in w ∈ W, as opposed to h(a) ∈ W. Using the fact that a = h−1 (w), this becomes
A boundary representation of A can be translated by transforming each vertex in
h(Hi ) = {w ∈ W | fi (h−1 (w)) ≤ 0}, (3.24) the sequence of polygon vertices using (3.25). Each point, (xi , yi ), in the sequence
in which the inverse of h appears in the right side because the original point a ∈ A is replaced by (xi + xt , yi + yt ).
needs to be recovered to evaluate fi . Therefore, it is important to be careful Now consider a solid representation of A, defined in terms of primitives. Each
because either h or h−1 may be required to transform the model. This will be primitive of the form
observed in more specific contexts in some coming examples. Hi = {(x, y) ∈ R2 | f (x, y) ≤ 0} (3.26)

A parameterized family of transformations It will become important to is transformed to


study families of transformations, in which some parameters are used to select
h(Hi ) = {(x, y) ∈ W | f (x − xt , y − yt ) ≤ 0}. (3.27)
the particular transformation. Therefore, it makes sense to generalize h to accept
two variables: a parameter vector, q ∈ Rn , along with a ∈ A. The resulting Example 3.2 (Translating a Disc) For example, suppose the robot is a disc of
transformed point a is denoted by h(q, a), and the entire robot is transformed to unit radius, centered at the origin. It is modeled by a single primitive,
h(q, A) ⊂ W.
The coming material will use the following shorthand notation, which requires Hi = {(x, y) ∈ R2 | x2 + y 2 − 1 ≤ 0}. (3.28)
the specific h to be inferred from the context. Let h(q, a) be shortened to a(q), and
Suppose A = Hi is translated xt units in the x direction and yt units in the y
let h(q, A) be shortened to A(q). This notation makes it appear that by adjusting
direction. The transformed primitive is
the parameter q, the robot A travels around in W as different transformations are
selected from the predetermined family. This is slightly abusive notation, but it is h(Hi ) = {(x, y) ∈ W | (x − xt )2 + (y − yt )2 − 1 ≤ 0}, (3.29)
convenient. The expression A(q) can be considered as a set-valued function that 3
The world frame serves the same purpose as an inertial frame in Newtonian mechanics.
yields the set of points in W that are occupied by A when it is transformed by
Intuitively, it is a frame that remains fixed and from which all measurements are taken. See
q. Most of the time the notation does not cause trouble, but when it does, it is Section 13.3.1.
helpful to remember the definitions from this section, especially when trying to 4
Technically, this placement is a function called an orientation-preserving isometric embed-
determine whether h or h−1 is needed. ding.
3.2. RIGID-BODY TRANSFORMATIONS 95 96 S. M. LaValle: Planning Algorithms

Using a 2 × 2 rotation matrix,


Moving  
the Robot cos θ − sin θ
R(θ) = , (3.31)
Moving the sin θ cos θ
Coordinate
Frame
the transformation can be written as
   
x cos θ − y sin θ x
= R(θ) . (3.32)
x sin θ + y cos θ y

Using the notation of Section 3.2.1, R(θ) becomes h(q), for which q = θ. For
(a) Translation of the robot (b) Translation of the frame linear transformations, such as the one defined by (3.32), recall that the column
vectors represent the basis vectors of the new coordinate frame. The column
vectors of R(θ) are unit vectors, and their inner product (or dot product) is zero,
Figure 3.7: Every transformation has two interpretations. indicating that they are orthogonal. Suppose that the x and y coordinate axes,
which represent the body frame, are “painted” on A. The columns of R(θ) can be
derived by considering the resulting directions of the x- and y-axes, respectively,
which is the familiar equation for a disc centered at (xt , yt ). In this example, the after performing a counterclockwise rotation by the angle θ. This interpretation
inverse, h−1 is used, as described in Section 3.2.1.  generalizes nicely for higher dimensional rotation matrices.
Note that the rotation is performed about the origin. Thus, when defining the
model of A, the origin should be placed at the intended axis of rotation. Using
The translated robot is denoted as A(xt , yt ). Translation by (0, 0) is the iden- the semi-algebraic model, the entire robot model can be rotated by transforming
tity transformation, which results in A(0, 0) = A, if it is assumed that A ⊂ W each primitive, yielding A(θ). The inverse rotation, R(−θ), must be applied to
(recall that A does not necessarily have to be initially embedded in W). It will be each primitive.
convenient to use the term degrees of freedom to refer to the maximum number of
independent parameters that are needed to completely characterize the transfor- Combining translation and rotation Suppose a rotation by θ is performed,
mation applied to the robot. If the set of allowable values for xt and yt forms a followed by a translation by xt , yt . This can be used to place the robot in any
two-dimensional subset of R2 , then the degrees of freedom is two. desired position and orientation. Note that translations and rotations do not
Suppose that A is defined directly in W with translation. As shown in Figure commute! If the operations are applied successively, each (x, y) ∈ A is transformed
3.7, there are two interpretations of a rigid-body transformation applied to A: 1) to  
x cos θ − y sin θ + xt
The world frame remains fixed and the robot is transformed; 2) the robot remains . (3.33)
x sin θ + y cos θ + yt
fixed and the world frame is translated. The first one characterizes the effect of
the transformation from a fixed world frame, and the second one indicates how The following matrix multiplication yields the same result for the first two vector
the transformation appears from the robot’s perspective. Unless stated otherwise, components:
the first interpretation will be used when we refer to motion planning problems     
cos θ − sin θ xt x x cos θ − y sin θ + xt
because it often models a robot moving in a physical world. Numerous books cover  sin θ cos θ yt  y  =  x sin θ + y cos θ + yt  . (3.34)
coordinate transformations under the second interpretation. This has been known
0 0 1 1 1
to cause confusion because the transformations may sometimes appear “backward”
from what is desired in motion planning. This implies that the 3 × 3 matrix,
 
cos θ − sin θ xt
Rotation The robot, A, can be rotated counterclockwise by some angle θ ∈ T =  sin θ cos θ yt  , (3.35)
[0, 2π) by mapping every (x, y) ∈ A as 0 0 1
represents a rotation followed by a translation. The matrix T will be referred to
(x, y) 7→ (x cos θ − y sin θ, x sin θ + y cos θ). (3.30) as a homogeneous transformation matrix. It is important to remember that T
3.2. RIGID-BODY TRANSFORMATIONS 97 98 S. M. LaValle: Planning Algorithms

z Yaw, pitch, and roll rotations A 3D body can be rotated about three orthog-
α onal axes, as shown in Figure 3.8. Borrowing aviation terminology, these rotations
Yaw will be referred to as yaw, pitch, and roll:

1. A yaw is a counterclockwise rotation of α about the z-axis. The rotation


β matrix is given by
Pitch  
Roll y cos α − sin α 0
γ Rz (α) =  sin α cos α 0 . (3.39)
0 0 1
x
Note that the upper left entries of Rz (α) form a 2D rotation applied to the
Figure 3.8: Any three-dimensional rotation can be described as a sequence of yaw,
x and y coordinates, whereas the z coordinate remains constant.
pitch, and roll rotations.
2. A pitch is a counterclockwise rotation of β about the y-axis. The rotation
represents a rotation followed by a translation (not the other way around). Each matrix is given by
primitive can be transformed using the inverse of T , resulting in a transformed  
cos β 0 sin β
solid model of the robot. The transformed robot is denoted by A(xt , yt , θ), and
Ry (β) =  0 1 0 . (3.40)
in this case there are three degrees of freedom. The homogeneous transformation
− sin β 0 cos β
matrix is a convenient representation of the combined transformations; therefore,
it is frequently used in robotics, mechanics, computer graphics, and elsewhere.
It is called homogeneous because over R3 it is just a linear transformation with- 3. A roll is a counterclockwise rotation of γ about the x-axis. The rotation
out any translation. The trick of increasing the dimension by one to absorb the matrix is given by
translational part is common in projective geometry [804].  
1 0 0
Rx (γ) = 0 cos γ − sin γ  . (3.41)
3.2.3 3D Transformations 0 sin γ cos γ
Rigid-body transformations for the 3D case are conceptually similar to the 2D case;
however, the 3D case appears more difficult because rotations are significantly more Each rotation matrix is a simple extension of the 2D rotation matrix, (3.31). For
complicated. example, the yaw matrix, Rz (α), essentially performs a 2D rotation with respect
to the x and y coordinates while leaving the z coordinate unchanged. Thus, the
third row and third column of Rz (α) look like part of the identity matrix, while
3D translation The robot, A, is translated by some xt , yt , zt ∈ R using the upper right portion of Rz (α) looks like the 2D rotation matrix.
The yaw, pitch, and roll rotations can be used to place a 3D body in any
(x, y, z) 7→ (x + xt , y + yt , z + zt ). (3.36) orientation. A single rotation matrix can be formed by multiplying the yaw, pitch,
and roll rotation matrices to obtain
A primitive of the form
R(α,β, γ) = Rz (α) Ry (β) Rx (γ) =
 
Hi = {(x, y, z) ∈ W | fi (x, y, z) ≤ 0} (3.37) cos α cos β cos α sin β sin γ − sin α cos γ cos α sin β cos γ + sin α sin γ
 sin α cos β sin α sin β sin γ + cos α cos γ sin α sin β cos γ − cos α sin γ  .
is transformed to − sin β cos β sin γ cos β cos γ
(3.42)
{(x, y, z) ∈ W | fi (x − xt , y − yt , z − zt ) ≤ 0}. (3.38)
It is important to note that R(α, β, γ) performs the roll first, then the pitch, and
The translated robot is denoted as A(xt , yt , zt ). finally the yaw. If the order of these operations is changed, a different rotation
3.2. RIGID-BODY TRANSFORMATIONS 99 100 S. M. LaValle: Planning Algorithms

matrix would result. Be careful when interpreting the rotations. Consider the The homogeneous transformation matrix for 3D bodies As in the 2D
final rotation, a yaw by α. Imagine sitting inside of a robot A that looks like case, a homogeneous transformation matrix can be defined. For the 3D case, a
an aircraft. If β = γ = 0, then the yaw turns the plane in a way that feels 4 × 4 matrix is obtained that performs the rotation given by R(α, β, γ), followed
like turning a car to the left. However, for arbitrary values of β and γ, the final by a translation given by xt , yt , zt . The result is
rotation axis will not be vertically aligned with the aircraft because the aircraft is  
left in an unusual orientation before α is applied. The yaw rotation occurs about cos α cos β cos α sin β sin γ − sin α cos γ cos α sin β cos γ + sin α sin γ xt
the z-axis of the world frame, not the body frame of A. Each time a new rotation  sin α cos β sin α sin β sin γ + cos α cos γ sin α sin β cos γ − cos α sin γ yt 
T = .
matrix is introduced from the left, it has no concern for original body frame of − sin β cos β sin γ cos β cos γ z t
0 0 0 1
A. It simply rotates every point in R3 in terms of the world frame. Note that 3D
(3.50)
rotations depend on three parameters, α, β, and γ, whereas 2D rotations depend
Once again, the order of operations is critical. The matrix T in (3.50) represents
only on a single parameter, θ. The primitives of the model can be transformed
the following sequence of transformations:
using R(α, β, γ), resulting in A(α, β, γ).
1. Roll by γ 3. Yaw by α
Determining yaw, pitch, and roll from a rotation matrix It is often con- 2. Pitch by β 4. Translate by (xt , yt , zt ).
venient to determine the α, β, and γ parameters directly from a given rotation
matrix. Suppose an arbitrary rotation matrix The robot primitives can be transformed to yield A(xt , yt , zt , α, β, γ). A 3D rigid
  body that is capable of translation and rotation therefore has six degrees of free-
r11 r12 r13 dom.
r21 r22 r23  (3.43)
r31 r32 r33
is given. By setting each entry equal to its corresponding entry in (3.42), equations 3.3 Transforming Kinematic Chains of Bodies
are obtained that must be solved for α, β, p and γ. Note that r21 /r11 = tan α and
2 2
The transformations become more complicated for a chain of attached rigid bodies.
r32 /r33 = tan γ. Also, r31 = − sin β and r32 + r33 = cos β. Solving for each For convenience, each rigid body is referred to as a link. Let A1 , A2 , . . . , Am denote
angle yields a set of m links. For each i such that 1 ≤ i < m, link Ai is “attached” to link Ai+1
α = tan−1 (r21 /r11 ), (3.44) in a way that allows Ai+1 some constrained motion with respect to Ai . The motion
−1
 q 2 2

constraint must be explicitly given, and will be discussed shortly. As an example,
β = tan − r31 r32 + r33 , (3.45)
imagine a trailer that is attached to the back of a car by a hitch that allows the
and trailer to rotate with respect to the car. In general, a set of attached bodies will
γ = tan−1 (r32 /r33 ). (3.46) be referred to as a linkage. This section considers bodies that are attached in a
There is a choice of four quadrants for the inverse tangent functions. How can single chain. This leads to a particular linkage called a kinematic chain.
the correct quadrant be determined? Each quadrant should be chosen by using
the signs of the numerator and denominator of the argument. The numerator sign 3.3.1 A 2D Kinematic Chain
selects whether the direction will be above or below the x-axis, and the denomi-
nator selects whether the direction will be to the left or right of the y-axis. This Before considering a kinematic chain, suppose A1 and A2 are unattached rigid
is the same as the atan2 function in the C programming language, which nicely bodies, each of which is capable of translating and rotating in W = R2 . Since
expands the range of the arctangent to [0, 2π). This can be applied to express each body has three degrees of freedom, there is a combined total of six degrees
(3.44), (3.45), and (3.46) as of freedom; the independent parameters are x1 , y1 , θ1 , x2 , y2 , and θ2 .
α = atan2(r21 , r11 ), (3.47)
 q  Attaching bodies When bodies are attached in a kinematic chain, degrees of
β = atan2 − r31 , r32 2 2
+ r33 , (3.48) freedom are removed. Figure 3.9 shows two different ways in which a pair of 2D
links can be attached. The place at which the links are attached is called a joint.
and For a revolute joint, one link is capable only of rotation with respect to the other.
γ = atan2(r32 , r33 ). (3.49) For a prismatic joint is shown, one link slides along the other. Each type of joint
Note that this method assumes r11 6= 0 and r33 6= 0. removes two degrees of freedom from the pair of bodies. For example, consider a

You might also like