0% found this document useful (0 votes)
7 views67 pages

Stabilization of Rotational Double Pendulum

Uploaded by

kalluha97
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views67 pages

Stabilization of Rotational Double Pendulum

Uploaded by

kalluha97
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ROTATIONAL DOUBLE INVERTED PENDULUM

Thesis

Submitted to

The School of Engineering of the

UNIVERSITY OF DAYTON

In Partial Fulfillment of the Requirements for

The Degree of

Master of Science in Electrical Engineering

By

Bo Li

UNIVERSITY OF DAYTON

Dayton, Ohio

August, 2013
ROTATIONAL DOUBLE INVERTED PENDULUM

Name: Li, Bo

APPROVED BY:

Raúl Ordóñez, Ph.D. Vijayan K. Asari, Ph.D.


Advisor Committee Chairman Committee Member
Associate Professor, Department of Professor, Department of Electrical and
Electrical and Computer Engineering Computer Engineering

Ralph Barrera, Ph.D.


Committee Member
Adjunct Professor, Department of
Electrical and Computer Engineering

John G. Weber, Ph.D. Tony E. Saliba, Ph.D.


Associate Dean Dean, School of Engineering
School of Engineering & Wilke Distinguished Professor

ii
c Copyright by

Bo Li

All rights reserved

2013
ABSTRACT

ROTATIONAL DOUBLE INVERTED PENDULUM

Name: Li, Bo
University of Dayton

Advisor: Dr. Raúl Ordóñez

The thesis deals with the stabilization control of the Rotational Double Inverted Pendulum (RDIP)

System. The RDIP is an extremely nonlinear, unstable, underactuated system of high order. A math-

ematical model is built for the RDIP with the Euler-Lagrange (E-L) equation. A Linear Quadratic

Regulator (LQR) controller is designed for this system and its stability analysis is presented in the

Lyapunov method. We re-develop the Direct Adaptive Fuzzy Control (DAFC) method in our case

for the purpose of exploring the possibility to improve the performance of the LQR control of the

system. The simulation results of these two control schemes with their comparative analysis show

that the DAFC is able to enhance the LQR controller by increasing its robustness in the RDIP con-

trol.

iii
For my family

iv
ACKNOWLEDGMENTS

I would like to express my special gratitude to my advisor Dr. Raúl Ordóñez for his tremendous

support and help through the learning process of this master thesis. Without his guidance and

persistent help this project would not have been possible within the limited time frame. Furthermore

I would also like to thank my committee members, Dr. Asari and Dr. Barrera, who have willingly

shared their precious time and provided me with the useful comments, remarks and engagement on

the dissertation. In addition, a thank you for all the members in the lab KL302, whose companion

and suggestions have supported me during the process of this thesis and helped me a lot. Last but

not the least, I would like to thank my loved ones for their endless love and support. I will be

grateful forever for your love.

v
TABLE OF CONTENTS

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

II. SYSTEM DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Rotational Double Inverted Pendulum Configurations . . . . . . . . . . . . . . . 4


2.2 Euler-Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

III. CONTROL USING LINEAR QUADRATIC REGULATOR . . . . . . . . . . . . . . 14

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 LQR Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

IV. ADAPTIVE CONTROL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Takagi-Sugeno Fuzzy System . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.1 Bounding Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.2 Adaptation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.3 Sliding-mode Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

vi
V. SIMULATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1 Open-loop Model Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 42


5.2 Linear Quadratic Regulator Results . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Direct Adaptive Fuzzy Control Results . . . . . . . . . . . . . . . . . . . . . . 47

VI. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

vii
LIST OF FIGURES

2.1 Rotational double inverted pendulum schematic. . . . . . . . . . . . . . . . . . . . 5

2.2 Velocity analysis for Link 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.1 Membership function with c = 1 and σ = 0.25. . . . . . . . . . . . . . . . . . . . 32

5.1 Open-loop simulation in the initial states θ1 = 57.2958◦ , θ2 = 5.7296◦ and θ3 =

5.7296◦ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 LQR simulation result with Q = I 6 and R = I. . . . . . . . . . . . . . . . . . . . 45

5.3 Tuned LQR simulation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4 Control signals of the LQR and its tuned version. . . . . . . . . . . . . . . . . . . 47

5.5 DAFC simulation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.6 DAFC control signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.7 LQR vs. DAFC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

viii
LIST OF TABLES

3.1 R.O.A searching result with q1 = π. . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Largest R.O.A for different q1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.1 Membership function settings for the DAFC. . . . . . . . . . . . . . . . . . . . . 48

ix
CHAPTER I

INTRODUCTION

An inverted pendulum is a pendulum which has its links rotating above its pivot point. It is often

implemented either with the pivot point connected with a base arm that can rotate horizontally

(described in [1]) or mounted on a cart that can move in a fixed horizontal line (introduced in [2]).

The links of the pendulum are usually limited to 1 degree of freedom by affixing the links to an

axis of rotation. It is obvious that an inverted pendulum is inherently unstable, and must be actively

balanced in order to remain upright while a normal pendulum is stable when hanging downwards.

This can be done by applying a torque at the pivot point for a rotational inverted pendulum as

considered in this thesis or moving the pivot point horizontally for the case of an inverted pendulum

on a cart. A simple demonstration of moving the pivot point to control the pendulum is achieved

by balancing an upturned broomstick on the end of one’s finger. The inverted pendulum control

is a classic problem in dynamics and control theory and is used to verify the performance and

demonstrating the effectiveness of control algorithms.

The Rotational Double Inverted Pendulum (RDIP) takes the classic rotational single pendulum prob-

lem to the next level of complexity. The RDIP is composed of a rotary arm that attaches to a servo

system which provides a torque to the base arm to control the whole system, a short bottom rod

1
connected to the arm and a top long rod. It is an underactuated (i.e., it has fewer inputs that de-

grees of freedom) and extremely nonlinear unstable system due to the gravitational forces and the

coupling arising from the Coriolis and centripetal forces. Since the RDIP presents considerable

control-design challenges, it is an attractive tool utilized for developing different control techniques

and testing their performances. Related applications include stabilizing the take-off of a multi-stage

rocket, as well as modeling the human posture system.

Nearly all works on pendulum control concentrate on two problems: stabilization of the inverted

pendulums and pendulums swing-up control design. The first topic is concerned with the controller

design to maintain the pendulum in the upright position. In the RDIP case, controllers are designed

to balance two vertical rods by manipulating the angle of the base arm. The second one refers to an

adequate algorithm to swing up the pendulum from its stable equilibrium [2], the downward position

to the upright position. In this thesis we concentrate on the balancing control of the pendulum

without investigating swing-up details.

There is a variety of works devoted into the control design of the RDIP. A simple mathematical

model for the RDIP has been built in [3], which takes the angles and angular velocities of the base

arm and the two pendulums as the system outputs and ignores all the friction terms for the rotational

joints and the DC motor. It also presents an alternative of the least squares theory to come up with

a controller providing a domain of convergence for the pendulum. A more precise system model

has been developed in this paper by using the modeling method mentioned in [4] with the help of

the E-L equation. Another control structure is proposed in [5] by compensating individually the

multiple loop delays, which is suitable to be used in a networked control system environment. In

2
this paper, we seek to balance the RDIP with the LQR and the DAFC. We will discuss the details

about these two methods in the following chapters.

This paper mainly serves for two purposes. Firstly, the simulation and experiment results in [6]

indicates that, even with a non-minimum-phase plant, the adaptive fuzzy controller is still able to

make a good control performance for the single inverted pendulum: the angle of the pendulum

can converge to the origin, although the base arm trends to rotate with a constant angular velocity,

which, in control notation, is another stable state. This paper tries to explore the possibility of this

finding in a more complicated case, the RDIP. Furthermore, assuming that this possibility exists, we

try to improve the control performance to make all the system states converge to zero. Secondly, this

paper also provides a fundamental theoretical basis for the control experiment of the SV02+DBIP

double inverted pendulum kit from the Quanser Company in the lab KL302.

This paper is organized as follows. In Chapter II, we present a description of the RDIP and develop

a mathematical model for the system. In Chapter III and IV, the LQR and DAFC are introduced and

developed for the RDIP in details respectively. Chapter V shows the simulation results of these two

controllers and presents an analysis on them. In Chapter VI, the concluding remarks, we summarize

the overall results, provide a broad assessment of the apparent advantages and disadvantages of the

LQR and the DAFC control techniques, and provide some future research directions which help to

identify limitations of the scope and content of this paper.

3
CHAPTER II

SYSTEM DESCRIPTION

In this Chapter we will focus on the mathematical model building of the RDIP. The more we know

about a dynamic system, the more accurate a mathematical model can be obtained. With accurate

mathematic models, faster, more accurate and effective controllers can be designed, since math-

ematic models allow design, test and development of controllers with the help of some powerful

engineering softwares such as MATLAB [MathWorks].

2.1 Rotational Double Inverted Pendulum Configurations

The RDIP experiment platform that we use for simulation to be described later consists of a hori-

zontal base arm (denoted as Link 1) driven by a servo motor and two vertical pendulums (denoted

as Link 2 and 3) that move freely in the plane perpendicular to Link 1, as shown in Figure 2.1 about

which we will talk in details later. Since we will focus on the stabilization of the pendulums, it is

convenient to set the coordinate system as in Figure 2.1. In this paper, the mathematical model of

the RDIP will be developed by the use of the Euler-Lagrange (E-L) function. A simple mathemat-

ical model has been presented in [3], which assumes that the acceleration of the base arm is able

4
Figure 2.1: Rotational double inverted pendulum schematic.

to be manipulated directly and therefore chosen as the system control input. In this dissertation,

a more practical assumption is taken under which the torque of the motor to the base arm is the

control signal. Moreover, the pivoting friction factors will be taken care of for the goal to build a

more precise model and simulate the real system we have in the lab.

We will use some additional basic assumptions of the system attributes similar with [3]:

• All the link angles and the angular velocities are accessible at each time step, since we know

that we can access these data with the help of the encoders on the links and high rate of data

acquisition is possible for our experiment platform.

5
• The viscous frictions of the arm and the two pendulums are considered while the static fric-

tion, backlash and plane slackness are ignored.

• The apparatus is light weight and has low inertia resulting in a structure with low stiffness

and a tendency to vibrate.

• System dynamics is slow enough to be controlled.

Figure 2.1 shows the basic configurations of the RDIP. The arrows on the arcs show the positive

direction for the rotary movement of the links. The straight dash lines denote the origin of the

displacement of the link angles. For example when the horizontal Link 1 is centered and the vertical

Link 2 and 3 are in the upright position, all of the position variables are zero. The state variables of

the links are:

θ1 Angle of Link 1 in the horizontal plane.

θ̇1 Velocity of Link 1 in the horizontal plane.

θ̈1 Acceleration of Link 1.

θ2 Angle of Link 2 in the vertical plane.

θ̇2 Velocity of Link 2 in the vertical plane.

6
θ̈2 Acceleration of Link 2.

θ3 Angle of Link 3 in the vertical plane.

θ̇3 Velocity of Link 3 in the vertical plane.

θ̈3 Acceleration of Link 3.

Some additional definitions are needed here as follows:

Ji Moment of inertia of Link i. J1 is about its pivot while Ji is about its center of mass Pci for

i = 2, 3.

li Distance from the center of rotation of Link i to its center of mass, i = 1, 2, 3.

mi Mass of Link i, i = 1, 2, 3.

g Gravity with the value g = 9.81m/s2 towards the center of the earth.

Li Length of Link i, i = 1, 2, 3.

bi Viscous damping coefficient of the bearing on which Link i rotates, i = 1, 2, 3.

7
2.2 Euler-Lagrange Equation

The E-L method, introduced in details in [7], is applied in the derivation of the equations of motion

for the RDIP dynamics since the Newtonian approach of applying Newton’s laws of motion is highly

complicated in this case. The solutions to the E-L equation for the action of a system are capable

of describing the evolution of a physical system according to the Hamilton’s principle of stationary

action in Lagrangian mechanics. In classical mechanics, it is equivalent to Newton’s laws of motion,

but it has the advantage that it takes the same form in any system of generalized coordinates, and it

is better suited to generalizations.

The E-L equation is an equation satisfied by a function q, of a real argument t, which is a stationary

point of the functional


Z b
S(q) = L (t, q(t), q̇(t)) dt, (2.1)
a

where q is the function to be found: q : [a, b] ⊂ R → X, t 7→ x = q(t) such that q is differentiable,

q(a) = xa and q(b) = xb ; q̇ is the derivative of q satisfying that q̇ : [a, b] → Tq(t) X, t 7→ υ = q̇(t)

with Tq(t) X denotes the tangent space of X at q(t); L is a real-valued function with continuous first

partial derivatives: L : [a, b] × T X → R, (t, x, υ) 7→ L(t, x, υ) with T X being the tangent bundle
S
of X defined by T X = {x} × Tx X. The E-L equation, then, is given by
x∈X

d
Lx (t, q(t), q̇(t)) − Lυ (t, q(t), q̇(t)) = 0, (2.2)
dt

where Lx and Lυ denote the partial derivatives of L with respect to x and υ respectively.

To determine the equations of motion for the system dynamics, we follow the following steps:

8
1. Determine the kinetic energy K and the potential energy P .

2. Compute the Lagrangian

L = K − P. (2.3)

∂L
3. Compute ∂q .

∂L d ∂L
4. Compute ∂ q̇ and from it, dt ∂ q̇ . It is important that q̇ be treated as a complete variable rather

than a derivative.

5. Solve the revised E-L equation for the system with the generalized forces

d ∂L ∂L
− = Qq , (2.4)
dt ∂ q̇ ∂q

where Qq are the generalized forces and q are the generalized coordinates.

2.3 Modeling

The RDIP works as follows. The movement of the arm on the base, Link 1, is constrained to the

x−o−z plane and rotating around the y axis. The movements of the other two links are constrained

to a vertical plane perpendicular to Link 1. Link 1 is driven by a DC motor, which generates a torque

to control the system and is described in [8]. Here we will not discuss the servo system. Therefore,

the control input of the RDIP is the torque applied to Link 1. The control objective is to maintain

the pendulums Link 2 and 3 in the upright position with Link 1 in the origin position.

9
Figure 2.2: Velocity analysis for Link 2.

The total kinetic energy of each link in our system is given by the combination of its moving kinetic

term Km and its rotating kinetic term Kr as


1
Km = mv 2 ,
2
(2.5)
1 2
Kr = J θ̇i ,
2

where v and θ̇i are respectively the moving velocity and the rotational angular velocity. We provide

the analysis of the total kinetic energy of the Link 2 in Figure 2.2 to help readers make an analysis

for other two links with the same method. The potential energy is easy to get, thus we do not discuss

it further. In general, we will have the total kinetic energy for the whole system as
 2 
1 2 1 2 1 2 1 2 
K = J1 θ̇1 + J2 θ̇2 + J3 θ̇3 + m2 L1 θ̇1 + l2 θ̇2 cos θ2 + −l2 θ̇2 sin θ2 +
2 2 2 2
  (2.6)
1 2  2
m3 L1 θ̇1 + l2 θ̇2 cos θ2 + l3 θ̇3 cos θ3 + −l2 θ̇2 sin θ2 − l3 θ̇3 sin θ3 ,
2

and the total potential energy

P = m3 gl2 cos θ2 + m3 g (L2 cos θ2 + l3 cos θ3 ) . (2.7)

10
Now we can obtain the Lagrangian by applying (2.6) and (2.7) to (2.3)

L = K − P.

Applying the E-L equation (2.4) to (2.3) results in three coupled non-linear equations.

d ∂L ∂L
− = τ − b1 θ̇1 (2.8)
dt ∂ θ̇1 ∂θ1

becomes

τ = J1 + L21 (m2 + m3 ) θ̈1 + L1 (m2 l2 + m3 L2 ) cos θ2 θ̈2 + L1 m3 l3 cos θ3 θ̈3




(2.9)
+ b1 θ̇1 − L1 (m2 l2 + m3 L2 ) θ̇12 sin θ2 − L1 m3 l3 θ̇32 sin θ3 .

d ∂L ∂L
− = −b2 θ̇2 (2.10)
dt ∂ θ̇2 ∂θ2

becomes

0 = − L1 (m2 l2 + m3 L2 ) cos θ2 θ̈1 − J2 + L22 m3 + l22 m2 θ̈2 − L2 m3 l3 cos (θ2 − θ3 )θ̈3




(2.11)
− b2 θ̇2 − L2 m3 l3 θ̇32 sin (θ2 − θ3 ) + (m2 l2 + m3 L2 )g sin θ2 .

And
d ∂L ∂L
− = −b3 θ̇3 (2.12)
dt ∂ θ̇3 ∂θ3

becomes

0 = − L1 m3 l3 cos θ3 θ̈1 − L2 m3 l3 cos (θ2 − θ3 )θ̈2 − J3 + l32 m3 θ̈3




(2.13)
− b3 θ̇3 + L2 m3 l3 θ̇22 sin (θ2 − θ3 ) + m3 l3 g sin θ3 .

If the equations are parameterized they reduce to a more manageable form. Define h1 , h2 , h3 , h4 ,

h5 , h6 , h7 and h8 as
h1 = J1 + L21 (m2 + m3 ),
h2 = L1 (m2 l2 + m3 L2 ),
h3 = L1 m3 l3 ,
h4 = J2 + L22 m3 + l22 m2 ,
(2.14)
h5 = L2 m3 l3 ,
h6 = J3 + l32 m3 ,
h7 = (m2 l2 + m3 L2 )g,
h8 = m3 l3 g.

11
The dynamic equations are reduced into the form
τ = h1 θ̈1 + h2 cos θ2 θ̈2 + h3 cos θ3 θ̈3 + b1 θ̇1 − h2 θ̇12 sin θ2 − h3 θ̇32 sin θ3 ,
0 = −h2 cos θ2 θ̈1 − h4 θ̈2 − h5 cos (θ2 − θ3 )θ̈3 − b2 θ̇2 − h5 θ̇32 sin (θ2 − θ3 ) + h7 sin θ2 , (2.15)
0 = −h3 cos θ3 θ̈1 − h5 cos (θ2 − θ3 )θ̈2 − h6 θ̈3 − b3 θ̇3 + h5 θ̇22 sin (θ2 − θ3 ) + h8 sin θ3 .

To make the system dynamics more accessible, we assume that θ = [θ1 , θ2 , θ3 ]> , θ̇ = [θ̇1 , θ̇2 , θ̇3 ]>

and θ̈ = [θ̈1 , θ̈2 , θ̈3 ]> . Then (2.15) can be rewritten in the same form with the equations in [3] as

F (θ)θ̈ + G(θ, θ̇) + V (θ) = ub(θ), (2.16)

where
 
h1 h2 cos θ2 h3 cos θ3
F (θ) = −h2 cos θ2 −h4 −h5 cos (θ2 − θ3 ) , (2.17)
−h3 cos θ3 −h5 cos (θ2 − θ3 ) −h6
2 2
 
b1 θ̇1 − h2 θ̇1 sin θ2 − h3 θ̇3 sin θ3
G(θ, θ̇) =  −b2 θ̇2 − h5 θ̇32 sin (θ2 − θ3 )  , (2.18)
−b3 θ̇3 + h5 θ̇22 sin (θ2 − θ3 )
 
0
V (θ) = h7 sin θ2  , (2.19)
h8 sin θ3
 
1
b(θ) = 0 . (2.20)
0
From the mathematical model, we can get the conclusion about the natural characteristics of the

RDIP similar with [9]:

• Open-loop Instability: As we have mentioned before, the upper equilibrium is an unstable

equilibrium for the system. A little disturbance will lead the open-loop system to leave from

the equilibrium and fall down to the downward position which is the stable equilibrium of the

system. This characteristic can be seen by the MATLAB simulation result in Figure 5.1 when

applied this model in some initial states without adding a control signal into the system.

• Coupling Characteristic: According to the mathematical model, we can see the strong cou-

pling characteristic between the state variables of the pendulum. This can be observed by
12
developing the dynamics into a state differential equation form. We will prove this later in

Section 3.2.

So far, a nonlinear mathematical model has been built for the RDIP. In the following two chapters,

we will design controllers for the RDIP based on this model. In Chapter V, we will check the

validation of this model by simulation and we will analyze its behavior in some initial states without

control.

In attempting to further develop the mathematical model for the RDIP, several challenges present

themselves. The principal ones are:

• The emergence of vibrational modes associated with unmodeled dynamics related to the elas-

ticity of the structure.

• The model of the motor system and its incorporation with the RDIP model.

13
CHAPTER III

CONTROL USING LINEAR QUADRATIC REGULATOR

3.1 Introduction

Linear Quadratic Regulator (LQR) is one of the main results of the theory of optimal control which

is concerned with operating a system at the minimum cost. In this theory, system dynamics are

usually described by a set of linear differential equations and the cost in the control process is

represented as a quadratic functional.

Suppose we have a dynamic process characterized by the vector-matrix differential equation

ẋ = Ax + Bu, (3.1)

where x is the state variables, u is the control input, A and B are known matrices. The goal is to

seek a feedback gain K which will be applied in the linear control law

u = −Kx, (3.2)

14
so as to minimize the cost function V expressed as the integral of a quadratic form in the state x

plus a second quadratic form in the control u

Z ∞
1 
V = x> Qx + u> Ru dt, (3.3)
2 0

where Q is a positive semi-definite symmetric matrix and R is a positive definite symmetric matrix.

With this assumption, the first integral term x> Qx is always positive or zero and the second term

u> Ru is always positive at each time t for all values of x and u. This guarantees that V is well-

defined. In terms of eigenvalues, the eigenvalues of Q should be non-negative and of R should be

positive. Usually, Q and R are selected to be diagonal for convenience, thus some entries of Q will

be positive with some possible zeros on its diagonal while all the entries of R must be positive. Note

that R is invertible.

The cost function V is a performance index of the cost of the whole control process and it can be

interpreted as an energy function. The magnitude of the control action itself is included in the cost

function so as to keep the cost, which is due to the control action itself, to be limited. Since both

the state x and the control input u are weighted in V , if V is small, then both x and u are kept to be

small. Furthermore, if V is minimized, then it is certainly finite, and since it is an infinite integral of

x, this implies that x goes to zero as t goes to infinity, which guarantees that the closed-loop system

will be stable.

The plant is linear and the cost function V is quadratic. For this reason, the problem of determining

the state feedback control which regulates the states to zero to minimize V is called the Linear

Quadratic Regulator (LQR).

15
By solving the algebraic Riccati equation (ARE) for P

A> P + P A + Q − P BR−1 B > P = 0, (3.4)

we obtain the optimal LQR gain K as

K = R−1 B > P. (3.5)

The minimal value of the performance criterion V using this gain is given by

V (x0 ) = x>
0 P x0 , (3.6)

which only depends on the initial condition x0 . This mean that the cost of using the LQR gain can

be computed from the initial conditions before the control is ever applied to the system.

Note that Q and R are set by control engineers. In effect, the LQR algorithm takes care of the

tedious work done by engineers in optimizing the controller. However, one still needs to specify

the weighting factors Q and R and compare the results with the specified design goals. Often this

means that it will be an iterative process for engineers to judge the produced ”optimal” controllers

through simulation and then adjusts the weighting factors to get a controller more suitable with the

specified design goals. A clear linkage between the adjusted matrices and the resulting changes in

control behavior is hard to find, which limits the application of the LQR based controller synthesis.

The design procedure for finding the LQR feedback K is:

1. Select LQR design matrices Q and R.

2. Solve the ARE for P .

16
3. Find the LQR gain using K = R−1 B > P .

The MATLAB routine “lqr(A,B,Q,R)” is used here to perform the numerical procedure for solving

the ARE.

3.2 Implementation

For the LQR control design, we need to linearize the RDIP dynamics at the upright equilibrium.

By defining x as
θ1
   
x1
x2  θ2 
   
x3  θ3 
x=
x4  = θ̇1  ,
   (3.7)
   
x5  θ̇2 
x6 θ̇3
we can rewrite (2.15) as

ẋ = f (x) + g(x)u, (3.8)

where f (x) ∈ R6 and g(x) ∈ R6 (Note that vectors in this paper are column vectors by default)
 
x4
 x5 
 
 x6 
f (x) = 
 , (3.9)
f4 (x)

f5 (x)
f6 (x)
 
0
 0 
 
 0 
g(x) = 
 . (3.10)
g4 (x)

g5 (x)
g6 (x)

17
We do not expand the terms fi (x) and gi (x) for i = 4, 5, 6 because their complete forms are too

long to be expanded here. Readers can easily get these terms from our dynamics with the help of a

computer. From the expansion form of (3.8), we find that all the terms fi (x) and gi (x) are related to

the states from x2 to x6 . This indicates a strong coupling relationship between the three links which

we mentioned in the previous chapter.

Then we linearize the system with the method in [10]


 ∂f ∂f ∂f ∂f1 ∂f1 ∂f1

1 1 1
∂x ∂x2 ∂x3 ∂x4 ∂x5 ∂x6
 ∂f21 ∂f2 ∂f2 ∂f2 ∂f2 ∂f2 
 ∂x1 ∂x2 ∂x3 ∂x4 ∂x5 ∂x6 
 ∂f3 ∂f3 ∂f3 ∂f3 ∂f3 ∂f3 
∂f
 ∂x ∂x2 ∂x3 ∂x4 ∂x5

∂x6  ,
A= ∂x
=  1
 ∂f4 ∂f4 ∂f4 ∂f4 ∂f4 ∂f4  (3.11)
 ∂x1 ∂x2 ∂x3 ∂x4 ∂x5 ∂x6 
 ∂f5 ∂f5 ∂f5 ∂f5 ∂f5 ∂f5 
 ∂x1 ∂x2 ∂x3 ∂x4 ∂x5 ∂x6 
∂f6 ∂f6 ∂f6 ∂f6 ∂f6 ∂f6
∂x1 ∂x2 ∂x3 ∂x4 ∂x5 ∂x6
 ∂g1 
∂u
 ∂g2 
 ∂u 
 ∂g3 
∂g  ∂u 
B= ∂u
=  ∂g . (3.12)
 ∂u4 
 ∂g5 
 ∂u 
∂g6
∂u
Substituting the origin x0 = 0 and u = 0 to A and B we will have the linearized system
0 0 0 1 0 0 
0 0 0 0 1 0
10 0 0 0 0 1

A= 0 h3 h5 h7 − h2 h6 h7 h2 h5 h8 − h3 h4 h8 b1 h4 h6 − b1 h25 b2 h2 h6 − b2 h3 h5 b3 h3 h4 − b3 h2 h5 , (3.13)

 
T 0 h h2 − h h h h h h − h h h b h h − b h h b h h − b h2 b h h − b h h 
7 3 1 6 7 1 5 8 2 3 8 1 2 6 1 3 5 2 1 6 2 3 3 3 4 3 2 5
0 h1 h5 h7 − h2 h3 h7 h8 h22 − h1 h4 h8 b1 h3 h4 − b1 h2 h5 b2 h2 h3 − b2 h1 h5 b3 h1 h4 − b3 h22

 
0

 0 

1 0 
B=  2
, (3.14)
T  h4 h6 − h5  
h2 h6 − h3 h5 
h3 h4 − h2 h5
where

T = h6 h22 − 2h2 h3 h5 + h4 h23 + h1 h25 − h1 h4 h6 . (3.15)

18
Another simple approach in [2] is also good for linearizing the system model by using the approxi-

mation

sin(θi ) = θi ,
(3.16)
cos(θi ) = 1,

under the condition that θi ≈ 0, i = 1, 2, 3.

We will use this method to linearize our second system model expressed in the form of (2.16). Thus

we have

F̄ (θ)θ̈ + Ḡ(θ, θ̇) + V̄ (θ) = ub̄(θ), (3.17)

where,
 
h1 h2 h3
F̄ (θ) = −h2 −h4 −h5  , (3.18)
−h3 −h5 −h6
b1 θ˙1
 

G(θ, θ̇) = −b2 θ̇2  , (3.19)


−b3 θ̇3
 
0
V (θ) = h7 θ2  , (3.20)
h8 θ 3
 
1
b(θ) = 0 .
 (3.21)
0

Here we ignore all the high-order terms of θi in both of these two methods. High-order terms contain

at least quadratic quantities of θi . Since if θi are small, their squares are even smaller, the high-order

terms can be neglected.

19
With the linear state matrices A and B, we are able to figure out the controllability of the system.

The controllability matrix can be computed with the help of MATLAB

C = A BA B 2 A B 3 A B 4 A B 5 A
 

−137.033 717.598 −10067.0 −2565733.0


 
0 157330.0

 0 −14.5856 −18.2875 936.228 −18645.9 318164.0 

 0 −133.981 1752.88 −30799.6 492875.0 −8090099.0 (3.22)
=−137.033 717.598 −10067.0 157330.0 −2565733.0 41787900  .

 
−14.5856 −18.2875 936.228 −18645.9 318164.0 −5232977.0
−133.981 1752.88 −30799.6 492875.0 −8090099.0 131834000

The rank of the matrix C is 6, therefore the linearized system is controllable. We simply choose Q =

I 6 and R = I to obtain the first gain K that stabilizes the system by using the MATLAB command

“lqr(A,B,Q,R)” and then adjust the entries of these two matrices to optimize the performance of the

LQR controller in the following section.

3.3 LQR Tuning

As we discussed in section 3.1, Q and R are selected by the design engineer. Different choice

of these design parameters will lead to different control performance for the closed-loop system.

Generally speaking, a large Q means that, to keep V small, the state x must be smaller, resulting

in the poles of the closed-loop system matrix Ac = (A − BK) being further left in the s-plane so

that the states converge faster to zero. On the other hand, selecting R large means that the control

input u must be smaller to keep V small which implies a less control effort. In this case the poles

are slower, resulting in larger values of the state x.

20
A reasonable simple choice for the matrices Q and R is given by the Bryson’s rule [11]. Select Q

and R diagonal with

1
Qii = , i = 1, . . . , l, (3.23)
maximum acceptable value of x2i
1
Qjj = , j = 1, . . . , k. (3.24)
maximum acceptable value of u2j

In this way, the Bryson’s rule scales the variables that appear in V so that the maximum accept-

able value for each term is one. This is especially important when the units used for the different

components of u and x make the values of the variables numerically very different from each other.

Although Bryson’s rule usually gives good results, it is just the starting point to a trial-and-error

iterative design procedure aimed to obtain a controller more in line with the desirable properties for

the closed-loop system.

Applying different Q and R in different initial states to construct the LQR controller in the MAT-

LAB simulation, the magnitudes of the three angular velocities are found to be around 3. The

magnitudes of the three angles are around π/3. The magnitude of the control input is around 1.

Then we choose the following Q and R to start our tuning according to the Bryson’s rule
 
0.9119 0 0 0 0 0
 0 0.9119 0 0 0 0 
 
 0 0 0.9119 0 0 0 
Q=  ,
 0 0 0 0.1111 0 0   (3.25)
 0 0 0 0 0.1111 0 
0 0 0 0 0 0.1111
R = 1,

21
and finally get an optimized result after some fine tuning as
 
1 0 0 0 0 0
0 1 0 0 0 0 
 
0 0 1 0 0 0 
Q= 0 0 0 0.1667
,
 0 0   (3.26)
0 0 0 0 0.1667 0 
0 0 0 0 0 0.1667
R = 0.8.

In fact, the entries of Q and R can be adjusted separately and it is possible that it will provide an

even better result. Now we only use two parameters q1 and q2 to adjust all the entries, since we find

that the changes of the angles are similar in magnitude and the same case happens to the angular

velocities. In addition, it is more convenient to show the procedure about how to adjust them and

compare the tuning results of different combinations. Actually, we find that this setting has been

good enough to show us a distinct improvement effect.

The tuning effect is shown in the comparison of the control results between the LQR with the

identity weight matrices and the tuned LQR in Section 5.2. We can see that tuning makes an

improvement to the RDIP control system. However, the effect of this optimization method is limited.
 >
For instance, in the initial states 0 −0.1 0.1 0 0 0 which we will discuss in Chapter IV

and V in details, it is impossible to control the system with the LQR no matter how we tune the

weight matrice. Therefore, we need to choose another way to optimize the LQR. The DAFC scheme

is taken for this purpose.

3.4 Stability Analysis

Although it is possible to optimize the LQR controller by adjusting the matrices Q and R on the

principle of the Bryson’s rules, we still have no idea about the stability of the RDIP system near the

22
upright position. In fact, the stability plays a central role in the analysis of a given control method

for systems. Here we will try to figure out the Region of Attraction (R.O.A) of the closed-loop

RDIP system to analyze the system stability (we will define the notion of the R.O.A later in the

introduction of the Lyapunov stability in this section). Once the R.O.A of the system is accessible,

a standard is able to be established to compare the performances of different control methods. A

method providing a larger R.O.A makes the system stable in a larger scale of initial states. We

will check the stability near the upright position, one of the 2 equilibrium point. An equilibrium

point is stable if all solutions staring at the nearby points stay nearby; otherwise it is unstable.

It is asymptotically stable if all solutions starting at nearby points not only stay nearby, but also

tend to the equilibrium point as time approaches infinity. In the case of the RDIP, the downward

equilibrium is an asymptotically stable equilibrium while the upper equilibrium is unstable. Stability

of equilibrium points is usually characterized in the sense of the Lyapunov function. Lyapunov’s

method helps to prove stability without requiring knowledge of the true physical energy, provided a

Lyapunov function can be found to satisfy the following constraints. One thing we need mentioned

here is that it is common that the R.O.A might not be the only concern when we decide which

control method we will select for a given system. For example, we may need a quick-response

controller that can force the system to converge to the origin as fast as possible. Also, there is a

limit for the Lyapunov stability analysis. That is, we cannot estimate directly the performance of a

controller which updates itself online, such as the DAFC method we will take in the next chapter.

Here the Lyapunov’s second method for stability is used. For more details about this method readers

can refer to [2]. Suppose we have a system with a point of equilibrium at x = 0. A Lyapunov

candidate function V (x) is a continuously differentiable function defined in a domain D ⊂ Rn → R

23
that contains the origin such that

V (0) = 0 and V (x) > 0 in D − {0}. (3.27)

Then the system is stable in the sense of Lyapunov if

dV (x)
V̇ (x) = ≤ 0 in D. (3.28)
dt

The trajectory of the system states can be constrained inside an area that it can never escape away

from.

Moreover, if
dV (x)
V̇ (x) = < 0 in D, (3.29)
dt

the system is asymptotically stable since the trajectory of the states approaches the origin as time

progresses. An additional condition called “properness” or “radial unboundedness” is required in

order to conclude global asymptotic stability. In general, the origin is stable if there is a continu-

ously differentiable positive definite function V (x) so that V̇ (x) is negative semidefinte, and it is

asymptotically stable if V̇ (x) is negative definite.

To understand this method of stability analysis, we can visualize the Lyapunov function as the

energy of a physical system. If there is no energy restored into it, the system will lose energy (due

to vibration, friction or some factors else) over time and finally stop in some final resting state. This

final state is called the attractor. For a system to be controlled there might be a lot of Lyapunov

functions that can be applied but finding an appropriate Lyapunov function to support the stability

analysis of a system is difficult.

24
Usually we will use a class of scalar functions of the quadratic form for which sign definiteness can

be easily checked
n X
X n
>
V (x) = x P x = pij xi xj , (3.30)
i=1 j=1

where P is a real positive definite symmetric matrix in which case V (x) is guaranteed to be a

good Lyapunov candidate. Here we re-take the Q matrix that we picked up for the LQR control to

construct the Lyapunov candidate function for our system

V (x) = x> Qx, (3.31)

where x = [x1 , x2 , x3 , x4 , x5 , x6 ]> .

Therefore we can easily get

V̇ (x) = x> Qẋ + ẋ> Qx, (3.32)

where ẋ = [ẋ1 , ẋ2 , ẋ3 , ẋ4 , ẋ5 , ẋ6 ]> and Q is already selected as a postive definite diagonal matrix.

Substituting u = −Kx into (3.8) we have

ẋ = f (x) − g(x)Kx, (3.33)

and then

V̇ (x) = x> Q [f (x) − g(x)Kx] + [f (x) − g(x)Kx]> Qx. (3.34)

To this point, we are able to find the R.O.A of the closed-loop system with the LQR controller using

MATLAB. The procedure is:

1. Sample a sufficiently large scale around the origin at small intervals.

25
2. Estimate the value of V̇ (x) for these samples according to (3.34).

3. Pick up the nearest point of the sample set, xp , such that

V̇ (xp ) ≥ 0. (3.35)

4. Enlarge the sample set if there is no such a point exists and repeat Step 1 to Step 3 till a point

xp is found.

5. The R.O.A can be estimated as

R.O.A ≈ ||xp || . (3.36)

Note the approximation sign is used here because the R.O.A should be a region without xp . Since

xp is the nearest point satisfying (3.35), it is a point on the boundary between the R.O.A and the area

outside. Any point that has a less distance than xp will make V̇ (x) less than 0. If there is another

x0p for which the value of V̇ (x0p ) is semi-positive within the boundary, then the R.O.A should be

smaller than the given one. In this way, we can make sure that the statement stands. To make the

identification more precise, we could use smaller intervals to sample the testing points. Another

thing we need to mention here is that, we use the Euclidean norm to express the distance between a

point and the origin. Therefore,

||x|| = ||x||2
q (3.37)
= x21 + x22 + x23 + x24 + x25 + x26 .

26
R
q1 = π
100 10 5 4 3 2 1
1 0.0361 0.0490 0.0447 0.0436 0.0447 0.0480 0.0458
10 0.0387 0.0490 0.0566 0.0648 0.0632 0.0616 0.0592
q2
100 0.0387 0.0490 0.0566 0.0648 0.0632 0.0616 0.0592
1000 0.0387 0.0490 0.0566 0.0648 0.0632 0.0616 0.0592

Table 3.1: R.O.A searching result with q1 = π.

q1 R.O.A q2 R
π 0.0648 10 4
π/2 0.0721 10 1
π/4 0.0755 10 5
π/8 0.0700 10 10

Table 3.2: Largest R.O.A for different q1 .

For the convenience of the simulation, we simply set the matrix Q the same as in Section 3.2
 
q1 0 0 0 0 0
 0 q1 0 0 0 0 
 
 0 0 q1 0 0 0 
Q=  0 0 0 q2 0 0  ,
 (3.38)
 
 0 0 0 0 q2 0 
0 0 0 0 0 q2

where q1 and q2 are selected for the angles and the angular velocities of the links respectively.

Note that we only adjust two parameters q1 and q2 based on the same reason as in Section 3.2 and

actually the six states can be adjusted separately. Table 3.1 shows us one of the searching results in

simulation. Here, we fix q1 as π, and change the values of q2 and R with some of their representative

values. Thus we can find that when q2 = 10 and R = 4 the system get its largest R.O.A with the

fixed q1 . Table 3.2 shows that the largest R.O.A of the system with different q1 . For each option of

q1 , not only the largest R.O.A but also its corresponding q2 and R values are provided in this table.

27
From Table 3.2, we can see that when q1 = π/4, q2 = 10 and R = 5, the LQR gets its largest

R.O.A as 0.0755 which means that if ||x|| < 0.0755, the derivative of the Lyapunov function is

negative and therefore system is asymptotically stable. Beyond the R.O.A, the behavior of the RDIP

system is not able to be predicted. There are still some points starting from which the states of the

system will converge to the origin in the control process whereas the progress is not controllable.

Interestingly, although the R.O.A is very small, the LQR can work in a region even beyond it. This

might be possible that there exist other Lyapunov candidates can provide a better assessment about

the R.O.A than the Lyapunov function we defined here. We will leave this topic for the RDIP system

with LQR in the study in future.

28
CHAPTER IV

ADAPTIVE CONTROL

4.1 Introduction

Since LQR can only work in a small region, we intend to optimize the performance of the LQR

controller. We will develop the LQR controller into a Direct Adatpive Fuzzy Control (DAFC)

system as mentioned in [6] and [12]. From the conclusion of [9], we know that a fuzzy controller

can be used to control the RDIP. A theoretical analysis of the stability and design of a fuzzy control

system is introduced in [13] using the Takagi-Sugeno (T-S) fuzzy model. However, there are some

problems about it. While this non-adaptive fuzzy control has proven its value in the application, it is

difficult to specify the rule base for some plants, or need could arise to tune the rule base parameters

if the plant changes. In the RDIP system, it is very hard to gather the heuristic knowledge about how

to control the RDIP to make it stands upright. Since heuristics do not provide enough information to

specify all the parameters of the fuzzy controller, a priori, adaptive schemes that use data gathered

during the on-line operation of the controller can be used to improve the fuzzy system by making it

automatically learn the parameters, to ensure that the performance objectives are met.

29
There has been some adaptive control schemes applied in system control. As the first adaptive

fuzzy controller, the linguistic self-organizing controller is introduced in [14]. Another successful

method, so-called “fuzzy model reference learning controller” is introduced in [15]-[16]. However,

the problem with them is that while they appear to be practical heuristic approaches to adaptive

fuzzy control there is no proof that these methods will result in a stable closed-loop system. Here,

we are going to optimize the LQR controller by the DAFC which has been provided with the stability

analysis in [6] and the experimental validation in [12]. Therefore the stability requirement could be

met for a safety-critical system such as the RDIP experiment plant in the lab. We will make a

detailed introduction to this control scheme in Section 4.3.

The DAFC attempts to directly adjust the parameters of a fuzzy or neural controller to achieve

asymptotic tracking of a reference input. There are some advantages with the DAFC:

• The stability of this controller may be applied to systems with a state-dependent input gain,

such as systems with the LQR gain.

• The DAFC method works for zero dynamics with minimum phase, however it looks like it

also works for some of the zero dynamics with non-minimum phase.

• The direct adaptive controller allows for T-S fuzzy systems, standard fuzzy systems, or neural

networks.

30
• The direct adaptive technique presented here allows for the inclusion of a known controller

uk so that it may be used to either enhance the performance of some pre-specified controller

or stand alone as a stable adaptive controller.

For what follows in this chapter, the notation from [12] will be used. In the next section, we will

introduce the T-S fuzzy system first. Reader could consult [17] and [18] to fully understand this

kind of fuzzy system. Then we provide a description of the DAFC and specify the DAFC scheme

for the RDIP with the LQR controller which we have obtained as the “known part” of the controller.

4.2 Takagi-Sugeno Fuzzy System

This section largely follows [6] to provide an introduction of the Takagi-Sugeno (T-S) fuzzy sys-

tem. Readers can refer to [19] for more details about the T-S fuzzy system. To briefly present
Pp Pp
the notation, take a fuzzy systems dented by f˜(x). Then, f˜(x) = i=1 ci µi / i=1 µi . Here,
 >
singleton fuzzification of the input x = x1 x2 . . . xn is assumed; the fuzzy system has p

rules, and µi is the value of the membership function for the antecedent of the ith rule given the
Pp
input x. It is assumed that the fuzzy system is constructed in such a way that i=1 µi 6= 0 for

all x ∈ Rn . The parameter ci is the consequent of the ith rule which, in this paper, will be taken

as a linear combination of Lipschitz continuous functions zk (x) ∈ R, k = 1, . . . , m − 1, so that

31
Figure 4.1: Membership function with c = 1 and σ = 0.25.

ci = ai,0 + ai,1 z1 (x) + . . . + ai,m−1 zm−1 (x), i = 1, . . . , p. Define


 
1
 z1 (x) 
z= ,
 
..
 . 
zm−1 (x)
[µ1 . . . µp ]
ζ> = p ,
P (4.1)
µi
i=1
 
a1,0 a1,1 . . . a1,m−1
a2,0 a2,0 . . . a2,m−1 
A> =  . ..  .
 
.. ..
 .. . . . 
ap,0 ap,1 . . . ap,m−1

Then, the nonlinear equation that describes the fuzzy system can be written as f˜(x) = z > Aζ. Here

we present one of the membership functions we use in the DAFC design for our system in Figure

4.1.

32
4.3 Theory

A DAFC controller directly adjusts the parameters of a controller to meet some performance spec-

ifications. In [12], the author developed the adaptive control method by assuming that 0 < β0 ≤

β(x) ≤ β1 < ∞. For the RDIP, the assumption holds for −∞ < β1 ≤ β(x) ≤ β0 < 0 which

can be found out from the simulation result of the RDIP dynamics in MATLAB. Here we will re-

develop the direct adaptive scheme for our case and at the same time provide readers a description

of the method.

Suppose we have the dynamics of the plant as


ẋ = f (x) + g(x)u,
(4.2)
y = h(x),
where x ∈ Rn is the state vector, u ∈ R is the input, y ∈ R is the output of the plant, functions

f (x), g(x) ∈ Rn , and h(x) ∈ R are smooth. The system has a “strong relative degree” r as
ẏ = Lf h(x),

ÿ = L2f h(x),
(4.3)
..
.

(r−1)
y (r) = Lrf h(x) + Lg Lf h(x)u,
where Lrf h(x) is the rth Lie derivative of h(x) with respect to f and Lrg h(x) is the rth Lie derivative

of h(x) with respect to g


∂h
Lf h(x) = f (x),
∂x
∂h (4.4)
Lg h(x) = g(x),
∂x
(2)
Lf = Lf [Lf h(x)],
and so on. Then we have

y (r) = α(x) + αk (t) + [β(x) + βk (t)] u. (4.5)


33
We assume that αk (t) and βk (t) are known components of the dynamics of the plant (that may

depend on the states) or known exogenous time dependent signals and that α(t) and β(t) represent

nonlinear dynamics of the plant that are unknown. It is also assumed that if x is a bounded state

vector, then αk (t) and βk (t) are bounded signals. Throughout the analysis to follow, both αk (t) and

βk (t) may be set to zero for all t ≥ 0.

We have some plant assumptions as follows.

1. The plant is of relative degree 1 ≤ r < n with the zero dynamics exponentially attractive and

there exists β0 and β1 such that −∞ < β1 ≤ β(x) ≤ β0 < 0.

2. x1 , . . . , xn and y, . . . , y (r−1) are measurable. Furthermore, by the use of Lipschitz properties

of ψ(ξ, π), the plants satisfying this assumption have bounded states [20].

3. We require that βk (t) = 0, t ≥ 0 and some function B(x) ≥ 0 such that |β̇(x)| = |(∂β/∂x)ẋ| ≤

B(x).

4. There exists some α1 (x) ≥ |α(x)|.

Although Assumption 1 is not met in our case since our system has an undetermined zero dynamics,

we will derive the control scheme for our system despite this condition. From [6] we know that this

method works for some simple nonlinear systems that are non-minimum-phase such as the rotational

single inverted pendulum. We are trying to implement this method in a more complicated case and

verify if it will still work. Assumption 1 also introduces a requirement that the controller gain β(x)

34
be bounded by a constant β0 from above and a constant β1 from below. The third restriction requires

that |β̇(x)| ≤ B(x) for some B(x) > 0. If k∂β/∂xk and kẋk are bounded, then some B(x) may

be found. If the controller gain of the system is finite, k(∂β/∂x)k is bounded. If y (i) is bounded

as i = 0, . . . , r, then plants with no zero dynamics are ensured that kẋk is bounded since the states

can be represented in terms of outputs y (i) . For a plant has zero dynamics, if β(x) is not dependent

upon the zero dynamics, then once again we have |β̇(x)| bounded. In [6], a function of x is found as

α1 (x) to meet Assumption 4. In this paper, we will use a constant as our global bound for α(x) to

simplify the choice of this function. The constant is obtained by the observation in the RDIP system

simulation results.

Using feedback linearization theory in [2], we assume that there exists some ideal controller

1
u∗ = [−α(x) + ν(t)] , (4.6)
β(x)

where ν(t) is a free parameter. We may express u∗ in terms of T-S fuzzy model, so that

u∗ = zu> A∗u ζu + uk + du (x), (4.7)

where zu ∈ Rmu , ζu ∈ Rpu and A∗u ∈ Rmu ×pu is the ideal direct control parameters
" #
A∗u : = arg min sup |zu> Au ζu − (u∗ − uk )| . (4.8)
Au ∈Ωu X∈Sx ,ν∈Sm

du (x) is an approximation error which arises when u∗ is represented by a fuzzy system. We assume

that Du (x) ≤ |du (x)|, where Du (x) is a known bound on the error in representing the ideal con-

troller with a fuzzy system. If |du (x)| is to be small, then our fuzzy controller will require x and

ν to be available, either through the input membership function or through zu> . uk is a known part

of the controller. The DAFC attempts to directly determine a controller, so within this chapter we

allow for a known part of the controller that is perhaps specified via heuristics or past experience

35
with the application of conventional direct control (in our case, LQR). The approximation of the

desired control is

û = zu> Au ζu + uk , (4.9)

where the matrix Au is updated online. The parameter error matrix

φu (t) = Au (t) − A∗u (4.10)

is used to define the difference between the parameters of the current controller and the desired

controller. The control law is given by

u = û + usd + ubd . (4.11)

In general, the DAFC is comprised of a bounding control term ubd , a sliding-mode control term usd ,
(r)
and an adaptive control term û. Here we define ν : = ym + ηes + ēs − ak (t) with e0 , es and ēs as

defined
e0 = ym − yp ,

(r−1)
es = [e0 . . . e0 ][k0 . . . kr−2 , 1]> , (4.12)

(r)
ēs = ės − e0 ,

where L̂(s) : = sr−1 + kr−2 sr−2 + . . . + k1 s + k0 has its poles in the open left-half plane.

Combining the above equations we can have

ės + ηes = −β(x)(û − u∗ ) − β(x)(usd + ubd ). (4.13)

36
4.3.1 Bounding Control

We now define the bounding control term ubd of the DAFC. The bounding control term is deter-

mined by considering
1
ubd = e2s . (4.14)
2

We differentiate (4.14) and use (4.13) to obtain

v̇bd = −ηe2s − es [β(x)(û − u∗ ) + β(x)(usd + ubd )]


(4.15)

≤ −ηe2s − |es | [β(x)(|û| + |u |) + β(x)|usd |] − β(x)ubd es .

We do not explicitly know u∗ , however, the bounding controller can be implemented using α1 (x) ≥

|α(x)| as
(  
α1 (x)+|ν|
−|û| − |usd | + β0 sgn(es ) if |es | > Me ,
ubd = (4.16)
0 if else.

Using (4.15) and (4.16), we have

v̇bd ≤ −ηe2s , if |es | ≥ Me . (4.17)

Thus we are ensured that if there exists a time t0 such that |es (t0 )| > Me , then for t > t0 , |es (t)|

will decrease exponentially until |es (t)| ≤ Me .

4.3.2 Adaptation Algorithm

Consider the following Lyapunov candidate equation

1 1
Vd = − e2 + tr(φ>
u Qu φu ), (4.18)
2β(x) s 2

where Qu ∈ Rmu ×mu is positve definite and diagonal, and φu = Au − A∗u . Since −∞ < β1 ≤

β(X) ≤ β0 < 0, Vd is radially unbounded. The tr(·) is the trace operator. The Lyapunov candidate

37
Vd is used to describe the error in tracking and the error between the desired controller and current

controller. If Vd → 0, then both the tracking and learning objectives have been fulfilled. Taking the

derivative of (4.18) yields

es β̇(x)e2s
V̇d = − [ės ] + tr(φ> Q φ̇
u u u ) + . (4.19)
β(x) 2β 2 (x)

Substituting ės , as defined in (4.13), yields

es β̇(x)e2s
V̇d = − [−ηes − β(x)(û − u∗ ) − β(x)(usd + ubd )] + tr(φ> Q φ̇
u u u ) + . (4.20)
β(x) 2β 2 (x)

Using the following fuzzy controller update law

Ȧu (t) = −Q−1 >


u zu ζu es , (4.21)

since φ̇ = Ȧu , we have

η 2 β̇(x)e2
V̇d = es + [zu> φu ζu − du + usd + ubd ]es − tr(zu> φu ζu )es + 2 s . (4.22)
β(x) 2β (x)

The projection algorithm mentioned in [12] is used to ensure that Thus we have

η 2 β̇(x)e2
V̇d ≤ es + [zu> φu ζu − du + usd + ubd ]es − tr(zu> φu ζu )es + 2 s . (4.23)
β(x) 2β (x)

The inequality above may equivalently be expressed as


" #
η 2 β̇(x)es
V̇d ≤ e + − du es + (usd + ubd )es . (4.24)
β(x) s 2β 2 (x)

4.3.3 Sliding-mode Control

With ubd defined before, we have


" #
η 2 β̇(x)es
V̇d ≤ e + − du es + usd es
β(x) s 2β 2 (x)
" # (4.25)
η 2 |β̇(x)||es |
≤ es + + |du | es + usd es .
β1 2β 2 (x)
38
We now define the sliding-mode control term for the direct adaptive controller as
 
B(x)|es |
usd = − + Du (x) sgn(es ), (4.26)
2β02

which ensures that V̇ ≤ ηe2s /β1 .

4.4 Implementation

Although the theoretical analysis in [12] uses the assumption that the unknown control law u∗ which

the DAFC tries to identify is a feedback linearizing law, it was found experimentally in [6] that it

is not necessarily the case. If the adaptation mechanism is initialized appropriately in accord with

the known controller such as the LQR, the adaptation algorithm will converge to a controller that

might behave in a very different manner because this mechanism seems to try to find the local

optimum controller closest to its starting point in the search space and, in our case, an optimized

LQR controller is found.

This finding is very important in the case that the control design involves dealing with a non-

minimum-phase plant or a system with internal dynamics that are hard to identify. If a non-adaptive

controller is available that can control the system regardless of whether the system is minimum-

phase, then it is possible that the desirable boundedness characteristics of this controller can be

incorporated into the DAFC design, and enhance by the robustness that the adaptive method pro-

vides.

In our case, we take the angle of Link 3 as the output

y = x3 (4.27)

39
Then we have

ẏ = x6 ,
(4.28)
ÿ = f6 (x) + g6 (x)u.

To this point, we find that the zero dynamics of the system are very hard to identify. But we already

know that the LQR controller works for the RDIP system, therefore a DAFC controller is possible

to be implemented with the LQR controller.

First we present the conditions mentioned in the former section for the DAFC. we set the known

bound for the approximation error Du (x) as 0.01. In practice it is often hard to have a concrete idea

about the magnitude of Du (x), because the relation between u∗ and its fuzzy representation might

be difficult to characterize; however, it is much easier to begin with a rough, intuitive idea about

this bound, and then iterate the design process and adjust it, until the performance of the controller

indicates that one is close to the right value. For the simulation, we found that Du (x) = 0.01 gives

us good results. A small Du (x) indicates that the fuzzy system could represent the ideal controller

very accurately.

We are going to search for u∗ using (4.9) where ζu ∈ R2187 , with the membership functions shown

in Figure 4.1. The mathematical description of the membership functions are provided in the section

5.3. We choose the number of rules p = 37 = 2187. And the matrix Au (t) ∈ R7×2187 is adaptively

updated on line. The function vector z is taken as


   
z1 1
z2  x1 
   
z3  x2 
   
z= z4  = x3  . (4.29)
  
z5  x4 
   
z6  x5 
z7 x6

40
The fuzzy system uses 37 rules and each ci (x) is a row of the matrix z > Au (t). We initialize the

fuzzy system approximation by letting Au = 0 since we know nothing about the optimal controller.

The DAFC control law is given by ud = û + usd + ubd as we have discussed before. The sliding

term is given by (4.26). In simulation we find that β(x) is between −128 and −135, thus we choose

β0 as −100. We also choose B(x) as 250 for safety. The bounding term needs the assumption that

α(x) is bounded, with |α(x)| ≤ α1 (x). We find that α(x) is always less than 14.6, therefore we

safely choose α1 (x) as 20. Then we have ubd as defined in (4.16). For simulation we use Me =8,

because by some calculation from the simulation results, we can find that Me is always less than

4.6 when the system works. Actually, the parameter Me defines a bounded, closed subset of the es

error-state space within which the error is guaranteed to stay. ν is defined as in the section above.

Here we select η = 1 and k0 = 5 for the es = k0 e0 + ė0 . With this choice, the poles of the error

transfer function are at s = −1 and s = −5, which produce a small error settling time.

The last part of the DAFC mechanism is the adaptation law, which is chosen in such a way that

the output error converges asymptotically to zero, and the parameter error remains at least bounded.

For this law We choose Qu = 1.2I 7 with which the algorithm is able to adapt and estimate the

control law û fast enough to perform well and compensate for disturbances, but without inducing

oscillations typical of a too high adaptation rate.

41
CHAPTER V

SIMULATION RESULTS

5.1 Open-loop Model Verification

We will use the MATLAB programming engineering environment to do the simulation all through

this paper. The solver “ode45” in MATLAB is used here in all cases to solve initial value problems

for ordinary differential equations. It is important to notice that both the LQR and DAFC controllers

are continuous time techniques, to implement them we use a digital computer, and thus are forced

to implicitly use a discrete time approximation of the controller. It is reasonable to think that a proof

of stability is still applicable when a continuous time technique is discretized, but such a study is

outside the scope of the present work.

 >
The initial conditions x(0) are set to be x(0) = 1 0.1 0.1 0 0 0 which means the pendu-

lums are nearly in the upright position at the starting point. Note here the initial states are expressed

in rad or rad/s while we will show the states in all the figures in deg and deg/s for easy observa-

tion. And the performance of the RDIP model without control is shown below in Figure 5.1. All the

42
Figure 5.1: Open-loop simulation in the initial states θ1 = 57.2958◦ , θ2 = 5.7296◦ and θ3 =
5.7296◦ .

simulation results of the link angles of the RDIP will scale from −100◦ to 250◦ for the convenience

to compare. The simulation time is from 0s to 10s.

As we can see from Figure 5.1, when there is no control input into the system, the pendulums fall

down directly to its stable equilibrium in the downward position, where the RDIP keeps in its lowest

total energy state. We can also see that the angle of Link 1 converges to a position near the initial

states. Since there is no input torque and only the small viscous friction working on the base arm,

the base arm can be seen as a conservative system along the horizontal plane. Therefore the base

arm finally goes back to its starting point. As of now, the model we build seems acceptable to

simulate the behavior of the RDIP system. During the simulation, a 3-D dynamical model is also

43
built to provide a direct insight of what the pendulum is doing by using the Euclidean geometry to

calculate the relative position from the pivot to the end for each link.

5.2 Linear Quadratic Regulator Results

In MATLAB, the LQR gain can be computed directly once we select the values of the parameters

Q and R. The MATLAB command “LQR” can help us speed up the calculation. We will set the

parameters of LQR initially as


 
1 0 0 0 0 0
0 1 0 0 0 0
 
0 0 1 0 0 0
Q=
0
,
 0 0 1 0 0 (5.1)
0 0 0 0 1 0
0 0 0 0 0 1
R = 1.
With these settings, we have the LQR gain

 
K = −1.0000 211.7112 −120.5868 −2.2287 56.0199 −5.3027 . (5.2)

The same initial states as in Figure 5.1 is used. The simulation result of the link angles is shown in

Figure 5.2. The control input is shown together with the one of the tuning system in Figure 5.4. We

can see that LQR provides a very good performance to control the system. All the links converge

to zero and the control input is relatively small compared with the control signal we will get later

from the DAFC simulation. That means the LQR controller will save the energy used to control the

system. It is at nearly 6s that the system succeed in converging to the origin.

Then we will try to adjust the parameters of the LQR for the purpose of optimizing the control

performance. Taking many simulation trails in different initial states with different Q and R, we get

an intuitive understanding about the scale of the system states. The maximum of the absolute value
44
Figure 5.2: LQR simulation result with Q = I 6 and R = I.

of the six states and the input are respectively around 1.5, 0.6, 0.5, 5, 1, 5 and 1. Thus, according

to the Bryson’s rules, we can set the Q and R with this information to start tuning. Then after some

fine tuning, we finally got the optimized LQR parameters as below


 
1 0 0 0 0 0
0 1 0 0 0 0 
 
0 0 1 0 0 0 
Q=  ,
0 0 0 0.167 0 0   (5.3)
0 0 0 0 0.167 0 
0 0 0 0 0 0.167
R = 0.8.

With these Q and R, we will have

 
K = −1.1180 135.6454 −70.5044 −1.8397 36.1255 −2.7697 . (5.4)

The simulation result is in Figure 5.3. Clearly, the performance of the LQR is improved by tuning.

The negative peak value of the position variation of Link 1 is nearly 50◦ in the improved system

45
Figure 5.3: Tuned LQR simulation result.

while it is 25◦ larger in the original system. Besides, at around 3.5s the tuned system has already

converged to zero while the original system is still on its way.

As we mentioned above, the control signals for the two cases are shown in Figure 5.4. Here we

scale the control signals from -2 to 2 by truncating the initial large negative peaks in both of the

LQR and its tuned version. For the LQR, the initial peak is -5.3961 and for the tuned one it changes

to -8.1124. We can see that, the tuning control signal react quickly than the original one. It looks

like the positive peak of the tuning LQR is higher than the original one, however, since it falls down

faster that original one, the total energy of the tuned system used for control is not large compared

with the original one. In fact, using (3.6) to calculate the total energy of the control process for these

two cases, we find that the control energy cost for the tuned LQR is 5.8730, which is much smaller

than the total cost of the original LQR, 19.7358. Note that here there is an intense change in both of

46
Figure 5.4: Control signals of the LQR and its tuned version.

the control signal at the beginning of the control process. We will see the same phenomenon in the

DAFC Simulation.

5.3 Direct Adaptive Fuzzy Control Results

Firstly, we provide a case that the LQR loses its impact on controlling the system, even with its
 >
parameters tuned. For the initial states x(0) = 0 −0.1 0.1 0 0 0 , the system states will

diverge over time. We will see that the RDIP system can be controlled with the adaptive LQR

control scheme.

47
zi c σ
1 1 0.25
x1 1 0.25
x2 0.1 0.025
x3 0.2 0.05
x4 4 1
x5 0.6 2
x6 3 0.75

Table 5.1: Membership function settings for the DAFC.

A Gaussian membership function with the following form is used here, which is shown in Figure

4.1 in the previous Chapter.

Left:
(
1 if x ≤ −c,
µ(x) = −( x+c
2

(5.5)
2σ )
e if x > −c.

Center:
 2

x
−( 2σ )
µ(x) = e . (5.6)

Right:
(  2

−( x−c
2σ )
µ(x) = e if x < c, (5.7)
1 if x ≥ c.

With this function, we set the membership function for each term of the z matrix as in Table 5.1.

Also, we adjust the other parameters in the bounding term, sliding term and the central equivalent

term of the control scheme with the options we have picked up in Chapter IV. Note here we need to

take care of not only the parameters of the adaptive fuzzy system, but also the LQR parameters Q

48
Figure 5.5: DAFC simulation result.

and R matching with the adaptive law. The Q and R are chosen here as
 
0.5 0 0 0 0 0
 0 0.5 0 0 0 0
 
0 0 0.5 0 0 0
Q=  ,
0 0 0 1.5 0 0 (5.8)
0 0 0 0 1.5 0 
0 0 0 0 0 1.5
R = 1.

Then we have the the result for the DAFC as in Figure 5.5 and the DAFC control signal as in Figure

5.6, where the control signal is scaled from -20 to 20 for observation by truncating the initial positive

peak 58.2286.

In Figure 5.5 we find that the adaptive LQR really works for the given initial states. The system

finally converges into zero. In [6], the state x1 in the case of a single inverted pendulum converges to

49
Figure 5.6: DAFC control signal.

a stable state of rotating with constant speed. If we could figure out the RDIP has a non-minimum-

phase zero dynamics, together with the finding from that paper with ours, we conclude that the

DAFC can work on a non-minimum-phase system and the system states will converge to a stable

state. Further analysis is still needed for a theoretical evidence of the effectiveness of the DAFC

among a large scale of non-minimum-phase systems.

From Figure 5.6 we can see that the control signal of the DAFC method vibrates more intensely

than the conventional LQR. We find that there is a sudden change at the beginning of the control

process, which also happens in the original LQR case. As we mentioned before, the DAFC tries

to follow the known controller part and optimize it. Thus it may inherit some characteristic of the

known controller. The magnitude of the DAFC control signal seems larger than the LQR, but it

indeed improves the LQR controller by making the system stable in the given initial states that the

50
LQR loses its power. In addition, the control signal of the DAFC has a magnitude much larger than

the LQR. Since a high feedback gains may lead to torque saturation, noise amplification, and other

problems in the experiment, we will need to find a way to avoid it in future.

From the characteristics information of the DC motor for the SV02+DBIP experiment in the “Quanser

Systems and Procedures” technique document in the lab KL302, we can figure out that the motor

can provide to the system with a torque no less than 83.8856N · m, which is large enough for our

DAFC control scheme since the largest control signal peak that we have got is 58.2286.

In Figure 5.7, we provide the comparison of the R.O.A of the original LQR and the DAFC. Here

we sample for the initial states of the system in a range where θi ∈ [−0.1, 0.1], f or i = 1, 2, 3

and assume θi = 0, f or i = 4, 5, 6 to simplify the comparison and make it possible to show the

result on a 3-D plot. We sample θ1 and θ2 at an interval of 0.02, and we sample θ3 at an interval

of 0.01. The blank region within the cube is where both the LQR and the DAFC works. The blue

points shows where the DAFC still works in these initial states while the original LQR has lost its

power to make the system converge to the zero point. We can see clearly that the DAFC actually

increase the stability of the LQR controller.

51
Figure 5.7: LQR vs. DAFC.

52
CHAPTER VI

CONCLUSION

We have studied two control approaches for the RDIP. First of all, a mathematical model is built

with the E-L method. The rotary frictions of the links are considered in model building while we

ignore the static friction of the system.

In the next step, we have developed a LQR controller for the system after linearizing the system with

two methods which is equivalent with each other. The LQR method presents an adequate behavior

on the plant in terms of our basic control objective to balance the pendulum. We introduce the

Bryson’s rule as a starting point for tuning the LQR controller, and then improve the performance

of the LQR with some fine tuning. For the exploration of the stability of the LQR, we discuss the

Lyapunov stability of the LQR controller and then get the R.O.A for the system. We can find that

beyond the scope of the R.O.A, there are still a large area that the LQR controller works, such as the

initial states we have used for the original LQR and its tuning test. But we cannot claim that there

exists a neighborhood outside the R.O.A where the LQR control can always work. For example,
 >
in the initial states 0 −0.05 0.1 0 0 0 the LQR controller will lose its power and this
 >
point is much closer than the point 1 0.1 0.1 0 0 0 which has been proven to be stable.

53
The R.O.A of the RDIP system is still limited; one can continue to adjust the parameters of the

conventional LQR and the Adaptive LQR controllers to enlarge it.

Since the performance of the LQR is limited, we tried to optimize the LQR with the DAFC. This

method directly approximates the ideal controller by using the T-S fuzzy set. We are able to increase

robustness using this method compared with the LQR. We find that with this method, the good

characteristic of the LQR can be retained and at the same time the benefits of adaptation are added.

Following the simulation result in [6], we applied the adaptive fuzzy control scheme in a more

complex system and get a better control result which shows that all the states of the system converges

to zero, while the base arm in [13] converge to another stable state, a constant-speed rotational

movement. It indicates that the DAFC is able to improve the LQR by improving its robustness

adaptively. It is possible to design a DAFC that gives us bounded states in spite of the marginal

stability of the zero dynamics. However, we provided no theoretical justification of the fact that this

design works as it does.

The DAFC can also be improved in some ways. An approach that can be applied is the incorporation

of heuristics about the inverse plant dynamics to speed the adaptation. An inverse plant is a fuzzy

system that is heuristically designed to roughly approximate the plant’s inverse dynamics. The

details about it is discussed in [12].

One must be careful in trying to evaluate these results. It is probably not fair to say that the LQR

failed and the DAFC succeeded, recalling that the pendulum does not satisfy the zero-dynamics

assumption of the DAFC method. However, our experience indicates that at least in some cases, the

adaptive fuzzy method we have investigated has an advantage with respect to the conventional LQR

54
method. It allows for more design flexibility. This is clearly illustrated by our adaptive design. The

DAFC using a LQR as the known part of the controller displays an improved behavior in comparison

with the conventional LQR technique. Apparently, the use of our knowledge of what the control law

should be helpful to increase the robustness of the algorithm. We manage to obtain an improvement

by the adaptive technique.

Although the result we have obtained seems to indicate that the adaptive LQR can improve the

original LQR and even work with systems with non-minimum-phase zero dynamics, it is still nec-

essary to evaluate the performance of the DAFC under a greater variety of conditions. It remains

to be investigated how robust the controllers are against many different types of disturbances, for

instance, we did not study how the adaptive fuzzy controllers react to a “white noise” disturbance

in the control input.

Another thing needs to be mentioned is that, we can see from Figure 5.6, there is a large peak in the

control signals at the beginning of the control process. It might be due to the zero initial states of

Au . One can try to reset the initial states of the fuzzy set to improve the magnitude of the control

input.

As we mentioned before, this paper is a theoretical preparation for the RDIP experiment in the lab

KL302. One can implement these control schemes in experiment to verify the simulation results,

which will be a great challenge since in the experiment, there may be some disturbance, unknown

dynamics or other factors.

55
BIBLIOGRAPHY

[1] V. Sukonatanakarn and M. Parnichkun, “Real-time optimal control for rotary inverted pendu-
lum,” American Journal of Applied Sciences, 2009.

[2] H. K. Khalil, Nonlinear Systems. Upper Saddle River, NJ: Prentice Hall, 2002.

[3] R. W. Brockett and H. Li, “A light weight rotary double pendulum: Maximizing the domain of
attraction,” in IEEE Decision and Control Conference, (Maui, Hawaii), pp. 3299–3304, Dec
2003.

[4] J. Driver and D. Thorpe, “Design, build and control of a single/ double rotational inverted
pendulum,” tech. rep., School of Mechanical Engineering, The University of Adelaide, 2004.

[5] V. Casanova, J. Salt, R. Piza, and A. Cuenca, “Controlling the double rotary inverted pendu-
lum with multiple feedback delays,” International Journal of Computers Communications and
Control, vol. 7, pp. 20–38, Mar. 2012.

[6] R. Ordonez, J. Zumberge, J. T. Spooner, and K. M. Passino, “Adaptive fuzzy control: Exper-
iments and comparative analysis,” IEEE Transactions on Fuzzy Systems, vol. 5, pp. 167–188,
May 1997.

[7] C. Fox, An Introduction to the Calculus of Variations. New York, NY: Dover Publications,
2010.

[8] J. R. Movellan, “Dc motors.” [Link] Mar.


2010.

56
[9] Y. Wang, “Rotation double inverted pendulum,” tech. rep., School of Electrical and Computer
Engineering, The University of Dayton, Apr. 2012.

[10] Z. Gajic, Linear Dynamic Systems and Signals. Upper Saddle River, NJ: Prentice Hall, 2002.

[11] J. P. Hespanha, “Lecture notes on lqr/lqg controller design.” [Link]


pl/˜wpaszke/materialy/kss/[Link], Feb. 2005.

[12] J. T. Spooner and K. M. Passino, “Stable adaptive control using fuzzy system and neural
networks,” IEEE Transactions on Fuzzy Systems, vol. 4, pp. 339–359, Aug. 1996.

[13] K. Tanaka and M. Sugeno, “Stability analysis and design of fuzzy control systems,” Fuzzy Sets
and Systems, vol. 45, pp. 135–156, Jan. 1992.

[14] T. Procyk and E. Mamdani, “A linguistic self-organizing process controller,” Automatica,


vol. 15, no. 1, pp. 15–30, 1979.

[15] J. R. Layne and K. M. Passino, “Fuzzy model reference learning control,” J. Intell. Fuzzy Syst.,
vol. 4, no. 1, pp. 33–47, 1996.

[16] J. R. Layne and K. M. Passino, “Fuzzy model reference learning control for cargo ship steer-
ing,” in IEEE Contr. Syst. Mag., pp. 23–24, Dec 1993.

[17] T. .Takagi and M. Sugeno, “Fuzzy identification of systems and its applications to modeling
and control,” TSMC, 1985.

[18] J. T. Spooner, M. Maggiore, R. Ordonez, and K. M. Passino, Stable Adaptive Control and
Estimation for Nonlinear Systems. New York, NY: John Wiley and Sons, 2002.

[19] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its applications to modeling
and control,” TSMC, vol. 15, pp. 116–132, Jan. 1985.

[20] S. Sastry and M. Bodson, Adaptive Control: Stability, Convergence and Robustness. Engle-
wood Cliffs, NJ: Prentice Hall, 1989.

57

You might also like