0% found this document useful (0 votes)
5 views32 pages

EEE 537 Power System Optimization

EEE 537 Power System Optimization covers optimization problem formulation, calculus-based techniques, and various optimization methods including linear and non-linear programming. Students will learn to apply optimization techniques to power systems, including economic dispatch and load shedding. The course emphasizes the importance of proper problem formulation and the application of both derivative-based and non-derivative methods.

Uploaded by

Joachim Anigbata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views32 pages

EEE 537 Power System Optimization

EEE 537 Power System Optimization covers optimization problem formulation, calculus-based techniques, and various optimization methods including linear and non-linear programming. Students will learn to apply optimization techniques to power systems, including economic dispatch and load shedding. The course emphasizes the importance of proper problem formulation and the application of both derivative-based and non-derivative methods.

Uploaded by

Joachim Anigbata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

EEE 537 Power System Optimization (2 units)

Introduction to the formulation of Optimization problems: Problem variables, problem constraint, the
objective function. Calculus based techniques: Unconstrained Minimization: Powell’s method of conjugate
directions, Gradient methods, second order methods.

Constrained Minimization Problems: Indirect methods by unconstrained minimization, penalty function


approach. Direct methods for constrained optimization; Lagrange multipliers, Kuhn Tucker conditions;
methods of feasible directions. Linear (LP) and Non-linear programming. Application of LP to power systems
problems economic dispatch, automatic load shedding generation expansion studies. Non-Calculus Based
Methods: Guided random search techniques: Evolutionary algorithms, simulated annealing, genetic
algorithms;
Expected Learning Outcomes:
At the end of studying this course, students will be able to:
(i) Understand the basic techniques of performing derivative-based optimization of multi-variable
functions using any of the methods suitable for the problems. Identification of a suitable methods
will be an expected skill to be developed by each student.
(ii) Formulate and solve linear programming problems using a suitable technique as may be demanded
by the problem.
(iii) Learn how to apply other tools even not discussed here as may become relevant in the future.
(iv) Differentiated between constrained and unconstrained optimization problems.
(v) Understand the techniques involved in solving both indirect and direct methods for unconstrained
and constrained optimization respectively.
(vi) Apply Linear Programming in solving Economic dispatch problems of power systems.
(vii) Finally, appreciated the application of evolutionary algorithms in Power systems economic
Dispatch etc.

INTODUCTION
Optimization is the process of finding the best possible solution to a problem from a set of feasible
alternatives, according to a clearly defined criterion. In engineering, science, economics, and management,
optimization helps in making decisions that maximize performance or minimize cost, error, energy, or
time, subject to real-world limitations.
Optimization is everywhere, from routine business transactions to important decisions of any sort, from
engineering design to industrial manufacturing, and from choosing a career path to planning our holidays.
Business organizations have to maximize their profit and minimize the cost. Engineering design has to
maximize the performance of the designed product while of course minimizing the cost at the same time.
Even when we plan holidays, we want to maximize the enjoyment and minimize the cost. In all these activities,
there are always some things (objectives) we are trying to optimize and these objectives could be cost, profit,
performance, quality, enjoyment, customer-rating and others. Therefore, the studies of optimization are of
both scientific interest and practical implications and subsequently the methodology will have many
applications. The formal approach to these optimization problems forms the major part of the mathematical
optimization or mathematical programming.
Optimization
Whatever the real-world problem is, it is usually possible to formulate the optimization problem in a generic
form.
Optimization is a best process of achieving:

1
(a) maximum utilization/minimum losses
(b) best possible solution
(c) feasible solution
(d) most economical solution/method.
Thus, optimization provides a logical method for the selection of the best choice from among all possible
design that are available such that all limitations and restrictions(constraints) placed on it are satisfied. Simply
put, it is a process or rationale to achieve an improved solution. It may apply to a physical design or an
operational procedure.
The approach to this study is numerical and, in most cases, practical if programmed for computer solutions.
The methods and/ algorithms developed will allow the developer to optimize the selected design concepts
against pre-selected design limitations.
Optimization in Power Systems
Power-system optimization is about operating and planning the grid at minimum cost while satisfying
physical and security constraints.
Typical objectives:
• Minimize generation cost
• Minimize power losses
• Minimize emissions
• Maximize reliability / security
Typical constraints:
• Power balance (generation = demand + losses)
• Generator limits
• Transmission limits
• Voltage and stability limits
An optimization problem consists of three essential elements:
1. Decision (design) variables
These are the unknowns we want to determine.
𝐱 = [𝑥1 , 𝑥2 , … , 𝑥𝑛 ]𝑇
2. Objective function
A mathematical function that measures the quality of a solution. It is either:
o Minimized, e.g., cost, error, weight
o Maximized, e.g., efficiency, profit, strength
Minimize or Maximize 𝑓(𝐱)
3. Constraints
Restrictions that the solution must satisfy, representing physical, economic, or operational limits.
2. General Mathematical Form of an Optimization Problem
A standard optimization problem is written as:
Minimize / Maximize 𝑓(𝐱)
subject to 𝑔𝑖 (𝐱) ≤ 0, 𝑖 = 1,2, … , 𝑚(inequality constraints)
ℎ𝑗 (𝐱) = 0, 𝑗 = 1,2, … , 𝑝(equality constraints)
𝐱 ∈ ℝ𝑛
Where:

2
• 𝑓(𝐱)is the objective function
• 𝑔𝑖 (𝐱)are inequality constraints
• ℎ𝑗 (𝐱)are equality constraints
• 𝐱is the vector of decision variables

3. Types of Optimization Problems


(a) Unconstrained Optimization
No constraints on the variables:
Minimize 𝑓(𝐱)

Example: Minimizing a quadratic error function.

(b) Constrained Optimization


Includes equality and/or inequality constraints.
Example: Structural design with stress and deflection limits.

(c) Linear vs Nonlinear Optimization


• Linear optimization
Objective and constraints are linear functions.
𝑓(𝐱) = 𝑐 𝑇 𝐱

• Nonlinear optimization
At least one function is nonlinear (very common in engineering).

(d) Continuous vs Discrete Optimization


• Continuous: variables take real values
• Discrete / Integer: variables take integer values only

4. Steps in Formulating an Optimization Problem


Formulation is often the most critical step in optimization.
Step 1: Understand the physical or practical system
• Identify the goal (what does “best” mean?)
• Define system boundaries
Step 2: Identify decision variables
• Choose variables that fully describe the system
• Avoid unnecessary variables
Step 3: Define the objective function
• Express performance quantitatively
• Ensure it reflects the true goal
Step 4: Identify constraints
• Physical laws (e.g., equilibrium, kinematics)
• Resource limits (cost, materials, power)
• Safety and performance limits
Step 5: Define variable bounds
𝑥𝑖min ≤ 𝑥𝑖 ≤ 𝑥𝑖max

5. Example of Optimization Problem Formulation


Example: Minimizing Production Cost
A manufacturer produces a product using two machines.
Let:
• 𝑥1 : hours of machine 1
• 𝑥2 : hours of machine 2

3
Objective function:
Minimize 𝑓 = 50𝑥1 + 40𝑥2
Constraints:
2𝑥1 + 𝑥2 ≥ 100(production requirement)
𝑥1 ≤ 40(machine 1 limit)
𝑥2 ≤ 60(machine 2 limit)
𝑥1 , 𝑥2 ≥0

7. Importance of Proper Formulation


• A poorly formulated problem leads to meaningless results
• Correct formulation simplifies solution methods
• Optimization algorithms cannot fix a wrong model
“Formulation is the bridge between the real-world problem and mathematical optimization.”

PROBLEM 1: Unconstrained Optimization (Single Variable)


Problem
Find the minimum value of
𝑓(𝑥) = 𝑥 2 − 8𝑥 + 15
Solution
Step 1: First derivative
𝑑𝑓
= 2𝑥 − 8
𝑑𝑥
Step 2: Set derivative to zero
2𝑥 − 8 = 0 ⇒ 𝑥 = 4

Step 3: Second derivative test


𝑑2𝑓
=2>0
𝑑𝑥 2
✔ Hence, the function has a minimum at 𝑥 = 4
Step 4: Minimum value
𝑓(4) = 16 − 32 + 15 = −1
Final Answer
𝑓min = −1 at 𝑥 = 4

PROBLEM 2: Unconstrained Optimization (Two Variables)


Problem
Minimize:
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 − 4𝑥 − 2𝑦 + 5

Solution
Step 1: Compute partial derivatives
∂𝑓
= 2𝑥 − 4
∂𝑥
∂𝑓
= 2𝑦 − 2
∂𝑦
Step 2: Set gradients to zero
2𝑥 − 4 = 0 ⇒ 𝑥 = 2
2𝑦 − 2 = 0 ⇒ 𝑦 = 1
Step 3: Hessian matrix
2 0
𝐇=[ ]
0 2
Since 𝐇is positive definite, the point is a minimum.

4
Step 4: Minimum value
𝑓(2,1) = 4 + 1 − 8 − 2 + 5 = 0

Final Answer
𝑓min = 0 at (𝑥, 𝑦) = (2,1)

PROBLEM 3: Equality Constrained Optimization (Lagrange Multipliers)


Problem
Minimize:
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2

subject to:
𝑥 + 𝑦 = 10
Solution
Step 1: Form the Lagrangian
ℒ = 𝑥 2 + 𝑦 2 + 𝜆(𝑥 + 𝑦 − 10)
Step 2: Partial derivatives
∂ℒ
= 2𝑥 + 𝜆 = 0
∂𝑥
∂ℒ
= 2𝑦 + 𝜆 = 0
∂𝑦
∂ℒ
= 𝑥 + 𝑦 − 10 = 0
∂𝜆
Step 3: Solve equations
2𝑥 = 2𝑦 ⇒ 𝑥 = 𝑦
𝑥 + 𝑥 = 10 ⇒ 𝑥 = 𝑦 = 5
Step 4: Minimum value
𝑓(5,5) = 25 + 25 = 50
Final Answer
𝑓min = 50 at (5,5)

PROBLEM 4: Inequality Constrained Optimization (KKT Conditions)


Problem
Minimize:
𝑓(𝑥) = 𝑥 2 − 4𝑥 + 5
subject to:
𝑥≥1

Solution
Step 1: Unconstrained minimum
𝑑𝑓
= 2𝑥 − 4 = 0 ⇒ 𝑥 = 2
𝑑𝑥

Step 2: Check feasibility


𝑥 = 2 ≥ 1 ⇒ constraint satisfied

Thus, constraint is inactive.


Step 3: Minimum value
𝑓(2) = 4 − 8 + 5 = 1

Final Answer

5
𝑓min = 1 at 𝑥 = 2

PROBLEM 5: Inequality Constrained Optimization (Active Constraint)


Problem
Minimize:
𝑓(𝑥) = (𝑥 − 3)2
subject to:
𝑥≤1
Solution
Step 1: Unconstrained minimum
𝑥=3
But:
3 ≰ 1 ⇒ not feasible
Step 2: Minimum occurs at boundary
𝑥=1
Step 3: Function value
𝑓(1) = (1 − 3)2 = 4
Final Answer
𝑓min = 4 at 𝑥 = 1

PROBLEM 6: Engineering-Oriented Example (Energy Minimization)


Problem
A system has energy:
𝐸(𝑥, 𝑦) = 3𝑥 2 + 2𝑦 2 − 12𝑥 − 8𝑦
Find the minimum energy configuration.
Solution
Step 1: Partial derivatives
∂𝐸
= 6𝑥 − 12 = 0 ⇒ 𝑥 = 2
∂𝑥
∂𝐸
= 4𝑦 − 8 = 0 ⇒ 𝑦 = 2
∂𝑦
Step 2: Hessian matrix
6 0
𝐇=[ ]
0 4
Positive definite → minimum.
Step 3: Minimum energy
𝐸(2,2) = 12 + 8 − 24 − 16 = −20
Final Answer
𝐸min = −20 at (2,2)
Summary Table
Technique Problem Type Key Tool
Single-variable Unconstrained First & second derivative
Multivariable Unconstrained Gradient & Hessian
Equality constraint Constrained Lagrange multipliers
Inequality constraint Constrained KKT / boundary check

The Hessian matrix plays a central role in calculus-based optimization, especially for multivariable
functions, because it determines the nature of stationary points (minimum, maximum, or saddle point).

Definition of the Hessian Matrix


For a scalar function

6
𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 )
The Hessian matrix is the square matrix of second-order partial derivatives:
∂2 𝑓 ∂2 𝑓 ∂2 𝑓

∂𝑥12 ∂𝑥1 ∂𝑥2 ∂𝑥1 ∂𝑥𝑛
2 2
∂ 𝑓 ∂ 𝑓 ∂2 𝑓
𝐇(𝑓) = ∂𝑥2 ∂𝑥1 ⋯
∂𝑥22 ∂𝑥2 ∂𝑥𝑛
⋮ ⋮ ⋱ ⋮
2 2
∂ 𝑓 ∂ 𝑓 ∂2 𝑓

[∂𝑥𝑛 ∂𝑥1 ∂𝑥𝑛 ∂𝑥2 ∂𝑥𝑛2 ]
If second derivatives are continuous, the Hessian is symmetric:
∂2 𝑓 ∂2 𝑓
=
∂𝑥𝑖 ∂𝑥𝑗 ∂𝑥𝑗 ∂𝑥𝑖
Role of the Hessian in Optimization
At a stationary point 𝐱 ∗ where:
∇𝑓(𝐱 ∗ ) = 𝟎
the Hessian determines the curvature of the function near that point.
Unconstrained Minimization: Powell’s Method of Conjugate Directions

Powell’s method is a derivative-free, unconstrained optimization technique used to minimize a


multivariable function. It is especially useful when gradient information is unavailable or expensive to
compute, yet the objective function is smooth.

1. Motivation and Idea


• Gradient-based methods (steepest descent, Newton) require derivatives
• Powell’s method uses only function evaluations
• It accelerates convergence by performing line searches along conjugate directions
The method is most effective for quadratic or near-quadratic functions

Powell’s Method of Conjugate Directions — Recap (Very Brief)

For minimizing 𝑓(𝐱)without derivatives:

1. Start at 𝐱 0

2. Choose initial directions (usually unit vectors)

3. Perform 1-D line minimization along each direction

4. Form a new direction = net displacement


5. Replace one direction

6. Repeat until convergence

3. Basic Structure of Powell’s Method


Let:
• 𝐱 0 : initial guess
• {𝐝1 , 𝐝2 , … , 𝐝𝑛 }: initial direction set (usually unit vectors)

Algorithm Steps
Step 1: Initialization
𝐱 (0) = 𝐱 0
𝐝1 = [1,0, … ,0]𝑇 , … , 𝐝𝑛 = [0, … ,1]𝑇

Step 2: Sequential Line Minimization

7
For 𝑖 = 1to 𝑛:
𝐱 (𝑖) = 𝐱 (𝑖 −1) + 𝛼𝑖 𝐝𝑖

where 𝛼𝑖 minimizes:
𝑓(𝐱 (𝑖 −1) + 𝛼𝐝𝑖 )

Step 3: Direction Update


Compute:
𝐝𝑛+1 = 𝐱 (𝑛) − 𝐱 (0)

Discard 𝐝1 , shift others, and include 𝐝𝑛+1 .

Step 4: Line Search in New Direction


Perform minimization along 𝐝𝑛+1 .

Step 5: Repeat
Continue until convergence:
∥ 𝐱 (𝑘+1) − 𝐱 (𝑘) ∥< 𝜀

4. Fully Worked Numerical Example (2-Variable Case)


Problem
Minimize:
𝑓(𝑥, 𝑦) = (𝑥 − 1)2 + 2(𝑦 − 2)2

Step 1: Initial Guess and Directions


𝐱0 = (0,0)
𝐝1 = (1,0), 𝐝2 = (0,1)

Step 2: Line Minimization Along 𝐝𝟏


𝑓(𝑥, 0) = (𝑥 − 1)2 + 8

Minimize w.r.t. 𝑥:
𝑑𝑓
= 2(𝑥 − 1) = 0 ⇒ 𝑥 = 1
𝑑𝑥
𝐱1 = (1,0)

Step 3: Line Minimization Along 𝐝𝟐


𝑓(1, 𝑦) = 2(𝑦 − 2)2
𝑑𝑓
= 4(𝑦 − 2) = 0 ⇒ 𝑦 = 2
𝑑𝑦
𝐱2 = (1,2)

Step 4: New Conjugate Direction


𝐝3 = 𝐱2 − 𝐱 0 = (1,2)

Step 5: Line Minimization Along 𝐝𝟑


𝐱(𝛼) = (1,2) + 𝛼(1,2)
Where x= 1, y =2
𝑓(𝛼) = (𝛼 + 1 − 1)2 + 2(2𝛼 + 2 − 2)2
𝑓 = (𝛼)2 + 2(2𝛼)2 = 𝛼 2 + 8𝛼 2 = 9𝛼 2
⇒𝛼=0

Thus:
𝐱 ∗ = (1,2)

✔ Global minimum found in one cycle

8
Worked 2-D Numerical Example

Minimize:

𝑓(𝑥, 𝑦) = (𝑥 − 1)2 + 2(𝑦 − 2)2

True minimum:

(𝑥 ∗ , 𝑦 ∗ ) = (1,2)

Step 1: Initialization

𝐱0 = (0,0)

Initial directions:

1 0
𝐝1 = [ ] , 𝐝2 = [ ]
0 1
Step 2: Line Search Along 𝐝𝟏 (x-direction)

𝑓(𝑥, 0) = (𝑥 − 1)2 + 8

Minimize:

𝑑
[(𝑥 − 1)2 ] = 2(𝑥 − 1) = 0
𝑑𝑥
𝑥=1
𝐱 (1) = (1,0)

Step 3: Line Search Along 𝐝𝟐 (y-direction)

𝑓(1, 𝑦) = 2(𝑦 − 2)2


𝑑
= 4(𝑦 − 2) = 0
𝑑𝑦
𝑦=2
𝐱 (2) = (1,2)

Step 4: New Direction

Net displacement:

1
𝐝𝑛𝑒𝑤 = 𝐱 (2) − 𝐱0 = [ ]
2
Replace 𝐝1 with 𝐝𝑛𝑒𝑤

Now directions:

0 1
𝐝1 = [ ] , 𝐝2 = [ ]
1 2
Since we already reached the exact minimum, further steps give no change.

❖ Converged in one cycle.

𝑨𝒍𝒕𝒆𝒓𝒏𝒂𝒕𝒊𝒗𝒆𝒍𝒚
If our starting point is 𝑆𝑞 and we take a step of length 𝛼𝑞 in the direction 𝑆𝑞 , then 𝑋𝑞+1 = 𝑋𝑞 + 𝛼𝑞 𝑆𝑞 ,

where 𝑋𝑞 is the old position and 𝑋𝑞+1 is the new position.

9
For minimization, 𝐹(𝑋𝑞+1 ) ≤ 𝐹(𝑋𝑞 )

For each variable,


(𝑖) (𝑖) (𝑖)
𝑋1𝑖+1 = 𝑋1 + 𝛼1 𝑆1
(𝑖) (𝑖) (𝑖)
𝑋2𝑖+1 = 𝑋2 + 𝛼2 𝑆2
(𝑖) (𝑖) (𝑖)
𝑋𝑛𝑖+1 = 𝑋𝑛 + 𝛼𝑛 𝑆𝑛
For each co-ordinate change of variable, determine 𝛼 by minimizing,

𝐹(𝑥 (𝑖+1) ) = 𝐹(𝑥𝑗𝑖 + 𝛼 (𝑖) 𝑆 (𝑖) )

Use the value of 𝛼 to determine the new co-ordinate point. At the end of the cycle, the new direction is given
by 𝑆 = 𝑋 − 𝑌, where X is the current (new) co-ordinate point and Y is the old co-ordinate point. The problem
is to actually minimize

𝐹(𝑋 (𝑖) + 𝛼 (𝑖) 𝑆𝑗 ) = 𝐹(𝛼), i.e., find the step size.

Example 1: Minimize 𝐹(𝑥) = 3𝑥12 + 𝑥22 − 12𝑥1 − 8𝑥2 , using Powel’s method. Start with F(0,0).
Solution:
Set Y = X = [0,0]
F(0, 0) = 0
Cycle 1:
Step1:
Change
(0) (0) (0) (0) (0)
𝑥𝑖1 = 𝑥𝑖 + 𝛼𝑖 𝑆𝑗 = 𝑥1 𝛼1

Since 𝑆1 is a unit vector of the cycle,


(0) (0) (0) (0) (0) 2 (0) 2 (0) (0) (0)
𝐹(𝑥1 + 𝛼1 , 𝑥2 ) = 3(𝑥1 + 𝛼1 ) + (𝑥2 ) − 12(𝑥1 + 𝛼1 ) − 8𝑥2

Substituting the penultimate value x = (0, 0), we have


(0) 2 (0)
𝐹(𝛼1 ) = 3(𝛼1 ) − 12(𝛼1 )
(0) (0)
Solving for 𝛼1∗ : 𝐹 ′ (𝛼1 ) = 6𝛼1 − 12 = 0, 𝑤ℎ𝑖𝑐ℎ 𝑦𝑖𝑒𝑙𝑑𝑠 𝛼1 = 2.
Hence, the new value of 𝑥1 𝑖𝑠 𝑥1 = 0 + 𝛼1∗ = 2
Test if x = (2, 0) minimizes F(x) : 𝐹(2,0) = 12 − (12 × 2) = −12
Therefore, 𝐹(𝑋𝑞+1 ) < 𝐹(𝑋𝑞 ), so we are moving in the right direction.

Step 2:
(1) (0) (0) (0) (0) (0)
Change 𝑥2 𝑡𝑜 𝑥2 + 𝛼2 𝑆2 = 𝑥2 + 𝛼2
(1) (1)
Substituting the new 𝑥1 and the present 𝑥2 in the objective function, we obtain:
(1) (0) (0) (1) 2 (0) (0) 2 (1) (0) (0)
𝐹(𝑥1 , 𝑥2 + 𝛼2 ) = 3(𝑥1 ) + (𝑥2 +𝛼2 ) − 12(𝑥1 ) − 8(𝑥2 + 𝛼2 )

10
(0) 2 (0) (0) (1) 2 (1)
𝐹(𝛼2 ) = 3(2)2 + (𝛼2 ) − 12(2) − 8(𝑥2 + 𝛼2 ) = (𝛼2 ) − 8𝛼2 − 12
(1)
Therefore, 𝐹 ′ (𝛼2 ) = 2𝛼2 − 8 = 0
(1)
Given 𝛼2 = 4
Test whether 𝑥 = (2,4) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒𝑠 𝐹(𝑥)
𝐹(2,4) = 3(2)2 + (4)2 − 12(2) − 8(4) = −28
Hence, 𝐹(𝑋𝑞+1 ) < 𝐹(𝑋𝑞 ), so we are moving in the right direction.

𝑆 = 𝑋 − 𝑌 = (2,4) − (0,0) = (2,4)


When do we stop this iteration?
Test when |𝐹(𝑥)(𝑖 + 1) − 𝐹(𝑥)𝑖| ≤ 𝜀 (𝑤ℎ𝑒𝑟𝑒 𝜀 is a specified tolerance value)
Cycle 2:
Step 1:
(2) (1) (1) (1) (1)
Change 𝑥𝑖 𝑡𝑜 𝑥1 + 𝛼1 = 𝑥1 + 2𝛼1

(1) 2 (1)
Hence, 𝐹(𝑥1 + 2𝛼1 , 𝑥2 ) = 3(𝑥1 + 2𝛼1 ) + 𝑥22 − 12(𝑥1 + 2𝛼1 ) − 8𝑥 2
(2)
Thus,𝐹 ′ (𝛼1 ) = 6(2 + 2𝛼1 )2 − 24 = 0 which yields 𝛼1 = 0 𝑖. 𝑒. , 𝑥1 = 2
There is no need to test for convergence since there is no change in 𝑥1 .
Step 2:
(2) (1) (1) (1) (1)
Change 𝑥2 𝑡𝑜 𝑥2 + 𝛼2 = 𝑥2 + 4𝛼1

𝐹(𝑥1 , 𝑥2 + 4𝛼2 ) = 3𝑥12 + ( 𝑥2 + 4𝛼2 )2 − 12𝑥1 − 8(𝑥2 + 𝛼2 )


where X = (2, 4).
𝐹(2,4) = 3(22 ) + (𝑥2 + 4𝛼2 )2 − 12(2) − 8(4 + 4𝛼2 )
Therefore, 𝐹 ′ (𝛼2 ) = 2(4 + 4𝛼2 )4 − 8(4) = 0
(2) (2)
This gave 𝛼2 = 0 while 𝑥2 = 4
Therefore, F(X) is minimum at X = (2, 4).
The optimal value of the objective function F(X) = - 28.
Thus, a quadratic function of two variables is minimized in two cycles. In conclusion, for a quadratic equation
or function, the number of variables is equal to the number of cycles needed to get convergence.
Powell showed that, for a quadratic form, k iterations of the above basic procedure produce a set of directions
µ𝑖 whose last k members are mutually conjugate. Therefore, N iterations of the basic procedure, amounting to
N (N + 1) line minimizations in all, will exactly minimize a quadratic form.

Alternative Solution to the same question.


𝑭(𝒙) = 𝟑𝒙𝟐𝟏 + 𝒙𝟐𝟐 − 𝟏𝟐𝒙𝟏 − 𝟖𝒙𝟐
Given that 𝒙 = (𝟎, 𝟎) 𝒂𝒏𝒅 𝑺 = (𝟏, 𝟏)

11
Applying this 𝐹(𝑥1 , 𝑥1 ) ⟹ 𝑥1 + 𝑆𝛼 𝑖. 𝑒. , 𝑥1 𝑐ℎ𝑎𝑛𝑔𝑒𝑠 𝑤ℎ𝑖𝑙𝑒 𝑥2 𝑟𝑒𝑚𝑎𝑖𝑛𝑠 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
Then substitute the values of 𝒙𝟏 into the given equation yields
𝑭(𝒙) = 𝟑(𝑥1 + 𝑆𝛼 )𝟐 + 𝒙𝟐𝟐 − 𝟏𝟐(𝑥1 + 𝑆𝛼 ) − 𝟖𝒙𝟐
Substituting the initial values of 𝑥1 𝑎𝑛𝑑 𝑆
𝑭(𝜶) = 𝟑(𝛼 )𝟐 − 𝟏𝟐𝜶
𝜕𝑓
At min. point 𝜕𝛼 = 0
𝜕𝑓
= 6𝛼 − 12
𝜕𝛼
∴𝜶=𝟐
Substituting, 𝑥1 = 2 𝑎𝑛𝑑 𝑥2 = 0 in the initial equation, 𝑓(0) = 0 𝑎𝑛𝑑 𝑓(2, 0) = 12 + 0 − 24 − 0 = −12
. with in new value it is approaching minimum point.
Next step is to set 𝑥1 = 2 𝑎𝑛𝑑 𝑥2 = 𝑥2 + 𝑆𝛼, substitute into the main equation gives
𝑭(𝒙) = 𝟑(𝑥1 )𝟐 + (𝑥2 + 𝑆𝛼 )𝟐 − 𝟏𝟐𝑥1 − 𝟖(𝑥2 + 𝑆𝛼 )
𝑭(𝟐, 𝟎) = 𝒇(𝜶) = 𝟏𝟐 + 𝜶𝟐 − 𝟐𝟒 − 𝟖𝜶
𝜕𝑓
= 2𝛼 − 8
𝜕𝛼
∴ 𝜶 = 𝟒 hence the new value for 𝑥2 = 4
Checking the direction from the original equation
𝑭(𝟐, 𝟒) = 𝟏𝟐 + 𝟏𝟔 − 𝟐𝟒 − 𝟑𝟐 = −𝟐𝟖
This showed that we are still on course, going down towards the minimum value.
Having completed one cycle and a new value of S will be obtained via
𝑆𝑛𝑒𝑤 = 𝑋 − 𝑌
where X = new value for x and Y = old value for x.
𝑆𝑛𝑒𝑤 = (2 − 0; 4 − 0) = (2, 4)
Repeat the cycle with this new value of S
Now 𝑥1 → 𝑥1 + 𝑆𝛼 = 2 + 2𝛼
𝑭(𝒙) = 𝟑(2 + 2𝛼 )𝟐 + 𝟏𝟔 − 𝟏𝟐(2 + 2𝛼 ) − 𝟑𝟐
𝜕𝑓
= 2 × 3(2 + 2𝛼) × 2 − 24
𝜕𝛼
∴𝛼=0
Secondly 𝑥2 = 𝑥2 = 4
Since 𝛼 = 0 it means that 𝑥1 does not need any change and therefore 𝑥1 𝑟𝑒𝑚𝑎𝑖𝑛𝑠 2
Hence, 𝑥2 → 𝑥2 + 𝑆𝛼 = 4 + 4𝛼 𝑤ℎ𝑖𝑙𝑒 𝑥1 = 𝑥1 = 2
𝑭(𝑥) = 3𝑥12 + 𝑥22 − 12𝑥1 − 8𝑥2
𝐹(𝑥) = 12 + (4 + 4𝛼)2 − 24 − 8(4 + 4𝛼)
𝜕𝑓
= 2(4 + 4𝛼) × 4 − 32 = 0
𝜕𝛼
∴𝛼=0
Also, since 𝛼 = 0, then 𝒙𝟐 does not any further change remain at 𝑥2 = 4
∴ Optimum point =F(x) = -28 and the variable that produces this point are 𝑥 = 2, 4.
𝑂𝑝𝑡𝑖𝑚𝑢𝑚 𝑝𝑜𝑖𝑛𝑡 = −28
𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑎𝑟𝑒 𝑥 = 2, 4.
Example 1 — Simple Quadratic (Diagonal Hessian)

Minimize:

𝑓(𝑥, 𝑦) = (𝑥 − 4)2 + 2(𝑦 − 1)2

True minimum:

(𝑥 ∗ , 𝑦 ∗ ) = (4,1)

Start:

𝐱0 = (0,0)

12
Initial directions:

𝐝1 = (1,0), 𝐝2 = (0,1)

Step 1 — Line search along 𝒅𝟏

𝐱 = (0,0) + 𝛼(1,0) = (𝛼, 0)


𝑓(𝛼, 0) = (𝛼 − 4)2 + 2(0 − 1)2
= (𝛼 − 4)2 + 2

Minimize:

𝑑
= 2(𝛼 − 4) = 0
𝑑𝛼
𝛼=4
𝐱1 = (4,0)

Step 2 — Line search along 𝒅𝟐

𝐱 = (4,0) + 𝛽(0,1) = (4, 𝛽)


𝑓(4, 𝛽) = 0 + 2(𝛽 − 1)2

Minimize:

𝛽=1
𝐱2 = (4,1)

❖ Reached minimum in one cycle.

Example 2 — Coupled Quadratic (Non-Diagonal Hessian)

This shows why conjugate directions matter.

Minimize:

𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑥𝑦 + 𝑦 2

Hessian:

2 1
𝐻=[ ]
1 2
True minimum:

(0, 0)

Start:

𝐱0 = (1,2)

Directions:

𝑑1 = (1,0), 𝑑2 = (0,1)

Step 1 — Along 𝒅𝟏

𝑥 = 1 + 𝛼, 𝑦 = 2
𝑓 = (1 + 𝛼)2 + (1 + 𝛼)(2) + 4
= 𝛼 2 + 4𝛼 + 7

13
𝑑
= 2𝛼 + 4 = 0
𝑑𝛼
𝛼 = −2
𝐱1 = (−1,2)

Step 2 — Along 𝒅𝟐

𝑥 = −1, 𝑦 = 2 + 𝛽
𝑓 = 1 + (−1)(2 + 𝛽) + (2 + 𝛽)2
= 𝛽 2 + 3𝛽 + 3
𝑑
= 2𝛽 + 3 = 0
𝑑𝛽
𝛽 = −1.5
𝐱2 = (−1,0.5)

New conjugate direction

𝑑3 = 𝑥2 − 𝑥0 = (−1,0.5) − (1,2)
𝑑3 = (−2, −1.5)

Step 3 — Along 𝒅𝟑

𝑥 = −1 − 2𝛾
𝑦 = 0.5 − 1.5𝛾

Substitute into 𝑓, simplify:

𝑓(𝛾) = 8.5𝛾 2

Minimize:

𝛾=0

➡ Already near minimum; next cycle rapidly converges to (0,0)

Worked 3-Variable Example: Powell’s Method of Conjugate Directions

Minimize:

𝑓(𝑥, 𝑦, 𝑧) = (𝑥 − 1)2 + 2(𝑦 − 2)2 + 3(𝑧 + 1)2

This is a strictly convex quadratic.

Step 0 — True Solution (for checking)

∇𝑓 = 0
𝑥 ∗ = 1, 𝑦 ∗ = 2, 𝑧 ∗ = −1

Step 1 — Initialization

Starting point:

𝐱0 = (0,0,0)

Initial directions (unit coordinate directions):

𝑑1 = (1,0,0)
𝑑2 = (0,1,0)
𝑑3 = (0,0,1)

14
Cycle 1

Line Search 1 — Along 𝒅𝟏

𝐱 = (0,0,0) + 𝛼(1,0,0) = (𝛼, 0,0)


𝑓(𝛼, 0,0) = (𝛼 − 1)2 + 8 + 3
= (𝛼 − 1)2 + 11

Minimize:

𝑑
= 2(𝛼 − 1) = 0
𝑑𝛼
𝛼=1
𝐱1 = (1,0,0)

Line Search 2 — Along 𝒅𝟐

𝐱 = (1,0,0) + 𝛽(0,1,0) = (1, 𝛽, 0)


𝑓(1, 𝛽, 0) = 0 + 2(𝛽 − 2)2 + 3
= 2(𝛽 − 2)2 + 3

Minimize:

𝑑
= 4(𝛽 − 2) = 0
𝑑𝛽
𝛽=2
𝐱2 = (1,2,0)

Line Search 3 — Along 𝒅𝟑

𝐱 = (1,2,0) + 𝛾(0,0,1) = (1,2, 𝛾)


𝑓(1,2, 𝛾) = 0 + 0 + 3(𝛾 + 1)2

Minimize:

𝑑
= 6(𝛾 + 1) = 0
𝑑𝛾
𝛾 = −1
𝐱3 = (1,2, −1)

New Conjugate Direction

Net displacement over the cycle:

𝑑4 = 𝐱3 − 𝐱0
𝑑4 = (1,2, −1)

GRADIENT METHODS
Unlike Powell’s method, all the variables are changed simultaneously for one cycle of the gradient method.
There are many types of gradient methods but we are going to treat the steepest decent/ascent method.
A large number of multi-dimensional optimization algorithms depend in some way in gradient information.
In general, we have an n-dimensional vector 𝑦 = (𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 ). In order to find the optimum of a
function, f, we consider a vector 𝑚𝑗 such that 𝑚1 is the direction of move or change of 𝑥1 𝑎𝑛𝑑 𝑚2 𝑖𝑠 the
direction of change or move of 𝑥2 𝑎𝑛𝑑 𝑚𝑛 is the direction of change or move of 𝑥𝑛 .
𝑚1 is the direction of change or move of 𝑥𝑛
𝑚2 is the direction of change or move of 𝑥2
𝑚3 is the direction of change or move of 𝑥3

15
𝑚𝑛 is the direction of change or move of 𝑥𝑛
(𝑖)
That is, from position 𝑥𝑗 we move to a new position
(𝑖+𝑗) (𝑖) (𝑖)
𝑥𝑗 = 𝑥𝑗 + 𝑆𝑚𝑗 (1)
where S is the magnitude of the step size and for a 2-dimensional case.
For n-dimensional case (n-decision variables case), we have;
𝑛

𝑑𝑆 = √𝑑𝑥12 + 𝑑𝑥22 +. . . +𝑑𝑥𝑛2 𝑂𝑅 𝑑𝑆 = √∑ 𝑑𝑥𝑗2 (2)


𝑗=1

If we divide both sides by ds, then it gives


𝑛 2 𝑛 2
𝑑𝑥𝑗 𝑑𝑥𝑗
1 = √∑ ( ) = ∑ ( ) (3)
𝑑𝑆 𝑑𝑆
𝑗 𝑗

This change of size will cause a change in the objective function f given by:
𝜕𝑓 𝜕𝑓 𝜕𝑓 𝜕𝑓
𝑑𝑓 = 𝑑𝑥1 + 𝑑𝑥2 + 𝑑𝑥3 + . . . + 𝑑𝑥
𝜕𝑥1 𝜕𝑥2 𝜕𝑥3 𝜕𝑥𝑛 𝑛
Dividing by dS throughout we have;
𝑛
𝑑𝑓 𝜕𝑓 𝑑𝑥1 𝜕𝑓 𝑑𝑥2 𝜕𝑓 𝑑𝑥3 𝜕𝑓 𝑑𝑥𝑛 𝜕𝑓 𝑑𝑥𝑗
= + + + . . . + =∑ (4)
𝑑𝑆 𝜕𝑥1 𝑑𝑆 𝜕𝑥2 𝑑𝑆 𝜕𝑥3 𝑑𝑆 𝜕𝑥𝑛 𝑑𝑆 𝜕𝑥𝑗 𝑑𝑆
𝑗=1
𝑑𝑓
A particular setoff displacement 𝑑𝑥𝑗 will make 𝜕𝑥 as large or as small as possible. This is the direction of
𝑗
the steepest ascent or descent. The problem may be stated as follows:
𝑑𝑓 𝜕𝑓 𝑑𝑥𝑗
Min/Max 𝑑𝑆 = ∑𝑛𝑗=1 𝜕𝑥 𝑑𝑆
𝑗
𝑑𝑥𝑗 2
s.t. ∑𝑛𝑗=1 ( 𝑑𝑆 ) = 1
This is a constrained optimization problem, which must be converted to an unconstrained optimization
problem.

Using the Lagrangian function, we have:

Min/Max:
𝑛 𝑛 2
𝜕𝑓 𝑑𝑥𝑗 𝑑𝑥𝑗
𝐿=∑ − 𝜆 {∑ ( ) − 1}
𝜕𝑥𝑗 𝑑𝑆 𝑑𝑆
𝑗=1 𝑗=1
Solution Steps:
𝑑𝑥
Step 1: Differentiate L with respect to 𝑑𝑆𝑗 and set to zero.
𝑑𝐿 𝑑𝑓 𝑑𝑥𝑗
𝑑𝑥𝑗 = − 2𝜆 =0 𝑓𝑜𝑟 𝑗 = 1, 2, 3, . . . , 𝑛 (5)
⁄ 𝑑𝑥𝑗 𝑑𝑆
𝑑𝑆
Step 2:
Differentiate L w.r.t 𝜆 and setto zero, then it gives;
𝑛 2
𝑑𝑥𝑗
∑( ) = 1 (6)
𝑑𝑆
𝑗=1
𝑑𝑥𝑗 1 𝑑𝑓
From Eqn. (5) = 2𝜆 . 𝑑𝑥 (7)
𝑑𝑆 𝑗
Substituting eqn. (7) into (6) gives;
𝑛 2
1 𝑑𝑓
2
∑( ) = 1 (8)
4𝜆 𝑑𝑥𝑗
𝑗=1

16
2
𝑑𝑓
and 2𝜆 ± √∑𝑛𝑗=1 (𝑑𝑥 ) =1 (9)
𝑗

Note that from eqn. (8) the change in xj as we make a move is given by:
𝑑𝑥𝑗 1 𝑑𝑓
= 2𝜆 . 𝑑𝑥 𝑗 = 1, 2, . . . 𝑛 and that our new position is:
𝑑𝑆 𝑗
𝑑𝑥𝑗 (𝑖)
𝑥𝑗𝑖+1 = 𝑥𝑗𝑖 , 𝑆 = 𝑥𝑗 + 𝑚𝑗
𝑑𝑆
𝑑𝑓
𝑑𝑥𝑗 1 𝑑𝑓 𝑑𝑥𝑗
where 𝑚𝑗 = = 2𝜆 . 𝑑𝑥 = ± (10)
𝑑𝑆 𝑗 2
𝑑𝑓
√∑𝑛
𝑗=1(𝑑𝑥 )
𝑗

The positive sign increases f and therefore it is ascent while the negative sign decreases f and is for descent.
The numerator is the gradient and the denominator is the normalizing factor. To find the step S, substitute
𝑥𝑗𝑖+1 in f and minimize w.r.t. S in order to find the optimum step size. We can test if the optimum obtained is
a minimum or maximum by checking the Hessian matrix. For minimum, H is positive definite at x.
test for convergence |𝑓(𝑥𝑗𝑖+1 ) − 𝑓(𝑥 𝑖 )| ≤ 𝜀

Steepest Descent/Ascent method


Given F(x), choose initial values of 𝑥 (0) = 𝑥10 , 𝑥20 , . . . , 𝑥 0𝑛 , i = with these values we calculate the gradient
given by;
𝜕𝑓
± ⁄𝜕𝑥
𝑗
𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡, 𝑀𝑗 = 2
𝜕𝑓
√∑𝑛
𝑗=1( )
𝜕𝑥𝑗

Next, we look for value of 𝑥 𝑖. 𝑒. , 𝑥 ′ and given by 𝑥 ′ = 𝑥 0 + 𝑀0 𝑆 which is similar to 𝑥 𝑖+1 = 𝑥 2 + 𝛼 2 𝑆 𝑖 .


𝜕𝑓(𝑠)
We put the value of 𝑥 ′ 𝑖𝑛𝑡𝑜 the function to get the function dependent on S, we get 𝜕𝑆 = 0.
Summary of the solution steps
1. Let the decision variable be 𝑥 = 𝑥1 , 𝑥2 , . . . 𝑥𝑛
2. Starting point 𝑥 0 = 𝑥10 , 𝑥20 , . . . , 𝑥𝑛0
𝛿𝑓
3. Calculate the gradient ∀ 𝑗 = 1, 2, 3, . . . , 𝑛 𝑁𝐵: ∀ 𝑚𝑒𝑎𝑛𝑠 𝑓𝑜𝑟 𝑎𝑙𝑙
𝛿𝑥𝑗
𝜕𝑓
± ⁄𝜕𝑥
𝑗
4. Define 𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡, 𝑀𝑗 = 2
𝜕𝑓
√∑𝑛
𝑗=1( )
𝜕𝑥𝑗

5. Set new values of x as 𝑥 = 𝑥 𝑖 + 𝑚(𝑖) 𝑆, 𝑤ℎ𝑒𝑟𝑒 𝑚 = [𝑚1 , 𝑚2 , . . ., 𝑚𝑛 ]


𝑖+1

6. Substitute 𝑥 𝑖+1 in the objective function, thus yielding a function F(s).


7. Determine the value of S that optimize F(s).
8. Calculate the new value of 𝑥 𝑖+1
9. Test for convergence |𝑓(𝑥 𝑖+1 ) − 𝑓(𝑥 𝑖 )| ≤ 𝜀

Example; 𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑭(𝑥) = 3𝑥12 + 𝑥22 − 12𝑥1 − 8𝑥2

Starting at 𝑥 0 = (0, 0), chose 𝜀 = 0.1.

Solution
𝜕𝑓
± ⁄𝜕𝑥
𝑗
Applying 𝑀𝑗 = 2
𝜕𝑓
√∑𝑛
𝑗=1( )
𝜕𝑥𝑗
𝜕𝑓 𝜕𝑓
= 6𝑥1 − 12 and = 2𝑥2 − 8
𝜕𝑥1 𝜕𝑥2
Substituting the initial conditions of 𝑥 0 = (0, 0)

17
𝜕𝑓 𝜕𝑓
| = −12 𝑎𝑛𝑑 | = −8
𝜕𝑥1 𝑥 0 𝜕𝑥2 𝑥 0
1 2
−12
𝑇ℎ𝑒𝑛, 𝑀10 = = −0.8321
√(−12)2 +(−8)2
−8
𝑀20 = = −0.5547
√(−12)2 +(−8)2
Step 5: New values are:
Thus, going to our equation
(1) (0)
𝑥1 = 𝑥1 + 𝑚1 𝑆 and
(1) (0)
𝑥2 = 𝑥2 + 𝑚2 𝑆
(1)
∴ 𝑥1 = 0 − 0.8321𝑆 = −0.8321𝑆
(1)
𝑥2 = 0 − 0.5547𝑆 = −0.5547𝑆
Step 6:
Substituting these values into the objective function to obtain;
Thus, 𝐹(𝑆) = 3(−0.8321𝑆)2 + (−0.5547𝑆)2 − 12(−0.8321𝑆) − 8(−0.5547𝑆)
𝜕𝑓(𝑠)
= 0;
𝜕𝑆
𝜕𝑓(𝑠)
= 6(−0.8321𝑆)(−0.8321) + 2(−0.5547𝑆)(−0.5547) + 12(0.8321) + 8(0.5547
𝜕𝑆
= 2 × 2.3849𝑆 + 14.4728 = 0
−14.4728
𝑆= = −3.0238
4.47698
Hence, substituting the value of S = -3.0238 into 𝑥1′ 𝑎𝑛𝑑 𝑥2′
(1)
𝑥1 = −0.8321 × −3.0238 = 2.5161
(1)
𝑥2 = −0.5547 × −3.0238 = 1.6773
⟹ 𝑓(2.5161, 1.6773) = −21.806
Thus, from our result we are minimizing because we got a negative value which is far smaller than zero
which was the initial value we started with;
Iteration 2: Then find
𝜕𝑓
± ⁄𝜕𝑥 ⌋ ′
(1) 1 𝑥
1
𝑀1 = 2 2
𝜕𝑓 𝜕𝑓
√( ⌋ ) +( ⌋ )
𝜕𝑥1 𝑥′ 𝜕𝑥2 𝑥′
1 2

𝜕𝑓 𝜕𝑓
= 6𝑥1 − 12 and = 2𝑥2 − 8
𝜕𝑥1 𝜕𝑥2

𝜕𝑓
Hence, 𝜕𝑥 ⌋ (1)
= 6 × 2. 5161 − 12 = 3. 0966
1 𝑥1
𝜕𝑓
⌋ = 2 × 1.6773 − 8 = −4.6454
𝜕𝑥2 𝑥 (1)
2
(1) ±3.0966 3.0966
𝑀1 = = 5.5829 = 0.5547
√(3.0966)2 +(−4.6454)2
(1) −4.6454 −4.6454
𝑀2 = = = −0.8321
√(3.0966)2 +(−4.6454)2 5.5829
(1)
𝑥12 = 𝑥11 + 𝑀1 𝑆 = 2.5161 + 0.5547𝑆
(1)
𝑥22 = 𝑥21 + 𝑀2 𝑆 = 1.6773 − 0.8321𝑆
To find the value of S, plug back into the original equation and differentiate and equate to zero.
∴ 𝐹(𝑆) = 3(2.5161 + 0.5547𝑆)2 + (1.6773 − 0.8321𝑆)2 − 12(2.5161 + 0.5547𝑆) − 8(1.6773
− 0.8321𝑆)
𝜕𝑓(𝑠)
= 0;
𝜕𝑆

18
6(2.5161 + 0.5547𝑆)(0.5547) + 2(1.6773 − 0.8321𝑆)(−0.8321) − 12(0 + 0.5547) −
8(0.8321)=0
∴ 𝑆 = −1.7280
𝑥12 = 𝑥11 + 𝑀1′ 𝑆 = 2.5161 + 0.5547 × −1.728 = 1.5576
𝑥2′ = 𝑥21 + 𝑀2′ 𝑆 = 1.6773 − 0.8321 × −1.728 = 3.115
𝐻𝑒𝑛𝑐𝑒, 𝑓(1.5576, 3.115) = −26.63
We are still going towards the minimum point since −26.63 < −21.806
Checking, −21.806 − (−26.63) ≤ 𝐸𝑟𝑟𝑜𝑟 𝑚𝑎𝑟𝑔𝑖𝑛
4.824 > 𝐸𝑟𝑟𝑜𝑟 𝑚𝑎𝑟𝑔𝑖𝑛 𝜀 ≤ 0.0001

Step 3: Iteration continues


𝜕𝑓 𝜕𝑓
= 6𝑥1 − 12 and = 2𝑥2 − 8
𝜕𝑥1 𝜕𝑥2

𝜕𝑓
Hence, ⌋ = 6 × 1.5576 − 12 = −2.6544
𝜕𝑥1 𝑥 2
1
𝜕𝑓
⌋ = 2 × 3.115 − 8 = −1.77
𝜕𝑥2 𝑥 2
2
𝜕𝑓
± ⁄𝜕𝑥 ⌋ 3
1 𝑥
𝑀12 = 2
1
2
𝜕𝑓 𝜕𝑓
√( ⌋ ) +( ⌋ )
𝜕𝑥1 𝑥3 𝜕𝑥2 𝑥3
1 2
−2.6544 −2.6544
𝑀12 = √−2.65442 = = −0.832
+(−1.77)2 3.1904
−1.77 −1.77
𝑀22 = = = −0.5548
√(−2.6544)2 +(−1.77)2 3.1904
𝑥13 = 2 2
𝑥1 + 𝑀1 𝑆 = 1.5576 + 0.832𝑆
𝑥22 = 𝑥21 + 𝑀2′ 𝑆 = 3.115 − 0.5548𝑆
∴ 𝐹(𝑆) = 3(1.5576 + 0.832𝑆 )2 + (3.115 − 0.5548𝑆)2 − 12(1.5576 + 0.832𝑆 ) − 8(3.115
− 0.5548𝑆)
𝜕𝑓(𝑠)
= 0; 6(1.5576 − 0.832𝑆 )(−0.832) + 2(3.115 − 0.5548𝑆 )(−0.5548) − 12(−0.832)
𝜕𝑆
− 8(−0.5548) = 0
−3.1905
∴𝑆= = −0.669
4.7689
𝑥13 = 𝑥11 + 𝑀1′ 𝑆 = 1.5576 − 0.832(−0.669) = 2.1142
𝑥23 = 𝑥21 + 𝑀2′ 𝑆3.115 − 0.5548𝑆(−0.669) = 3.4862
𝐻𝑒𝑛𝑐𝑒, 𝑓(1.5576, 3.115) = −27.904
Continue the iteration till the error is less than 0.0001
Classification of the methods
Indirect methods: The constrained problem is converted into a sequence of unconstrained problems whose
solutions will approach to the solution of the constrained problem, the intermediate solutions need not to be
feasible

Examples of Indirect methods

❖ Penalty function methods- for inequality constraints


❖ Lagrangian methods – for equality constraints

Direct methods: The constraints are taking into account explicitly; intermediate solutions are feasible

19
Penalty Function Approach
Many real engineering problems are constrained, but most powerful numerical algorithms are designed
for unconstrained minimization.
Indirect methods handle this by:
• Transforming a constrained problem
• Into a sequence of unconstrained problems
This is achieved by adding penalty term to the objective function for any violation of constraints.
The penalty function method is the most common of these.
Original constrained optimization problem
General nonlinear constrained problem:
Minimize 𝑓(𝑥)
subject to 𝑔𝑖 (𝑥) ≤ 0, 𝑖 = 1,2, … , 𝑚
ℎ𝑗 (𝑥) = 0, 𝑗 = 1,2, … , 𝑝
Where:
• 𝑓(𝑥): objective function
• 𝑔𝑖 (𝑥): inequality constraints
• ℎ𝑗 (𝑥): equality constraints
Core idea of the Penalty Function Method
Instead of enforcing constraints directly, penalize constraint violation by modifying the objective function.
𝐹(𝑥, 𝑟) = 𝑓(𝑥) + 𝑟 𝑃(𝑥)
Where:
• 𝐹(𝑥, 𝑟): penalized objective function
• 𝑟 > 0: penalty parameter
• 𝑃(𝑥): penalty term (zero when constraints are satisfied)
• As 𝑟 → ∞, the solution of the unconstrained problem approaches the constrained optimum.
Common penalty functions
(a) Equality constraints
For:
ℎ(𝑥) = 0
Penalty term:
𝑃(𝑥) = ℎ(𝑥)2
(b) Inequality constraints
For:
𝑔(𝑥) ≤ 0
Penalty term:
𝑃(𝑥) = max (0, 𝑔(𝑥))2
This ensures no penalty when the constraint is satisfied.
(c) Combined penalty function
𝑚 𝑝

𝑃(𝑥) = ∑ max (0, 𝑔𝑖 (𝑥)) + ∑ ℎ𝑗 (𝑥)2


2

𝑖=1 𝑗=1
Exterior Penalty Function Method (most common)
The exterior penalty allows constraint violation during intermediate iterations.
𝑚 𝑝

𝐹(𝑥, 𝑟) = 𝑓(𝑥) + 𝑟 [∑ max (0, 𝑔𝑖 (𝑥)) + ∑ ℎ𝑗 (𝑥)2 ]


2

𝑖=1 𝑗=1

Algorithm:
1. Choose initial 𝑥0 (may be infeasible)
2. Choose small 𝑟1

20
3. Minimize 𝐹(𝑥, 𝑟1 )(unconstrained)
4. Increase penalty parameter:
𝑟𝑘+1 = 𝑐 𝑟𝑘 , 𝑐 > 1

5. Repeat until convergence


6. Interior Penalty (Barrier) Method (contrast)
Used when:
𝑔𝑖 (𝑥) ≤ 0
Barrier function:
𝑚
1
𝐹(𝑥, 𝜇) = 𝑓(𝑥) − 𝜇 ∑
𝑔𝑖 (𝑥)
𝑖=1
Key features:
• Solution never leaves feasible region
• Requires feasible initial point
• 𝜇 → 0as iterations proceed
7. Example (simple numerical illustration)
Problem:
Minimize 𝑓(𝑥) = (𝑥 − 2)2
Subject to:
𝑥 ≥ 0 ⇒ 𝑔(𝑥) = −𝑥 ≤ 0
Penalized function:
𝐹(𝑥, 𝑟) = (𝑥 − 2)2 + 𝑟 max (0, −𝑥)2
• If 𝑥 ≥ 0: penalty = 0
• If 𝑥 < 0: penalty activates
Solve unconstrained for increasing 𝑟→ solution converges to 𝑥 = 2.

NUMERICAL EXAMPLES

Example 1: Inequality constraint (single variable)


Problem
Minimize 𝑓(𝑥) = (𝑥 − 1)2
Subject to:
𝑥 ≥ 0 ⇒ 𝑔(𝑥) = −𝑥 ≤ 0
Solution
Step 1: Construct penalty function
𝐹(𝑥, 𝑟) = (𝑥 − 1)2 + 𝑟 max (0, −𝑥)2
Step 2: Try different values of 𝒓
• If 𝑥 ≥ 0: penalty = 0
→ Minimum of (𝑥−1)2is at 𝑥 = 1
• Constraint satisfied, so:
𝑥∗ = 1
Interpretation
As 𝑟increases, any solution with 𝑥 < 0becomes very costly, pushing the solution into the feasible region.
Example 2: Equality constraint (single variable)
Problem
Minimize 𝑓(𝑥) = 𝑥 2
Subject to:
ℎ(𝑥) = 𝑥 − 1 = 0
Step 1: Penalty function
𝐹(𝑥, 𝑟) = 𝑥 2 + 𝑟(𝑥 − 1)2
Step 2: Minimize unconstrained

21
𝑑𝐹
= 2𝑥 + 2𝑟(𝑥 − 1) = 0
𝑑𝑥
(1 + 𝑟)𝑥 = 𝑟
𝑟
𝑥=
1+𝑟
Step 3: Increase penalty parameter
𝑟
𝑟 𝑥 = 1+𝑟
1 0.5
10 0.91
100 0.99
∞ 1.00
𝑥∗ → 1
Example 3: Two-variable inequality constraint
Problem
Minimize 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
Subject to:
𝑥+𝑦−1≤0
Step 1: Penalty function
𝐹(𝑥, 𝑦, 𝑟) = 𝑥 2 + 𝑦 2 + 𝑟 max (0, 𝑥 + 𝑦 − 1)2
Step 2: Optimal solution
The unconstrained minimum is at (0, 0), which satisfies the constraint:
0 + 0 − 1 = −1 ≤ 0
(𝑥 ∗ , 𝑦 ∗ ) = (0,0)
Step 3: Geometric insight
Penalty term becomes active only if 𝑥 + 𝑦 > 1, forcing the solution back into the feasible region.
Example 4: Two-variable equality constraint
Problem
Minimize 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
Subject to:
𝑥+𝑦−1=0
Step 1: Penalty function
𝐹(𝑥, 𝑦, 𝑟) = 𝑥 2 + 𝑦 2 + 𝑟(𝑥 + 𝑦 − 1)2
Step 2: Partial derivatives
∂𝐹
= 2𝑥 + 2𝑟(𝑥 + 𝑦 − 1)
∂𝑥
∂𝐹
= 2𝑦 + 2𝑟(𝑥 + 𝑦 − 1)
∂𝑦
Step 3: Solve
Subtract equations:
𝑥=𝑦
Substitute into constraint:
2𝑥 = 1 ⇒ 𝑥 = 𝑦 = 0.5
As 𝑟 → ∞:
(𝑥 ∗ , 𝑦 ∗ ) = (0.5,0.5)

Direct methods handle constraints explicitly, without converting the problem into an unconstrained one.
Main direct methods:
1. Lagrange Multiplier Method
2. Kuhn–Tucker (KKT) Conditions
3. Methods of Feasible Directions

22
1. Lagrange Multiplier Method
Applicable to:
• Equality constraints only
Problem formulation
Minimize 𝑓(𝑥)
subject to ℎ𝑗 (𝑥) = 0, 𝑗 = 1, … , 𝑝
Lagrangian function
𝑝

ℒ(𝑥, 𝜆) = 𝑓(𝑥) + ∑ 𝜆𝑗 ℎ𝑗 (𝑥)


𝑗=1
Where:
• 𝜆𝑗 : Lagrange multipliers
Necessary conditions for optimality
∇𝑥 ℒ = 0
ℎ𝑗 (𝑥) = 0
Numerical Example (Equality Constraint)
Minimize
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
Subject to
𝑥+𝑦−1=0
Lagrangian
𝓛 = 𝒙𝟐 + 𝒚𝟐 + 𝝀(𝒙 + 𝒚 − 𝟏)
First-order conditions
∂ℒ
= 2𝑥 + 𝜆 = 0
∂𝑥
∂ℒ
= 2𝑦 + 𝜆 = 0
∂𝑦
𝑥+𝑦 =1
From first two:
𝑥=𝑦
Substitute:
2𝑥 = 1 ⇒ 𝑥 = 𝑦 = 0.5
(𝑥 ∗ , 𝑦 ∗ ) = (0.5,0.5)
2. Kuhn–Tucker (Karush–Kuhn–Tucker) Conditions
Applicable to:
• Inequality constraints
• Equality constraints
• General nonlinear problems
Problem formulation
Minimize 𝑓(𝑥)
subject to 𝑔𝑖 (𝑥) ≤ 0, 𝑖 = 1, … , 𝑚
ℎ𝑗 (𝑥) = 0, 𝑗 = 1, … , 𝑝
Lagrangian
𝑝 𝑚

ℒ(𝑥, 𝜆, 𝜇) = 𝑓(𝑥) + ∑ 𝜆𝑗 ℎ𝑗 (𝑥) + ∑ 𝜇𝑖 𝑔𝑖 (𝑥)


𝑗=1 𝑖=1
Where:
• 𝜇𝑖 ≥ 0: KKT multipliers

KKT conditions
1. Stationarity

23
∇𝑓(𝑥) + ∑𝜆𝑗 ∇ℎ𝑗 (𝑥) + ∑𝜇𝑖 ∇𝑔𝑖 (𝑥) = 0
2. Primal feasibility
ℎ𝑗 (𝑥) = 0, 𝑔𝑖 (𝑥) ≤ 0
3. Dual feasibility
𝜇𝑖 ≥ 0
4. Complementary slackness
𝜇𝑖 𝑔𝑖 (𝑥) = 0

Numerical Example (Inequality Constraint)


Minimize
𝑓(𝑥) = 𝑥 2
Subject to
𝑥 − 1 ≥ 0 ⇒ 𝑔(𝑥) = 1 − 𝑥 ≤ 0

Solution:
Lagrangian
ℒ = 𝑥 2 + 𝜇(1 − 𝑥)
KKT conditions
𝜕ℒ
= 2𝑥 − 𝜇 = 0
𝜕𝑥
Stationarity:
2𝑥 − 𝜇 = 0
Complementary slackness:
𝜇(1 − 𝑥) = 0
Case 1: 𝜇 = 0
2𝑥 = 0 ⇒ 𝑥 = 0(violates constraint)
Case 2: 1 − 𝑥 = 0 ⇒ 𝑥 = 1
𝜇=2
𝑥∗ = 1
3. Methods of Feasible Directions (MFD)
Core idea
Start from a feasible point and move only in directions that maintain feasibility and reduce the objective
function.
Problem formulation
Minimize 𝑓(𝑥)subject to 𝑔𝑖 (𝑥) ≤ 0
Feasible direction 𝒅
At point 𝑥, direction 𝑑is feasible if:
∇𝑔𝑖 (𝑥)𝑇 𝑑 ≤ 0for all active constraints
Descent condition
∇𝑓(𝑥)𝑇 𝑑 < 0
Algorithm outline
1. Start with feasible 𝑥0
2. Identify active constraints
3. Compute feasible descent direction 𝑑
4. Perform line search:
𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑑
5. Repeat until convergence
Simple Numerical Illustration
Minimize
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2

Subject to

24
𝑥+𝑦−1≤0

Start at feasible point (0.2, 0.2)


Gradient:
∇𝑓 = (0.4,0.4)
Choose direction:
𝑑 = (−1, −1)
Check feasibility:
∇𝑔 = (1,1) ⇒ ∇𝑔𝑇 𝑑 = −2 ≤ 0
Direction is feasible and descent.

Problem 2
Minimize 𝑓(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2
Subject to:
ℎ(𝑥, 𝑦, 𝑧) = 𝑥 + 𝑦 + 𝑧 − 3 = 0

Solution
Step 1: Form the Lagrangian
ℒ(𝑥, 𝑦, 𝑧, 𝜆) = 𝑥 2 + 𝑦 2 + 𝑧 2 + 𝜆(𝑥 + 𝑦 + 𝑧 − 3)
Step 2: First-order conditions
∂ℒ
= 2𝑥 + 𝜆 = 0
∂𝑥
∂ℒ
= 2𝑦 + 𝜆 = 0
∂𝑦
∂ℒ
= 2𝑧 + 𝜆 = 0
∂𝑧
𝑥+𝑦+𝑧 = 3
Step 3: Solve
From the first three equations:
𝑥=𝑦=𝑧
Substitute into constraint:
3𝑥 = 3 ⇒ 𝑥 = 1
Final Answer
𝑥∗ = 𝑦∗ = 𝑧∗ = 1

Problem 3
Minimize 𝑓(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2

Subject to:
𝑔(𝑥, 𝑦, 𝑧) = 𝑥 + 𝑦 + 𝑧 − 1 ≤ 0

Step 1: Form the Lagrangian


ℒ = 𝑥 2 + 𝑦 2 + 𝑧 2 + 𝜇(𝑥 + 𝑦 + 𝑧 − 1)
Step 2: Stationarity
∂ℒ
= 2𝑥 + 𝜇 = 0
∂𝑥
∂ℒ
= 2𝑦 + 𝜇 = 0
∂𝑦
∂ℒ
= 2𝑧 + 𝜇 = 0
∂𝑧
So:
𝑥=𝑦=𝑧

25
Step 3: Complementary slackness
𝜇(𝑥 + 𝑦 + 𝑧 − 1) = 0
Step 4: Case analysis

Case 1: Constraint inactive (𝝁 = 𝟎)


𝑥=𝑦=𝑧=0
Check constraint:
0 + 0 + 0 − 1 = −1 ≤ 0(valid)
Case 2: Constraint active (𝒙 + 𝒚 + 𝒛 = 𝟏)
1
𝑥=𝑦=𝑧=
3
Compare objective values:
• At (0,0,0): 𝑓 = 0
• At (1/3,1/3,1/3): 𝑓 = 1/3

Final Answer
(𝑥 ∗ , 𝑦 ∗ , 𝑧 ∗ ) = (0,0,0)
Constraint is inactive at optimum.

3. Methods of Feasible Directions (MFD)


Core idea
• Start from a feasible point
• Move in a direction that:
o Reduces the objective function
o Maintains feasibility
Feasible direction conditions

At feasible point 𝑥, direction 𝑑must satisfy:


• Descent condition
∇𝑓(𝑥)𝑇 𝑑 < 0
• Feasibility condition
∇𝑔𝑖 (𝑥)𝑇 𝑑 ≤ 0for all active constraints

3-Variable Numerical Example (Feasible Directions)


Problem
Minimize 𝑓(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2
Subject to:
𝑔(𝑥, 𝑦, 𝑧) = 𝑥 + 𝑦 + 𝑧 − 1 ≤ 0

Step 1: Choose a feasible starting point


Let:
𝑥0 = (0.2,0.3,0.4)
Check feasibility:
0.2 + 0.3 + 0.4 − 1 = −0.1 ≤ 0
Step 2: Compute gradients

Objective gradient:
∇𝑓 = (0.4,0.6,0.8)
Constraint gradient:
∇𝑔 = (1,1,1)
Step 3: Choose a feasible descent direction
Try:

26
𝑑 = (−1, −1, −1)
Check feasibility:
∇𝑔𝑇 𝑑 = −3 ≤ 0
Check descent:
∇𝑓 𝑇 𝑑 = −1.8 < 0
Direction is feasible and descent.

Step 4: Update
𝑥1 = 𝑥0 + 𝛼𝑑

Choose small 𝛼 > 0to maintain feasibility.

Given Optimization Problem


Minimize 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
subject to 𝑔1 (𝑥, 𝑦) = 𝑥 + 𝑦 − 1 ≤ 0
𝑔2 (𝑥, 𝑦) = −𝑥 ≤ 0(𝑥 ≥ 0)
𝑔3 (𝑥, 𝑦) = −𝑦 ≤ 0(𝑦 ≥ 0)

This is a convex problem with three inequality constraints.

1. Kuhn–Tucker (KKT) Conditions


(Primary direct method for multiple inequalities)

Step 1: Form the Lagrangian


ℒ = 𝑥 2 + 𝑦 2 + 𝜇1 (𝑥 + 𝑦 − 1) + 𝜇2 (−𝑥) + 𝜇3 (−𝑦)

where:
𝜇1 , 𝜇2 , 𝜇3 ≥ 0
Step 2: Stationarity conditions
∂ℒ
= 2𝑥 + 𝜇1 − 𝜇2 = 0
∂𝑥
∂ℒ
= 2𝑦 + 𝜇1 − 𝜇3 = 0
∂𝑦
Step 3: Complementary slackness
𝜇1 (𝑥 + 𝑦 − 1) = 0
𝜇2 𝑥 = 0
𝜇3 𝑦 = 0
Step 4: Primal feasibility
𝑥 + 𝑦 ≤ 1, 𝑥 ≥ 0, 𝑦 ≥ 0
Step 5: Case analysis

Case 1: All constraints inactive


𝜇1 = 𝜇2 = 𝜇3 = 0
Stationarity:
𝑥=𝑦=0
Check feasibility:
0 + 0 − 1 = −1 ≤ 0
Case 2: Constraint 𝒈𝟏 active only
𝑥 + 𝑦 = 1, 𝜇1 > 0
From stationarity:
1
𝑥=𝑦=
2

27
Check feasibility:
𝑥, 𝑦 > 0 ⇒ 𝜇2 = 𝜇3 = 0
Objective:
𝑓 = 0.5
Case 3: Constraint 𝒈𝟐 active (𝒙 = 𝟎)

From feasibility:
𝑦 ≤ 1 ⇒ 𝑓 = 𝑦2
Minimum at:
𝑦=0⇒𝑓=0
Final KKT Solution
(𝑥 ∗ , 𝑦 ∗ ) = (0,0)
2. Lagrange Multipliers (Active Constraint View)
At the optimum:
• Only non-negativity constraints inactive
• No active equality constraint
If instead the minimum lay on 𝑥 + 𝑦 = 1, we would solve:
min 𝑥 2 + 𝑦 2 s.t. 𝑥 + 𝑦 = 1
which gives 𝑥 = 𝑦 = 0.5, confirming the KKT result.

Lagrange multipliers appear as a special case of KKT when constraints are active and treated as
equalities.

3. Method of Feasible Directions (MFD)

Step 1: Choose a feasible starting point


𝑥0 = (0.3,0.4)
Check:
0.3 + 0.4 − 1 = −0.3 ≤ 0
Step 2: Identify active constraints
None active (strictly inside feasible region).

Step 3: Compute gradient of objective


∇𝑓 = (2𝑥, 2𝑦) = (0.6,0.8)
Step 4: Choose descent direction
𝑑 = −∇𝑓 = (−0.6, −0.8)

Check feasibility:
∇𝑔1𝑇 𝑑 = (1,1) ⋅ (−0.6, −0.8) = −1.4 ≤ 0

All constraints satisfied → direction feasible.


Step 5: Line search
𝑥𝑘+1 = 𝑥𝑘 + 𝛼𝑑
Move until one constraint becomes active.

Convergence
The method drives the point toward:
(0, 0)

Direct method for Constrained Optimization Problems


(i) Lagrange multipliers
Given the objective function

28
𝑀𝑖𝑛 𝑓(𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 )
Subject to equality constraint
𝑔𝑖 (𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 ) = 0, 𝑖 = 1, 2, . . . , 𝑘
Use the Lagrange multiplier method to convert the constrained equality problem to unconstrained problem
by augmenting the cost function by introducing the k-vector 𝜆 of undetermined quantities.
The unconstrained function becomes
𝑘

𝐿 = 𝑓(𝑥) + ∑ 𝜆𝑖 𝑔𝑖
𝑖=1
The resulting conditions for constrained local minima of ℓ are the following
𝑘
𝜕𝐿 𝜕𝑓 𝜕𝑔𝑖
= + ∑ 𝜆𝑖 =0 (1)
𝜕𝑥𝑖 𝜕𝑥𝑖 𝜕𝑥𝑖
𝑖=1
𝜕𝐿
= 𝑔𝑖 = 0 (2)
𝜕𝑥𝑖
Then solving Eqns. (1) & (2) gives the parameter values for the minimum cost.

Example 3.
Apply the Lagrange method to solve constrained optimization to determine the minimum distance from the
origin of the 𝑥𝑦 plane to a circle described by
(𝑥 − 8)2 + (𝑦 − 6)2 = 25
The minimum distance is obtained by minimization of the distance square given by
𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2

Solution
The function can be rewritten in this form
𝑀𝑖𝑛 𝑓(𝑥, 𝑦)
s.t 𝑔𝑖 (𝑥) = (𝑥 − 8)2 + (𝑦 − 6)2 = 25
So,
ℓ = 𝑥 2 + 𝑦 2 + 𝜆{(𝑥 − 8)2 + (𝑦 − 6)2 − 25}
The necessary condition for extrema then becomes
𝜕𝐿
= 2𝑥 + 𝜆(2𝑥 − 16) = 0
𝜕𝑥
This equation gives
2𝑥(𝜆 + 1) = 16𝜆

𝜕𝐿
= 2𝑦 + 𝜆(2𝑦 − 12) = 0
𝜕𝑦
2𝑦(𝜆 + 1) = 12𝜆
Finally,
𝜕𝐿
= (𝑥 − 8)2 + (𝑦 − 6)2 − 25 = 0
𝜕𝜆
Solving the above three equations simultaneously gives
x = 4 and x = 12
y = 3 and y = 9
At 𝜆 = 1, (𝑥, 𝑦) = 4, 3
𝜆 = −3, (𝑥, 𝑦) = 12, 9
The minimum distance then becomes (𝑥, 𝑦) = 4, 3

For Inequality constraints of constrained optimization


Situations do arise when the optimization problem may contain both equality and inequality constraint such
as:
𝑀𝑖𝑛 𝑓(𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 )
Subject to equality constraint
𝑔𝑖 (𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 ) = 0, 𝑖 = 1, 2, . . . , 𝑘
and inequality constraint

29
𝑈𝑖 (𝑥1 , 𝑥2 ,. . . , 𝑥𝑛 ) ≤ 0, 𝑖 = 1, 2, . . . , 𝑚
To convert this problem to unconstrained one, we use extended Lagrange multiplier method or the Kuhn
Tucker conditions to include the inequality constraints by introducing M – vector µ of underdetermined
quantities. The unconstrained cost function becomes
𝑘 𝑚

𝐿 = 𝑓 + ∑ 𝜆𝑖 𝑔𝑖 + ∑ 𝜇𝑗 𝑈𝑗
𝑖=1 𝑗=1
the resulting necessary conditions becomes as usual, the necessary conditions for constrained local minima
of L are the following;
𝜕𝐿
= 0, 𝑖 = 1, . . ., 𝑛
𝜕𝑥𝑖
𝜕𝐿
= 𝑔𝑖 , 𝑖 = 1, . . ., 𝑘
𝜕𝜆𝑖
𝜕𝐿
= 𝑈𝑗 , 𝑗 = 1, . . ., 𝑚
𝜕𝜇𝑗

𝜇𝑗 𝑈𝑗 = 0 𝑎𝑛𝑑 𝜇𝑗 > 0 , 𝑗 = 1, . . . , 𝑚
Example 4:
Find the minimum value of the function 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
Subject to 𝑔(𝑥, 𝑦) = 𝑥 2 − 5𝑥 − 𝑦 2 + 20 = 0
𝑈(𝑥, 𝑦) = 2𝑥 + 𝑦 ≥ 6

Solution
𝑘 𝑚

ℓ = 𝑓 + ∑ 𝜆𝑖 𝑔𝑖 + ∑ 𝜇𝑗 𝑈𝑗
𝑖=1 𝑗=1
substituting in the above equation gives
ℓ = x 2 + y 2 + λ{(𝑥 2 − 5x − 𝑦 2 + 20) + μ(2x + y − 6)}
𝜕ℓ
= 2𝑥 + λ(2𝑥 − 5) + 2𝜇 = 0 (1)
𝜕𝑥
𝜕ℓ
= 2𝑦 + λ(−2𝑦) + 𝜇 = 0
𝜕𝑦
= 2𝑦 − 2𝑦λ + 𝜇 = 0 (2)
𝜕ℓ
= 𝑥 2 − 5x − 𝑦 2 + 20 = 0 (3)
𝜕λ
𝜕ℓ
= 2x + y − 6 = 0 (4)
𝜕μ
Eliminating μ from Eqns. (1) & (2)
2𝑥 + λ(2𝑥 − 5) + 2𝜇 = 0 × 1
2𝑦 − 2𝑦λ + 𝜇 =0 × 2
2𝑥 + λ(2𝑥 − 5) + 2𝜇 = 0
4𝑦 − 4𝑦λ + 2𝜇 = 0
-
2𝑥 − 4𝑦 + λ(2𝑥 − 5) − −4𝑦λ = 0
(2𝑥 − 4𝑦) + λ(2𝑥 + 4𝑦) − 5λ = 0 (5)
From eqn. (4)
𝑦 = 6 − 2𝑥 (6)
Substituting the value of y in Eqn. (5)
[2𝑥 − 4(6 − 2𝑥 )] + λ[2𝑥 + 4(6 − 2𝑥)] − 5λ = 0
Expanding the above equation gives
10𝑥 − 6𝑥λ + 19λ − 24 = 0
(10 − 6λ)𝑥 = 24 − 19λ
24 − 19λ
∴𝑥=
10 − 6λ
Substituting x into Eqn. (6)
2(24−19λ) 60−36λ−48+38λ 𝟏𝟐+𝟐𝛌
𝑦 = 6 − 10−6λ = =
10−6λ 𝟏𝟎−𝟔𝛌

30
Substituting x and y into (3)
24−19λ 2 24−19λ 12+2λ 2
( 10−6λ ) − 5 ( 10−6λ ) − (10−6λ) + 20 = 0
[576−9+2λ+361λ2 ] [120−95λ] [144+48λ+4λ2 ]
− + + 20 = 0
(10−6λ)2 10−6λ (10−6λ)2
{(361λ2 − 912λ + 576)} − [(120 − 95λ)(10 − 6λ)] + [4λ2 + 48λ + 144] + 20[(10 − 6λ)(10 − 6λ)]
=0
(10 − 6λ)2
507λ2 −1690λ+1232
(10−6λ)2
= 0
Cross multiplying, we obtain
507λ2 − 1690λ + 1232 = 0
−𝑏±√𝑏2 −4𝑎𝑐
Applying this almighty formula of λ = 2𝑎
1690±598 2288 1092
λ = 1014 = 1014 𝑜𝑟 = 𝟐. 𝟐𝟔 𝒐𝒓 𝟏. 𝟎𝟖
1014
∴ 𝑾𝒉𝒆𝒏 λ = 2.26
12+2(2.26)
𝑦 = 10−6(2.26) = −4.64
24−19(2.26)
𝑥= = 5.32
10−6(2.26)
When 𝛌 = 𝟏. 𝟎𝟖
12+2(1.08)
𝑦 = 10−6(1.08) = 4.02
24−19(1.08)
𝑥= = 0.99
10−6(1.08)
From Eqn. (2)
When 𝑥 = 5.32 & 𝑦 = −4.64, 𝑡ℎ𝑒𝑛 λ = 2.26
𝜇 = 2𝑦λ − 2y = (2 × −4.64 × 2.26) − 2(−4.64)
𝜇 = −20.97 + 9.28
∴ 𝝁 = −𝟏𝟏. 𝟔𝟗
𝑩𝒖𝒕 𝑾𝒉𝒆𝒏 𝒙 = 𝟎. 𝟗𝟗; 𝒚 = 𝟒. 𝟎𝟐 & λ = 1.08
𝜇 = (2 × 4.02 × 1.08) − 2(−4.64)
∴ 𝝁 = 𝟎. 𝟔𝟒.
Therefore, solutions are as follows
(𝒙, 𝒚) = (𝟓. 𝟑𝟐, −𝟒. 𝟔𝟒)
For λ = 2.26 , μ = −11.69
(𝒙, 𝒚) = (𝟎. 𝟗𝟗, 𝟒. 𝟎𝟐)
For λ = 1.08 , μ = 0.64

Exercises
No 1:
Find the optimal solution to the problem
𝑀𝑖𝑛 𝑓(𝑥) = 𝑥12 + 2𝑥22 + 10𝑥32
Subject to
𝑔1 (𝑥) = 𝑥1 + 𝑥22 + 𝑥3 − 5 = 0
𝑔2 (𝑥) = 𝑥1 + 5𝑥2 + 𝑥3 − 7 = 0
Suppose that 𝑔1 (𝑥) = 0.01 & 𝑔2 (𝑥) = 0.02. Find the corresponding change in the optimal value of f(x)

No2: An electrical company manufactures two pole models, each on separate production line. The daily
capacity of the first line is 60 poles and that of the second is 75 poles. Each unit of the first model uses 10
pieces of materials, whereas each unit of the second model requires 8 pieces of the same materials. The
maximum daily availability of the special material is 800 pieces. The profit per unit of models 1 and 2 is
N30 and N20, respectively. Determine the optimum daily production of each model.

No 3: Find the minimum value of the function


min 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2
Subject to

31
𝑔(𝑥, 𝑦) = (𝑥 − 8)2 + (𝑦 − 6)2 = 25
𝜇(𝑥, 𝑦) = 2𝑥 + 𝑦 ≥ 12

32

You might also like