Convex Optimization Concepts Explained
Convex Optimization Concepts Explained
Convexity
Definition. f : Rn → R is convex if and only if
ANDREW TULLOCH f ((1 − τ )x + τ y) ≤ (1 − τ )f (x) + τ f (y) (2.1)
for all x, y ∈Rn , τ ∈ (0, 1).
A set C ⊆ Rn is convex if and only if δC is convex if and only if
(1 − τ )x + τ y ∈ C for all x, y ∈ C, τ ∈ (0, 1).
f : Rn → R is strictly convex if and only if f is convex and the in-
1. Existence
equality is strict for all x 6= y and τ ∈ (0, 1).
Definition. For C ⊆ Rn , define δC as Pm
( Definition. Let x0 , . . .P, xm ∈ Rn and λ0 , . . . , λm ≥ 0, i=0 λi = 1.
m
0 x∈C The linear combination i=0 λi xi is a convex combination of the points
δC (x) = (1.1)
∞ x∈/C x0 , . . . , xm .
Note x0 minimizes f over C if and only if x0 minimizes f + δC over Theorem. (i) f : Rn → R is convex if and only if
Rn . Xm m
X
f( λ i xi ) ≤ λi f (xi ) (2.2)
Definition. (i) dom f = {x ∈ Rn |f (x) < ∞}, i=0 i=0
(ii) for all m ≥ 0, xi ∈ Rn , λ 0, m
P
( i ≥ λi = 1.
i=0
∅ f ≡∞ (ii) C ⊆ Rn is convex if and only if C contains all convex combinations
arg min f = (1.2)
{x ∈ Rn |f (x) = inf f } f <∞ of its elements.
(iii) f is proper if and only if dom f 6= ∅ and f (x) > −∞ for all Theorem. f : Rn → R is convex implies dom f is convex.
x ∈ Rn .
Theorem. (i) f : Rn → R is convex if and only if epi f is convex in
Definition. For f : Rn → R, denote Rn × R.
(ii) f : Rn → R is strictly convex if and only if {(x, α) ∈ Rn ×R|f (x) <
epi f = {(x, α) ∈ Rn × R|f (x) ≤ α} (1.3)
α} is convex in Rn × R.
Theorem. A set C is an epigraph if and only if for every x there is an (iii) f : Rn → R is convex implies lev≤α f is convex for all α ∈ R.
α ∈ R such that C ∩ (x × R) = [α, ∞] - so all vertical one-dimensional
Theorem. Assume f : Rn → R is convex. Then
sections must be closed upper half-lines. If f is proper then epi f is not
empty and does not include a complete vertical line. (i) arg min f is convex.
(ii) x is a local minimizer of f implies x is a global minimize of f .
Definition. For f : Rn → R, define (iii) f is strictly convex and proper implies f has at most one global
lim inf f (x) = lim inf f (x) = lim inf f (x) (1.4) minimizer.
x→x0 δ↓0 kx−x0 k2 ≤δ k→∞ kx−x0 k2 ≤ 1
k
Theorem. Let I be an arbitrary index set. Then
f is lower semicontinuous at x0 if and only if f (x0 ) ≤ lim inf x→x0 f (x). (i) fi , i ∈ I convex implies supi∈I fi (x) is convex.
f is lower semicontinuous if f is lower semicontinuous at every x0 ∈ Rn . (ii) fi , i ∈ I strictly convex, I finite implies supi∈I fi (x) is strictly
Theorem. convex.
(iii) Ci , i ∈ I is convex implies ∩i∈I Ci is convex.
lim inf f (x) = min{α ∈ R|∃(xk ) → x0 : f (xk ) → α} (1.5) (iv) fk , k ∈ N is convex implies lim supk→∞ fk (x) is convex.
x→x0
In particular, f is lower semi-continuous at x0 if and only if f (x0 ) ≤ Theorem. Assume C ⊆ Rn is open and convex, and f : C → R is
lim inf k→∞ f (xk ) for all convergence sequences xk → x0 . differentiable. Then the following are equivalent:
(i) f is [strictly] convex
Theorem. Let f : Rn → R. Then the following are equivalent: (ii) hy − x, ∇ f (y) − ∇ f (x)i ≥ 0 for all x, y ∈ C [and > 0 if x 6= y]
(i) f is lsc on Rn (iii) f (x) + hy − x, ∇ f (x)i ≤ f (y) for all x, y ∈ C [and < f (y) if x 6= y]
(ii) epi f is closed in Rn × R (iv) If f is additionally twice differentiable, then ∇2 f (x) is positive
(iii) The sub level sets lev≤α f = {x ∈ Rn |f (x) ≤ α} are closed in Rn semidefinite for all x ∈ C.
for all α ∈ R.
Proof. Reduce to one-dimensional sections g(t) = f (x + t(x − y)). Take
g(y)−g(x)
Proof. epi f can only be not closed along vertical lines. g(x + h), use convexity to show g 0 (x) ≤ y−x
.
Definition. f : Rn → R is level bounded if and only if lev≤α f is bounded Use g(x) = supy∈dom g g(y) + (y − x)g 0 (y) to show the supremum over
for all α ∈ R. convex functions is convex.
Theorem. f : Rn → R is level-bounded if and only if f (xk ) → ∞ for all Theorem. (i) Assume f1 , . . . , fm : Rn → R are convex, λ1 , . . . , λm ≥
0. Then f = m
P
sequences (xk ) satisfying kxk k2 → ∞. i=1 i fi is convex. If at least one of the fi with
λ
λi > 0 is strictly convex, then f is strictly convex.
Proof. K(α) such that xk ∈ / lev≤α f for k ≥ K(α) by boundedness. In (ii) Assume fi : Rni → R are convex. Then
reverse, find α with lev≤α f unbounded and choose xk in this set with m
X
f (xk ) ≤ α. f : Rn1 × . . . Rnm → Rf (x1 , . . . , xm ) = fi (xi ) (2.3)
i=1
Theorem. Assume f : Rn
→ R is lsc, level-bounded, and proper. Then
is convex. If all fi are strictly convex, then f is strictly convex.
inf f (x) ∈ (−∞, +∞) and arg min f is nonempty and compact.
(iii) If f : Rm → R is convex, A ∈ Rm×n , b ∈ Rm . Then g(x) =
Proof. Consider ∩α∈R,a>∈f = arg min f , then this is countable intersec- f (Ax + b) is convex.
tion of nonempty, compact sets, so intersection is nonempty. Theorem. (i) C1 , . . . , Cm convex implies C1 × · · · × Cm is convex.
Finiteness of inf is shown as for any x ∈ arg min f 6= ∅, must have (ii) C ⊆ Rn convex, A ∈ Rm×n , b ∈ Rm , L(x) = Ax + b implies L(C)
f (x) = −∞ contradicting properness. is convex.
Theorem. (i) f, g lsc, proper implies f + g is lsc. (iii) C ⊆ Rm convex, A ∈ Rm×n , b ∈ Rm , L(x) = Ax + b implies
(ii) f lsc, λ ≥ 0 implies λf is lsc L−1 (C) is convex.
(iii) f : Rn → R is lsc and g : Rm → Rn is continuous implies f ◦ g is (iv) C1 , C2 is convex implies C1 + C2 is convex.
lsc. (v) C convex, λ ∈ R implies λC is convex.
1
CONVEX OPTIMIZATION SUMMARY 2
Proof. x = m
P Pm
Pai x ∈ K, which is a rint C = {x ∈ Rn |∃ open neighborhood N of x with N ∩ aff C ⊆ C}.
i=0 αi xi ∈ K, ai ≥ 0 ⇐⇒ i=0 α j
j (4.6)
convex combination.
Theorem. (i) Assume f : Rn → R is convex. Then
Theorem. Assume K is a convex cone. Then K is pointed if and only if
K ∩ −K = {0}. g(x) = f (x + y) ⇒ ∂g(x) = ∂f (x + y) (4.7)
g(x) = f (λx) ⇒ ∂g(x) = λ∂f (λx), λ 6= 0 (4.8)
Proof. x1 +x2 +· · ·+xm ∈ K, x1 6= 0, x2 +· · ·+xm = −x1 , x2 +· · ·+xm ∈
K, so x1 ∈ K ∩ −K. g(x) = λf (x) ⇒ ∂g(x) = λ∂f (x), λ > 0 (4.9)
(4.10)
Theorem. For a closed convex cone K ⊆ Rn we define the generalized
inequality (ii) Assume f : Rn → R is proper and convex, and A ∈ Rn×m is such
that
x ≤K y ⇐⇒ x − y ∈ K (3.1)
{Ay|y ∈ Rm } ∩ rint dom f (4.11)
Then If x ∈ dom(f ◦ A) = {y ∈ Rm |Ay ∈ dom f }, then ∂(f ◦ A)(x) =
(i) x ≤K x AT ∂f (Ax).
(ii) x ≤K y, y ≤K z ⇒ x ≤K z (iii) Assume f0 , . . . , fm : Rn → R are proper and convex,
Pand rint dom f0 ∩
(iii) x ≤K y ⇒ −y ≤K −x ·P· · ∩ rint dom fm 6= ∅. If x ∈ dom f , then ∂( m i=0 fi )(x) =
m
(iv) x ≤K y, λ ≥ 0 ⇒ λx ≥K λy i=0 ∂fi (x).
(v) x ≥K y, x0 ≥K y 0 ⇒ x + x0 ≥K y + y 0
5. Conjugate Functions
(vi) xk → x, y k → y with xk ≥K y K for all k ∈ N, then x ≥K y.
(vii) x ≥K y, y ≥K ⇒ x = y (antisymmetry) holds if and only if K is Definition. For f : Rn → R, con f (x) = supg ≤ f , g convex is the convex
pointed. hull of f . con f is the greatest convex function majorized by f .
Definition (Kn LP ). For any pointed, closed, convex cone K ⊆ Rm , a Definition. For f : Rn → R, the lower closure cl f is defined as (cl f )(x) =
matrix A ∈ R m×n , and vectors c ∈ Rn , b ∈ Rm , define the conic problem lim inf y→x f (y). Alternatively, (cl f )(x) = supg ≤ f , g lsc g(x).
inf x cT [Link] ≥K b.
Theorem. For f : Rn → R, we have epi(cl f ) = cl(epi f ). Moreover, if f
SOCP ).
Definition (Kn is convex then cl f is convex.
q Proof. (x, α) ∈ cl(epi f ) ⇐⇒ ∃xk → x, αk → α, f (xk ≤ αk ) ⇐⇒
SOCP
Kn = {x ∈ Rn }|xn ≥ x21 + · · · + x2n−1 . (3.2) lim inf y→x f (y) ≤ α ⇐⇒ (x, α) ∈ epi(cl f ).
CONVEX OPTIMIZATION SUMMARY 3
Theorem. Assume C ⊆ Rn is closed and convex. Then For cones, show g is positively homogeneous lsc convex proper indicator
\ function ⇐⇒ g = δK for Ka convex closed cone.
C= Hb,β (5.1) ? is an indicator function,
Taking any such indicator function δK , then δK
(b,β),C⊆Hb,β ? ?
with δK (v) < ∞ ⇐⇒ v ∈ K .
where Bb,β = {x ∈ Rn |hx, bi − β ≤ 0}
Definition. For any set S ⊆ Rn define the support function suppS (v) =
Proof. Separating hyperplane theorem - x ∈ C then x is the intersection. ? )(v)
supx∈S hv, xi = (δS
If x ∈
/ C consider the separating hyperplane of C and {x}.
Definition. A function f : Rn → R is said to be positively homogeneous
Theorem. Assume f : Rn → R is proper, lsc, and convex. Then f (x) = if and only if 0 ∈ dom f and f (λx) = λf (x) for all x ∈ Rn and λ > 0.
supg affine, f ≤ f g(x).
Theorem. The set of positively homogeneous proper lsc convex functions
Proof. f lsc implies epi f is closed, f is convex implies epi f is convex, so and the set of closed convex nonempty sets are in one-to-one correspon-
epi f is the intersection of all half spaces containing it. Show g affine ⇐⇒ dence through the Legendre-Fenchel transform:
there exists (b, c), β with c < −, epi g = H(b,c),β .
δC ↔? supp (5.11)
Then as epi(supg≤f g(x)) = ∩g≤f epi g where g is affine and g ≤ C
f ⇐⇒ epi f ⊆ epi g, we need only show I1 = ∩(b,c),β∈S H(b,c),β = x ∈ ∂ supp(v) ⇐⇒ x ∈ C (5.12)
∩(b,c),β∈S,c<0 H(b,c),β = I2 . C
supp(v) = hv, xi ⇐⇒ v ∈ NC (x). (5.13)
Definition. Let f : Rn → R, then C
Proof. f (x) + f ? (v) = hv, xi iff x ∈ arg maxx0 hv, x0 i − f (x0 ) ⇐⇒ v ∈ inf φ(x) < ∞ ⇐⇒ 0 ∈ dom p (6.6)
x
∂f (x). sup ψ(y) > −∞ ⇐⇒ 0 ∈ dom q (6.7)
y
Theorem. For a proper, lsc, convex f : Rn → R, we have
Proof. f proper lsc convex implies f ? is proper lsc convex implies π, ψ lsc
(f (·) − ha, ·i)? = f ? (· + a) (5.6) convex.
(f (· + b))? = f ? (·) − h·, bi, (5.7) p, q are convex from the strict epigraph set, with E = {(u, a) ∈ Rm ×
? ?
(f (·) + c) = f (·) − c (5.8) R|p(u) = inf x∈Rn f (x, u) < α} = A(E 0 ), where A is the linear projection
· mapping A(x, u, α) = (u, α), and E 0 is the strict epigraph of f and thus
(λf (·))? = λf star ( ), λ > 0 (5.9) convex, so A(E 0 ) is convex.
λ
· p? (y) = −ψ(y), so p?? (0) = supy ψ(y) .
(λf ( ))? = λf ? (·), λ > 0 (5.10) 0 ∈ dom p ⇐⇒ p(0) < ∞ ⇐⇒ inf x f (x, 0) < ∞ ⇐⇒ inf ψ < ∞.
λ
Theorem. Let fi : Rni → R, i = 0, . .P . , m be proper and f (x0 , . . . , xm ) = Theorem. Assume f : Rn × Rm → R is proper, lsc, and convex. Then
Pm ? m ?
i=0 fi (xi ). Then f (v1 , . . . , vm ) = i=0 fi (vi ).
weak duality always holds,
Proof. For support functions, f ? (x) = σC (x), so C nonempty closed con- inf φ(x) ≥ sup ψ(y), (6.8)
x y
vex implies f ? is proper lsc convex with f ? (0) = 0 and f ? (λv) = λf ? (v),
so f ? = σC is positively homogeneous. and under certain conditions the infimum and supremum are equal and
If g is positively homogeneous lower semicontinuous, g ? (x) ∈ {0, ∞}, finite - strong duality
so g ? is an indicator function, which must be convex and nonempty by p(0) ∈ R, p lsc in 0 ⇐⇒ inf φ(x) = sup ψ(y) ∈ R. (6.9)
x y
properness lsc convexity of g ? .
One to one follows from δC ?? = δ .
C The difference inf φ − sup ψ is the duality gap.
CONVEX OPTIMIZATION SUMMARY 4
Proof. inf x φ(x) = p(0) ≥ p?? (0) = supy ψ(y) so must show p(0) ∈ R and Proof. Consider g(x, y, u) = f (x, u) − hy, ui, which is proper lsc convex.
p lsc in 0 if and only if p(0) = p?? (0) ∈ R. Then l(·, y) = inf u g(·, y, u) is [Link] fx (y) = f (x, y), then −l(x, ·) =
(⇒) follows as p?? (0) ≤ cl p(0) ≤ p(0), so lim inf y→0 p(y) = cl p(0) = fx? (·) so fx is either +∞ or proper lsc convex, and −l(x, ·) is either −∞ or
p(0) ∈ R, so p is lsc in 0. proper lsc convex, but always lsc convex.
(⇐) follows from the claim that if it holds then cl p is proper lsc con- By subgradient definition, (v, y) ∈ ∂f (x, u) ⇐⇒ inf u0 f (x0 , u0 ) −
vex. Convexity, lsc is clear, and an improper convex lsc function is always hy, u0 i ≥ f (x, u) − hy, ui + hv, x0 − xi∀x0 , which evaluated at x0 = x implies
constant ∞ or −∞, contradiction p(0) ∈ R. So (p? )? (0) = ((cl p)? )? (0) = inf u0 f (x, u0 ) − hy, u0 i = f (x, u) − hy, ui.
cl p(0) = p(0). Continuing, we obtain (v, y) ∈ ∂f (x, u) ⇐⇒ y ∈ ∂u f (x, u), ∈ ∂x l(x, y),
and the first condition is equivalent to u ∈ ∂fx? (y), or u ∈ ∂(−l(x, ·))?? =
Theorem. Assume f : Rn × Rm → R is proper, lsc, and convex. Then ∂y (−l)(x, y) as required.
we have the primal-dual optimality conditions,
Definition. For any function l : Rn × Rm → R we say that (x0 , y 0 ) is a
(0, y 0 ) ∈ ∂f (x0 , 0) (6.10) saddle point of l if l(x, y 0 ) ≥ l(x0 , y 0 ) ≥ l(x0 , y) for all x, y. The set of all
⇐⇒ {x0 ∈ arg min φ(x), y 0 ∈ arg max ψ(y), inf ψ(x) = sup ψ(y)} saddle points is denoted by sp l.
x y x y Evquivalently, (x0 , y 0 ) ∈ sp l if inf x l(x, y 0 ) = l(x0 , y 0 ) = supy l(x0 , y).
(6.11)
Theorem. Assume f is proper, lsc, and convex with associated Lagrangian
⇐⇒ (x0 , 0) ∈ ∂f ? (0, y 0 ). (6.12) l. Then φ(x) = supy l(x, y), and ψ(y) = inf x l(x, y), and the primal prob-
The set of primal-dual optimal points (x0 , y 0 ) satisfying this equa- lem is inf x φ(x) = inf x supy l(x, y), and the dual problem is supy ψ(y) =
tion is either empty or equal to (arg min φ) × (arg max ψ). supy inf x l(x, y). Moreover, the optimality condition
{x0 ∈ arg min φ(x), y 0 ∈ arg max ψ(y), inf φ(x) = sup ψ(y)} (6.20)
Proof. Follows from invertibility of subgradient in terms of conjugate func- x y x y
tions, showing f (x0 , 0) = −f ? (0, y 0 ) ⇐⇒ φ(x0 ) = ψ(y 0 ) ∈ R, and equality ⇐⇒ (x0 , y 0 ) ∈ sp l (6.21)
with infinite value is explicitly excluded by definition of arg min. 0 0 0 0
⇐⇒ {0 ∈ ∂x l(x , y ), 0 ∈ ∂y (−l)(x , y )} (6.22)
Theorem. Assume f : Rn × Rm → R is proper, lsc, and convex. Then
Theorem. Assume X ⊆ Rn , Y ⊆ Rm are nonempty, closed, convex, and
(i) 0 ∈ int dom p or 0 ∈ int dom q implies inf x φ(x) = supy ψ(y). (S 0 ) L : X × Y → R is a continuous function with L(·, y) is convex for every y
R
(ii) 0 ∈ int dom p and 0 int dom q implies inf x φ(x) = supy ψ(y) ∈ R. and −L(x, ·) convex for every x. Then l(x, y) = L(x, y) + δX (x) − δY (y)
(S) R with the convention +∞ − ∞ = +∞ on the right, is the Lagrangian to
(iii) 0 ∈ dom p and inf x φ(x) ∈ R if and only if arg maxy ψ(y) is f (x, u) = supy l(x, y) + hu, yi = (−l(x, ·))? (u).
nonempty
R and bounded. (P ) f is proper, lsc, and convex, so the previous result applies with primal
(iv) 0 ∈ dom q and supy ψ(y) ∈ R if and only if arg minx φ(x) is and dual problems φ(x) = δX (x) + supy∈Y L(x, y), ψ(y) = −δY (y) +
R
nonempty and bounded. (D)
x∈X L(x, y). Moreover, if X and Y are bounded, then sp l is nonempty
In particular, if any of S, P, D hold, then strong duality holds - inf φ = and bounded.
sup ψ ∈ R. If S, or (P and D) hold, then there exists x0 , y 0 satisfying the Proof. For a fixed y, inf x l(x, y) = −f ? (0, y) = ψ(y), and for a fixed x,
primal-dual optimality conditions. Also, P implies ∂p(0) = arg maxy ψ(y), supy l(x, y) = f (x, 0) = fx?? (0) = f (x, 0) = φ(x) as fx is either proper lsc
and D implies ∂q(0) = arg minx φ(x). convex or +∞.
The optimality condition is equivalent to (0, y 0 ) ∈ f (x0 , 0) ⇐⇒ 0 ∈
Proof. (i): If p(0) = −∞, then p?? (0) ≤ p(0) = −∞ Use cl p = p on
R ∂x l(x0 , y 0 ), 0 ∈ ∂y (−l)(x0 , y 0 ), which is the saddle point condition.
dom p, so sup ψ = inf ψ = −∞. Otherwise, p(0) ∈ R so as cl p = p on
dom p, we have p is lsc in 0 and so p?? (0) = p(0) ∈ R.
R
7. Numerical Optimality
This followsRsymmetrically 0 ?
R on f (x, y) = f (y, x).
If both 0 ∈ dom p, 0 ∈ dom q, then +∞ > p(0) ≥ p?? (0) = sup ψ = Definition. For φ : Rn → R, a point x is an -optimal solution if φ(x) −
−q(0) > −∞, which is finite. inf φ ≤ .
R
Nonemptyness and boundedness follows from 0 ∈ dom p if and only Definition. Assume (xk , y k ) is a primal-dual feasible pair - so xk ∈
if ψ is proper lsc convex and level bounded. dom φ and y k ∈ dom ψ. Then φ(xk ) ≥ ψ(y k ), and 0 ≤ φ(xk ) − inf φ ≤
Subdifferential: if (iii) then p(0) ∈ R, so cl p(0) = p(0) ∈ R, cl p is then k
φ(xk ) − ψ y = γ(xk , y k ) := γ. γ is the numerical primal-dual gap. If
proper and lsc convex, so ∂(cl p)(0) = arg max ψ, but ∂p(0) = ∂(cl p)(0).
γ < then xk is an -optimal solution with optimality certificate y k .
φ(xk )−ψ(y k )
The normalized gap is γ = γ(xk , y k ) = ψ(y k )
.
Theorem. Assume k : Rn → R and h : Rm → R are both proper, lsc, P np
convex, and A ∈ Rm×n , b ∈ Rm , c ∈ Rn . For f (x, u) = hc, xi + k(x) + Definition. Assume φ(x) = φ0 (x) =
P nd i=1 δgi (x)≤0 , ψ(y) = ψ0 (y) −
δ where dom φ = dom ψ = n and g : Rn → R, h : Rm →
h(Ax − b + u), the primal and dual problems are of the form i=1 hi (x)≤0 0 0 R i i
R are suitable continuous real-valued convex functions, so the primal and
inf φ(x), φ(x) = hc, xi + k(x) + h(Ax − b) (6.13) dual constraints are of the form gi (x) ≤ 0, hi (y) ≤ 0. Then the primal
? ?
sup ψ(y), ψ(y) = −hb, yi − h (y) − k (−A y − c) T
(6.14) and dual infeasibilities are defined as ηp = max{0, g1 (xk ), . . . , gnp (xk )}
y
and ηd = max{0, h1 (y k ), . . . , hnd (y k )}.
with
Z Z 8. First-Order Methods
dom p = (dom h − A dom k) + b (6.15) Definition. For f : Rn → R, we define
Z Z (i) The forward step, Fτk f (xk ) = (I − τk ∂f )xk
dom q = (dom k? − (−AT ) dom h? ) + c (6.16) (ii) The backward step, Bτk f (xk ) = (I + τk ∂f )−1 xk .
and optimality conditions Theorem. If f : Rn → R is proper lsc convex with τ > 0, then the
backward step is Bτ f (x) = arg miny { 21 ky − xk22 + τ f (y)} and is therefore
{−AT y 0 − c ∈ ∂k(x0 ), y 0 ∈ ∂h(Ax0 − b)} (6.17) unique.
⇐⇒ {x0 ∈ arg min φ(x), y 0 ∈ arg max ψ(y), inf φ(x) = sup ψ(y)} (6.18) Proof. y ∈ Bτ f (x) ⇐⇒ 0 ∈ y − x + τ ∂f (y) ⇐⇒ y ∈ arg miny0 { 12 ky 0 −
x y x y
xk22 + τ f (y 0 )}
⇐⇒ {Ax0 − b ∈ ∂h? (y 0 ), x0 ∈ ∂k? (−AT y 0 − c)} (6.19)
Theorem. Assume f is proper lsc convex and arg min f 6= ∅. The for-
Theorem. Assume f : Rn ×Rm → R is proper, lsc, convex. Define the as- ward step is xk+1 ∈ Fτk f (xk ). The sequence is not unique, can get stuck
sociated Lagrangian as l(x, y) = −f (x, ·)? (y), so l(x, y) = inf u (f (x, u) − if xk is infeasible.
hy, ui). Then l(·, y) is convex for every y, −l(x, ·) is lsc and convex for The backward step is xk+1 = Bτk f (xk ) - which is a unique sequence,
every x, y and f (x, ·) = (−l(x, ·))? , and (v, y) ∈ ∂f (x, u) ⇐⇒ v ∈ and cannot get stuck. Sub-steps are as hard as the original problem (but
∂x l(x, y) and u ∈ ∂y (−l)(x, y). strictly convex).
CONVEX OPTIMIZATION SUMMARY 5
(i) Forward stepping: xk+1 ∈ Fτk f (xk ): Proof. If we linearize ∇ F (Axk+1 −b) = ∇ F (Axk −b+A∆x) ≈ ∇ F (Axk −
(ii) Backward stepping: xk+1 = Bτk f (xk ). b)+∇2 F (Axk −b)A∆x, and solve the constraints AT (y) = c, ty+∇ F (Ax−
If f = g + h, ∂f = ∂g + ∂h with f, g, h proper lsc convex, arg min f 6= ∅, b) = 0.
we can do: Theorem. Assume 0 < ρ ≤ κ < 10 1
, tk > 0 fixed, and z k = (xk , y k )
(i) Backward-Backward stepping: xk+1 = Bτk h Bτk g (xk ). strictly feasible, so Ax −b ∈ dom F , y ∈ dom f , such that dist(z k , z(tk )) <
k k
References