0% found this document useful (0 votes)
18 views31 pages

lect6_removed

Uploaded by

rw8an1413
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views31 pages

lect6_removed

Uploaded by

rw8an1413
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Convex Optimization

(EE227A: UC Berkeley)

Lecture 6
(Conic optimization)
07 Feb, 2013

Suvrit Sra
Organizational Info
I Quiz coming up on 19th Feb.
I Project teams by 19th Feb
I Good if you can mix your research with class projects
I More info in a few days

2 / 31
Mini Challenge
Kummer’s confluent hypergeometric function
X (a)j xj
M (a, c, x) := , a, c, x ∈ R,
(c)j j!
j≥0

and (a)0 = 1, (a)j = a(a + 1) · · · (a + j − 1) is the rising-factorial.


Claim: Let c > a > 0 and x ≥ 0. Then the function
Γ(a + µ)
ha,c (µ; x) := µ 7→ M (a + µ, c + µ, x)
Γ(c + µ)

is strictly log-convex on [0, ∞) (note that h is a function of µ).


R∞
Recall: Γ(x) := 0 tx−1 e−t dt is the Gamma function (which is
known to be log-convex for x ≥ 1; see also Exercise 3.52 of BV).

3 / 31
LP formulation
Write min kAx − bk1 as a linear program.

min kAx − bk1 x ∈ Rn


X
min |aT x − bi |
i i
X
min ti , |aTi x − bi | ≤ ti , i = 1, . . . , m.
x,t i

min 1T t, −ti ≤ aTi x − bi ≤ ti , i = 1, . . . , m.


x,t

Exercise: Recast kAx − bk22 + λkBxk1 as a QP.

4 / 31
Cone programs – overview
I Last time we briefly saw LP, QP, SOCP, SDP

LP (standard form)
min fT x s.t. Ax = b, x ≥ 0.
Feasible set X = {x | Ax = b} ∩ Rn+ (nonneg orthant)
Input data: (A, b, c)
Structural constraints: x ≥ 0.

How should we generalize this model?

5 / 31
Cone programs – overview
I Replace linear map x 7→ Ax by a nonlinear map?
I Quickly becomes nonconvex, potentially intractable

Generalize structural constraint Rn+

♣ Replace nonneg orthant by a convex cone K;


♣ Replace ≥ by conic inequality 
♣ Nesterov and Nemirovski developed nice theory in late 80s
♣ Rich class of cones for which cone programs are tractable

6 / 31
Conic inequalities
I We are looking for “good” vector inequalities  on Rn
I Characterized by the set

K := {x ∈ Rn | x  0}

of vector nonneg w.r.t. 

xy ⇔ x−y 0 ⇔ x − y ∈ K.

I Necessary and sufficient condition for a set K ⊂ Rn to define


a useful vector inequality  is: it should be a nonempty,
pointed cone.

7 / 31
Cone programs – inequalities
• K is nonempty: K 6= ∅
• K is closed wrt addition: x, y ∈ K =⇒ x + y ∈ K
• K closed wrt noneg scaling: x ∈ K, α ≥ 0 =⇒ αx ∈ K
• K is pointed: x, −x ∈ K =⇒ x = 0

Cone inequality
x K y ⇐⇒ x−y ∈K
x K y ⇐⇒ x − y ∈ int(K).

8 / 31
Conic inequalities
I Cone underlying standard coordinatewise vector inequalities:

x≥y ⇔ x i ≥ yi ⇔ xi − yi ≥ 0,

is the nonegative orthant Rn+ .


I Two more important
 i properties that Rn+ has as a cone:
n
It is closed x ∈ R+ → x =⇒ x ∈ Rn+
It has nonempty interior (contains Euclidean ball of
positive radius)
I We’ll require our cones to also satisfy these two properties.

9 / 31
Conic optimization problems
Standard form cone program
min f T x s.t. Ax = b, x ∈ K
min f T x s.t. Ax K b.
♣ The nonnegative orthant Rn+
♣ The second order cone Qn := {(x, t) ∈ Rn | kxk2 ≤ t}
n := X = X T  0 .

♣ The semidefinite cone: S+
♣ Other cones K given by Cartesian products of these
♣ These cones are “nice”:
♣ LP, QP, SOCP, SDP: all are cone programs
♣ Can treat them theoretically in a uniform way (roughly)
♣ Not all cones are nice!

10 / 31
Cone programs – tough case
Copositive cone

Def. Let CPn := A ∈ Sn×n | xT Ax ≥ 0, ∀x ≥ 0 .




Exercise: Verify that CPn is a convex cone.

If someone told you convex is “easy” ... they lied!

I Testing membership in CPn is co-NP complete.


(Deciding whether given matrix is not copositive is NP-complete.)
I Copositive cone programming: NP-Hard
Exercise: Verify that the following matrix is copositive:
1 −1 1 1 −1
 
−1 1 −1 1 1 
A :=  1 −1 1 −1 1 .
 
 1 1 −1 1 −1
−1 1 1 −1 1

11 / 31
SOCP in conic form

min fT x s.t. kAi x + bi k2 ≤ cTi x + di i = 1, . . . , m

Let Ai ∈ Rni ×n ; so Ai x + bi ∈ Rni .


 
−A1 
b1

 −cT   d1 
 1 
 

 −A2 
  b2 
 
K = Qn1 × Qn2 × · · · × Qnm , A =  b =  d2  .
 
T 
 −c2  ,
 
..  .. 
   . 
 . 
   
 −Am   bm 
−cTm dm

SOCP in conic form

min fT x Ax K b

12 / 31
SOCP representation
Exercise: Let 0 ≺ Q = LLT , then show that
p
xT Qx + bT x + c ≤ 0 ⇔ kLT x + L−1 bk2 ≤ bT Q−1 b − c

Rotated second-order cone



Qnr := (x, y, z) ∈ Rn+1 | kxk2 ≤ yz, y ≥ 0, z ≥ 0 .


Convert into standard SOC (verify!)


 
2x √
≤ (y + z) ⇐⇒ kxk2 ≤ yz.
y−z 2

Exercise: Rewrite the constraint xT Qx ≤ t, where both x and t


are variables using the rotated second order cone.

13 / 31
Convex QP as SOCP
min xT Qx + cT x s.t. Ax = b.

min cT x + t
x,t

s.t. Ax = b, xT Qx ≤ t.

min cT x + t
x,t

s.t. Ax = b, (2LT x, t, 1) ∈ Qnr .


Since, xT Qx = xT LLT x = kLT xk22

14 / 31
Convex QCQPs as SOCP
Quadratically Constrained QP

min q0 (x) s.t. qi (x) ≤ 0, i = 1, . . . , m


where each qi (x) = xT Pi x + bTi x + ci is a convex quadratic.
Exercise: Show how QCQPs can be cast at SOCPs using Qnr
Hint: See Lecture 5!
Exercise: Explain why we cannot cast SOCPs as QCQPs. That is,
why cannot we simply use the equivalence

kAx + bk2 ≤ cT x + d ⇔ kAx + bk22 ≤ (cT x + d)2 , cT x + d ≥ 0.

Hint: Look carefully at the inequality!

15 / 31
Robust LP
min cT x
s.t. aTi x ≤ bi ∀ai ∈ Ei
where Ei := {āi + Pi u | kuk2 ≤ 1} .
Robust half-space constraint:
I Wish to ensure aTi x ≤ bi holds irrespective of which ai we pick
from the uncertainty set Ei . This happens, if bi ≥ supai ∈Ei aTi x.
sup (āi + Pi u)T x = āTi x + kPiT xk2 .
kuk2 ≤1
I We used the fact that supkuk2 ≤1 uT v = kvk2 (recall dual-norms)

SOCP formulation
T
min c x, s.t. āTi x + kPiT xk2 ≤ bi , i = 1, . . . , m.

16 / 31
Semidefinite Program (SDP)
Cone program (semidefinite)
min cT x s.t. Ax = b, x ∈ K,
where K is a product of semidefinite cones.

Standard form
I Think of x as a matrix variable X
I Wlog we may assume K = S+ n (Why?)
n1 n2
I Say K = S+ × S+
n1 +n2
I The condition (X1 , X2 ) ∈ K ⇔ X := Diag(X1 , X2 ) ∈ S+
I Thus, by imposing non diagonals blocks to be zero, we reduce to
where K is the semidefinite cone itself (of suitable dimension).
I So, in matrix notation:
cT x → Tr(CX);
aTi x = bi → Tr(Ai X) = bi ; and
x ∈ K as X  0.

17 / 31
SDP
SDP (conic form)

min cT y
y∈Rn

s.t. A(y) := A0 + y1 A1 + y2 A2 + . . . + yn An  0.

Standard form SDP


min Tr(CX)
s.t. Tr(Ai X) = bi , i = 1, . . . , m
X  0.

One can be converted into another

18 / 31
SDP – CVX form

cvx_begin
variables X (n , n ) symmetric ;
minimize ( trace ( C * X ) )
subject to
for i = 1: m ,
trace ( A { i }* X ) == b ( i );
end
X == semidefinite ( n );
cvx_end

Note: remember symmetric and semidefinite

19 / 31
SDP representation – LP
LP as SDP

min fT x s.t. Ax ≤ b.

SDP formulation

min f T x
s.t. A(x) := diag(b1 − aT1 x, . . . , bm − aTm x)  0.

20 / 31
SDP representation – SOCP
SOCP as SDP

min fT x s.t. kATi x + bi k ≤ cTi x + di , i = 1, . . . , m.

SDP formulation

t xT
 
kxk2 ≤ t ⇐⇒ 0
x tI
h i
A BT
Schur-complements: B C
 0 ⇐⇒ A − B T C −1 B  0.

cTi x + di (ATi x + bi )T
 
kATi x + bi k ≤ cTi x + di ⇐⇒  0.
ATi x + bi (cTi x + di )

21 / 31
SDP / LMI representation

Def. A set S ⊂ Rn is called linear matrix inequality (LMI) repre-


sentable if there exist symmetric matrices A0 , . . . , An such that

S = {x ∈ Rn | A0 + x1 A1 + · · · + xn An  0} .

S is called SDP representable if it equals the projection of some


higher dimensional LMI representable set.

♠ Linear inequalities: Ax ≤ b iff

b1 − aT1 x
 
 ..   0.

 .
bm − aTm x

22 / 31
SDP / LMI representation
♠ Convex quadratics: xT LLT x + bT x ≤ c iff

LT x
 
I
0
xT L c − bT x

♠ Eigenvalue inequalities:
λmax (X) ≤ t, iff tI − X  0
λmin (X) ≥ t iffX − tI  0
λmax cvx λmin concave.

♠ Matrix norm: X ∈ Rm×n , kXk2 ≤ t (i.e., σmax (X) ≤ t) iff


 
tIm X
 0.
X T tIn

Proof. t2 I  XX T =⇒ t2 ≥ λmax (XX T ) = σmax


2 (X).

23 / 31
SDP / LMI representation
Pk
♠ Sum of top eigenvalues: For X ∈ Sn , i=1 λi (X) ≤ t iff

t − ks − Tr(Z) ≥ 0
Z0
Z − X + sI  0.
Proof: P
Suppose ki=1 λi (X) ≤ t. Then, choosing s = λk and
Z = Diag(λ1 − s, . . . , λk − s, 0, . . . , 0), above LMIs hold.
Conversely, if above LMI holds, then, (since Z  0)
Xk Xk
X  Z + sI =⇒ λi (X) ≤ (λi (Z) + s)
i=1 i=1
Xn
≤ λi (Z) + ks
i=1
≤ t (from first ineq.).

24 / 31
SDP / LMI Representation
Pn
♠ Nuclear norm: X ∈ Rm×n ; kXktr := i=1 σi (X) ≤ t iff

t − ns − Tr(Z) ≥ 0
Z  0
 
0 X
Z− + sIm+n  0.
XT 0
h i
0 X
Follows from: λ XT 0
= (±σ(X), 0, . . . , 0).

Alternatively, we may SDP-represent nuclear norm as


 
U X
kXktr ≤ t ⇔ ∃U, V :  0, Tr(U + V ) ≤ 2t.
XT V

Proof is slightly more involved (see lecture notes).

25 / 31
SDP example
Logarithmic Chebyshev approximation

min max | log(aTi x) − log bi |


1≤i≤m

| log(aTi x) − log bi | = log max(aTi x/bi , bi /aTi x)

Reformulation

min t s.t. 1/t ≤ aTi x/bi ≤ t, i = 1, . . . , m.


x,t
 T 
ai x/bi 1
 0, i = 1, . . . , m.
1 t

26 / 31
Least-squares SDP
min kX − Y k22 s.t. X  0.
Exercise 1: Try solving using CVX (assume Y T = Y ); note k·k2
above is the operator 2-norm; not the Frobenius norm.
Exercise 2: Recast as SDP. Hint: Begin with minX,t t s.t. . . .
Exercise 3: Solve the two questions also with kX − Y k2F
Exercise 4: Verify against analytic solution: X = U Λ+ U T , where
Y = U ΛU T , and Λ+ = Diag(max(0, λ1 ), . . . , max(0, λn )).

27 / 31
SDP relaxation
Binary Least-squares
min kAx − bk2
xi ∈ {−1, +1} i = 1, . . . , n.
I Fundamental problem (engineering, computer science)
I Nonconvex; xi ∈ {−1, +1} – 2n possible solutions
I Very hard in general (even to approximate)

min xT AT Ax − 2xT AT b + bT b x2i = 1


min Tr(AT AxxT ) − 2bT Ax x2i = 1
min Tr(AT AY ) − 2bT Ax s.t. Y = xxT , diag(Y ) = 1.

I Still hard: Y = xxT is a nonconvex constraint.

28 / 31
SDP relaxation
Replace Y = xxT by Y  xxT . Thus, we obtain

min Tr(AT AY ) − 2bT Ax


Y  xxT , diag(Y ) = 1.

This is an SDP, since


 
T Y x
Y  xx ⇔ 0
xT 1

(using Schur complements).


I Optimal value gives lower bound on binary LS
I Recover binary x by randomized rounding
Exercise: Try the above problem in CVX.

29 / 31
Nonconvex quadratic optimization

min xT Ax + bT x
xT Pi x + bTi x + c ≤ 0, i = 1, . . . , m.

Exercise: Show that xT Qx = Tr(QxxT ) (where Q is symmetric).

min Tr(AX) + bT x
X,x

Tr(Pi X) + bTi x + c ≤ 0, i = 1, . . . , m
X  0, rank(X) = 1.

I Relax nonconvex rank(X) = 1 to X  xxT .


I Can be quite bad, but sometimes also quite tight.

30 / 31
References
1 L. Vandenberghe. MLSS 2012 Lecture slides; EE236B slides
2 A. Nemirovski. Lecture slides on modern convex optimization.

31 / 31

You might also like