Cheatsheet
Cheatsheet
Cheat sheet
Notation
∇f (x) gradient, ∇2 f (x) Hessian
A ⪰ B means B − A is positive semidefinite
A ≻ B means B − A is positive definite
x ≥ 0 means that all entries of the vector x are ≥ 0.
L-smooth function
Unconstrained optimization
min f (x) over x ∈ Rd
1
Theorems:
• f cont.diff., gradient method with Armijo step size =⇒ every accumulation point of (x(k) )k∈N0 is a stationary
point of f
• f cont.diff., µ-strongly convex, L-smooth, gradient method with αk = L1 =⇒ (x(k) )k∈N0 and (f (x(k) ))k∈N0
converge q-linearly
• f twice cont.diff., ∇f (x∗ ) = 0, ∇2 f (x∗ ) invertible =⇒ for x(0) ≈ x∗ , αk = 1 and Newton directions p(k) ,
convergence x(k) → x∗ is q-superlinear (q-quadratic when ∇2 f is Lipschitz)
Linear optimization min c⊤ x over x ∈ Rd s.t. constraints of the form a⊤ x ≤ b, a⊤ x ≥ b, or a⊤ x = b for given
a, c ∈ Rd and b ∈ R.
Nonlinear constrained optimization min f (x) over x ∈ Rd s.t. gj (x) ≤ 0, hi (x) = 0 with f, gj , hi continuously
differentiable, let C be the feasible set
n (k)
o
• tangent cone TC (x) = y ∈ Rd : ∃ (x(k) ∈ C N , αk ∈ (0, ∞)N , x(k) → x, αk → 0, x αk−x → y
• first-order necessary optimality condition: x∗ is a locally optimal solution ⇒ ∇f (x∗ )⊤ y ≤ 0 for all y ∈ TC (x∗ )
• polar cone: K = y : y x ≤ 0 ∀ x ∈ K
◦ ⊤
• first-order necessary optimality condition again: x∗ is a locally optimal solution ⇒ −nablaf (x∗ ) ∈ TC (x∗ )◦
• linearized tangent cone F (x) = y ∈ Rd : ∇hi (x)⊤ y = 0 ∀ i, ∇gj (x)⊤ y ≤ 0 ∀ j with gj (x) = 0
• always TC (x) ⊆ F (x) and F (x)◦ ⊆ TC (x)◦
• F (x) depends
nP on the constraint functions, TC (x) only dependso on the feasible set
• F (x)◦ = (x) + (x) : 0,
P
µ
j:gj (x)=0 j ∇g j λ h
i i i µj ≥ λi ∈ R
• F (x)◦ = TC (x)◦ is true when constraint qualifications are satisfied
• LICQ is satisfied at x: All the vectors ∇gj (x) and ∇hi (x) are linearly independent (all i, only those j for
which gj (x) = 0)