0% found this document useful (0 votes)

55 views70 pages

L01 Intro

Uploaded by

Josh Low

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views70 pages

L01 Intro

Uploaded by

Josh Low

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Lecture 1: Introduction

DSA3102 Essential Data Analytics Tools: Convex Optimisation

Lam Xin Yee

[email protected]
S17-08-11

Past contributors of the material - Prof Tan Geok Choo, Prof Toh Kim Chuan, Prof Pang
Chin How Jeffrey.
Pre-requisites

Pre-requisites: Linear algebra, multivariate calculus, some programming

knowledge (preferably Matlab or python).

Python will be used to present algorithms in the lecture most of the

time, so it might be good to install in your computer. Refer to
https://www.python.org/ for installation guide.
A handy interactive web tool for running Python: Jupyter Notebook
(https://jupyter.org/)
For some algorithms, MATLAB code will be provided as well as
supplementary material. Refer to https://nusit.nus.edu.sg/
services/software_and_os/software/software-student/ for
installation guide of MATLAB.
Optimization model
Optimization models attempt to express the goal of solving a problem in
the “best” way in mathematical terms.

Examples:
Optimal time management
Optimal allocation of resources
Optimal design of manufacturing processes and instruments
Technique in machine learning

Types of optimization models:

1 Linear objective function over linear constraints (MA3252 linear and

network optimization)
2 Nonlinear objective function over convex sets (This course)

3 Linear/nonlinear objective function over discrete sets (MA4254

discrete optimization)
4 ...
Simple examples

Example I. If K units of capital and L units of labor are used, a company

can produce KL units of a product.
Capital can be purchased at $4 per unit and labor can be purchased
at $1 per unit.
A total of $8000 is available to purchase capital and labor.
How can the firm maximize the quantity of the product manufactured?
Simple examples

Example I. If K units of capital and L units of labor are used, a company

Solution. Let K = units of capital purchased, and L = units of labor

purchased. The problem to solve is:

maximize KL
s.t. 4K + L ≤ 8000, K , L ≥ 0.
Simple examples

Example II. It costs a company $c to produce a unit of a product. If the

company charges $p per unit, and the customers demand D(p) units,
what price should the company charge to maximize its profit?
Simple examples
Example III. Two products A, B are produced on the same machine using
the same raw material, of which 200kg are available.
Product A uses 2kg and product B uses 3kg of the material per unit
produced.
The machine is available for 50 hours.
Product A requires 30 minutes and Product B requires 20 minutes of
machine time for each unit produced.
If the profit for one unit of product A and B are $150 and $300
respectively, and the manufacturing cost is 3x12 for x1 units of Product A
and 5x22 for x2 units of Product B, determine how many units of A and B
should be produced to maximize the total net profit.
Example: portfolio selection

Consider an investor who has a certain amount of money to be invested in

a number of different securities (stocks, bonds, etc) with random returns.
Example: portfolio selection

Consider an investor who has a certain amount of money to be invested in

a number of different securities (stocks, bonds, etc) with random returns.
For each security i = 1, . . . , n, estimates of its expected return µi and
variance σi2 are given.
For any two securities i and j, their correlation coefficient ρij is also
assumed to be known.
Let the proportion of the total funds invested in security i be xi . The
vector x = [x1 ; . . . ; xn ] is called a portfolio vector.

expected return of x = E [x] = x1 µ1 + · · · + xn µn = µT x

X
variance of x = Var [x] = ρij σi σj xi xj = xT Qx
i,j

where Qij = ρij σi σj and µ = [µ1 ; . . . ; µn ].

P
Also, i xi = 1, xi ≥ 0 ∀ i = 1, . . . , n.
Example: portfolio selection (cont’d)

For a given target expected return R, a valid portfolio vector x is called

efficient if it has the minimum variance among all portfolios that have at
least expected return R.

Markowitz’s efficient portfolio (also called mean-variance) optimization

problem:

min xT Qx
x
n
X
s.t. xi = 1,
i=1

µT x ≥ R,
xi ≥ 0, i = 1, . . . , n.
Example: linear regression
Application: to estimate a quantity of interest from several observed
variables
Eg. given floor area, location, built year, house types etc., predict the
price of a house
Data: (a1 , b1 ), ..., (am , bm )
Input: ai ∈ Rp
Output: bi ∈ R
For a linear model, assume:

f (a) = x̄1 ai1 + x̄2 ai2 + · · · + x̄p aip + α + i = bi ,

where x̄ ∈ Rp and α ∈ R are the unknown coefficient vector and

offset, and i is stochastic noise that satisfies various assumptions
(e.g. independent and identically distributed (i.i.d.), normally
distributed).
We wish to learn the parameters x̄ and α from the data.
Example: linear regression (cont’d)

1.6

1.4

b=a+

1.2

=0.5 : y-intercept
errors i

0.8

0.6
data points (ai , bi )

0.4

0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

This goal can be formulated as the optimization problem

n1 X m o
min (bi − aiT x̄ − α)2 | x̄ ∈ Rp , α ∈ R
2
i=1
Example: sensor network localization and molecular
conformation
Aim: find the positions of atoms in a molecule (typically a protein
molecule) given distances estimated from Nuclear Magnetic Resonance
spectroscopy.

Figure: A sensor network with 4 anchors (filled squares). An edge in the graph
means that the sensor-sensor or sensor-anchor pairs are within the radio range R.
Example: sensor network localization and molecular
conformation (cont’d)
[Unknown] Coordinates of n sensors: xj ∈ Rd , j = 1, . . . , n
[Known]
Coordinate of m anchors: ai ∈ Rd , i = 1, . . . , m
Pairwise distances of the sensors and anchors within the radio range R:
kai − xj k ≈ fij ∀ (i, j) ∈ M, M := {(i, j) : kai − xj k ≤ R}
kxi − xj k ≈ dij ∀ (i, j) ∈ N , N := {(i, j) : kxi − xj k ≤ R}
k · k is the Euclidean norm (length) defined by
q
kyk = y12 + y22 + · · · + yn2 , for a vector y ∈ Rn .
Because of noise, the distances fij and dij are not estimated exactly
but only approximately.
The problem to solve is the following:
X X
min (kxi − xj k2 − dij2 )2 + (kai − xj k2 − fij2 )2 .
x1 ,...,xn ∈Rd
(i,j)∈N (i,j)∈M
General nonlinear programming (NLP) problems

Minimize (or Maximize) f (x)

x
Subject to x ∈ S ⊆ Rn

Variable x = (x1 , x2 , · · · , xn )T is a column vector in Rn

The function f : Rn → R which we wish to minimize (or maximize) is
known as the objective function.
S is known as the feasible set.
A solution or a point in the feasible set is a feasible solution or a
feasible point; otherwise, it is an infeasible solution or infeasible
point.
Terminology and notation

Minimize (or Maximize) f (x)

x
Subject to x ∈ S ⊆ Rn

(a) For a minimization problem, a feasible solution x∗ for which

f (x∗ ) ≤ f (x) for all feasible solutions x ∈ S is called an optimal
solution to the NLP. We can write

x∗ = argmin f (x)
x∈S
Terminology and notation

Minimize (or Maximize) f (x)

x
Subject to x ∈ S ⊆ Rn

(a) For a minimization problem, a feasible solution x∗ for which

f (x∗ ) ≤ f (x) for all feasible solutions x ∈ S is called an optimal
solution to the NLP. We can write

x∗ = argmin f (x)
x∈S

(b) For a maximization problem, a feasible solution x∗ is optimal if

f (x∗ ) ≥ f (x) for all feasible solutions x ∈ S is called an optimal
solution to the NLP. We can write

x∗ = argmax f (x)
x∈S

The value of f (x∗ ) is then called the optimal value.

Unboundedness

Minimize (or Maximize) f (x)

x
Subject to x ∈ S ⊆ Rn

(a) For a minimization problem, the objective value is said to be

unbounded (the optimal value is −∞) if
∀K , ∃x ∈ S such that f (x) < K .
Unboundedness

Minimize (or Maximize) f (x)

x
Subject to x ∈ S ⊆ Rn

(a) For a minimization problem, the objective value is said to be

unbounded (the optimal value is −∞) if
∀K , ∃x ∈ S such that f (x) < K .
(b) For a maximization problem, the objective value is said to be
unbounded (the optimal value is +∞)
if ∀K , ∃x ∈ S such that f (x) > K .
In these cases, we say the NLP is unbounded.
Symmetry

The following optimization problems are equivalent:

Maximize f (x) Minimize −f (x)

⇔
Subject to x∈S Subject to x ∈ S

x∗ ∈ S is an optimal solution for the maximization problem with

optimal objective value f (x∗ ) if and only if it is also an optimal
solution for the minimization problem with optimal objective value
−f (x∗ ).
Else, both problems are infeasible or both problems have unbounded
objective value.
We will mostly focus on the discussion of minimization problems.
Topics
An unconstrained nonlinear programme

Minimize (or Maximize) f (x)

x
Subject to x∈X

Objective function f : Rn → R is a nonlinear function of x

Feasible set X is an open subset of Rn

Definition 1.1 (Open set)

A subset S ⊆ Rn is open if for every x ∈ S there exists > 0 such that
the open ball B(x, ) ⊆ S. Here, the open ball centered at x having radius
is defined by
B(x; ) := {y ∈ Rn : ky − xk < }.
An unconstrained nonlinear programme

Minimize (or Maximize) f (x)

x
Subject to x∈X

Objective function f : Rn → R is a nonlinear function of x

Feasible set X is an open subset of Rn

Definition 1.1 (Open set)

A subset S ⊆ Rn is open if for every x ∈ S there exists > 0 such that
the open ball B(x, ) ⊆ S. Here, the open ball centered at x having radius
is defined by
B(x; ) := {y ∈ Rn : ky − xk < }.

Example of an unconstrained NLP: linear regression, the molecular

conformation problem
A constrained nonlinear programme

Minimize (or Maximize) f (x)

x
Subject to gi (x) = 0, i = 1, 2, 3, · · · , m,
hj (x) ≤ 0, j = 1, 2, 3, · · · , p,

f : Rn → R is the objective function

Each gi : Rn → R is an equality constraint
Each hj : Rn → R is an inequality constraint
Some functions of f , gi , hj are nonlinear. In this course, we assume
that f , gi , hj are continuous functions.
Feasible set
S := {x ∈ Rn | gi (x) = 0, i = 1, 2, ..., m, hj (x) ≤ 0, j = 1, 2, ..., p}
is a closed subset of Rn
Example of a constrained NLP: portfolio selection model
Closed set
Remark. A set S is closed if its complement is open.
Closed set
Remark. A set S is closed if its complement is open.
Example 1.2
Determine if the following set is open or closed.
1 {x ∈ R | a < x < b}
 
x1
x =  ...  ∈ Rn | ai ≤ xi ≤ bi , i = 1, 2, · · · , n
n o
2
 

xn
3 Rn
4 ∅
Closed set
Remark. A set S is closed if its complement is open.
Example 1.2
Determine if the following set is open or closed.
1 {x ∈ R | a < x < b}
 
x1
x =  ...  ∈ Rn | ai ≤ xi ≤ bi , i = 1, 2, · · · , n
n o
2
 

xn
3 Rn
4 ∅
By default, a NLP where the feasible set is Rn is classified as an
unconstrained NLP.
Closed set
The following result is useful to show if a set is closed.
Proposition 1
Let g : Rn → R be a continuous function. Then
(a) The set S = {x ∈ Rn | g (x) ≤ 0} is closed.
(b) The set S = {x ∈ Rn | g (x) ≥ 0} is closed.
(c) The set S = {x ∈ Rn | g (x) = 0} is closed.
Closed set
The following result is useful to show if a set is closed.
Proposition 1
Let g : Rn → R be a continuous function. Then
(a) The set S = {x ∈ Rn | g (x) ≤ 0} is closed.
(b) The set S = {x ∈ Rn | g (x) ≥ 0} is closed.
(c) The set S = {x ∈ Rn | g (x) = 0} is closed.

Example 1.3

x
Verify that the set C = { ∈ R2 } is closed.
x2
Solution. Note that

x 2 x1
C ={ ∈R }={ ∈ R2 | g (x) := x2 − x12 = 0}.
x2 x2

Since g is continuous, C is closed by Proposition 1 (c).

Closed set

Remark. Unions and intersections of closed sets are closed.

Closed set

Remark. Unions and intersections of closed sets are closed.

Corollary 1.4
Suppose gi , hj : Rn → R are continuous. The set

S = {x ∈ Rn | gi (x) = 0, i = 1, 2, · · · , m; hj (x) ≤ 0, j = 1, 2, · · · , p}

is closed.
Note that S is the feasible set of the constrained NLP.
Closed set

Remark. Unions and intersections of closed sets are closed.

Corollary 1.4
Suppose gi , hj : Rn → R are continuous. The set

S = {x ∈ Rn | gi (x) = 0, i = 1, 2, · · · , m; hj (x) ≤ 0, j = 1, 2, · · · , p}

is closed.
Note that S is the feasible set of the constrained NLP.

Example 1.5
The set {[x1 ; x2 ] ∈ R2 | x12 + x22 ≤ 3, x1 − 2 sin x2 ≥ 0} is closed.
Example 1.6
Change the following into the standard formulation

maxx1 ,x2 x1 x2
s.t. x1 ≥ 0, x2 ≥ 0,
x1 + x2 = 24.

Is this an unconstrained or constrained NLP?

In the remaining section, we will learn
how to solve simple optimization models (in R2 ) using graphical
methods.
notion of local vs global minimizers
special case where global optimizers are guarantee to exist
Example 1.7 (An NLP (in R2 ) with linear constraints but nonlinear
objective function.)
minimize f (x) = (x1 − 4)2 + (x2 − 6)2
subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.
Example 1.7 (An NLP (in R2 ) with linear constraints but nonlinear
objective function.)
minimize f (x) = (x1 − 4)2 + (x2 − 6)2
subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

describes a circle with center (4, 6)T
and radius r .

Figure: A 2-variable minimization problem with

minimizer occurring on the boundary of the
feasible region. Contour values are 0, 1, 2, 3, 4.
Example 1.7 (An NLP (in R2 ) with linear constraints but nonlinear
objective function.)
minimize f (x) = (x1 − 4)2 + (x2 − 6)2
subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

describes a circle with center (4, 6)T
and radius r .
• Graphically, we want to find the
shortest distance of feasible points
from the point (4, 6)T .

Figure: A 2-variable minimization problem with

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

describes a circle with center (4, 6)T
and radius r .
• Graphically, we want to find the
shortest distance of feasible points
from the point (4, 6)T .
• We see that the minimizer x∗ must
occur on the boundary defined by the
Figure: A 2-variable minimization problem with
minimizer occurring on the boundary of the
line 3x1 + 2x2 = 18.
feasible region. Contour values are 0, 1, 2, 3, 4.
Example 1.7 (cont’d)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

We can reduce the 2-variable problem into a single variable

optimization problem.
The optimal solution occurs on the line 3x1 + 2x2 = 18.
Substituting x2 = 9 − 32 x1 into f (x), we obtain a function in x1

f (x) = (x1 − 4)2 + (3 − 1.5x1 )2 =: g (x1 ), 2 ≤ x1 ≤ 4.

Example 1.7 (cont’d)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

We can reduce the 2-variable problem into a single variable

optimization problem.
The optimal solution occurs on the line 3x1 + 2x2 = 18.
Substituting x2 = 9 − 32 x1 into f (x), we obtain a function in x1

f (x) = (x1 − 4)2 + (3 − 1.5x1 )2 =: g (x1 ), 2 ≤ x1 ≤ 4.

Use 1-variable calculus to determine a global minimizer of g (x1 ).

g 0 (x1 ) = 2(x1 − 4) − 3(3 − 1.5x1 ) = 13 34
2 x1 − 17 = 0 yields x1 = 13 .
The optimal solution of the above NLP is x∗ = [ 13 34 66
; 13 ], with optimal
∗ 468
value f (x ) = 169 .
Example 1.7 (cont’d)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

Question. Can we always use the above technique to reduce number of

variables?
Example 1.7 (cont’d)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

Question. Can we always use the above technique to reduce number of

variables?

Note: No! In the above, we assumed that we know the optimal solution is
on the boundary defined by the constraint 3x1 + 2x2 = 18. Such an
information is in general not known a priori.
Example 1.7 (cont’d)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.

Programming practice: write a python code to generate the following figure.

Example 1.8 (Same as Example 1.7 but with a different objective
function.)
minimize f (x) = (x1 − 2)2 + (x2 − 2)2
subject to x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1 , x2 ≥ 0.
9
• Minimizer x∗ = [2; 2] — in the interior
8 f(x) = (x1 − 2)2 + (x2 − 2)2

7
of the feasible region.
6

5
80
x2

3 60

2
40
1

0
0 2 4 6 8 20
x1

0
0
Figure: A 2-variable minimization 5
2
4
6
8

10 0
problem with minimizer occurring in the x2

interior of the feasible region. Contour

values are 0, 1, 2, 3, 4. Figure: 3D plot.
Discussion of example 1.7 and 1.8

How do we know whether the minimizer would occur on the boundary

or in the interior of the feasible region? Is it very important to know
this piece of information before we can find the minimizer?
What necessary conditions must the minimizer satisfies?
How do we know a given point x∗ is a minimizer if we cannot
visualize the graph of f (x)?

We will discuss these questions in topic 2 and topic 4.

Example 1.9 (A 2-variable NLP with a nonlinear objective function
and a nonlinear constraint.)

minimize x12 − x2
subject to 2x1 − x2 ≤ 4
9x12 + 25x22 ≤ 225
x1 , x2 ≥ 0.
Example 1.9 (A 2-variable NLP with a nonlinear objective function
and a nonlinear constraint.)

minimize x12 − x2
subject to 2x1 − x2 ≤ 4
9x12 + 25x22 ≤ 225
x1 , x2 ≥ 0.

8
Feasible region is bounded, but
6
not a polygon.
4

2 the equation
α(x1 − a)2 + β(x2 − b)2 = r 2
x2

−2
(α > 0, β > 0) describes an
−4

−6
ellipse.
−8
−5 0 5
x1

Figure: A 2-variable minimization

problem with a non-polygonal feasible
region.
Example 1.10 (Same as example 1.9 but with different feasible region)

minimize x12 − x2
subject to 2x1 − x2 ≤ 4
9x12 + 25x22 ≥ 225
x1 , x2 ≥ 0.
8

2
Its feasible region is unbounded and non-
convex.
x2

-2

-4

-6
In this example, is the optimal objective value
-8
-8 -6 -4 -2 0 2 4 6 8
unbounded as well?
x1
Example 1.10 (Same as example 1.9 but with different feasible region)

minimize x12 − x2
subject to 2x1 − x2 ≤ 4
9x12 + 25x22 ≥ 225
x1 , x2 ≥ 0.
8

2
Its feasible region is unbounded and non-
convex.
x2

-2

-4

-6
In this example, is the optimal objective value
-8
-8 -6 -4 -2 0 2 4 6 8
unbounded as well?
x1

Can we have an optimization problem with unbounded feasible region, but

finite optimal objective value?
How good is the graphical method?

Pros:
Good for intuition
Provide rough solution

Cons:
Possible only for problems with 1 or 2 variables
Requires the computation of the function values on a dense grid of
points is a costly task in general
May need to rely on graph plotting toolbox (eg. python/MATLAB)

We need to devise some other algebraic methods to solve higher

dimensional problem!
Local vs global minimizers
First, we study the notion on local vs global minimizers.
8 2 x 2
f(x) = sin(pi*x) e − x
6

0
f(x)

−2

−4

−6

−8

−10
0 0.5 1 1.5 2 2.5 3
x

What is the minimizer of f for the entire region [0, 3]?

What is the minimizer of f for a small region [1.8, 2.3]?
Inner product/ Dot product

Definition 1.11 (Inner product/ Dot product)

   
x1 y1
 x2   y2 
The inner product of vectors x =   and y =   ∈ Rn is
   
.. ..
 .   . 
xn yn
defined as
n
X
T
hx, yi = x y = xi yi = x1 y1 + x2 y2 + · · · + xn yn .
i=1

Note that we also have hx, yi = kxkkyk cos(θ), where θ is the angle
between x and y.
Euclidean norm

Definition 1.12 (Euclidean norm)

Suppose x = (x1 , x2 , · · · , xn )T ∈ Rn . The Euclidean norm of x is defined
as follows q
||x|| = (x12 + x22 + · · · + xn2 ).

Note that ||x||2 = xT x.

Properties
(a) x ∈ Rn =⇒ kxk ≥ 0;
(b) kxk = 0 ⇐⇒ x = 0.
Euclidean norm

Definition 1.12 (Euclidean norm)

Suppose x = (x1 , x2 , · · · , xn )T ∈ Rn . The Euclidean norm of x is defined
as follows q
||x|| = (x12 + x22 + · · · + xn2 ).

Note that ||x||2 = xT x.

Properties
(a) x ∈ Rn =⇒ kxk ≥ 0;
(b) kxk = 0 ⇐⇒ x = 0.
x ∈ Rn , λ ∈ R =⇒ kλxk = |λ| kxk.
Euclidean norm

Definition 1.12 (Euclidean norm)

Suppose x = (x1 , x2 , · · · , xn )T ∈ Rn . The Euclidean norm of x is defined
as follows q
||x|| = (x12 + x22 + · · · + xn2 ).

Note that ||x||2 = xT x.

Properties
(a) x ∈ Rn =⇒ kxk ≥ 0;
(b) kxk = 0 ⇐⇒ x = 0.
x ∈ Rn , λ ∈ R =⇒ kλxk = |λ| kxk.
Triangle inequality: x, y ∈ Rn =⇒ kx + yk ≤ kxk + kyk.
Euclidean norm

Definition 1.12 (Euclidean norm)

Suppose x = (x1 , x2 , · · · , xn )T ∈ Rn . The Euclidean norm of x is defined
as follows q
||x|| = (x12 + x22 + · · · + xn2 ).

Note that ||x||2 = xT x.

Properties
(a) x ∈ Rn =⇒ kxk ≥ 0;
(b) kxk = 0 ⇐⇒ x = 0.
x ∈ Rn , λ ∈ R =⇒ kλxk = |λ| kxk.
Triangle inequality: x, y ∈ Rn =⇒ kx + yk ≤ kxk + kyk.
Cauchy-Schwarz inequality: x, y ∈ Rn =⇒ |xT y| ≤ kxk · kyk.
Equality holds if and only if x and y are parallel (i.e. x = λy or
y = λx, for some λ ∈ R).
Euclidean norm

q
||x|| = (x12 + x22 + · · · + xn2 ).
Properties(cont’d):
(a) xT y = kxk · kyk ⇐⇒ x = λy or y = λx, for some λ ≥ 0 (i.e. the
maximum value of xT y occurs whenever x and y are vectors in the
same direction).
(b) xT y = −kxk · kyk ⇐⇒ x = λy or y = λx, for some λ ≤ 0 (i.e.
the minimum value of xT y occurs whenever x and y are vectors in the
opposite direction).
Local minimizer and global minimizer

Definition 1.13 (Local minimizer and global minimizer)

Let S be a subset of Rn . Define B (y) = {x ∈ Rn | kx − yk < } to be the open
ball with center y and radius .
Local minimizer and global minimizer

Definition 1.13 (Local minimizer and global minimizer)

Let S be a subset of Rn . Define B (y) = {x ∈ Rn | kx − yk < } to be the open
ball with center y and radius .
1 A point x∗ ∈ S is said to be a local minimizer of f (x) if there exists an
> 0 such that f (x) ≥ f (x∗ ) for all x ∈ S ∩ B (x∗ ).
If f (x) > f (x∗ ) for all x ∈ S ∩ B (x∗ ) − {x∗ }, then x∗ is said to be a strict
local minimizer of f (x).
2 A point x∗ ∈ S is said to be a global minimizer of f (x) if f (x) ≥ f (x∗ ) for
all x ∈ S.
If f (x) > f (x∗ ) ∀ x ∈ S − {x∗ }, then x∗ is said to be a strict global
minimizer of f (x).
Local minimizer and global minimizer

Definition 1.13 (Local minimizer and global minimizer)

By definitions, a global minimizer is a local minimizer. However, the converse is

not true in general.
Example 1.14
Consider the following 1-dimensional problem:
minimize f (x) := sin(πx)2 exp(x) − x 2
subject to 0≤x ≤3

8 2 x 2
f(x) = sin(pi*x) e − x
6

0
f(x)

−2

−4

−6

−8

−10
0 0.5 1 1.5 2 2.5 3
x
Example 1.15
Consider the following NLP in R2 .
minimize f (x) := x2
subject to 10 − (x1 − 3)(x1 − 1)2 − x2 ≤ 0
0 ≤ x1 ≤ 4
20

∗ 4
1 x = is a global minimizer.
1
15

1
2 x̂ = is a local minimizer. It is a feasible
10
x2

point and it gives a minimum

objective
value in
5
1
the local vicinity of x̂ = .
10
0
−2 0 2 4 6
x1
3 x̂ is a local minimizer but not a global minimizer.
Continuous function on closed and bounded set
When do we guarantee to have a global maximizer/minimizer?

We study a special case of continuous function on closed and bounded set!

Definition 1.16 (Bounded)
Let S ⊆ Rn be a nonempty set. The set S is said to be bounded if there
is a positive number M such that ||x|| ≤ M ∀ x ∈ S.

Note: A set S is bounded if and only if there is a positive number M

b such
that |xi | ≤ M ∀ i = 1, 2, · · · , n ∀ x ∈ S.
b
Continuous function on closed and bounded set
When do we guarantee to have a global maximizer/minimizer?

We study a special case of continuous function on closed and bounded set!

Definition 1.16 (Bounded)
Let S ⊆ Rn be a nonempty set. The set S is said to be bounded if there
is a positive number M such that ||x|| ≤ M ∀ x ∈ S.

Note: A set S is bounded if and only if there is a positive number M

b such
that |xi | ≤ M ∀ i = 1, 2, · · · , n ∀ x ∈ S.
b

Example 1.17
(a) Intervals (a, b), [a, b), [a, b] and (a, b] are bounded in R.
(b) The set {x = (x1 , x2 , · · · , xn )T ∈ Rn | ai ≤ xi ≤ bi , i = 1, 2, · · · , n} is
bounded.
(c) The closed n-ball B̄(0, M) = {x ∈ Rn : kxk ≤ M} and the open n-ball
B(0, M) = {x ∈ Rn : kxk < M} are bounded.
Continuous function on compact set

A closed set may not be a bounded set, eg: S = {x ∈ R | x ≥ 0}

A bounded set may not be a closed set, eg: S = {x ∈ R | 0 < x < 1}.

We define a set with nice property:

Definition 1.18 (Compact)

A set S in Rn is said to be compact if it is closed and bounded.
Continuous function on compact set

A closed set may not be a bounded set, eg: S = {x ∈ R | x ≥ 0}

A bounded set may not be a closed set, eg: S = {x ∈ R | 0 < x < 1}.

We define a set with nice property:

Definition 1.18 (Compact)

A set S in Rn is said to be compact if it is closed and bounded.

It turns out we have a guarantee on the existence of global optimizers!

Theorem 1.19 (Weierstrass Theorem)

A continuous function on a nonempty compact set S ⊂ Rn has a global
maximum point and a global minimum point in S.
[Weierstrass Theorem]
A continuous function on a nonempty compact set S ⊂ Rn has a global maximum
point and a global minimum point in S.

Example 1.20

minimize f (x) := x12 − x22

subject to g (x) = x12 + x22 − 3 = 0.
The feasible set S = {x ∈ R2 | g (x) = x12 + x22 − 3 = 0} is closed and
bounded. The function f is continuous. By Weierstrass Theorem, f has a
global minimum and a global maximum on S.
Example 1.21
minimize f (x) := x12 − x22
subject to g (x) = 1 − x1 − x2 = 0.
The function f is continuous, but the feasible set
S = {x ∈ R2 | g (x) = 1 − x1 − x2 = 0} is closed but unbounded. Thus,
Weierstrass Theorem cannot be used directly to deduce whether there is a
global minimum.

In fact, the problem has no global minimum. For any α ∈ R, the point
[α; 1 − α] ∈ S, and
α
= lim α2 − (1 − α)2 = lim (2α − 1) = −∞.

lim f
α→−∞ 1−α α→−∞ α→−∞

Introduction to Convex Optimization
No ratings yet
Introduction to Convex Optimization
32 pages
Matinf 2360 Part 3
No ratings yet
Matinf 2360 Part 3
106 pages
MATLAB Optimization Toolbox Guide
No ratings yet
MATLAB Optimization Toolbox Guide
49 pages
Lecture1 introductionPCA
No ratings yet
Lecture1 introductionPCA
75 pages
Optimization For Data Science
No ratings yet
Optimization For Data Science
18 pages
Theory Note 1
No ratings yet
Theory Note 1
5 pages
Matlab For Optimization PDF
No ratings yet
Matlab For Optimization PDF
49 pages
Wisdom of Crowds Intro
No ratings yet
Wisdom of Crowds Intro
53 pages
Process Optimization Algorythms PDF
100% (1)
Process Optimization Algorythms PDF
77 pages
Math 273a: Optimization: Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015
No ratings yet
Math 273a: Optimization: Instructor: Wotao Yin Department of Mathematics, UCLA Fall 2015
17 pages
Short Introduction To Optimization MIT
No ratings yet
Short Introduction To Optimization MIT
16 pages
OQM Lecture Note - Part 1 Introduction To Mathematical Optimisation
No ratings yet
OQM Lecture Note - Part 1 Introduction To Mathematical Optimisation
10 pages
Introduction to Nonlinear Programming
No ratings yet
Introduction to Nonlinear Programming
12 pages
Introduction to Mathematical Optimization
No ratings yet
Introduction to Mathematical Optimization
8 pages
Solving Optimization Problems Using The Matlab Opt
No ratings yet
Solving Optimization Problems Using The Matlab Opt
50 pages
Optimization Techniques Guide
100% (1)
Optimization Techniques Guide
198 pages
Optimization Models and Applications
No ratings yet
Optimization Models and Applications
33 pages
Mathematical Optimization
No ratings yet
Mathematical Optimization
11 pages
Introduction to Optimization Concepts
No ratings yet
Introduction to Optimization Concepts
46 pages
Lec1 Coen505
No ratings yet
Lec1 Coen505
15 pages
Linear Programming MSM 2da (G05a, M09a) : Matthias Gerdts
No ratings yet
Linear Programming MSM 2da (G05a, M09a) : Matthias Gerdts
85 pages
Numerical Analysis for Students
No ratings yet
Numerical Analysis for Students
49 pages
Particulars Mittal 2015
No ratings yet
Particulars Mittal 2015
29 pages
BV Cvxslides PDF
No ratings yet
BV Cvxslides PDF
301 pages
Lectures HD
No ratings yet
Lectures HD
301 pages
Discrete and Continuous Optimization Textbook
No ratings yet
Discrete and Continuous Optimization Textbook
68 pages
Convex Optimization Fundamentals
No ratings yet
Convex Optimization Fundamentals
28 pages
Handout 1 Introduction
No ratings yet
Handout 1 Introduction
7 pages
Introduction to Optimization Models
No ratings yet
Introduction to Optimization Models
18 pages
Chap04 ConvexOptimizationBasics
No ratings yet
Chap04 ConvexOptimizationBasics
29 pages
Promise Chapter Three
No ratings yet
Promise Chapter Three
8 pages
Convex Optimization Prerequisite - Topics
No ratings yet
Convex Optimization Prerequisite - Topics
6 pages
Nonlinear Optimization Overview
No ratings yet
Nonlinear Optimization Overview
46 pages
Convex Optimization for Engineers
No ratings yet
Convex Optimization for Engineers
14 pages
LP Optimization for Network Design
No ratings yet
LP Optimization for Network Design
202 pages
ConvexSpring25 Week 1 2
No ratings yet
ConvexSpring25 Week 1 2
46 pages
01 Intro Notes Cvxopt f22
No ratings yet
01 Intro Notes Cvxopt f22
25 pages
Optimization
No ratings yet
Optimization
49 pages
斯坦福大学机器学习数学基础 33-40
No ratings yet
斯坦福大学机器学习数学基础 33-40
8 pages
Convex Optimization in Data Science
No ratings yet
Convex Optimization in Data Science
31 pages
Univariate Optimization and Steepest Descent
No ratings yet
Univariate Optimization and Steepest Descent
67 pages
Linear Programming Guide
No ratings yet
Linear Programming Guide
66 pages
June 2017 MS - Paper 1 OCR (A) Physics AS-level
No ratings yet
June 2017 MS - Paper 1 OCR (A) Physics AS-level
14 pages
ASOE Biology 2022 Final-With-Answers
No ratings yet
ASOE Biology 2022 Final-With-Answers
31 pages
Stata Output Panel Hsiao 1986 Example
No ratings yet
Stata Output Panel Hsiao 1986 Example
5 pages
Aptitude and Programming Questions Guide
No ratings yet
Aptitude and Programming Questions Guide
179 pages
Finite Element Analysis of Flat Slab With CalcPad
No ratings yet
Finite Element Analysis of Flat Slab With CalcPad
13 pages
KISA ISC Preparatory Mathematics
No ratings yet
KISA ISC Preparatory Mathematics
8 pages
COBT Market Profile Handbook - 01
No ratings yet
COBT Market Profile Handbook - 01
10 pages
ITwin Technology
No ratings yet
ITwin Technology
20 pages
خار تخت
No ratings yet
خار تخت
1 page
AniMer: Advanced Animal Pose Estimation
No ratings yet
AniMer: Advanced Animal Pose Estimation
15 pages
Math 5 DLP 9 - Finding The Greatest Common Factor of Two or More Numbers PDF
No ratings yet
Math 5 DLP 9 - Finding The Greatest Common Factor of Two or More Numbers PDF
8 pages
BT KTL
No ratings yet
BT KTL
17 pages
Single-Phase Induction Motor Guide
No ratings yet
Single-Phase Induction Motor Guide
27 pages
What Is RPC RMI
No ratings yet
What Is RPC RMI
5 pages
CX Plus Controller Replacement Guide
No ratings yet
CX Plus Controller Replacement Guide
4 pages
Driver Information for Windows 7
No ratings yet
Driver Information for Windows 7
53 pages
Chapter - 3: Elements of Realiazability Theory: Requirements Is Called Network Synthesis
100% (1)
Chapter - 3: Elements of Realiazability Theory: Requirements Is Called Network Synthesis
4 pages
Solving Recurrence Relations in Algorithms
No ratings yet
Solving Recurrence Relations in Algorithms
4 pages
Mission 3 - FCM - Drainage & Stormwater Calculations (Nov7, 0
No ratings yet
Mission 3 - FCM - Drainage & Stormwater Calculations (Nov7, 0
60 pages
A History of Mechanical Engineering Ce Zhang Install Download
No ratings yet
A History of Mechanical Engineering Ce Zhang Install Download
56 pages
Degrees of Freedom in Gases Explained
No ratings yet
Degrees of Freedom in Gases Explained
2 pages
Terraform Associate: Number: Terraform Associate Passing Score: 800 Time Limit: 120 Min File Version: 1
No ratings yet
Terraform Associate: Number: Terraform Associate Passing Score: 800 Time Limit: 120 Min File Version: 1
11 pages
Anwana
No ratings yet
Anwana
16 pages
Electrical Installations - Numbers & Vocabulary Worksheet (A1-A2)
No ratings yet
Electrical Installations - Numbers & Vocabulary Worksheet (A1-A2)
4 pages
12th Physics 1mrks
No ratings yet
12th Physics 1mrks
151 pages
Weak Acids and Their Ionization
No ratings yet
Weak Acids and Their Ionization
71 pages
33N25 FairchildSemiconductor
No ratings yet
33N25 FairchildSemiconductor
8 pages
New Features
No ratings yet
New Features
180 pages
Axlr8r Racing Intro Guide
No ratings yet
Axlr8r Racing Intro Guide
49 pages
2018-11 FS9100 Datasheet
No ratings yet
2018-11 FS9100 Datasheet
3 pages

L01 Intro

Uploaded by

L01 Intro

Uploaded by

Lecture 1: Introduction

DSA3102 Essential Data Analytics Tools: Convex Optimisation

Lam Xin Yee

Pre-requisites: Linear algebra, multivariate calculus, some programming

Python will be used to present algorithms in the lecture most of the

Types of optimization models:

3 Linear/nonlinear objective function over discrete sets (MA4254

Example I. If K units of capital and L units of labor are used, a company

Example I. If K units of capital and L units of labor are used, a company

Solution. Let K = units of capital purchased, and L = units of labor

Example II. It costs a company $c to produce a unit of a product. If the

Consider an investor who has a certain amount of money to be invested in

Consider an investor who has a certain amount of money to be invested in

expected return of x = E [x] = x1 µ1 + · · · + xn µn = µT x

where Qij = ρij σi σj and µ = [µ1 ; . . . ; µn ].

For a given target expected return R, a valid portfolio vector x is called

Markowitz’s efficient portfolio (also called mean-variance) optimization

f (a) = x̄1 ai1 + x̄2 ai2 + · · · + x̄p aip + α + i = bi ,

where x̄ ∈ Rp and α ∈ R are the unknown coefficient vector and

This goal can be formulated as the optimization problem

Minimize (or Maximize) f (x)

Variable x = (x1 , x2 , · · · , xn )T is a column vector in Rn

Minimize (or Maximize) f (x)

(a) For a minimization problem, a feasible solution x∗ for which

Minimize (or Maximize) f (x)

(a) For a minimization problem, a feasible solution x∗ for which

(b) For a maximization problem, a feasible solution x∗ is optimal if

The value of f (x∗ ) is then called the optimal value.

Minimize (or Maximize) f (x)

(a) For a minimization problem, the objective value is said to be

Minimize (or Maximize) f (x)

(a) For a minimization problem, the objective value is said to be

The following optimization problems are equivalent:

Maximize f (x) Minimize −f (x)

x∗ ∈ S is an optimal solution for the maximization problem with

Minimize (or Maximize) f (x)

Objective function f : Rn → R is a nonlinear function of x

Definition 1.1 (Open set)

Minimize (or Maximize) f (x)

Objective function f : Rn → R is a nonlinear function of x

Definition 1.1 (Open set)

Example of an unconstrained NLP: linear regression, the molecular

Minimize (or Maximize) f (x)

f : Rn → R is the objective function

Since g is continuous, C is closed by Proposition 1 (c).

Remark. Unions and intersections of closed sets are closed.

Remark. Unions and intersections of closed sets are closed.

Remark. Unions and intersections of closed sets are closed.

Is this an unconstrained or constrained NLP?

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

Figure: A 2-variable minimization problem with

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

Figure: A 2-variable minimization problem with

• (x1 − 4)2 + (x2 − 6)2 = r 2 (r > 0)

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

We can reduce the 2-variable problem into a single variable

f (x) = (x1 − 4)2 + (3 − 1.5x1 )2 =: g (x1 ), 2 ≤ x1 ≤ 4.

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

We can reduce the 2-variable problem into a single variable

f (x) = (x1 − 4)2 + (3 − 1.5x1 )2 =: g (x1 ), 2 ≤ x1 ≤ 4.

Use 1-variable calculus to determine a global minimizer of g (x1 ).

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

Question. Can we always use the above technique to reduce number of

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

Question. Can we always use the above technique to reduce number of

minimize f (x) = (x1 − 4)2 + (x2 − 6)2

Programming practice: write a python code to generate the following figure.

interior of the feasible region. Contour

How do we know whether the minimizer would occur on the boundary

We will discuss these questions in topic 2 and topic 4.

Figure: A 2-variable minimization

Can we have an optimization problem with unbounded feasible region, but

We need to devise some other algebraic methods to solve higher

What is the minimizer of f for the entire region [0, 3]?

Definition 1.11 (Inner product/ Dot product)

Definition 1.12 (Euclidean norm)

f (a) = x̄1 ai1 + x̄2 ai2 + · · · + x̄p aip + α + i = bi ,