Lecture 10: descent methods
Generic descent algorithm Generalization to multiple dimensions Problems of descent methods, possible improvements Fixes Local minima
Gradient descent (reminder)
Minimum of a function is found by following the slope of the function
f
f(x) guess
f(m) m
Gradient descent (illustration)
f
f(x) guess next step
f(m) m
Gradient descent (illustration)
f
f(x) guess next step new gradient
f(m) m
Gradient descent (illustration)
f
f(x) guess
next step f(m) m
Gradient descent (illustration)
f
f(x) guess
f(m)
stop m
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
guess
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Direction: downhill
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
step
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Now you are here
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Stop when close from minimum
Gradient descent: algorithm
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
guess = x
direction = -f(x) step = h > 0 x:=xhf(x) f(x)~0
Until stopping criterion is satisfied
Example of 2D gradient: pic of the MATLAB demo
Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo
Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo
Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo
Definition of the gradient in 2D
This is just a genaralization of the derivative in two dimensions. This can be generalized to any dimension.
Example of 2D gradient: pic of the MATLAB demo
Illustration of the gradient in 2D
Example of 2D gradient: pic of the MATLAB demo
Gradient descent works in 2D
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
guess
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Direction: downhill
10
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
step
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Now you are here
11
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
Until stopping criterion is satisfied
Stop when close from minimum
Generalization to multiple dimensions
Start with a point (guess) Repeat
Determine a descent direction Choose a step Update
guess = x
direction = -f(x) step = h > 0 x:=xh Vf(x)
Until stopping criterion is satisfied
Vf(x)~0
12
Multiple dimensions
Everything that you have seen with derivatives can be generalized with the gradient.
For the descent method, f(x) can be replaced by
In two dimensions, and by
in N dimensions.
Example of 2D gradient: MATLAB demo
The cost to buy a portfolio is:
Stock N
Stock i
Stock 2
If you want to minimize the price to buy your portfolio, you need to compute the gradient of its price:
Stock 1
13
Problem 1: choice of the step
When updating the current computation: - small steps: inefficient - large steps: potentially bad results
f(x)
guess
f(m)
stop m
Too many steps: takes too long to converge
Problem 1: choice of the step
When updating the current computation: - small steps: inefficient - large steps: potentially bad results
f(x)
Next point (went too far)
f(m) m
Current point
14
Problem 2: ping pong effect
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
Problem 2: ping pong effect
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
15
Problem 2: (other norm dependent issues)
[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]
Problem 3: stopping criterion
Intuitive criterion:
In multiple dimensions:
Or equivalently
Rarely used in practice. More about this in EE227A (convex optimization, Prof. L. El Ghaoui).
16
Fixes
Several methods exist to address this problem
- Line search methods, in particular - Backtracking line search - Exact line search - Normalized steepest descent - Newton steps
Fundamental problem of the method: local minima
Local minima: pic of the MATLAB demo
The iterations of the algorithm converge to a local minimum
17
Local minima: pic of the MATLAB demo
View of the algorithm is myopic
18