0% found this document useful (0 votes)

87 views18 pages

Gradient Descent for Students

This lecture discusses gradient descent algorithms for finding the minimum of a function. Gradient descent works by iteratively moving in the direction of the negative gradient of the function. The lecture generalizes gradient descent to multiple dimensions using vector gradients. It discusses problems that can arise with gradient descent, such as choosing step sizes and stopping criteria. Potential fixes discussed include line search methods and Newton steps. The lecture notes that gradient descent is prone to converging at local minima rather than the global minimum.

Uploaded by

gacongnghiep7786

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views18 pages

Gradient Descent for Students

Uploaded by

gacongnghiep7786

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Lecture 10: descent methods

Generic descent algorithm Generalization to multiple dimensions Problems of descent methods, possible improvements Fixes Local minima

Gradient descent (reminder)

Minimum of a function is found by following the slope of the function

f
f(x) guess

f(m) m

Gradient descent (illustration)

f
f(x) guess next step

f(m) m

Gradient descent (illustration)

f
f(x) guess next step new gradient

f(m) m

Gradient descent (illustration)

f
f(x) guess

next step f(m) m

Gradient descent (illustration)

f
f(x) guess

f(m)

stop m

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

guess

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Direction: downhill

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

step

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Now you are here

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Stop when close from minimum

Gradient descent: algorithm

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

guess = x
direction = -f(x) step = h > 0 x:=xhf(x) f(x)~0

Until stopping criterion is satisfied

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Definition of the gradient in 2D

This is just a genaralization of the derivative in two dimensions. This can be generalized to any dimension.

Example of 2D gradient: pic of the MATLAB demo

Illustration of the gradient in 2D

Example of 2D gradient: pic of the MATLAB demo

Gradient descent works in 2D

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

guess

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Direction: downhill

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

step

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Now you are here

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

Until stopping criterion is satisfied

Stop when close from minimum

Generalization to multiple dimensions

Start with a point (guess) Repeat
Determine a descent direction Choose a step Update

guess = x
direction = -f(x) step = h > 0 x:=xh Vf(x)

Until stopping criterion is satisfied

Vf(x)~0

Multiple dimensions
Everything that you have seen with derivatives can be generalized with the gradient.

For the descent method, f(x) can be replaced by

In two dimensions, and by

in N dimensions.

Example of 2D gradient: MATLAB demo

The cost to buy a portfolio is:

Stock N

Stock i

Stock 2

If you want to minimize the price to buy your portfolio, you need to compute the gradient of its price:

Stock 1

Problem 1: choice of the step

When updating the current computation: - small steps: inefficient - large steps: potentially bad results

f(x)

guess

f(m)

stop m

Too many steps: takes too long to converge

Problem 1: choice of the step

When updating the current computation: - small steps: inefficient - large steps: potentially bad results

f(x)

Next point (went too far)

f(m) m

Current point

Problem 2: ping pong effect

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 2: ping pong effect

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 2: (other norm dependent issues)

[S. Boyd, L. Vandenberghe, Convex Convex Optimization lect. Notes, Stanford Univ. 2004 ]

Problem 3: stopping criterion

Intuitive criterion:

In multiple dimensions:

Or equivalently

Rarely used in practice. More about this in EE227A (convex optimization, Prof. L. El Ghaoui).

Fixes
Several methods exist to address this problem
- Line search methods, in particular - Backtracking line search - Exact line search - Normalized steepest descent - Newton steps

Fundamental problem of the method: local minima

Local minima: pic of the MATLAB demo

The iterations of the algorithm converge to a local minimum

Local minima: pic of the MATLAB demo

View of the algorithm is myopic

11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
No ratings yet
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
24 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
11-Descida de Gradiente
No ratings yet
11-Descida de Gradiente
3 pages
Unit3 Rev3
No ratings yet
Unit3 Rev3
201 pages
LInear
No ratings yet
LInear
14 pages
Optimization Gradient Descent Method
No ratings yet
Optimization Gradient Descent Method
17 pages
5 1 SD 17122020
No ratings yet
5 1 SD 17122020
47 pages
Gradient - Descent Important 23-24
No ratings yet
Gradient - Descent Important 23-24
37 pages
Lecture 3 Gradient Descent
No ratings yet
Lecture 3 Gradient Descent
37 pages
Lecture 3 Gradient Descent
No ratings yet
Lecture 3 Gradient Descent
37 pages
Gradient Descent
No ratings yet
Gradient Descent
52 pages
(K) K (k+1) (K) K (K)
No ratings yet
(K) K (k+1) (K) K (K)
6 pages
Gradient Descent in Convex Optimization
No ratings yet
Gradient Descent in Convex Optimization
27 pages
05 Gradient Descent
No ratings yet
05 Gradient Descent
23 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
06 23ECE216 GradientDescent v2
No ratings yet
06 23ECE216 GradientDescent v2
73 pages
Chapter 4
No ratings yet
Chapter 4
65 pages
Understanding Cost Function & Gradient Descent
No ratings yet
Understanding Cost Function & Gradient Descent
142 pages
Process Optimization
100% (1)
Process Optimization
70 pages
Lecture 5
No ratings yet
Lecture 5
31 pages
Gradient Descend
No ratings yet
Gradient Descend
64 pages
Lec 11
No ratings yet
Lec 11
13 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Adam Optimizer
No ratings yet
Adam Optimizer
22 pages
Slides-4 Optimization Extra Gradient Descent
No ratings yet
Slides-4 Optimization Extra Gradient Descent
67 pages
Gradient Descent New
No ratings yet
Gradient Descent New
42 pages
Clnote Oct8
No ratings yet
Clnote Oct8
39 pages
5.1loss Function, Optimization, GD
No ratings yet
5.1loss Function, Optimization, GD
39 pages
MScFE 650 MLF - Video - Transcripts - M3
No ratings yet
MScFE 650 MLF - Video - Transcripts - M3
19 pages
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
No ratings yet
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
19 pages
Gradient Descent Algorithm.Y...
No ratings yet
Gradient Descent Algorithm.Y...
10 pages
Gradient Descent Explained
No ratings yet
Gradient Descent Explained
9 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
7 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
ML Lecture # 03 Gradient Descent
No ratings yet
ML Lecture # 03 Gradient Descent
23 pages
Week 10 Notes MLF
No ratings yet
Week 10 Notes MLF
20 pages
Machine Learning - Lecture 2
No ratings yet
Machine Learning - Lecture 2
28 pages
Understanding Gradient Descent Methods
No ratings yet
Understanding Gradient Descent Methods
6 pages
6 Gradient Method
No ratings yet
6 Gradient Method
19 pages
Gradient Descent Insights
No ratings yet
Gradient Descent Insights
4 pages
Gradient Descent
No ratings yet
Gradient Descent
12 pages
Gradient Descent for ML Practitioners
No ratings yet
Gradient Descent for ML Practitioners
2 pages
ML Optimization Techniques Explained
No ratings yet
ML Optimization Techniques Explained
25 pages
Coordinate Descent Algorithms: Stephen J. Wright
No ratings yet
Coordinate Descent Algorithms: Stephen J. Wright
32 pages
Gradient Descent
No ratings yet
Gradient Descent
55 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
AIMLB PGP 2025 Session 5
No ratings yet
AIMLB PGP 2025 Session 5
67 pages
Unconstrained Numerical Optimization An Introduction For Econometricians
100% (1)
Unconstrained Numerical Optimization An Introduction For Econometricians
32 pages
Understanding Gradient Descent in ML
No ratings yet
Understanding Gradient Descent in ML
20 pages
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
No ratings yet
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
5 pages
Lecture 7 (With Notes)
No ratings yet
Lecture 7 (With Notes)
39 pages
Math Lecture 4
No ratings yet
Math Lecture 4
27 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Chapter 3 Unconstrained Convex Optimization
No ratings yet
Chapter 3 Unconstrained Convex Optimization
28 pages
LV 25 P
No ratings yet
LV 25 P
2 pages
Library File Description OrCAD PSpice
No ratings yet
Library File Description OrCAD PSpice
5 pages
Gradient Descent for Students
No ratings yet
Gradient Descent for Students
18 pages
C30 Users Guide
No ratings yet
C30 Users Guide
208 pages

Gradient Descent for Students

Uploaded by

Gradient Descent for Students

Uploaded by

Lecture 10: descent methods

Gradient descent (reminder)

Gradient descent (illustration)

Gradient descent (illustration)

Gradient descent (illustration)

next step f(m) m

Gradient descent (illustration)

Gradient descent: algorithm

Until stopping criterion is satisfied

Gradient descent: algorithm

Until stopping criterion is satisfied

Gradient descent: algorithm

Until stopping criterion is satisfied

Gradient descent: algorithm

Until stopping criterion is satisfied

Now you are here

Gradient descent: algorithm

Until stopping criterion is satisfied

Stop when close from minimum

Gradient descent: algorithm

Until stopping criterion is satisfied

Example of 2D gradient: pic of the MATLAB demo

Example of 2D gradient: pic of the MATLAB demo

Example of 2D gradient: pic of the MATLAB demo

Example of 2D gradient: pic of the MATLAB demo

Example of 2D gradient: pic of the MATLAB demo

Example of 2D gradient: pic of the MATLAB demo

Generalization to multiple dimensions

Until stopping criterion is satisfied

Generalization to multiple dimensions

Until stopping criterion is satisfied

Generalization to multiple dimensions

Until stopping criterion is satisfied

Generalization to multiple dimensions

Until stopping criterion is satisfied

Now you are here

Generalization to multiple dimensions

Until stopping criterion is satisfied

Stop when close from minimum

Generalization to multiple dimensions

Until stopping criterion is satisfied

For the descent method, f(x) can be replaced by

In two dimensions, and by

Example of 2D gradient: MATLAB demo

Problem 1: choice of the step

Too many steps: takes too long to converge

Problem 1: choice of the step

Next point (went too far)

Problem 2: ping pong effect

Problem 2: ping pong effect

Problem 2: (other norm dependent issues)

Problem 3: stopping criterion

Fundamental problem of the method: local minima

Local minima: pic of the MATLAB demo

The iterations of the algorithm converge to a local minimum

Local minima: pic of the MATLAB demo

You might also like