0% found this document useful (0 votes)
23 views29 pages

1-Review of Linear Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views29 pages

1-Review of Linear Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Introduction to Deep Learning

Review of Linear Regression


Example
• Problem: House selling price prediction
• Input: Having data about areas and selling
prices of 30 houses as in table 1 below:
Area (m2) Selling price (Vnese million)

30 448.524
32.4138 509.248
34.8276 535.104
37.2414 551.432
39.6552 623.418
… …

Table 1: Dataset for house selling price prediction

2
Visualization of table 1 data
Relationship between house selling price
and house area

price

m2

3
Example (next)

• Requirement: estimate the selling price of


a 50 square meter

• Output: estimated price?

4
Solution: Idea
Solution: Draw a line closest to the data points and calculate the house price at 50

Estimated price of 50-m2 house

price

m2
5
Solution: Programming
• Step 1- Training: Find the line closest to the
data points (called model)

6
Solution: Programming
• Step 1- Training: Find the line closest to the
data points (called model) à using Gradient
descent algorithm

7
Solution: Programming
• Step 1- Training: Find the line closest to the
data points (called model) à using Gradient
descent algorithm

• Step 2 - Prediction: Predict how much a 50-m2


house will cost based on the trained model

8
Formulating model
• Model formula: 𝑦 = 𝑤1 ∗ 𝑥 + 𝑤0

9
Formulating model
• Model formula: 𝑦 = 𝑤1 ∗ 𝑥 + 𝑤0
à Linear Model

10
Formulating model
• Model formula: 𝑦 = 𝑤1 ∗ 𝑥 + 𝑤0
à Linear Model
• Problem becomes: find 𝑤1, 𝑤0
• Represent input data points as:
{(xi,yi), i = 1...30}
In which: yi = 𝑤1 ∗ 𝑥𝑖 + 𝑤0
• Represent estimated data point as:
𝑦)𝑖 = 𝑤1 ∗ 𝑥𝑖 + 𝑤0

11
Model training
• Random initial data point: 𝑤1 = 1, 𝑤0 = 0 à Model becomes: y = x
• Model fine-tuning

Difference between true price and estimated price


at the data point x = 42 of linear model y = x

price
y=
Difference between true
price and estimated price
y=x
𝑦! =

m2
12
Model training
• Problem: estimated price is too far from true
price. For example, at the point x = 42
Difference between true price and estimated price
at the data point x = 42 of linear model y = x

price

y= Difference between true


price and estimated price

y=x
𝑦! =

m2

13
Model training
• Need a metric to evaluate the linear model with
parameter set: (w0,w1) = (0,1)
Difference between true price and estimated price
at the data point x = 42 of linear model y = x

price

y= Difference between true


price and estimated price

y=x
𝑦! =

m2

14
Loss Function
• For each data point (xi,yi), the difference between
the actual price and the predicted price:
!
∗ (𝑦)𝑖 − 𝑦# )2
"
• The difference across the entire data set as the
sum of the differences of each data point:
! !
J= ∗ ∗ ( ∑) (
&'( 𝑖𝑦
$ − 𝑦$ ) 2)
" #

Where N is number of data points

15
Loss Function
! !
J= ∗ ∗ ( ∑)
&'(( 𝑦
$𝑖 − 𝑦$ ) 2)
" #
• J >= 0
• The smaller J is, the model is more close to the
actual data points
• If J = 0 then the model passes through all data
points

à J is called the loss function

16
Loss Function
! !
J= ∗ ∗ ( ∑)
&'(( 𝑦
$𝑖 − 𝑦$ ) 2)
" #
• The problem transfers from: finding the linear
model 𝑦 = 𝑤1 ∗ 𝑥 + 𝑤0
cloest to the data points
• à to: finding the parameter (w0,w1) such that
J obtains the minimum value
à Use Gradient descent algorithm to find
minimum value of J

17
Gradient Descent Algorithm
• Idea: use derivative to find the minimum value
of a function f(x)
• Algorithm:
(1) Random initialization: x = x0
(2) Assign: x = x - learning_rate * f’(x)
(3) Re-compute f(x). Stop if f(x) is small enough,
or repeat step (2) if not

18
Gradient Descent Algorithm
Note:
• learning_rate is non-negative constant
• step 2 will be repeated until a large enough
number of times or f(x) is small enough

19
Gradient Descent: example

Problem: Find minimum value of f(x) = x2


using gradient descent algorithm

20
Gradient Descent: example

Problem: Find minimum value of f(x) = x2


using gradient descent algorithm

• Solution 1: use Linear Algebra


• Solution 2: use gradient descent algorithm

21
Gradient Descent: example
Step 1: Random initialization x= -2 (Point A)
Step 2: compute f’(x)
then x = xA– learning_rate * f’(xA)
Step 3: compute f(x) à still big à move to
point C, and repeat Step 2

22
Gradient Descent: example
In detail: if we choose initial value: x = 10, learning_rate = 0.1, then the
values of step 2 and step 3 will be as in the following table:
Time x f(x)
1 8.00 64.00
2 6.40 40.96
3 5.12 26.21
4 4.10 16.78
5 3.28 10.74
6 2.62 6.87
7 2.10 4.40
8 1.68 2.81
9 1.34 1.80
10 1.07 1.15

Table 2: values of f(x) after 10 times of step 2 calculation

23
Gradient Descent: example

Visualization of Table 2

24
Effect of Learning Rate Selection

Many times to Proper times to Overshoot problem


calculate step 2 calculate step 2

25
Effect of Learning Rate Selection

Epoch: Number of times of step 2, Loss: the function to find the minimum value

26
Practical Work 01
• Setup python environment
– Anaconda
– Or virtualenv
–…
• Write Python code to plot data on previous
table

27
Practical Work 01
Time x f(x)
1 8.00 64.00
2 6.40 40.96
3 5.12 26.21
4 4.10 16.78
5 3.28 10.74
6 2.62 6.87
7 2.10 4.40
8 1.68 2.81
9 1.34 1.80
10 1.07 1.15

28
29

You might also like