0% found this document useful (0 votes)
9 views

Lecture 05

The document discusses the application of linear regression in mechanical engineering, highlighting the relationships between dependent and independent variables, as well as the concepts of bias and variance in model fitting. It covers tools and libraries for machine learning, such as Anaconda, Pandas, and scikit-learn, and emphasizes the importance of regularization techniques like Ridge and Lasso to prevent overfitting and underfitting. Additionally, it outlines the process of data splitting for training, validation, and testing in machine learning models.

Uploaded by

hashim shabbir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 05

The document discusses the application of linear regression in mechanical engineering, highlighting the relationships between dependent and independent variables, as well as the concepts of bias and variance in model fitting. It covers tools and libraries for machine learning, such as Anaconda, Pandas, and scikit-learn, and emphasizes the importance of regularization techniques like Ridge and Lasso to prevent overfitting and underfitting. Additionally, it outlines the process of data splitting for training, validation, and testing in machine learning models.

Uploaded by

hashim shabbir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

AI for Mechanical Engineering

Dr. Arsalan Arif

Artificial Intelligence A Modern Approach


Stuart J. Russell and Peter Norvig

Spring 2024
Linear Regression

Dependent Variable
Weight gain vs intake of food
Positive relationship

Weight Gain Vs expenditure

Independent variable increase and dependent


variable decreases
Independent Variable

Regression Line
Minimize the difference between the estimated and
actual value Negative relationship
Error
Linear
Regression
^
𝑦𝑦=𝑏 0 +𝑏
=𝑚𝑥 1 𝑥1
+𝐶

𝑏^
𝑦 =𝑏
=
6 𝑥 +𝑏 𝑥
0 =0.6
0 1 𝑏
1 =
∑ (=1𝒙 − 𝒙 ) ∗( 𝒚 − 𝒚 )
Where
∑ ( 𝒙 − 𝒙 )𝟐
1 1
10
^
𝑦 =𝑏 +𝑏1 𝑥 1
𝑏1 =? 0

^ =slope+𝑏
𝑦 =𝑏 of the line 𝑏 0=2.2
0 1 𝑥1 4=𝑏0 + 0.6 ∗ 3
(3 , 4)

^
𝑦 =𝑏 0 +𝑏1 𝑥 1
x y
1 2 1-3=-2 2-4=-2 4 4
6
5
2 4 2-3=-1 4-4=0 1 0
4 3 5 3-3=0 5-4=1 1 0
3
4 4 4-3=1 4-4=0 1 0
2
1 5 5 5-3=2 5-4=1 4 2

0 1 2 4 5
3
3 4 10 6
Linear
Regression 6
5

𝑏1 =∑ 𝒚^ −𝒚 ¿
2 ¿ ¿ 𝑏1 =
3.6
=𝟎 . 𝟔
4
range is from 0 to 1
∑ 𝒚 −𝒚 ¿2 6
=0.6 means it’s a good fit
3
2
1
^
𝑦 =2.2+ 0.6 𝑥
0 1 2 3 4 5

x y
1 2 1-3=-2 2-4=-2 4 2.8 2.8-4=-1.2 1.44 4 4
2 4 2-3=-1 4-4=0 0 3.4 -0.6 0.36 1 0
3 5 3-3=0 5-4=1 1 4 0 0 1 0
4 4 4-3=1 4-4=0 0 4.6 0.6 0.36 1 0
5 5 5-3=2 5-4=1 1 5.2 1.2 1.44 4 2

3 4 6 3.6 10 6
Bias: Underfitting
Gap between Actual and estimated value.
High Bias means estimated value is far away from the actual
value. And vice versa.
When algorithm has limited flexibility to learn.
Pays less attention to training data, and over simplify the
model.
Such models always leads to high error on training and test
data.

Variance:
How much scattered the estimated values are
A model with high variance pays lots of attention to training
data and doesn't generalize.

Overfitting
Anaconda

is open source (free) of python programing for machine learning with tools like
Spider
Jupitar notebook is a platform
Google provided free GPU (Online access). Also paid ( Faster)
More number of cores, parallel computation.
Pandas Application Programing Interface (API)
Numerical Python (numpy)

Pandas is open source python library data analysis tool, providing high performance

Read “csv “ file.


By default first 5 samples

It can be chosen by writing required


data visibility
25 % of your data is less than or equal to 7

50 % of your data is less than or equal to 7.5

Split the data


Training, Validation and Testing
Test size = 40 % of the data
Call scikit (SK) library for data spliting

Train the system and validate it.


Separate some % age of the data (Unseen by the system) for testing. Then report the results on test data
Training, Validation and Testing
Test size = 40 % of the data
Calibrate the regression problems to prevents under or overfitting in order to minimize the adjusted loss
function.
Ridge regularization
Used L2 Norm
Loss= sum of the square of the errors =0.61
=1.96Line
Regression
Lasso regularization
Used L1 Norm 6
Linear regularization
Ridge regularization 5
Loss=0
Used L2 Norm 4
λ= 1
Loss= sum of the square of the errors 3
w= 1.4 2
Cost Fun= Loss +λ
Cost Fun=0+1*(1.4=1.96 1
Loss= Sum of squared values
λ= Penalty for errors 0 1 2 3 4 5

w= Slope of curve Ridge regularization Lasso regularization


Loss=(0 Cost Fun= Loss +λ
λ= 1 Loss= Sum of squared values
w= 0.7 λ= Penalty for errors
Cost Fun=0.13+1*(0.7=0.61 w= Slope of curve
Calibrate the regression problems to prevents under or overfitting in order to minimize the adjusted loss
function.
Ridge regularization
Used L2 Norm
Loss= sum of the square of the errors
Lasso regularization
Used L1 Norm

You might also like