0% found this document useful (0 votes)

18 views42 pages

Lecture 3

Uploaded by

c8d72twt49

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views42 pages

Lecture 3

Uploaded by

c8d72twt49

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 42

machine

learning
Linear regression
Dr. Darkhan Zholtayev
Assistant professor at Department of Computational and Data
Science
[email protected]
Topics to cover
• What is the regression
• Linear regression
• Lest square error
General graph

AI map

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Da
Science. https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-pytho
Linear Regression
• Technique used for the modeling and analysis of
numerical data
• Exploits the relationship between two or more variables
so that we can gain information about one of them
through knowing values of the other
• Regression can be used for prediction, estimation,
hypothesis testing, and modeling causal relationships
Problem
Data

Xie, Y. (2013). Lecture 11: Simple Linear Regression. H. Milton Stewart School of Industrial
and Systems Engineering, Georgia Institute of Technology. Retrieved from
Data

Xie, Y. (2013). Lecture 11: Simple Linear Regression. H. Milton Stewart School of Industrial
and Systems Engineering, Georgia Institute of Technology. Retrieved from
Data
Linear Regression
Linear regression
Linear
regression

Xie, Y. (2013). Lecture 11: Simple Linear Regression. H. Milton Stewart School of Industrial
and Systems Engineering, Georgia Institute of Technology. Retrieved from
Linear regression: different forms
Linear regression
Linear regression
Linear regression
Estimate regression parameters
Method of least squares
Least square estimates

Xie, Y. (2013). Lecture 11: Simple Linear Regression. H. Milton Stewart School of Industrial
and Systems Engineering, Georgia Institute of Technology. Retrieved from
Alternative notation
Example: oxygen and hydrocarcon level
Calculati
on 2
Calculati
on
Interpretat
ion of
regression
model
Estimation of variance
Sammary

Xie, Y. (2013). Lecture 11: Simple Linear Regression. H. Milton Stewart School of Industrial
and Systems Engineering, Georgia Institute of Technology. Retrieved from
Example
• import pandas as pd # for data manipulation
import numpy as np # for data manipulation
from sklearn.linear_model import LinearRegression # for
creating a model
import plotly.graph_objects as go # for visualizations
import plotly.express as px # for visualizations

• # Read data into a Pandas DataFrame

df = pd.read_csv('Real estate.csv', encoding='utf-8')

# Print DataFrame
df
Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Data

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Code 1
• # Create a scatter plot
fig = px.scatter(df, x=df['X3 distance to the nearest MRT station'], y=df['Y house price of unit area'],
opacity=0.8, color_discrete_sequence=['black'])

# Change chart background color

fig.update_layout(dict(plot_bgcolor = 'white'))

# Update axes lines

fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',
zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey',
showline=True, linewidth=1, linecolor='black')

fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',

zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey',
showline=True, linewidth=1, linecolor='black')

# Set figure title

fig.update_layout(title_text="Scatter Plot")

# Update marker size

fig.update_traces(marker=dict(size=3))

Joseph,fig.show()
B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Scatter plot

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Training
• # Select variables that we want to use in a model
# Note, we need X to be a 2D array, hence reshape
X=df['X3 distance to the nearest MRT station'].values.reshape(-1,1)
y=df['Y house price of unit area'].values

# Fit linear regression model

model = LinearRegression()
reg = model.fit(X, y)

# Print the slope and intercept of the best-fit line

print(reg.coef_)
print(reg.intercept_)

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Code 2
• # We will use below to draw a best-fit line on a chart
# Create 20 evenly spaced points from smallest X to largest X
x_range = np.linspace(X.min(), X.max(), 20)

# Predict y values for our set of X values

y_range = model.predict(x_range.reshape(-1, 1))

# Create a scatter plot

fig = px.scatter(df, x=df['X3 distance to the nearest MRT station'], y=df['Y house price of unit area'],
opacity=0.8, color_discrete_sequence=['black'])

# Add a best-fit line

fig.add_traces(go.Scatter(x=x_range, y=y_range, name='Regression Fit'))

# Change chart background color

fig.update_layout(dict(plot_bgcolor = 'white'))

# Update axes lines

fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',
zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey',
showline=True, linewidth=1, linecolor='black')

fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',

zeroline=True, zerolinewidth=1, zerolinecolor='lightgrey',
showline=True, linewidth=1, linecolor='black')

# Set figure title

fig.update_layout(title_text="Scatter Plot with Linear Regression Line")

# Update marker size

fig.update_traces(marker=dict(size=3))

fig.show()

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Prediction line
• # Select variables that we want to use in a
model
# Note, X in this case is already a 2D
array, hence no reshape
X=df[['X3 distance to the nearest MRT
Multiple station','X2 house age']]
y=df['Y house price of unit area'].values
linear # Fit linear regression model
regression model = LinearRegression()
reg = model.fit(X, y)

# Print slope(s) and intercept

print(reg.coef_)
print(reg.intercept_)
Multiple
linear
regression
— Python
example

Joseph, B. (2020, June 17). Linear Regression Made Easy: How Does It Work and How to Use It in Python. Towards Data Science.
https://towardsdatascience.com/linear-regression-made-easy-how-does-it-work-and-how-to-use-it-in-python-be0799d2f159
Basic statistics
• The sample mean is the sum of all the observations (∑Xi)
divided by the number of observations (n):
ΣXi = X1 + X2 + X3 + X4 + … + Xn

• Example. 1, 2, 2, 4, 5, 10. Calculate the mean. Note: n =

6 (six observations)

∑Xi = 1 + 2+ 2+ 4 + 5 + 10 = 24
= 24 / 6 = 4.0
The median

To get the median, we must first

rearrange the data into an
The median is the middle value of ordered array (in ascending or
the ordered data descending order). Generally, we
order the data from the lowest
value to the highest value.
The mode
• The mode is the value of the data that occurs with the
greatest frequency.

Example. 1, 1, 1, 2, 3, 4, 5
Answer. The mode is 1 since it occurs three times. The other values
each appear only once in the data set.

Example. 5, 5, 5, 6, 8, 10, 10, 10.

Answer. The mode is: 5, 10.
There are two modes. This is a bi-modal dataset.
Standart deviation

• The standard deviation, s, measures a kind of “average” deviation about the

mean. It is not really the “average” deviation, even though we may think of
it that way.

• Why can’t we simply compute the average deviation about the mean, if
that’s what we want?

• If you take a simple mean, and then add up the deviations about the mean,
as above, this sum will be equal to 0. Therefore, a measure of “average
deviation” will not work.
Standard Deviation
• Instead, we use:

• This is the “definitional formula” for standard deviation.

• The standard deviation has lots of nice properties, including:
• By squaring the deviation, we eliminate the problem of the deviations
summing to zero.
• In addition, this sum is a minimum. No other value subtracted from X and
squared will result in a smaller sum of the deviation squared. This is called
the “least squares property.”
• Note we divide by (n-1), not n. This will be referred to as a loss of
one degree of freedom.
Variance
The variance, s2, is the standard deviation (s) squared.
Conversely, .

Definitional formula:
Computational formula:
Thank you
for your
attention

ML Unit
No ratings yet
ML Unit
23 pages
Linear Regression with Python OLS
No ratings yet
Linear Regression with Python OLS
23 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Practical 5
No ratings yet
Practical 5
8 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
46 pages
Regression
No ratings yet
Regression
6 pages
Linear Regression For Machine Learning
100% (1)
Linear Regression For Machine Learning
17 pages
Module 4
No ratings yet
Module 4
41 pages
Lecture 3 - Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 3 - Linear Regression Imran 20022025 092939am
46 pages
Module 2 Notes
No ratings yet
Module 2 Notes
4 pages
Regression Analysis and Equations
No ratings yet
Regression Analysis and Equations
16 pages
Logistic Regression Example Explained
No ratings yet
Logistic Regression Example Explained
45 pages
CL IV Manual
No ratings yet
CL IV Manual
108 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Lect 10 Regression
No ratings yet
Lect 10 Regression
7 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
No ratings yet
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
20 pages
Machine Learning: Linear Regression Basics
No ratings yet
Machine Learning: Linear Regression Basics
17 pages
ML Algorithm
No ratings yet
ML Algorithm
4 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
Linear Regression Techniques Guide
No ratings yet
Linear Regression Techniques Guide
103 pages
ML - Regression
No ratings yet
ML - Regression
34 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Chapter - 2 - Linear and Logistic Regression
No ratings yet
Chapter - 2 - Linear and Logistic Regression
34 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Supervised Learning: Regression Techniques
No ratings yet
Supervised Learning: Regression Techniques
34 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Linear Regression in Python.
No ratings yet
Linear Regression in Python.
13 pages
Linear Regression for Analysts
No ratings yet
Linear Regression for Analysts
6 pages
OE-ML Unit - 3
No ratings yet
OE-ML Unit - 3
29 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Regression Questionnaire
No ratings yet
Regression Questionnaire
10 pages
Linear Regression Explained
No ratings yet
Linear Regression Explained
8 pages
Regression
No ratings yet
Regression
16 pages
ML Exp 1
No ratings yet
ML Exp 1
6 pages
Tutorial R LM
No ratings yet
Tutorial R LM
22 pages
Chap 2 Linear Regression - Part1
No ratings yet
Chap 2 Linear Regression - Part1
29 pages
Combinepdf
No ratings yet
Combinepdf
8 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Unit5 R
No ratings yet
Unit5 R
5 pages
ML Exp 1
No ratings yet
ML Exp 1
4 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
DSA1101 2019 Week2 Part1
No ratings yet
DSA1101 2019 Week2 Part1
51 pages
Group 1 Practical
No ratings yet
Group 1 Practical
16 pages
Data Science for Beginners
No ratings yet
Data Science for Beginners
98 pages
LR 1751142062
No ratings yet
LR 1751142062
10 pages
Understanding Linear Regression Methods
No ratings yet
Understanding Linear Regression Methods
17 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
ML Ch-2 Linear Models For Supervised Learning
No ratings yet
ML Ch-2 Linear Models For Supervised Learning
18 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Midterm - Exam Regression
No ratings yet
Midterm - Exam Regression
5 pages
Data Analysis Basics: Udacity Guide
No ratings yet
Data Analysis Basics: Udacity Guide
9 pages
Migration
No ratings yet
Migration
27 pages
Module 2-Data Science
No ratings yet
Module 2-Data Science
3 pages
QM - Assignment - Theory
0% (1)
QM - Assignment - Theory
10 pages
Data Collection NC II
No ratings yet
Data Collection NC II
87 pages
EDA Folder Assignment
No ratings yet
EDA Folder Assignment
13 pages
Implementation of Association Rule Using Apriori A
No ratings yet
Implementation of Association Rule Using Apriori A
10 pages
Mean and Variance of Sample Means
100% (2)
Mean and Variance of Sample Means
14 pages
SynitiONE Migration Methodology Guiding Principles
No ratings yet
SynitiONE Migration Methodology Guiding Principles
22 pages
Biostatistics: A Refresher: Kevin M. Sowinski, Pharm.D., FCCP
100% (1)
Biostatistics: A Refresher: Kevin M. Sowinski, Pharm.D., FCCP
20 pages
HR Practices at IFortis Worldwide Report
No ratings yet
HR Practices at IFortis Worldwide Report
11 pages
2marks With Answers
No ratings yet
2marks With Answers
10 pages
Big Data Science Diploma Egypt
No ratings yet
Big Data Science Diploma Egypt
4 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
AI Engineer Roadmap
No ratings yet
AI Engineer Roadmap
13 pages
Journal24 IFTA Pandini
No ratings yet
Journal24 IFTA Pandini
28 pages
M.Tech Industrial Engineering Syllabus
No ratings yet
M.Tech Industrial Engineering Syllabus
71 pages
Business Analytics - Unit 5 Notes
No ratings yet
Business Analytics - Unit 5 Notes
12 pages
A Critical Study On Ratio Analysis Between Indian Oil Corporation Ltd. & Hindustan Petroleum Corporation Ltd. - 074334
100% (1)
A Critical Study On Ratio Analysis Between Indian Oil Corporation Ltd. & Hindustan Petroleum Corporation Ltd. - 074334
116 pages
Linear Regression & Contraceptive Use Analysis
No ratings yet
Linear Regression & Contraceptive Use Analysis
12 pages
5TH Activity
No ratings yet
5TH Activity
15 pages
Simple Regression Analysis Guide
No ratings yet
Simple Regression Analysis Guide
58 pages
Dee Notes Unit I
No ratings yet
Dee Notes Unit I
17 pages
Programa CP 2019
No ratings yet
Programa CP 2019
98 pages
Research Methods and Variable Types
No ratings yet
Research Methods and Variable Types
11 pages
BRC Food6 - Checklist
100% (2)
BRC Food6 - Checklist
48 pages
Overview of Descriptive Data Mining Techniques
No ratings yet
Overview of Descriptive Data Mining Techniques
8 pages
V Ai-Ds Ccs334 Bda Unit1
No ratings yet
V Ai-Ds Ccs334 Bda Unit1
58 pages
Lecture 17
No ratings yet
Lecture 17
26 pages

Lecture 3

Uploaded by

Lecture 3

Uploaded by

machine

• # Read data into a Pandas DataFrame

# Change chart background color

# Update axes lines

fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',

# Set figure title

# Update marker size

# Fit linear regression model

# Print the slope and intercept of the best-fit line

# Predict y values for our set of X values

# Create a scatter plot

# Add a best-fit line

# Change chart background color

# Update axes lines

fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgrey',

# Set figure title

# Update marker size

# Print slope(s) and intercept

• Example. 1, 2, 2, 4, 5, 10. Calculate the mean. Note: n =

To get the median, we must first

Example. 5, 5, 5, 6, 8, 10, 10, 10.

• The standard deviation, s, measures a kind of “average” deviation about the

• This is the “definitional formula” for standard deviation.

You might also like