0% found this document useful (0 votes)

93 views

Machine Learning: Pradyumn Sharma Pragati Software Pvt. LTD

The document provides an overview of a 5-day machine learning training program. It includes an agenda that covers topics like linear regression, logistic regression, decision trees, clustering, and model deployment. It also discusses prerequisites like statistics and linear algebra. Finally, it introduces Google Colab as the development environment and libraries like Pandas and scikit-learn that will be used in the training.

Uploaded by

Surabhi Kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views

Machine Learning: Pradyumn Sharma Pragati Software Pvt. LTD

Uploaded by

Surabhi Kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 85

Machine Learning

Pradyumn Sharma
Pragati Software Pvt. Ltd.
[email protected]
www.pragatisoftware.com
[email protected]

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 1
Program Contents
1.Day one 3.Day three
1. Essential concepts and 1. Polynomial regression
terminology 2. Logistic regression
2. Understanding linear (classification)
regression 3. Classification report
3. Hypothesis, cost function 4.Day four
4. Gradient Descent, learning rate 1. Decision tree
5. Essentials of numpy, pandas, 2. K-nearest neighbors
matplot libraries 3. Ensemble techniques
6. Training and test dataset split 5.Day five
2.Day two 1. Unsupervised learning:
1. Predictive modelling Clustering
2. Using the Stochastic Gradient 2. K-means
Descent (SGD) regressor 3. Anomaly detection
3. Tweaking the SGD regressor 4. Deploying ML Models
4. R-square: coefficient of
determination
5. Making predictions
6. Feature scaling

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 2
Machine Learning Prerequisites

• Machine Learning is more about Maths than about Python or

any libraries. More specifically, familiarity with at least some
basics of :
 Statistics, including probability
 Linear algebra
 Calculus
• https://www.youtube.com/watch?v=tGyfmzuR4d4&t=5s

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 3
What is Machine Learning?

• Giving computers the ability to learn without being explicitly

programmed with some knowledge. –Arthur Samuel, 1959.
• Computer algorithms that autonomously learn from data.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 4
Conventional Programming vs Machine Learning

Rules
Conventional Result
Input Programming

Rule: F = (9 * C / 5) + 32

Input: 20
Output: 68

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 5
Conventional Programming vs Machine Learning

Rules
Conventional Result
Input Programming

Input
Machine
Result Rules
Learning
F = (9 * C / 5) + 32

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 6
Machine Learning: Some More Examples

• Making predictions: consumer behavior, market behavior

• Medical diagnostics
• Spam filters
• Face recognition in Facebook
• Troll detection system
• Robots that learn by observing human actions
• Program to play chess
• Chatbots
• Autonomous vehicles
• Machine translation
• Natural language processing

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 7
Machine Learning Technologies

• Python with Scikit-learn

• Python with Tensorflow
• Python with PyTorch
• R programming language
• Matlab, Octave
• Lex (Amazon), Luis (Microsoft), Watson (IBM), Wit.ai
(Facebook)
• and many more
• Development and deployment environments
 Jupyter Notebook
 Google Colaboratory
 Google AutoML, Amazon SageMaker, Azure Machine Learning

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 8
Technologies Used in This Program

• Python
• General libraries: numpy, pandas, matplotlib
• ML library: scikit-learn
• Development environment: Google Colab

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 9
Introduction to Google Colaboratory

• Google Colaboratory, or 'Colab' allows you to write and run

Python programs in the browser. Benefits:
 Zero configuration required
 Free access to GPUs
 Easy sharing
 Limited memory access in the free version; higher limits in the paid
versions

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 10
Setting Up Google Colab

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 11
Pandas

• Pandas (pandas.pydata.org) is an open-source library that

provides data structures and data analysis tools.
• http://pandas.pydata.org/pandas-docs/stable/getting_started/1
0min.html
: 10 minutes to Pandas.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 12
scikit-learn

• Open-source, Machine Learning library in Python.

• https://scikit-learn.org/stable/index.html

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 13
Loading and Examining Data

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 14
Rentals in Andheri East, Mumbai

Source: www.housing.com
27-Oct-2017

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 15
Rentals in Andheri East, Mumbai

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 16
Loading the Data from Google Drive

import pandas as pd
from google.colab import drive

drive.mount('/content/drive')

full_data = pd.read_csv
("/content/drive/MyDrive/rentals.csv")

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 17
Viewing Data

• print (full_data)
• full_data

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 18
Some Statistics About Data

• full_data.describe ()

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 19
Standard Deviation of a Sample

• It is a statistical measure of dispersal of data from the mean

of a group of values.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 20
Percentiles

• 90th percentile = 90% of the data elements have lower values

• 99th percentile = 99% of the data elements have lower values
• 75th percentile = 75% of the data elements have lower values;
also called the 3rd quartile
• 50th percentile = 50% of the data elements have lower values;
also called the 2nd quartile, or median
• 25th percentile = 25% of the data elements have lower values;
also called the 1st quartile
• Interquartile range: 3rd quartile – 1st quartile. This indicates
the spread of data.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 21
The Normal Distribution

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 22
The Normal (Distribution) Curve

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 23
Many Distributions Follow the Normal Curve

• Examples : Height, weight, IQ levels.

• But not: income, house prices.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 24
Normal Distribution Table

• About 68% values fall within 1 standard deviation of the

mean
• About 95% values fall within 2 standard deviations of the
mean (to be more precise, 1.96 standard deviations)
• About 99.7% within 3 standard deviations of the mean.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 25
Normal Distribution

• In a random sample from a population, if the statistics for the

height of men are found to be as follows:
 mean: 67.8 inches
 standard deviation: 1.6 inches
• Then we can say that
 about 68% people have the height between 66.2 and 69.4 inches
 about 95% people have the height between 64.6 and 71 inches
 about 99.7% people have the height between 63 and 72.6 inches

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 26
Distinct Values and Their Count

full_data['bedrooms'].value_counts()

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 27
Correlations Among Variables

• full_data.corr ()
• Provides a measure of correlation between various variables.
• Value close to 1 => strong positive correlation
• Value close to -1 => strong negative correlation
• Value close to 0 => no linear correlation

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 28
Box Plot

full_data.area.plot(kind='box')

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 29
Box Plot

full_data.plot (kind='box', subplots=True,

layout=(2, 2))

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 30
Box Plot

full_data[['area', 'rent']].plot (kind='box',

subplots=True,layout=(2, 2),figsize=(10,6))

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 31
Outliers

• Upper value outliers:

values greater than 75th percentile + 1.5 * IQR
• Lower value outliers:
values less than 25th percentile – 1.5 * IQR

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 32
Histogram

full_data.area.hist()

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 33
Histogram

full_data['area'].hist(bins=25)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 34
Scatter Matrix

full_data.plot(kind='scatter',
x = 'area', y = 'rent')

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 35
Scatter Matrix

from pandas.plotting import scatter_matrix

scatter_matrix(full_data)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 36
Saving a Diagram

full_data.area.plot(kind='box')
plt.savefig
("/content/drive/MyDrive/output.png")

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 37
Understanding the Key Concepts

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 38
An Introduction to Linear Regression

• Suppose we consider linear regression with

just one input variable...

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 39
Area x Rent

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 40
Learning Algorithm

Training set

Learning
Algorithm

Size of house Hypothesis Expected Rent

(x) (h) (y)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 41
Hypothesis

• Given the value of an

input variable (size of a
house), estimate the
value of output
variable (expected
rent).
• Hypothesis:

• Example:
 f = 32 + 1.8c
 rent = 5000 + 40 x area
• How do we choose the
values of and ?

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 42
Generalized Hypothesis

Or...

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 43
Hypothesis as a Matrix Operation

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 44
Cost Function: Least Squared Error (L2)

• Cost for is taken as

• Cost function

• Minimize J

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 45
The LinearRegression Class

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 46
Separating the Predictors and the Target

dataX = full_data.drop(columns=['rent'])
dataY = pd.DataFrame ({'rent':full_data.rent})

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 47
Training and Test Data Split

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 48
Training and Test Data Split

from sklearn.model_selection import train_test_split

trainX, testX, trainY, testY = train_test_split(dataX,
dataY, test_size = 0.20)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 49
Different Random Splits Every Time

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 50
Ensuring Same Split Every Time

trainX, testX, trainY, testY = train_test_split

(dataX, dataY, test_size = 0.20, random_state =
11)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 51
Defining the Model

from sklearn.linear_model import LinearRegression

model = LinearRegression ()

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 52
Training the Model

model.fit (trainX, trainY)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 53
Evaluating the Results

print ('Coefficients:', model.coef_)

print ('Intercept:', model.intercept_)
print ('R2 on training data:', model.score(trainX,
trainY))

Hypothesis:
rent = 911 + 27 x area + 6103 x bedrooms + 840 x furnished

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 54
R-Squared (Coefficient of Determination)

CoD: a measure of how much

change in output variable (y) is
explained by changes in the
input variable (x)

If,
are the actual values,
are the predicted values,
is the mean of the actual values,

Then,

= 1-

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 55
Mean Squared Error, Mean Absolute Error

from sklearn.metrics import mean_squared_error,

mean_absolute_error

predictions = model.predict(trainX)
mse = mean_squared_error (trainY, predictions)
rmse = np.sqrt (mse)
print ('RMSE on training data: ', rmse)

mae = mean_absolute_error (trainY, predictions)

print ('MAE on training data: ', mae)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 56
Predictions for User-Supplied Data

area = input('Enter area : ')

bedrooms = input('Enter bedrooms : ')
furnished = input(
'Enter furnished state (0/1/2) : ')
customTestX = pd.DataFrame({'area': area ,
'bedrooms': bedrooms,
'furnished': furnished }, index=[0])
prediction = model.predict (customTestX)
print('\nPrediction result :: ',prediction)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 57
The Normal Equation Method

• The Normal Equation Method is a "closed-form solution" to

minimize the cost function J (using OLS, ordinary least
squares)
• Method to solve for theta analytically.

• x= y=

mxn mx1

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 58
Normal Equation Method

• A simple technique for regression. However,

• Inverting an (n x n) matrix is computationally intensive. The
order of this is between O(n2.4) and O(n3). Thus it is too slow
when n is very large, say in thousands.
• Some metrices are not invertible; hence this method will not
work for those.
• Does not work for classification and many other categories of
ML algorithms.
• Requires all the data to be in memory for the model to train.
Some other algorithms support batch learning, including
learning from a single training record at a time. Thus,
available RAM is not a constraint for them.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 59
The LinearRegression Class

• The LinearRegression() class in scikit-learn uses a

refinement of the Normal Equation Method called "Singular
Value Decomposition" (SVD).
• This is more efficient than the Normal Equation Method,
with the computational complexity being O(n2). This also
works even if the matrix XTX is not invertible.
• However, like the Normal Equation Method, this also
requires all the training data to be in memory.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 60
Gradient Descent

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 61
Lab: Manual Regression, Using Spreadsheet

• Step 1:
 put in formula for cost, play around with values of and , and see the
impact on average cost.
 with the same formula for cost, and a fixed value for , and varying
values , see the impact on average cost in data table.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 62
Gradient Descent

Avg cost
900

800

700

600

500
Avg cost

400

300

200

100

0
20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 63
Gradient Descent

Image source:
https://stackoverflow.com/questions/64940632/how-to-illustrate-a-3d-graph-of-gradient-descent-using-python-matplotlib

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 64
Gradient Descent

Hypothesis

Cost function

Objective
Minimize J
Algorithm
Start with some values of and (say = 0, = 0)
Repeat until convergence {
simultaneously update
(for j = 0 and j = 1)
}

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 65
What are and

• is the slope of the curve for

• Similarly, is the slope of the curve for
• is the learning rate.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 66
Simultaneous Updates

Applying partial derivatives, the simultaneous updates become:

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 67
Lab: Gradient Descent using MS Excel

• In the sheet titled “Step 2”:

 Start with some and (such as 0, 0)
 Put in formulas for estimate, cost, error, error * x
 Put in formulas for average error, average (error * x)
 Set learning rate () to 0.0000001.
 Manually run a few iterations of gradient descent logic, and see the
impact on the chart plotted.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 68
Gradient Descent Results

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 69
Gradient Descent Results

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 70
Gradient Descent Results

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 71
Learning Rate ()

• If is too small, gradient descent can be too slow.

• If is too large, gradient descent can overshoot the minimum
and fail to converge, or may even diverge.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 72
Impact of Alpha on J

Alpha = 0.0000001 Alpha = 0.0000003

Alpha = 0.000001 Alpha = 0.000003

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 73
Gradient Descent Variations

• Batch gradient descent: uses the entire training data set to

compute the gradient at every step; making it slow with large
training data sets.
• Gradient descent: at every step, a single training instance is
randomly selected, and the gradient is computed based on
that.
• Mini-batch gradient descent: at every step, computes the
gradient based on a small subset of randomly selected
instances from the training data set.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 74
Comparison

• Batch gradient descent is the slow, stochastic gradient

descent is the fastest.
• Batch gradient descent eventually settles down near the
optimal solution; stochastic gradient descent continues to
walk around, never settling down on the optimal.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 75
SGDRegressor

• Stochastic Gradient Descent Regressor.

• Stochastic = having a random probability distribution that
may be analyzed statistically but may not be predicted
precisely.
• A simple and efficient algorithm for linear regression.
• Scales very well for very large datasets (such as 105 rows)
and very large number of features (such as 105).
• https://scikit-learn.org/stable/modules/generated/sklearn.linea
r_model.SGDRegressor.html

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 76
Using SGDRegressor for Stochastic GD

model = SGDRegressor()

• Three modes of gradient descent:

 batch mode (average = True)
 stochastic (average = False)
 mini-batch (average = some value greater than 1)
• The default value for the parameter "average" (False) is
stochastic gradient descent.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 77
Training the Model

model.fit (trainX, trainY)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 78
Training the Model

model.fit (trainX, trainY.values.ravel())

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 79
Some of the Model Parameters

• verbose. Default = 0 (silent). Other value = 1 (verbose

output).
• eta0: initial learning rate. Default = 0.01.
• max_iter: maximum number of iterations (epochs). Default =
1000.
• shuffle: whether the training data should be shuffled after
each iteration. Default = True
• random_state: used for shuffling the data, when shuffle =
True. Setting it to a constant value results in deterministic
randomization.
• tol: The stopping criterion. If for n_iter_no_change
iterations (default = 5), best_loss – loss < tol, then training
stops.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 80
Making Gradient Descent Work Well

• Observe changes in J after each iteration (with verbose = 1),

and if required, change eta0, max_iter, tol.

model = SGDRegressor (verbose = 1)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 81
Making Gradient Descent Work Well

• Start with verbose = True, a low value of eta0 (such as

0.0000001), a low value for max_iter (such as 10 or 20).
Observe average loss, and R2.
• If the average loss seems to converge, increase eta0 by a
factor of 2 or 3, until it no longer appears to converge.
• If the average loss seems to diverge, decrease eta0 by a factor
of 2 or 3.
• Once you obtain a promising value of eta0, drop varose = 1,
and gradually increase max_iter. You may also want to tweak
tol.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 82
Tweaking Learning Rate

• For sufficiently small learning rate, J should decrease on each

iteration for batch gradient descent (average = True).
• But if it is too small, it can be slow to converge.

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 83
Making Gradient Descent Work Well

model = SGDRegressor(eta0 = 0.000005,

max_iter = 2000000, tol = 0.1,
shuffle = False)

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 84
Day 1 Review Questions

1. What is the difference between conventional

programming and ML?
2. Which method in the pandas library reads a CSV file
into memory?
3. What does dataframe.describe() function show?
4. What is standard deviation?
5. What is a percentile score?
6. What is a boxplot?
7. What is an outlier? How are outliers identified?
8. What is the first split of the data that we perform for ML
algorithms?
9. Why do we split training and test data, and how?
10.What does random_state parameter achieve in
train_test_split?

Pragati Software Pvt. Ltd., 312, Lok Center, Marol-Maroshi Road, Marol, Andheri (East), Mumbai 400 059. www.pragatisoftware.com 85

Data Science An Introduction To Statistics and Machine Learning (Matthias Plaue) (Z-Library)
100% (1)
Data Science An Introduction To Statistics and Machine Learning (Matthias Plaue) (Z-Library)
372 pages
12th Computer-Science EM - WWW - Tntextbooks.in
No ratings yet
12th Computer-Science EM - WWW - Tntextbooks.in
360 pages
CS 2 3 4 Aml
No ratings yet
CS 2 3 4 Aml
70 pages
Gcse Data Work Book Compiled by MR Bradford
0% (1)
Gcse Data Work Book Compiled by MR Bradford
80 pages
Machine Learning With Python Nitin Sharma
No ratings yet
Machine Learning With Python Nitin Sharma
18 pages
Stotram Dash Mahavidya
No ratings yet
Stotram Dash Mahavidya
68 pages
Annapurna Sadhana
No ratings yet
Annapurna Sadhana
3 pages
Data Flow Diagram With Examples - Food Ordering System
No ratings yet
Data Flow Diagram With Examples - Food Ordering System
4 pages
Joshua-Mogyoros-Resume-November2020 1
No ratings yet
Joshua-Mogyoros-Resume-November2020 1
3 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
GenAI Interview Questions-1
No ratings yet
GenAI Interview Questions-1
9 pages
22 CHEE12 Set 2
No ratings yet
22 CHEE12 Set 2
2 pages
Merlin The Wizard
100% (1)
Merlin The Wizard
1 page
Python Numpy Array Tutorial
No ratings yet
Python Numpy Array Tutorial
53 pages
02 Kinetic Energy
No ratings yet
02 Kinetic Energy
2 pages
NM Project PDF-1
No ratings yet
NM Project PDF-1
16 pages
ML Models Concepts
No ratings yet
ML Models Concepts
32 pages
Mumbai Educational Trust: MET Institute of Computer Science
No ratings yet
Mumbai Educational Trust: MET Institute of Computer Science
368 pages
NEJM 2023. Where Medical Statistics Meets Artificial Intelligence
No ratings yet
NEJM 2023. Where Medical Statistics Meets Artificial Intelligence
9 pages
Hypothesis Testing Class
No ratings yet
Hypothesis Testing Class
73 pages
Download Complete Machine Learning Pocket Reference Working with Structured Data in Python 1st Edition Matt Harrison PDF for All Chapters
100% (3)
Download Complete Machine Learning Pocket Reference Working with Structured Data in Python 1st Edition Matt Harrison PDF for All Chapters
55 pages
Higher Engineering Mathematics - B. S. Grewal Companion Text
80% (5)
Higher Engineering Mathematics - B. S. Grewal Companion Text
197 pages
Weka Tutorial
No ratings yet
Weka Tutorial
2 pages
Merge +1
No ratings yet
Merge +1
107 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
CNN For Deep Learning - Convolutional Neural Networks
No ratings yet
CNN For Deep Learning - Convolutional Neural Networks
10 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
34 pages
Machine Learning Algorithm With Python Implementation
No ratings yet
Machine Learning Algorithm With Python Implementation
34 pages
Baby Names Vishnu Sahashranamam
No ratings yet
Baby Names Vishnu Sahashranamam
21 pages
R Programming
No ratings yet
R Programming
11 pages
Face Attendace
No ratings yet
Face Attendace
104 pages
IDS Sec-1 CS1-CS8 Merged Slides
No ratings yet
IDS Sec-1 CS1-CS8 Merged Slides
419 pages
Gauss Elimination & Jordan Method
No ratings yet
Gauss Elimination & Jordan Method
15 pages
Freeemg 1000
No ratings yet
Freeemg 1000
39 pages
Data Visualization With Ma Thematic A
No ratings yet
Data Visualization With Ma Thematic A
46 pages
Distance Based Models
No ratings yet
Distance Based Models
58 pages
Introduction To Generative Models
No ratings yet
Introduction To Generative Models
13 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
25 pages
Pea305 Analytical Skills-I
No ratings yet
Pea305 Analytical Skills-I
2 pages
Unit-2 PDS (Final) PDF (G)
No ratings yet
Unit-2 PDS (Final) PDF (G)
14 pages
@vtudeveloper.in ISMLA Mod 5
No ratings yet
@vtudeveloper.in ISMLA Mod 5
30 pages
Book Summary
No ratings yet
Book Summary
35 pages
Build ETL Using Python
No ratings yet
Build ETL Using Python
7 pages
Inventateq-DataScience-with-python-course-content-syllabus (1)
No ratings yet
Inventateq-DataScience-with-python-course-content-syllabus (1)
9 pages
(Ebook) Essential Statistics for Non-STEM Data Analysts by Rongpeng Li ISBN 9781838984847, 1838984844download
100% (2)
(Ebook) Essential Statistics for Non-STEM Data Analysts by Rongpeng Li ISBN 9781838984847, 1838984844download
49 pages
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
Module 4
No ratings yet
Module 4
44 pages
Common DS Interview Questions and Answers - 1
No ratings yet
Common DS Interview Questions and Answers - 1
4 pages
Anshu Complete Data Science Files
No ratings yet
Anshu Complete Data Science Files
26 pages
Machine Learning
No ratings yet
Machine Learning
80 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
EE353 - 769 00 Course Introduction
No ratings yet
EE353 - 769 00 Course Introduction
28 pages
Artificial Intelligence Course Intellipaat
No ratings yet
Artificial Intelligence Course Intellipaat
11 pages
ML1
No ratings yet
ML1
69 pages
Data Science Book
No ratings yet
Data Science Book
16 pages
Masters in Datascience With Power BI
No ratings yet
Masters in Datascience With Power BI
15 pages
Big Data &: Machine Learning Prodegree
No ratings yet
Big Data &: Machine Learning Prodegree
6 pages
ML Syllabus
No ratings yet
ML Syllabus
5 pages
Data Science & AIML Coursework
No ratings yet
Data Science & AIML Coursework
10 pages
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
Neural Style Transfer
No ratings yet
Neural Style Transfer
14 pages
FRA Business Report
100% (1)
FRA Business Report
21 pages
Computer Vision Assignment
No ratings yet
Computer Vision Assignment
1 page
Problem 2
100% (1)
Problem 2
10 pages
M3 T1 V3 Joins Query
No ratings yet
M3 T1 V3 Joins Query
1 page
TPS (Think Pair Share) REPORT: Syed Ayub Ahmed DSBA Online Date:15/03/2021
No ratings yet
TPS (Think Pair Share) REPORT: Syed Ayub Ahmed DSBA Online Date:15/03/2021
8 pages
Machine Learnin1
100% (1)
Machine Learnin1
41 pages
Chapter 14 - Analyzing Quantitative Data
No ratings yet
Chapter 14 - Analyzing Quantitative Data
8 pages
Chapter 3 Descriptive Measures
No ratings yet
Chapter 3 Descriptive Measures
12 pages
Interpreting Measures of Position WEEK 3
No ratings yet
Interpreting Measures of Position WEEK 3
8 pages
Chapter 4&5
No ratings yet
Chapter 4&5
33 pages
Interprets Measures of Position
100% (1)
Interprets Measures of Position
6 pages
MATH 533 Project 1
No ratings yet
MATH 533 Project 1
15 pages
Lesson Two
No ratings yet
Lesson Two
66 pages
MDM4U-Unit3
No ratings yet
MDM4U-Unit3
22 pages
B A Interview
No ratings yet
B A Interview
276 pages
Introduction To Statistics in Python
100% (2)
Introduction To Statistics in Python
211 pages
Maths Paper 2 Final Push (31 October 2024)
No ratings yet
Maths Paper 2 Final Push (31 October 2024)
154 pages
Median Quartile Decile and Percentile
No ratings yet
Median Quartile Decile and Percentile
34 pages
IOAA 2024 Data Analysis
No ratings yet
IOAA 2024 Data Analysis
5 pages
DWM Unit II
No ratings yet
DWM Unit II
76 pages
Bba Question Paper
40% (5)
Bba Question Paper
5 pages
Teacher Learning Area Teaching Date & Time Quarter: Grades 1 To 12 Daily Lesson LOG
No ratings yet
Teacher Learning Area Teaching Date & Time Quarter: Grades 1 To 12 Daily Lesson LOG
11 pages
Semi - Detailed Lesson Plan
100% (1)
Semi - Detailed Lesson Plan
6 pages
CE502 Week 3 (Part 1) Descriptive Statistics
No ratings yet
CE502 Week 3 (Part 1) Descriptive Statistics
36 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
Unit 3 Statistical Graphics
No ratings yet
Unit 3 Statistical Graphics
18 pages
Math10 Q4 Week 1 Hybrid Version2
No ratings yet
Math10 Q4 Week 1 Hybrid Version2
15 pages
Add Maths Sba
No ratings yet
Add Maths Sba
27 pages
Final Assignment
No ratings yet
Final Assignment
3 pages
Lecture-2 Descriptive Statistics-Box Plot Descriptive Measures
No ratings yet
Lecture-2 Descriptive Statistics-Box Plot Descriptive Measures
47 pages
Descriptive Statistics (II)
No ratings yet
Descriptive Statistics (II)
14 pages
Module 5
No ratings yet
Module 5
45 pages
STATISTIC
No ratings yet
STATISTIC
100 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Quartile-Percentile-and-Decile-Compatibility-Mode
No ratings yet
Quartile-Percentile-and-Decile-Compatibility-Mode
5 pages

Machine Learning: Pradyumn Sharma Pragati Software Pvt. LTD

Uploaded by

Machine Learning: Pradyumn Sharma Pragati Software Pvt. LTD

Uploaded by

Machine Learning

• Machine Learning is more about Maths than about Python or

• Giving computers the ability to learn without being explicitly

• Making predictions: consumer behavior, market behavior

• Python with Scikit-learn

• Google Colaboratory, or 'Colab' allows you to write and run

• Pandas (pandas.pydata.org) is an open-source library that

• Open-source, Machine Learning library in Python.

• It is a statistical measure of dispersal of data from the mean

• 90th percentile = 90% of the data elements have lower values

• Examples : Height, weight, IQ levels.

• About 68% values fall within 1 standard deviation of the

• In a random sample from a population, if the statistics for the

full_data.plot (kind='box', subplots=True,

full_data[['area', 'rent']].plot (kind='box',

• Upper value outliers:

from pandas.plotting import scatter_matrix

• Suppose we consider linear regression with

Size of house Hypothesis Expected Rent

• Given the value of an

• Cost for is taken as

from sklearn.model_selection import train_test_split

trainX, testX, trainY, testY = train_test_split

from sklearn.linear_model import LinearRegression

model.fit (trainX, trainY)

print ('Coefficients:', model.coef_)

CoD: a measure of how much

from sklearn.metrics import mean_squared_error,

mae = mean_absolute_error (trainY, predictions)

area = input('Enter area : ')

• The Normal Equation Method is a "closed-form solution" to

• A simple technique for regression. However,

• The LinearRegression() class in scikit-learn uses a

• is the slope of the curve for

Applying partial derivatives, the simultaneous updates become:

• In the sheet titled “Step 2”:

• If is too small, gradient descent can be too slow.

Alpha = 0.0000001 Alpha = 0.0000003

Alpha = 0.000001 Alpha = 0.000003

• Batch gradient descent: uses the entire training data set to

• Batch gradient descent is the slow, stochastic gradient

• Stochastic Gradient Descent Regressor.

• Three modes of gradient descent:

model.fit (trainX, trainY)

model.fit (trainX, trainY.values.ravel())

• verbose. Default = 0 (silent). Other value = 1 (verbose

• Observe changes in J after each iteration (with verbose = 1),

model = SGDRegressor (verbose = 1)

• Start with verbose = True, a low value of eta0 (such as

• For sufficiently small learning rate, J should decrease on each

model = SGDRegressor(eta0 = 0.000005,

1. What is the difference between conventional

You might also like