0% found this document useful (0 votes)
184 views

02 ML Supervised Learning

Machine learning models are mathematical representations of the output of a training process. They recognize patterns in data and make predictions. Models are trained on labeled example data to learn patterns and relationships. Trained models can then make predictions on new, unlabeled data. Common types of machine learning models include supervised learning models like regression and classification, unsupervised learning models, and reinforcement learning models.

Uploaded by

Adarsh Dash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views

02 ML Supervised Learning

Machine learning models are mathematical representations of the output of a training process. They recognize patterns in data and make predictions. Models are trained on labeled example data to learn patterns and relationships. Trained models can then make predictions on new, unlabeled data. Common types of machine learning models include supervised learning models like regression and classification, unsupervised learning models, and reinforcement learning models.

Uploaded by

Adarsh Dash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Machine Learning

Supervised Learning
Machine Learning – Model

What is Machine Learning Model ?

A machine learning model is defined as a mathematical representation of the output of the training process.
It recognizes certain types of patterns. A model is trained over a set of data, providing it an algorithm that it
can use to reason over and learn from those data. Once you have trained the model, you can use it to reason
over data that it hasn't seen before, and make predictions about those data.

Machine learning is the study of different algorithms that can improve automatically through experience & old
data and build the model. A machine learning model is similar to computer software designed to recognize
patterns or behaviors based on previous experience or data. The learning algorithm discovers patterns within
the training data, and it outputs an ML model which captures these patterns and makes predictions on new
data.

For example, let's say you want to build an application that can recognize a user's emotions based
on their facial expressions. You can train a model by providing it with images of faces that are each
tagged with a certain emotion, and then you can use that model in an application that can
recognize any user's emotion.
Machine Learning – Model

What is Machine Learning Model ?

Machine Learning models can be understood as a program that has been trained to find patterns within new
data and make predictions. These models are represented as a mathematical function that takes requests in the
form of input data, makes predictions on input data, and then provides an output in response. First, these
models are trained over a set of data, and then they are provided an algorithm to reason over data, extract the
pattern from feed data and learn from those data. Once these models get trained, they can be used to predict
the unseen dataset.

Take another example, let's say you want to build an application that can recognize a geometric
shapes like triangles, rectangles etc. Based on criteria such as number of sides, angles, you can train
the model by providing it with different types of geometric shapes. Once model is trained, any new
shape having same criteria can be identified by the model
Machine Learning – Classification of ML Models

Based on different business goals and data sets, there are three learning models for algorithms. Each machine
learning algorithm settles into one of the three models:

 Supervised Learning

 Unsupervised Learning

 Reinforcement Learning
Machine Learning – ML Models
Supervised Learning

Supervised learning, one of the most used methods in ML, takes both training data (also called data samples)
and its associated output (also called labels or responses) during the training process. The major goal of
supervised learning methods is to learn the association between input training data and their labels. For this it
performs multiple training data instances.

Supervised algorithms are called supervised because the machine learning model learns from data samples
where the output is known in advance. In this sense, the whole process of learning in supervised learning
algorithms can be thought as it is being supervised by a supervisor.
Machine Learning – ML Models
Supervised Learning

Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.
 If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.

 If the given shape has three sides, then it will be labelled as a triangle.
 If the given shape has six equal sides then it will be labelled as hexagon.

The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape
on the bases of a number of sides, and predicts the output.
Machine Learning – ML Models
Unsupervised Learning

Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset and are allowed to
act on that data without any supervision.
These models are not supervised using training dataset. Instead, models itself find the hidden patterns and insights from the
given data. It can be compared to learning which takes place in the human brain while learning new things.

we have the input data but no corresponding output data. The goal of unsupervised learning is to find the underlying
structure of dataset, group that data according to similarities, and represent that dataset in a compressed format.

 Examples of Unsupervised Learning

Social network analysis − Social network analysis is conducted to make clusters of friends depends on the frequency of
connection between them. Such analysis reveals the links between the users of some social networking website.
Market segmentation − Sales organizations can cluster or group their users into multiple segments on the basis of their prior
billed items. For instance, a big superstore can required to send an SMS about grocery elements specifically to its users of
grocery rather than sending that SMS to all its users.
Machine Learning – ML Models
Reinforcement Learning

Reinforcement Learning (RL) is the science of decision making. It is about learning the optimal behavior in an environment
to obtain maximum reward. This optimal behavior is learned through interactions with the environment and observations of
how it responds, similar to children exploring the world around them and learning the actions that help them achieve a goal.
In the absence of a supervisor, the learner must independently discover the sequence of actions that maximize the reward.
This discovery process is akin to a trial-and-error search. The quality of actions is measured by not just the immediate reward
they return, but also the delayed reward they might fetch. As it can learn the actions that result in eventual success in an
unseen environment without the help of a supervisor, reinforcement learning is a very powerful algorithm.

 Examples of Reinforcement Learning

•Autonomous Driving. An autonomous driving system must perform multiple perception and planning tasks in an
uncertain environment. Some specific tasks where RL finds application include vehicle path planning and motion
prediction. Vehicle path planning requires several low and high-level policies to make decisions over varying
temporal and spatial scales. Motion prediction is the task of predicting the movement of pedestrians and other
vehicles, to understand how the situation might develop based on the current state of the environment.
Machine Learning – ML Models
Reinforcement Learning

 The above image shows the robot, diamond, and fire. The goal of the robot is to get the reward that is the diamond and avoid
the hurdles that are fired. The robot learns by trying all the possible paths and then choosing the path which gives him the
reward with the least hurdles. Each right step will give the robot a reward and each wrong step will subtract the reward of the
robot. The total reward will be calculated when it reaches the final reward that is the diamond. 
Machine Learning – Supervised Learning

Based on machine learning based tasks, we can divide supervised learning algorithms in following two classes

 Regression

 Classification

Regression

Regression algorithms are used if there is a relationship between the input variable and the output variable. It
is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc. Below are
some popular Regression algorithms which come under supervised learning

Classification

Classification algorithms are used when the output variable is categorical, which means there are two classes
such as Yes-No, Male-Female, True-false, etc.
Machine Learning – Supervised Learning
Regression

 Some commonly used Regression models are as follows:

 Linear Regression

 Regression Trees

 Non-Linear Regression

 Bayesian Linear Regression

 Polynomial Regression
Machine Learning – Supervised Learning
Classification

 Some commonly used Classification models are as follows:

 Random Forest

 Decision Trees

 Logistic Regression

 Support vector Machines


Machine Learning – Supervised Learning
Regression

 Regression analysis is a fundamental concept in the field of machine learning. It helps in establishing a relationship among
the variables by estimating how one variable affects the other.
 There are various scenarios in the real world where we need some future predictions such as weather condition, sales
prediction, marketing trends, etc., for such case we need some technology which can make predictions more accurately. So
for such case we need Regression analysis which is a statistical method and used in machine learning and data science.

Some other reasons for using Regression analysis:

 Regression estimates the relationship between the target and the independent variable.

 It is used to find the trends in data.

 It helps to predict real/continuous values.

 By performing the regression, we can confidently determine the most important factor, the least important factor,
and how each factor is affecting the other factors.
Machine Learning – Supervised Learning
Regression

 Examples :-

Car Purchase -Imagine you’re going to purchase a car and have decided that gas mileage is a deciding factor in your
decision to buy. If you wanted to predict the miles per gallon of some promising rides, how would you do it? Well, since
you know the different features of the car (weight, horsepower, displacement, etc.) one possible method is regression. By
plotting the average MPG of each car given its features you can then use regression techniques to find the relationship of
the MPG and the input features. The regression function here could be represented as $Y = f(X)$, where Y would be the
MPG and X would be the input features like the weight, displacement, horsepower, etc.

Advertisement Vs Sales - Suppose there is a marketing company


A, who does various advertisement every year and get sales on
that. The below list shows the advertisement made by the
company in the last 5 years and the corresponding sales:

Now, the company wants to do the advertisement of $200 and


wants to know the prediction about the sales. So to solve such
type of prediction problems in machine learning, we need
regression analysis.
Machine Learning – Supervised Learning
Regression - Linear

 Linear regression is one of the regression technique in which a dependent variable has a linear relationship
with an independent variable. The main goal of Linear regression is to consider the given data points and plot
the trend line that fit the data in the best way possible.

Example-
-Let’s say we have a dataset that contains information about the relationship between X and Y. Number of
observations are made on X and Y and are recorded . This will be our training data. Our goal is to design a
model that can predict the Y value if the X value is provided. Using the training data, a regression line is
obtained which will give the minimum error. This linear equation is then used to apply for new data. That is, if
we give X as an input, our model should be able to predict Y with minimum error.

-Let us consider another example that there’s a connection between how many hours a student study and
marks; regression analysis can help us understand that connection. Regression analysis will provide us with a
relation that can be visualized into a graph to make predictions about your data.
Machine Learning – Supervised Learning
Regression - Linear

 The goal of regression analysis is to create a trend line based on the data. This then allows us to determine
whether other factors apart from hours of study affect the student marks, such as level of stress, etc. Before
taking that into account, we need to look at these factors and attributes and determine whether there is a
correlation between them. Linear Regression can then be used to draw a trend line which can then be used to
confirm or deny the relationship between attributes.
Machine Learning – Supervised Learning
Regression - Linear

 How do we determine the line that best fits the data?


The line is considered best fit if the predicted values and the observed values is approximately same. In simple
words, the sum of distance of data points from the line is minimum then it is a best fit line.
The Line is also called the regression line and the errors are also known as residuals which are shown below. It
can be visualized by the vertical lines from the data point to the regression line.
Machine Learning – Supervised Learning
Regression - Linear

 Model Performance
After the model is built, We need to check the difference between the values predicted and actual data, if it is
not much, then it is considered to be a good model. 
Machine Learning – Supervised Learning
Regression - Linear

 Python Code
After the model is built, We need to check the difference between the values predicted and actual data, if it is
not much, then it is considered to be a good model. 

import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
x = np.array([1, 2, 3, 4, 5, 6]).reshape((-1, 1))
y = np.array([2, 5, 6, 8, 9, 12])
model = LinearRegression()
model.fit(x, y)
Y_pred = model.predict(x)
r_sq = model.score(x, y)
print('coefficient of determination:', r_sq)
plt.scatter(x, y)
plt.plot(x, Y_pred, color='red')
plt.show()

Note :-The coefficient of determination is a statistical measurement that examines how differences in one variable
can be explained by the difference in a second variable, when predicting the outcome of a given event. 
Machine Learning – Supervised Learning
Multivariate Linear Regression

 Linear Regression is one of the most used statistical models in the industry. The main advantage of linear
regression lies in its simplicity and interpretability. Linear regression is used to forecast revenue of a company
based on parameters, forecasting player’s growth in sports, predicting the price of a product given the cost of
raw materials, predicting crop yield given rainfall and much much more. During our internship at Ambee, we
were given a warm-up task to predict car prices given the dataset. This task strengthened our understanding of
feature selection for multivariate linear regression and statistical measures for choosing the right model. You
might be wondering why does an environment company makes interns work on a car pricing dataset. At Ambee,
we celebrate outside data as much as inside data. That’s what makes us relate things like how a change in
pollutants impacts health businesses’ economies of scale, which aren’t seen directly by many but affect
indirectly. It is important for a data scientist to gain domain knowledge but it is also important to keep an open
mind on external factors that can be directly or indirectly related. Regression is a statistical technique used to
model continuous target variables. It has also been adopted to Machine Learning to predict continuous variables.
Regression models the target variable as a function of independent variables also called as predictors. Linear
Regression fits a straight line to our data. Simple Linear Regression (SLR) models target variable as a function
of a single predictor whereas Multivariate Linear Regression (MLR) models target variable as a function of
multiple predictors.
Machine Learning – Supervised Learning
Multivariate Linear Regression

 Problem Statement
A new car manufacturer is looking to set up business in the US Market. They need to know the factors on which
the pricing of a car depends on to take on their competition in the market. The company wants to know the
variables the price depends on and to what extent does the variables explain the price of a car.

Business Goal
We need to build a model for the price of a car as a function of explanatory variables. The company will then use
it to configure the price of a car according to its features or configure the features according to its price. In this
blog post, we shall go through the process of cleaning the data, understanding our variables and modelling using
linear regression. Let us import our libraries. Numpy is a fast matrix computation library that most of the other
libraries depend on and we might need it at some point. Pandas is our data manipulation library and one of the
most important libraries in our pipeline. matplotlib and Seaborn are used for plotting graphs.
Machine Learning – Supervised Learning
Multivariate Linear Regression

 Python Code
Machine Learning – Supervised Learning
Multi Level Models

 Many kinds of data, including observational data collected in the human and biological sciences, have a
hierarchical or clustered structure. For example, children with the same parents tend to be more alike in
their physical and mental characteristics than individuals chosen at random from the population at large.
Individuals may be further nested within geographical areas or institutions such as schools or employers.
Multilevel data structures also arise in longitudinal studies where an individual’s responses over time are
correlated with each other.
Multilevel models recognize the existence of such data hierarchies by allowing for residual components at
each level in the hierarchy. For example, a two-level model which allows for grouping of child outcomes
within schools would include residuals at the child and school level. Thus the residual variance is
partitioned into a between-school component (the variance of the school-level residuals) and a within-
school component (the variance of the child-level residuals). The school residuals, often called ‘school
effects’, represent unobserved school characteristics that affect child outcomes. It is these unobserved
variables which lead to correlation between outcomes for children from the same school.
Machine Learning – Supervised Learning
Multi Level Models

 Why use multilevel models?

1.Correct inferences: Traditional multiple regression techniques treat the units of analysis as


independent observations. One consequence of failing to recognize hierarchical structures is that standard
errors of regression coefficients will be underestimated, leading to an overstatement of statistical
significance. Standard errors for the coefficients of higher-level predictor variables will be the most
affected by ignoring grouping.
2.Substantive interest in group effects: In many situations a key research question concerns the extent
of grouping in individual outcomes, and the identification of ‘outlying’ groups. In evaluations of school
performance, for example, interest centers on obtaining ‘value-added’ school effects on pupil attainment.
Such effects correspond to school-level residuals in a multilevel model which adjusts for prior attainment.
3.Estimating group effects simultaneously with the effects of group-level predictors: An alternative
way to allow for group effects is to include dummy variables for groups in a traditional (ordinary least
squares) regression model. Such a model is called an analysis of variance or fixed effects model. In many
cases there will be predictors defined at the group level, eg type of school (mixed vs. single sex). In a fixed
effects model, the effects of group-level predictors are confounded with the effects of the group dummies,
ie it is not possible to separate out effects due to observed and unobserved group characteristics. In a
multilevel (random effects) model, the effects of both types of variable can be estimated.
4.Inference to a population of groups: In a multilevel model the groups in the sample are treated as a
random sample from a population of groups. Using a fixed effects model, inferences cannot be made
beyond the groups in the sample.
Machine Learning – Supervised Learning
Regression - Polynomial

 Polynomial regression is a special case of linear regression where we fit a polynomial equation on the data
with a curvilinear relationship between the independent variable x and dependent variable y is modeled as
an nth degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the
corresponding conditional mean of y

 Polynomial Regression is a regression algorithm that models the relationship between a dependent(y) and
independent variable(x) as nth degree polynomial. The Polynomial Regression equation is given below:

 It is also called the special case of Multiple Linear Regression in ML. Because we add some polynomial terms to the
Multiple Linear regression equation to convert it into Polynomial Regression.
 It is a linear model with some modification in order to increase the accuracy.
 The dataset used in Polynomial regression for training is of non-linear nature.
 It makes use of a linear regression model to fit the complicated and non-linear functions and datasets.
 Hence, "In Polynomial regression, the original features are converted into Polynomial features of required degree
(2,3,..,n) and then modeled using a linear model."
Machine Learning – Supervised Learning
Regression - Polynomial

Need for Polynomial Regression:


 If we apply a linear model on a linear dataset, then it provides us a good result as we have seen in Simple Linear
Regression, but if we apply the same model without any modification on a non-linear dataset, then it will produce a
drastic output. Due to which loss function will increase, the error rate will be high, and accuracy will be decreased.
 So for such cases, where data points are arranged in a non-linear fashion, we need the Polynomial Regression
model. We can understand it in a better way using the below comparison diagram of the linear dataset and non-linear
dataset.

 In the above image, we have taken a dataset which is arranged non-linearly. So if we try to cover it with a linear
model, then we can clearly see that it hardly covers any data point. On the other hand, a curve is suitable to cover most
of the data points, which is of the Polynomial model.
 Hence, if the datasets are arranged in a non-linear fashion, then we should use the Polynomial Regression model
instead of Simple Linear Regression.
Machine Learning – Supervised Learning
Regression – Polynomial
Python Code =================================
# Polynomial Regression

# Importing the libraries


import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
==================================
# Importing the dataset
dataset = pd.read_csv('../input/position-salary-dataset/Position_Salaries.csv')
X = dataset.iloc[:, 1:-1].values
y = dataset.iloc[:, -1].values
==================================
# Displaying the dataset
dataset.head()
==================================
# Displaying X
X
==================================
# Displaying y
y
==================================
Machine Learning – Supervised Learning
Regression – Polynomial
Python Code ==================================
# Training the Linear Regression model on the whole dataset
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)
===================================
# Training the Polynomial Regression model on the whole dataset
from sklearn.preprocessing import PolynomialFeatures
poly_reg = PolynomialFeatures(degree = 4)
X_poly = poly_reg.fit_transform(X)
lin_reg_2 = LinearRegression()
lin_reg_2.fit(X_poly, y)
====================================
# Visualising the Linear Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg.predict(X), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position Level')
plt.ylabel('Salary')
plt.show()
====================================
Machine Learning – Supervised Learning
# Visualising the Polynomial Regression results
Regression – Polynomial plt.scatter(X, y, color = 'red')
Python Code plt.plot(X, lin_reg_2.predict(poly_reg.fit_transform(X)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
=========================================
# Visualising the Polynomial Regression results (for higher resolution and
smoother curve)
X_grid = np.arange(min(X), max(X), 0.1)
X_grid = X_grid.reshape((len(X_grid), 1))
plt.scatter(X, y, color = 'red')
plt.plot(X_grid, lin_reg_2.predict(poly_reg.fit_transform(X_grid)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
============================================
# Predicting a new result with Linear Regression
lin_reg.predict([[6.5]])
# Predicting a new result with Polynomial Regression
lin_reg_2.predict(poly_reg.fit_transform([[6.5]]))
=================================================
Machine Learning – Supervised Learning
Regression – Polynomial
Python Code
Machine Learning – Supervised Learning
Regression – Polynomial
Python Code
Machine Learning – Linear Regression

You might also like