Lecture 4 Machine Learning - Bcsc
Lecture 4 Machine Learning - Bcsc
MACHINE LEARNING
WHAT IS MACHINE
LEARNING?
Is a subset of AI that focuses on designing of
systems that are allowed to learn and make
predictions based on experience (which is data in
this case)
Machine learning enables computers to act and
make data driven decisions rather than being
explicitly programmed.
HOW MACHINE
LEARNING WORKS
TYPES OF MACHINE
LEARNING
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement learning
SUPERVISED LEARNING
In supervised learning, the training data fed into
the ML algorithm includes the desired solutions,
called labels.
Supervised learning can be categorized into
Classification
Regression
SUPERVISED LEARNING (CONT.)
Classification: A classification problem is when the
output variable is a category, such as “Red” or “blue”,
“disease” and “no disease”, “spam” or “no spam”.
An example of a classification problem is predicting
the grade of a student
Regression: A regression problem is when the output
variable is a real value, such as “weight”.
An example of a regression problem is predicting exam
score of a student
UNSUPERVISED
LEARNING
Unsupervised learning is the training of a machine
using information that is neither classified nor
labeled and allowing the algorithm to act on that
information without guidance.
Here the task of the machine is to group unsorted
information according to similarities, patterns, and
differences without any prior training of data.
UNSUPERVISED
LEARNING (CONT.)
Unsupervised learning can be categorized into
Clustering
Association
Clustering finds patterns on data being worked on e.g.
shape, size, colour which can be used to group data or
create cluster.
Examples of algorithm used here are k-means
clustering, hierarchical clustering
UNSUPERVISED
LEARNING (CONT.)
Association finds the dependencies of one data
item to another data item and map them e.g. one
looking for bread in the super market will be
looking for milk also.
Examples of algorithms used include apriori,
FP Growth
SEMI SUPERVISED
LEARNING
Semi-supervised learning is the type of machine
learning that uses a combination of labeled data
(usually small amount) and unlabeled data (usually
large amount of ) to train models.
This approach to machine learning is a combination
of supervised machine learning, which uses labeled
training data, and unsupervised learning, which uses
unlabeled training data.
REINFORCEMENT LEARNING
Reinforcement means to establish or create a pattern of
behavior.
Reinforcement learning is a type of machine learning
where machine learns to behave in an environment by
performing actions and seeing the results
Agent: RL algorithm that learns through trial and error
Environment is the world through which an agent moves
Action: All the possible steps an agent can take
State: current condition returned by the environment
MACHINE LEARNING
ALGORITHMS
Often ML algorithms are referred to as ML models or
systems.
The model is the core component of machine learning,
and ultimately what we are trying to build
Rather than being edited by people so that they work
well, machine learning models are shaped by data: they
learn from experience.
MACHINE LEARNING
ALGORITHMS (CONT.)
A model can be thought of as a function that accepts data
as an input and produces an output (estimations and
predictions).
Parameter values are continuously adjusted during the
model training process.
The final parameters found after training determines how
the model will perform on unseen data.
MACHINE LEARNING
ALGORITHMS (CONT.)
Models are trained using data plus two pieces of code:
the objective function, and the optimizer.
Objective is what is expected of the model to be able to do.
Objective function judges whether the model is doing a
good job or not.
The optimizer is code that changes the model’s parameters
so the model will do a better job next time.
MACHINE LEARNING
ALGORITHMS (CONT.)
MACHINE LEARNING
ALGORITHMS
NB: Training (CONT.)
a model changes the parameter values inside of a
model but does not change the kind of model is used.
Using a model means providing inputs and receiving an
estimation or prediction -- this is done when testing the model
and when using it in the real world.
TYPES OF MACHINE LEARNING
ALGORITHMS
There are several machine learning algorithms. Some
common ones include
Linear regression
Logistic regression
Decision tree
Random forest
K-Nearest Neighbour
Naïve Bayes
Support Vector Machines
LINEAR REGRESSION
Linear Regression is a supervised Machine Learning algorithm
in which the model finds the linear relationship between the
dependent and independent variable.
LOGISTIC REGRESSION
Logistic regression is a supervised learning classification
algorithm used to predict the probability of a target variable.
DECISION TREE
Decision Trees are a type of Supervised Machine
Learning algorithm (used to solve both classification
and regression problems) where the data is
continuously split according to a certain parameter.
With DT, little effort is required for data preparation
Can handle both numerical and categorical data
Nonlinear parameters do not affect its performance
DECISION TREE
(DISADVANTAGES)
Overfitting occurs when the algorithm captures
noise in the data
High variance – the model can get unstable due to
small variation in data
Low –bias tree – a highly complicated DT tends to
have a low bias which makes it difficult for the
model to work with new data
RANDOM FOREST
Random forest is a supervised machine learning algorithm
used for Classification and Regression problems.
It is based on the concept of ensemble learning which is a
process of combining multiple classifiers to solve a complex
problem and improve the performance of the model.
Random forest therefore uses multiple decision trees to
predict the output.
Random Forest Algorithm reduces the risk of overfitting
and the required training time.
Additionally, it offers a high level of accuracy.
KNN (K NEAREST
NEIGHBOUR) ALGORITHM
While K nearest neighbour algorithm can be used
for either regression or classification problems, it is
typically used as a classification algorithm, working
off the assumption that similar points can be found
near one another.
“k” means the number of nearest neighbors the
model will consider.
NAÏVE BAYES
Naïve Bayes is a probabilistic machine learning
algorithm based on the Bayes Theorem, used in a
wide variety of classification tasks.
CHALLENGES OF MACHINE
LEARNING
The challenges that can arise in selecting and training
machine learning model can be grouped into “bad
model” and “bad data.”
The following can be classified as bad data
Insufficient Quantity of Training Data
Non-representative Training Data
CHALLENGES OF MACHINE
LEARNING(CONT.)
Bad Data:
Poor Quality of Data
Irrelevant Features
Overfitting the Training Data
Underfitting the Training Data
IRRELEVANT FEATURES
Feature engineering is the process of coming up with a good
set of features within a dataset to train a model on.
This process involves
Feature selection: selecting the most useful features to train
on among existing features.
Feature extraction: combining existing features to produce
a more useful one (dimensionality reduction algorithms can
help).
OVERFITTING THE
TRAINING DATA
Overfitting is a concept where a model
performs well on training data, but it does not
generalize well to new unseen data.
Generally speaking, Overfitting happens when
the model is too complex relative to the amount
and noisiness of the data.
OVERFITTING THE
TRAINING DATA
The possible solutions are:
To simplify the model by selecting one with fewer
parameters, by reducing the number of attributes in the
training data
To gather more training data
To reduce the noise in the training data, that is fixing data
errors, remove outliers and reduce the number of instances in
the training set.
UNDERFITTING THE TRAINING DATA
Underfitting is a situation where the model is too simple for the
data.
It generally happens when the information available to construct
the exact model is small.
The main options to fix this problem are:
Selecting a more powerful model, with more parameters
Feeding better features to the learning algorithm (feature
engineering)
Reducing the constraints on the model (e.g., reducing the
regularization hyperparameter)
TESTING AND VALIDATING
The only way to know how well a model will generalize
to new cases is to actually try it out on new cases
Data is usually split into training set and the test set where
the training set is used to train the model and the testing
set is used to test the model.
EVALUATION METRICS
FOR REGRESSION
MAE
ANALYSIS
MSE
RMSE
MAPE – Mean Absolute Percentage Error
MPE – Mean Percentage Error
R SQUARED SCORE
ADJUSTED R SQUARED SCORE
MEAN SQUARED ERROR -MSE
Mean Squared Error (MSE): finds the average of the
squared difference between the target value and the
value predicted by the regression model.
It is given by
Where
• y_j: actual value
• y_hat: predicted value from the regression model
• N: number of items in the sample
MEAN SQUARED ERROR –
MSE (CONT.)
It penalizes even small errors by squaring them, which
essentially leads to an overestimation of how bad the
model is.
Error interpretation has to be done with squaring factor
(scale) in mind.
Due to the squaring factor, it’s fundamentally more
prone to outliers than other metrics
MEAN ABSOLUTE ERROR -MAE
Mean Absolute Error is the average of the difference between
the actual values and the predicted values.
Mathematically, its represented as :
Where:
• y_j: actual value
• y_hat: predicted value from the regression model
• N: number of items in the sample
MEAN ABSOLUTE ERROR –
MAE (CONT)
• It’s more robust towards outliers than MSE, since it
doesn’t exaggerate errors.
• It gives a measure of how far the predictions were
from the actual output.
• However, since MAE uses absolute value of the
residual, it doesn’t give us an idea of the direction of
the error, i.e. whether there is under-prediction or
over-prediction the data.
ROOT MEAN SQUARED
ERROR (RMSE)
Root Mean Squared Error (RMSE): It is the root of
MSE i.e Root of the mean difference of Actual and
Predicted values.
RMSE penalizes the large errors whereas MSE doesn’t.
MAE & RMSE - SIMILARITIES
Mean Absolute Error (MAE) and Root mean squared error (RMSE) are
two of the most common metrics used to measure accuracy for
continuous variables.
Both MAE and RMSE express average model prediction error in units of
the variable of interest.
Both metrics can range from 0 to ∞ and are indifferent to the direction of
errors.
They are negatively oriented scores, which means lower values are better.
MAE & RMSE - DIFFERENCES