0% found this document useful (0 votes)
11 views

AI chapter 5 (1)

Chapter 5 covers the basics of machine learning, including types such as supervised, unsupervised, and reinforcement learning, along with their applications and techniques. It emphasizes the importance of knowledge representation, probabilistic models, and various algorithms like KNN, SVM, and neural networks. The chapter also discusses real-world applications and the significance of machine learning in enhancing decision-making and automation.

Uploaded by

Habiteneh Endale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

AI chapter 5 (1)

Chapter 5 covers the basics of machine learning, including types such as supervised, unsupervised, and reinforcement learning, along with their applications and techniques. It emphasizes the importance of knowledge representation, probabilistic models, and various algorithms like KNN, SVM, and neural networks. The chapter also discusses real-world applications and the significance of machine learning in enhancing decision-making and automation.

Uploaded by

Habiteneh Endale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

Chapter 5: Machine Learning Basics

 Presentation Outline

5.1 Knowledge in Learning


5.2 Machine Learning
5.3 Supervised learning
5.3.1 Linear classification models
5.3.2 Probabilistic models
5.4 Unsupervised learning
5.4.1 Clustering models
5.5 Reinforcement learning
5.6 Deep Learning
5.6.1 Neural networks and back-propagation
5.6.2 Convolution neural networks
5.6.3 Recurrent neural networks and LSTMs
1
5.1 Knowledge in Learning

Is the information and data that a machine learning model uses to learn and make
decisions.
It means simply Knowledge representation and acquisition in machine learning.

Representation: How knowledge is encoded within a model.

Importance of Knowledge in Learning


Essential for training accurate and efficient models.
 Foundation for building intelligent systems that learn from data.

Helps models perform well on unseen data. Generalization

Allows models to improve over time and adapt to new data. Adaptability
05/19/2025 2
5.2 Learning Probabilistic Models

Is a models that incorporate probability theory to predict outcomes and handle uncertainty.

Usage: Used to estimate the likelihood of various outcomes.

Why Probabilistic Models Matter:


Uncertainty Handling: They manage uncertainty and variability in data.

Real-World Applications: Effective in real-world scenarios where outcomes are not


deterministic.
Flexibility: Can be applied to various types of data and problems.

05/19/2025 3
…5.2 Learning Probabilistic Models

Real-World Applications of Probabilistic Models


Examples:
 Spam Filtering: Naive Bayes Classifier to detect spam emails.

 Recommendation Systems: Predict user preferences and recommend products.

 Speech Recognition: HMMs to model and predict spoken words.

05/19/2025 4
5.3 Machine Learning

Machine learning is a subfield of artificial intelligence (AI) that focuses on the development
of algorithms and statistical models that enable computers to learn and improve their
performance on a specific task without being explicitly programmed.
In essence, machine learning allows computers to learn from data, identify patterns, and
make decisions or predictions based on that learned information.
Machine learning plays a crucial role in enabling computers to learn from data and make
intelligent decisions, leading to advancements in various fields and improving efficiency,
accuracy, and automation in tasks.

05/19/2025 5
…5.3 Machine Learning

With machine learning we can gain insight from a dataset.

We’re going to ask the computer to make some sense from the data.

This is what we mean by learning.

Machine learning is the process of turning the data into Information and Knowledge.

Machine Learning lies at the intersection of computer science, engineering, and statistics
and often appears in other disciplines.

05/19/2025 6
…5.3 Machine Learning

It’s a tool that can be applied to many problems.

Any field that needs to interpret and act on data can benefit from Machine Learning
techniques.
There are many problems where the solution isn’t deterministic.

That is, we don’t know enough about the problem or don’t have enough computing power
to properly model the problem.

05/19/2025 7
Traditional Vs ML systems

In Machine Learning, once the system is provided with the right data and algorithms, it can
‘’fish for itself”.

05/19/2025 8
…Traditional Vs ML systems

A key aspect of Machine Learning that makes it particularly appealing in terms of business
value is that it does not require as much explicit programming in advance.

05/19/2025 9
…5.3 Machine Learning
Why Machine Learning ?
It is very hard to write programs that solve complex problem like the following, but with ML
its possible:-
 To solve complex problem

 Automation and Efficiency

 Enhanced Decision Making

 Real-Time Applications

 Healthcare Advancements

 Improving Customer Service

 Security Enhancements

05/19/2025 10
5.2 Machine Learning
Key components of machine learning
Data: Machine learning algorithms require data to learn from.

Algorithms: Machine learning algorithms are mathematical models that process data and
learn from it to make predictions or decisions.
Training: During the training phase, machine learning models are exposed to labeled or
unlabeled data to learn patterns and relationships.
Evaluation: The model's accuracy, precision, recall, or other relevant metrics are assessed
to determine its effectiveness.
Deployment: Once the model has been trained and evaluated, it can be deployed to make
predictions
05/19/2025
or decisions on new, unseen data. 11
…5.2 Machine Learning

Types of Machine Learning

Supervised Learning (label provided for every input)


Unsupervised Learning (no label provided)
Reinforcement Learning (classical AI and ML)

05/19/2025 12
…5.2 Machine Learning
Types of Machine Learning

05/19/2025 13
5.3 Supervised learning
Supervised learning is a type of machine learning where the algorithm learns from labeled
data.
each example in the training dataset is associated with an input and an output label.

The goal of supervised learning is to learn a mapping from inputs to outputs based on this
labeled data, so that it can make predictions or decisions about new, unseen data.
Classification and regression are examples of supervised learning.

This set of problems is known as supervised because we’re telling the algorithm what to
predict.

05/19/2025 14
5.3 Supervised learning

Supervised learning is

05/19/2025 15
5.3 Supervised learning
1. Classification
is a supervised learning task where the goal is to predict the category or class label of new
observations based on past observations with known labels.
Classification Problem: nominal / discrete value.
 Grouping the data into predetermined classes.

Example 1: is a given patient information having the disease or does not have the disease?
 Learn to predict an output when given an input vector. There is an outcome; we are trying to predict.

 E.g.: Features: age, gander, smoking, drinking etc.…

 Labels : having the disease , does not have the disease

 Example 3: is a given email “spam” or “ham”.


05/19/2025 16
5.3 Supervised learning

1. Classification
In classification, our job is to predict what class an instance of data should fall into.

Example 2:,

05/19/2025 17
…5.3 Supervised learning
2. Classification Applications:
Spam Detection: Classifying emails as spam or not spam.

Image Recognition: Identifying objects in images, such as distinguishing between pictures


of cats and dogs.
Medical Diagnosis: Predicting whether a patient has a certain disease based on their
medical data.

05/19/2025 18
…5.3 Supervised learning
2. Classification Techniques:
K-Nearest Neighbors (KNN): Works with Numeric values, nominal values. (Classification
and regression)
Logistic Regression: Despite its name, it is used for binary classification problems.

Support Vector Machines (SVM): Finds the hyperplane that best separates different classes
in the feature space.
Decision Trees and Random Forests: Tree-based methods that split data into subsets based
on feature values to make predictions.
Neural Networks: Especially useful for complex classification tasks like image and speech
recognition.
05/19/2025 19
…5.3 Supervised learning
KNN Classification :
Problems:- Classifying movies into romance or action movies using the kNN algorithm?
 The number of kisses and kicks in each movie (features)

Now, you find a movie you haven’t seen yet and want to know if it’s a romance movie or an
action movie.
We find the movie in question and see how many kicks and kisses it has.

05/19/2025 20
Quiz
4. Classify
…5.3 the following
Supervised learningfilms, romance or
action?
KNN Classification :
Use KNN algorithm
Movies with the Numbers ofK=3 kicks, Numbers of kisses along with their class
Feature: No of kicks: 78
No of kisses: 32
1. Write the application of ML/DL
2. Write and define the types of ML? give
example for each
3.Write and define the types of DL? give exam
for each

05/19/2025 21
…5.3 Supervised learning
KNN Classification :
Solution:- We don’t know what type of movie the question mark movie is.

 First, we calculate the distance to all the other movies.

05/19/2025 22
…5.3 Supervised learning
KNN Classification :
Solution:- We don’t know what type of movie the question mark movie is.

Let’s assume k=3.

Then, the three closest movies are He’s Not Really into Dudes, Beautiful Woman, and
California Man.
Because all three movies are romances, we forecast that the mystery movie is a romance
movie. (through majority vote).

05/19/2025 23
…5.3 Supervised learning

2. Regression
Regression is a supervised learning task that involves predicting a continuous output
variable based on input features.
Involves predicting continuous values.

Regression is the prediction of a numeric value. (target variable)

Regression Problem: numerical / continuous value.

Given some data, you assume that those values come from some sort of function and try to
find out what the function is.
It is a problem of function approximation or interpolation.
05/19/2025 24
…5.3 Supervised learning

2. Regression
Applications:

House Price Prediction: Estimating the value of a house based on features like location, size,
and age.
Stock Market Forecasting: Predicting the future price of stocks based on historical data.

Sales Forecasting: Estimating future sales volumes based on past sales data and market
trends.

05/19/2025 25
…5.3 Supervised learning
2. Regression
Techniques:

Linear Regression: A method that models the relationship between the dependent variable
and one or more independent variables by fitting a linear equation.
Polynomial Regression: Extends linear regression by considering polynomial relationships
between the dependent and independent variables.
Ridge and Lasso Regression: Techniques that add regularization terms to the linear
regression model to prevent overfitting.
Neural Networks: Also used for regression, particularly when dealing with highly non-linear
relationships.
05/19/2025 26
…5.3 Supervised learning
KNN KNN- Regression Problem:

 The similarity measure is dependent on the type of the data:


 Real-valued data: Euclidean distance .
 Hamming distance: categorical or binary data (P-norm; when p=0)

X1, X2 y  d(): k Average


Regression
1, 6 7  Euclidian: 1-NN _______
Q = 4, 2, y = ??? 2, 4 8
3-NN _______
3, 7 16
6, 8 44
 Manhattan 1-NN _______
7, 1 50
8, 4 68 3-NN _______

05/19/2025 27
…5.3 Supervised learning
KNN KNN- Regression Problem:

Regression
ED
 d(): k Average
X1, X2 y
 Euclidian: 1-NN ___8___
1, 6 7 25
2, 4 8 8
 3-NN ___42__
3, 7 16 26
6, 8 44 40
 Manhattan 1-NN _______
7, 1 50 10  3-NN _______
8, 4 68 20
Q = 4, 2, y = ??? Euclidian = ((X – q )2 +(X2i – q2)2)1/2
1i 1

05/19/2025 28
…5.3 Supervised learning
KNN KNN- Regression Problem:

Regression
 d(): k Average
X1, X2 y MD  Euclidian: 1-NN ___8____
1, 6 7 7  3-NN ___42___
2, 4 8 4
3, 7 16 6  Manhattan 1-NN ___29__
6, 8 44 8  3-NN __35.5__
7, 1 50 4
8, 4 68 6
Manhattan = (|X – q |) + (|X2i - q1|)
1i 1
Q = 4, 2, y = ???

05/19/2025 29
5.4 Unsupervised learning
is a type of machine learning where the algorithm is trained on unlabeled data.
 This means that the data provided to the model does not come with predefined labels or outcomes.

The aim of unsupervised learning is to find clusters of similar inputs in the data without
being explicitly told that some data points belong to one class and the other in other
classes.
The algorithm has to discover this similarity by itself.

Discover a good internal representation of the input.

05/19/2025 30
…5.4 Unsupervised learning
Architectures:-

05/19/2025 31
…5.4 Unsupervised learning
Example:-

05/19/2025 32
…5.4 Unsupervised learning
Example:- segment grocery store shopper into clusters that exhibit similar behaviors.
There is no “right answer”

Clustering
Clustering is an unsupervised learning task where the objective is to group a set of objects in
such a way that objects in the same group (cluster) are more similar to each other than to
those in other groups.

05/19/2025 33
…5.4 Unsupervised learning
Clustering

Applications:
Customer Segmentation: Grouping customers based on purchasing behavior to target
marketing efforts more effectively.
Document Clustering: Organizing a large set of documents into topics for improved
information retrieval.
Anomaly Detection: Identifying unusual patterns that do not conform to expected behavior,
useful in fraud detection.

05/19/2025 34
…5.4 Unsupervised learning
Clustering

Techniques:
K-Means Clustering: A popular algorithm that partitions the data into K clusters by minimizing the
variance within each cluster.

Hierarchical Clustering: Builds a tree of clusters by either merging or splitting them iteratively.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise): Groups together


points that are closely packed together, marking points that lie alone in low-density regions as outliers.

Gaussian Mixture Models (GMM): Assumes data is generated from a mixture of several Gaussian
distributions and assigns probabilities to each point for belonging to each cluster.

05/19/2025 35
5.5 Reinforcement learning
is a type of machine learning where an agent learns to behave in an environment by
performing actions and seeing the results.
Reinforcement (stimulus/state, action, reward) [behavior scores well] – reward
maximization.
The algorithm searches over the state space of possible inputs and outputs in order to
maximize a reward.
Learn to select an action to maximize payoff.

05/19/2025 36
…5.5 Reinforcement learning
Reinforcement learning system is composed of two main components:
Agent

Environment

The action influences the state of the world which determines its reward.

05/19/2025 37
…5.5 Reinforcement learning
Architecture:

05/19/2025 38
…5.5 Reinforcement learning
Example of RL:

A robot cleaning a room,

Game playing,

Learning how to fly a helicopter,

Scheduling planes to their destinations,

 Reinforcement learning with an analogy:


Scenario1: Baby starts crawling and move into the candy. (candy)
Scenario2: Baby starts crawling but due to some hurdle in between. (no candy)

05/19/2025 39
…5.5 Reinforcement learning
 Here comes two approaches:
Model based approach RL:
 Learn the model, and use it to derive the optimal policy.
 E.g Adaptive dynamic learning (ADP) approach
Model free approach RL:
 Derive the optimal policy without learning the model.
 E.g LMS and Temporal difference approach

05/19/2025 40
…5.5 Reinforcement learning
 Reinforcement Learning Definitions:
 Agent: The RL algorithm that learns from trial and error.
 Environment: The world through which the agent moves.
 Action (A): All the possible steps that the agent can take.
 State (S): Current condition returned by the environment.
 Reward (R): An instant return from the environment to appraise the last action.
 Policy (): The approach that the agent uses to determine the next action based on the
current state.
 Values (V): The expected long-term return with discount, as opposed to the short-term
reward R.
 Action-value (Q): This is similar to Value, except, it takes an extra parameter, the current
action (A).

05/19/2025 41
…5.5 Reinforcement learning
Techniques:

 Markove Decision Process (MDP):- is the mathematical approach for mapping a solution
in reinforcement learning.
 Q-Learning:- is a value-based reinforcement learning algorithm used to find the optimal
action-selection policy using a Q function.

05/19/2025 42
5.6 Deep Learning
Machine learning that involves using very complicated models called “deep neural
networks”.
Deep Learning Vs Classical Machine Learning:

05/19/2025 43
…5.6 Deep Learning

Deep Learning problem types:


Deep Learning can solve multiple supervised and unsupervised problems.

The majority of its success has been when working with images, natural language, and
audio data.
Image classification and detection.

Semantic segmentation.

Natural language object retrieval.

Speech recognition and language translation.

05/19/2025 44
…5.6 Deep Learning

Example:
Classification and Detection: detect and label the image.
 Person,

 Motor Bike.

05/19/2025 45
…5.6 Deep Learning

Example:
Natural Language object retrieval.

05/19/2025 46
…5.6 Deep Learning

There are some basic deep learning models/architectures:


Convolutional neural networks.

Recurrent neural networks. used for predicting the future outcome. E.g. LSTM

Generative Adversarial Networks (GAN) and etc.

05/19/2025 47
5.6.1 Neural networks and back-propagation

Neural Networks
A neural network consists of layers of interconnected nodes, or neurons, where each
connection represents a weight.
The layers typically include:

Input Layer: Receives the input data.

Hidden Layers: Perform computations and feature extraction.

Output Layer: Produces the final prediction or classification.

Forward Propagation: Data passes through the network from the input layer to the output
layer, and each neuron applies an activation function to its inputs to produce an output.
05/19/2025 48
5.6.1 Neural networks and back-propagation

Back-Propagation
is the process of training neural networks, which involves updating the weights to minimize
the error.
Steps in Back-Propagation:

Forward Pass: Calculate the predicted output by passing the input through the network.

Compute Error: Measure the error using a loss function (e.g., mean squared error, cross-
entropy).
Backward Pass: Propagate the error backward through the network, calculating the
gradient of the loss function with respect to each weight.
Weight Update: Adjust the weights using an optimization algorithm like gradient descent.
05/19/2025 49
5.6.2 Convolution neural networks
are a subset of machine learning, and they are at the heart of deep learning algorithms.

are distinguished from other NN by their superior performance with speech and audio
signals and images.
CNN has many layers with various function.

It can be categorized as main layers and supportive trick.

The main layers are:


 Convolutional layer

 Pooling layer

 Fully-connected (FC) layer

05/19/2025 50
5.6.2 Convolution neural networks
CNNs are specialized neural networks for processing structured grid data, like images.

Key Components:
Convolutional Layers: Apply convolution operations to the input, using filters (kernels)
to detect features such as edges, textures, and patterns.
Activation Function: Commonly ReLU (Rectified Linear Unit), introduces non-linearity.

Pooling Layers: Downsample the spatial dimensions (e.g., max pooling), reducing the
computational load and focusing on important features.
Fully Connected Layers: Flatten the input and apply dense layers for final classification
or regression.
05/19/2025 51
…5.6.2 Convolution neural networks
Convolutional neural networks (CNNs) are algorithms that work like the brain’s visual processing system.

They can process images and detect objects by filtering a visual prompt and assessing components such as
patterns, texture, shapes, and colors.

CNNs often power computer vision and image recognition, fields of AI that teach machines how to process the
visual world.

Convolutional Neural Networks (CNNs) have been instrumental in advancing the field of computer vision.

05/19/2025 52
…5.6.2 Convolution neural networks
CNN – Application
 Classification

 Recognition

 Feature extraction

 Detection

 Segmentation

Example:- Image classification, object detection , facial recognition, etc.

05/19/2025 53
…5.6.2 Convolution neural networks
Here are lists of popular CNN architectures or models developed in the context of computer vision:

AlexNet: Won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012

VGGNet: Known for its uniform architecture with deep and stacked convolutional layers.

GoogLeNet (Inception): Introduced inception modules, which allow the network to learn and use filters of various sizes.

ResNet: Addressed the vanishing gradient problem with the introduction of residual connections, enabling the training of
very deep networks.

DenseNet: Utilizes dense connectivity between layers, allowing each layer to receive feature maps from all preceding.

MobileNet: Designed for efficient deployment on mobile and embedded devices.

EfficientNet: Introduced a compound scaling method to optimize the balance between depth, width, and resolution for
improved efficiency.

YOLO (You Only Look Once): Designed for real-time object detection, divides an image into a grid and predicts bounding
boxes and class probabilities for each grid cell.
05/19/2025 54
5.6.3 Recurrent neural networks and LSTMs
RNNs are designed for sequential data, where the output depends on previous computations.

Structure: Neurons in RNNs have connections to previous states, allowing information to


persist.
Training Challenges: Suffer from vanishing and exploding gradient problems, making it
difficult to learn long-term dependencies.

05/19/2025 55
5.6.3 Recurrent neural networks and LSTMs
Long Short-Term Memory Networks (LSTMs)
LSTMs are a type of RNN designed to handle long-term dependencies better than standard
RNNs.

Components of LSTM Cell:


Forget Gate: Decides what information to discard from the cell state.

Input Gate: Determines what new information to store in the cell state.

Output Gate: Controls the output based on the cell state.

05/19/2025 56
5.6.3 Recurrent neural networks and LSTMs
Applications:
 Time series prediction,

 language modeling,

 speech recognition, and more.

05/19/2025 57
5.6.4 Generative Adversarial Networks (GAN)
GAN’s have the capability to generate new samples similar to the data they were trained on.

For example: creating new faces after being trained on large data set of faces.

05/19/2025 58
…5.6.4 Generative Adversarial Networks (GAN)
 The general idea of how GANs work:

Two networks: a generator (G) and a discriminator (D).

These networks “compete” against each other.

05/19/2025 59
…5.6.4 Generative Adversarial Networks (GAN)
 The general idea of how GANs work:

Eventually after a lot of training (and usually tuning of hyperparameters) the generator will
hopefully be able to generate examples indistinguishable from the real data.
What is not simple is the tuning of the hyperparameters and the training time involved.

 Discriminator overpowering generator:

Sometimes the discriminator begins classifying all the generated examples as fake.

May want to have discriminator output be unscaled instead of sigmoid.

05/19/2025 60
…5.6.4 Generative Adversarial Networks (GAN)
 The general idea of how GANs work:

 Mode collapse:
 The generator discovers some weakness in the discriminator. Generator continually
produces a similar example regardless of variation in input.
 Can try adjusting training rate or changing layers of the discriminator in an attempt to
make better.
 Realistically, only GPU powered computers will able to handle training of a GAN.
 Event then, training can take a long time (days to weeks) for certain data.

05/19/2025 61
How does deep learning work?
Deep learning applications work using artificial neural networks—a layered structure of algorithms. To use a
deep learning model, a user must enter an input (unlabeled data).

 It is then sent through the hidden layers of the neural network where it uses mathematical operations to
identify patterns and develop a final output (response).

The algorithm’s design pulls inspiration from the human brain and its network of neurons, which transmit
information via messages.

05/19/2025 62
Applications of machine/deep learning include:
Natural Language Processing (NLP).

Image Recognition

Healthcare (e.g., disease diagnosis)

Finance (e.g., fraud detection)

Transportation (e.g., autonomous vehicles)

Marketing (e.g., customer segmentation)

Recommender Systems

Robotics

05/19/2025 63
Thanks

You might also like