Chap 5 Learning
Chap 5 Learning
1
Contents to be
covered
• What is learning
- Components of learning Agent
• Classification tasks
- Steps for classification
- Classifier performance measure
- Factors affecting classifier model performance
• Types of learning
• Learning approaches/methods
2
What is Learning?
• Learning is one of the keys to human intelligence. Do
you agree?
• what learning is? Learning is:
- Memorizing something
- Knowing facts through observation and exploration
- Improving motor and/or cognitive skills through practice
• The idea behind learning is that percepts should not
only be used for acting now, but also for improving
the agent’s ability to act in the future to achieve
the goal.
- Learning is essential for unknown environments,
i.e. when the agent lacks knowledge.
- It enables to organize new knowledge into
general,
- effective representations
Learning modifies the agent's decision making mechanisms
3
to improve performance
Learning Examples
– Learning is memorizing and remembering
• Telephone number
• Reading textbook
• Understanding Language (memorizing
grammar &
practicing)
• Recognize face, signature & fraudulent credit
card transactions
– Learning is improving motor skills
• Riding a bike
• Exercising and practicing the idea
– Learning is understanding the strategy & rule of the
game
• Playing chess and football
– Learning is abstraction and exploration
• Develop scientific theory
• Undertaking research
– Learning is nothing but
• Feature extraction 4
• Classification
Feature extraction
“Good” “Bad” 5
Classification
• Applications
–Character recognition:
Different printing styles.
–Speech recognition:
• Use of a dictionary or
the syntax of the
language.
• Example: Credit scoring
–Differentiating between
low-risk & high-risk
customers from their
income & savings
Discriminant: IF income > θ1 AND savings > θ2 THEN
low-risk
ELSE
high-risk 6
Face Recognition
• Learning: it is training (adaptation) from data set
• Training examples of a person –
• Face recognition is
challenging because
of the effect of
facial expression,
• Test images lighting, occlusion,
make-up,
hair style, etc.
7
Learning Agents
Percepts
Actions 8
The Basic Learning Model
• A computer program is said to learn from experience
E with respect to some class of tasks T and
performance measure P,
– if its performance at tasks T, as measured by P, improves
with experience E.
• Learning agents consist of four main components:
– learning element -- the part of the agent responsible
for improving its performance
– performance element -- the part that chooses the
actions
to take
– critic – provides feedback for the learning element
how the agent is doing with respect to a performance
standard
training
- problem thegenerator -- suggests problems or actions that 9
system further.
Data sets preparation for learning
• Data Sets also k.as Samples, Examples or Instances
• Good data is a prerequisite for
producing effective models of any types of
problem.
• Usually, the given data set is divided into training
and test sets.
• Training set
– Used in supervised learning, a training set is a set
of problem instances (described as a set of
properties and their values), together with a
classification of the instance.
• Test set
– A set of instances and their classifications used
How to Split Data Sets into Training
Holdout Method and Testing Sets?
– Given data is randomly partitioned into
two independent sets
o Training set (2/3) for model construction
o Test set (1/3) for accuracy estimation
- If many (thousands) of examples are available, including
several hundred examples from each class.
Cross-Validation Method
-Randomly partition the data into k mutually
exclusive
subsets, each approximately equal size.
-Where k = 10 is most popular.
At i-th iteration, use Di as test set and others as
-set 11
- training
Use Cross-validation for small data
Cross Validation Examples
12
Example: 3-Fold Cross-Validation
Average Performance=
Hold
out Tra ining Performance=60%
(67+60+81)/3=69.3
Hold
Tra ining Performance=81%
out
Data sets preparation for learning…
- It is important that the test data is not used in any
way to create the classifier.
Generally,
– The larger the training data the better the
classifier
14
Classification tasks- A two step process
Model construction: (Learning step or training
step)
– Construct the classification model based on the training
data.
• Training Data
- A set of tuples
- Each tuple is assumed to belongs to a predefined class
- Labeled data(ground truth)
• How a classification model looks like?
-A classification model can be represented by one of the
ff forms:-
classification rules,
decision trees, or
mathematical formulae
o Thus, model construction refers to describing a set15
Step 1: Model Construction
Classification
Algorithms
Trainin
g
Data
Model usage:
- Before using the model, we first need to test its
accuracy.
• Measuring model accuracy
- To measure the accuracy of a model we need
test
data.
- Test data is similar in its structure to training
data(labeled data).
• How to test?
• The known label of test samples is compared with
the classified result from the model.
17
Classification tasks- A two step process…
• Accuracy rate is the percentage of test set samples
that are correctly classified by the model
Number of correctclassifications
Accuracy ,
Total number of test cases
• Important:- test data should be independent of training
set, otherwise over-fitting will occur.
18
Step 2: Using the Model in
Prediction
Classifier
model
Testin
g Unseen Data
Data
(Jeff, Professor, 4)
NAM E RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
M erlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes 19
Classification:- Step1
Split data into train and test sets
20
Classification:- Step2
Build a model on a training set
21
Classification:- Step3
Evaluate on test set (Re-train?)
22
Confusion matrix
• A confusion matrix is useful tool for analyzing how well u’r classifier
can recognize tuples of different classes
• A confusion matrix displays the number of correct and incorrect
predictions made by the model compared with the actual
classifications in the test data.
Observe the following Confusion Matrix
P=?????
N=????
24
Error Rate=1-accuracy
Performance Measure
25
Example
2
6
Classifier performance Measure Examples
Classes Yes NO
NO How many
Yes
90 210
210 data sets
Yes
NO 140 9560 are there in
140 9560
Sensitivity =
this
TP/P = 90/300 = 30%
Specificity = TN/N = 9560/9700 = 98.56% example?
Accuracy = (TP + TN)/(P+N) = 9650/10,000 =
96.50%
Error Rate= 1-Accuracy= 1- 0.965= 3.5%
Precision = TP/(TP+FP) = TP/N = 90/230 = 39.13% 27
Sensitivity =TP/TP+FP
Specificity =
Accuracy =
Error Rate=
Precision =
Recall 28
Performance Measure
Accuracy
- classifier accuracy: predicting class label
- predictor accuracy: guessing value of predicted
attributes
Speed
- time to construct the model (training time)
- time to use the model (classification/prediction time)
Robustness: handling noise and missing values
Interpretability
- understanding and insight provided by the model 29
Purpose of Evaluation
• The objective of learning classifications from
sample data is to classify and predict
successfully on new data
31
Sum-up
• What is learning?
• Discuss the learning agent Components?
• What is dataset? What is the need of splitting data
into training and test samples/instances?
• What are the methods to split dataset into training
and testing dataset?
• What are the two classification tasks and discus it.
• What is the purpose of confusion matrix?
• What are the commonly used measures for evaluating
the classification(Classifier model) performance
• Which factors affect the performance of classifier
model?
32
Types of learning
• Supervised learning
- classification, regression
• Unsupervised learning
- clustering
• Semi- Supervised learning
• Reinforcement learning
33
Types of learning
Supervised learning – Classification
-Isthelearning process when the outcome variable
is
known.
3
5
Supervised Learning
Example:-
36
Supervised learning
An example: data (loan application)
37
The learning Process
• Learn a classification model from the data
• Use the model to classify future loan applications
into
– Yes (approved) and
– No (not approved)
• What is the class for following case/instance?
38
Example
label
Classification: a finite set of
apple labels
apple
banana
banana
Super
Example: Digit Recognition
40
Ranking example
Given a query
and a set of
web pages,rank
them
according
to relevance
41
Cont…
42
Applications
Medical Diagnosis:- Predicting tumor
cells as benign or malignant
Credit Approval
Classifying secondary structures of
protein
as alpha-helix, beta-sheet, or
random
coil
… recommended
many more
Types of learning…
• Unsupervised learning – Clustering
- The class labels of training data is unknown.
-Learning when there is no information about what the
correct outputs are.
– In unsupervised learning or clustering there is no explicit
teacher, the system forms clusters or natural groupings
of the input patterns.
- A form of learning by observation rather than learning by
examples
– Clustering is a technique for finding similarity
groups in data.
4
5
Cont…
• Thus Cluster Analysis
– Finding groups of objects such that the objects in a
group will be similar (or related) to one another and
different from (or unrelated to) the objects in
other groups
Inter-cluster
Intra-cluster distances
distances are are
minimized maximized
Example
49
Unsupervised Learning
Example:-
Unsupervised learning 50
Types of learning
• Reinforcement learning (RL): an agent interacting with
the world makes observations, takes actions, & is
rewarded or punished; it should learn to choose actions
in order to obtain a lot of reward.
• Examples
– Game playing: player knows whether it win or lose, but
not know how to move at each step
– Control: a traffic system can measure the delay of cars,
but not know how to decrease it.
51
RL is learning from interaction
52
Example
Backgammon
WIN!
LOSE!
አመሰግናለሁ !!!
56