0% found this document useful (0 votes)
2 views

Lecture Slide 01_ML

The document outlines the fundamentals of machine learning, defining it as a process where systems improve performance through experience. It discusses various types of learning, including supervised, unsupervised, and reinforcement learning, along with their applications in fields such as medicine, finance, and robotics. Additionally, it provides a historical overview of machine learning developments and emphasizes the importance of generalization in learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture Slide 01_ML

The document outlines the fundamentals of machine learning, defining it as a process where systems improve performance through experience. It discusses various types of learning, including supervised, unsupervised, and reinforcement learning, along with their applications in fields such as medicine, finance, and robotics. Additionally, it provides a historical overview of machine learning developments and emphasizes the importance of generalization in learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Fundamentals of Machine Learning

Course 4232: Machine Learning

Dept. of Computer Science


Faculty of Science and Technology

Lecturer No: 1 Week No: 1 Semester: Fall 23-24

Instructor: Md Saef Ullah Miah ([email protected])


What is learning?

◼ “Learning is any process by which a system improves performance


from experience.” –Herbert Simon

◼ “Learning is constructing or modifying representations of what is


being experienced.”
–Ryszard Michalski

◼ “Learning is making useful changes in our minds.” –Marvin Minsky


What Is Machine Learning (ML)?
Why “Learn” ?

◼ Machine learning is programming computers to optimize a


performance criterion using example data or past experience.
◼ There is no need to “learn” to calculate payroll
◼ Learning is used when:
 Human expertise does not exist (navigating on Mars),
 Humans are unable to explain their expertise (speech
recognition)
 Solution changes in time (routing on a computer network)
 Solution needs to be adapted to particular cases (user
biometrics)
Why learn?

◼ Build software agents that can adapt to their users or to other


software agents or to changing environments
 Personalized news or mail filter
 Personalized tutoring
 Mars robot
◼ Develop systems that are too difficult/expensive to construct
manually because they require specific detailed skills or knowledge
tuned to a specific task
 Large, complex AI systems cannot be completely derived by
hand and require dynamic updating to incorporate new
information.
◼ Discover new things or structure that were previously unknown to
humans
 Examples: data mining, scientific discovery
Related Disciplines

The following are close disciplines:


 Artificial Intelligence
◼ Machine learning deals with the learning part of AI

 Pattern Recognition
◼ Concentrates more on “tools” rather than theory

 Data Mining
◼ More specific about discovery

The following are useful in machine learning techniques or may give insights:
 Probability and Statistics
 Information theory

 Psychology (developmental, cognitive)


 Neurobiology
 Linguistics
 Philosophy
Data Mining

◼ Retail: Market basket analysis, Customer relationship


management (CRM)
◼ Finance: Credit scoring, fraud detection
◼ Manufacturing: Control, robotics, troubleshooting
◼ Medicine: Medical diagnosis
◼ Telecommunications: Spam filters, intrusion detection
◼ Bioinformatics: Motifs, alignment
◼ Web mining: Search engines
◼ ...
History of Machine Learning
◼ 1950s
 Samuel’s checker player

◼ 1960s:
 Neural networks: Perceptron
 Minsky and Papert prove limitations of
Perceptron

◼ 1970s:
 Expert systems and the knowledge acquisition
bottleneck
 Mathematical discovery with AM
 Symbolic concept induction
History of Machine Learning (cont.)

◼ 1980s:
 Resurgence of neural networks (connectionism,
backpropagation)
 Advanced decision tree and rule learning
 Learning, planning and problem solving
 Utility theory
 Analogy

◼ 1990s
 Data mining
 Reinforcement learning (RL)
 Inductive Logic Programming (ILP)
 Ensembles: Bagging, Boosting, and Stacking
History of Machine Learning (cont.)
◼ 2000s
 Kernel methods
◼ Support vector machines

 Graphical models
 Statistical relational learning
 Transfer learning

◼ Applications
 Adaptive software agents and web applications
 Learning in robotics and vision
 E-mail management (spam detection)
…
What is Machine Learning ?

◼ A computer program M is said to learn from experience E with


respect to some class of tasks T and performance P, if its
performance as measured by P on tasks in T in an environment Z
improves with experience E.

◼ Example:
 T: Cancer diagnosis
 E: A set of diagnosed cases
 P: Accuracy of diagnosis on new cases
 Z: Noisy measurements, occasionally misdiagnosed training cases
 M: A program that runs on a general purpose computer; the
learner
What is Machine Learning ?

◼ A computer program M is said to learn from experience E with respect to


some class of tasks T and performance P, if its performance as measured by
P on tasks in T in an environment Z improves with experience E.
Why Machine Learning ?

◼ Solving tasks that required a system to be adaptive


 Speech, face, or handwriting recognition
 Environment changes over time

◼ Understanding human and animal learning


 How do we learn a new language ? Recognize people ?
◼ Some task are best shown by demonstration
 Driving a car, or, landing an airplane

◼ Objective of Real Artificial Intelligence:


 “If an intelligent system–brilliantly designed, engineered and
implemented– cannot learn not to repeat its mistakes, it is not as
intelligent as a worm or a sea anemone or a kitten.” (Oliver
Selfridge)
Kinds of Learning

◼ Based on the information available


 Association
 Supervised Learning
◼ Classification

◼ Regression

 Reinforcement Learning
 Unsupervised Learning
 Semi-supervised learning

◼ Based on the role of the learner


 Passive Learning
 Active Learning
Major paradigms of machine learning
◼ Rote learning – “Learning by memorization.”
 Employed by first machine learning systems, in 1950s
◼ Samuel’s Checkers program

◼ Supervised learning – Use specific examples to reach general conclusions or


extract general rules
◼ Classification (Concept learning)
◼ Regression

◼ Unsupervised learning (Clustering) – Unsupervised identification of natural


groups in data

◼ Reinforcement learning– Feedback (positive or negative reward) given at the end


of a sequence of steps

◼ Analogy – Determine correspondence between two different representations

◼ Discovery – Unsupervised, specific goal not given

◼ …
Rote Learning is Limited

◼ Memorize I/O pairs and perform exact matching with new


inputs

◼ If a computer has not seen the precise case before, it cannot


apply its experience

◼ We want computers to “generalize” from prior experience


 Generalization is the most important factor in learning
The inductive learning problem

◼ Extrapolate from a given set of examples to make accurate


predictions about future examples

◼ Supervised versus unsupervised learning


 Learn an unknown function f(X) = Y, where X is an input
example and Y is the desired output.
 Supervised learning implies we are given a training set
of (X, Y) pairs by a “teacher”
 Unsupervised learning means we are only given the Xs.
 Semi-supervised learning: mostly unlabelled data
Learning Associations

◼ Basket analysis:
P (Y | X ) probability that somebody who buys X also buys Y where X and Y
are products/services.

Example: P ( sugar | tea ) = 0.7


Supervised Learning
◼ Training experience: a set of labeled examples of the form
< x1, x2, …, xn, y >

◼ where xj are values for input variables and y is the output

◼ This implies the existence of a “teacher” who knows the


right answers

◼ What to learn: A function f : X1 × X2 × … × Xn → Y , which


maps the input variables into the output domain

◼ Goal: minimize the error (loss function) on the test


examples
Types of supervised learning
x2=color

Tangerines Oranges
a) Classification:
• We are given the label of the training objects: {(x1,x2,y=T/O)}

• We are interested in classifying future objects: (x1’,x2’) with


the correct label.
I.e. Find y’ for given (x1’,x2’).

x1=size

Tangerines Not Tangerines b) Concept Learning:

• We are given positive and negative samples for the concept


we want to learn (e.g.Tangerine): {(x1,x2,y=+/-)}

• We are interested in classifying future objects as member of


the class (or positive example for the concept) or not.
I.e. Answer +/- for given (x1’,x2’).
Types of Supervised Learning

◼ Regression
 Target function is continuous rather than class
membership
 For example, you have some the selling
prices of houses as their sizes (sq-mt)
changes in a particular location that may y=price
look like this. You may hypothesize that
the prices are governed by a particular
function f(x). Once you have this
function that “explains” this relationship, f(x)
you can guess a given house’s value,
given its sq-mt. The learning here is the
selection of this function f() . Note that
the problem is more meaningful and
challenging if you imagine several input 60 70 90 120 150 x=size
parameters, resulting in a multi-
dimensional input space.
Classification
◼ Example: Credit scoring
◼ Differentiating between
low-risk and high-risk
customers from their
income and savings

Discriminant: IF income > θ1 AND savings > θ2


THEN low-risk ELSE high-risk
Classification: Applications

◼ Pattern Recognition

◼ Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair


style

◼ Character recognition: Different handwriting styles.

◼ Speech recognition: Temporal dependency.


 Use of a dictionary or the syntax of the language.
 Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for
speech

◼ Medical diagnosis: From symptoms to illnesses

◼ Biometrics: Recognition/authentication using physical and/or behavioral


characteristics: Face, iris, signature, etc
Face Recognition
Training examples of a person

Test images

ORL dataset,
AT&T Laboratories, Cambridge UK
Supervised Learning: Uses

◼ Prediction of future cases: Use the rule or model to predict the


output for future inputs

◼ Knowledge extraction: The rule is easy to understand

◼ Compression: The rule is simpler than the data it explains

◼ Outlier detection: Exceptions that are not covered by the rule,


e.g., fraud
Unsupervised Learning

◼ Learning “what normally happens”

◼ Training experience: no output, unlabeled data

◼ Clustering: Grouping similar instances

◼ Example applications
 Customer segmentation in CRM
 Image compression: Color quantization
 Bioinformatics: Learning motifs
Reinforcement Learning

◼ Training experience: interaction with an environment; learning agent


receives a numerical reward
 Learning to play chess: moves are rewarded if they lead to WIN, else
penalized
 No supervised output but delayed reward

◼ What to learn: a way of behaving that is very rewarding in the long run -
Learning a policy: A sequence of outputs

◼ Goal: estimate and maximize the long-term cumulative reward


◼ Credit assignment problem
◼ Robot in a maze, game playing
◼ Multiple agents, partial observability, ...
Passive Learning and Active Learning

◼ Traditionally, learning algorithms have been passive learners, which


take a given batch of data and process it to produce a hypothesis or
a model

◼ Data → Learner → Model


◼ Active learners are instead allowed to query the environment
 Ask questions
 Perform experiments
◼ Open issues: how to query the environment optimally? how to
account for the cost of queries?
Learning: Key Steps

data and assumptions


– what data is available for the learning task?
– what can we assume about the problem?
• representation
– how should we represent the examples to be classified
• method and estimation
– what are the possible hypotheses?
– what learning algorithm to use to infer the most likely hypothesis?
– how do we adjust our predictions based on the feedback?
• evaluation
– how well are we doing?
Evaluation of Learning Systems

◼ Experimental
 Conduct controlled cross-validation experiments to compare
various methods on a variety of benchmark datasets.
 Gather data on their performance, e.g. test accuracy,
training-time, testing-time…
 Analyze differences for statistical significance.

◼ Theoretical
 Analyze algorithms mathematically and prove theorems about
their:
◼ Computational complexity

◼ Ability to fit training data

◼ Sample complexity (number of training examples needed to


learn an accurate function)
Measuring Performance

Performance of the learner can be measured in one of the


following ways, as suitable for the application:
 Classification Accuracy
◼ Number of mistakes

◼ Mean Squared Error

◼ Loss functions

 Solution quality (length, efficiency)


 Speed of performance
…
Textbook/ Reference Materials

1. Introduction to Machine Learning (MIT Press) by Ethem Alpaydin

You might also like