0% found this document useful (0 votes)

71 views50 pages

01 Introduction

Machine Learning

Uploaded by

aradhana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views50 pages

01 Introduction

Machine Learning

Uploaded by

aradhana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

Introduction to

Machine Learning
What is Machine Learning?
“Learning is any process by which a system improves
performance from experience.”
- Herbert Simon

Definition by Tom Mitchell (1998):

Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E.
A well-defined learning task is given by <P, T, E>.
3
Traditional Programming

Data
Computer Output

Program
Machine Learning

Data
Computer Progra
m
Output 4
When Do We Use Machine Learning?
ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)

Learning isn’t always useful:

• There is no need to “learn” to calculate payroll
5
A classic example of a task that requires machine learning:
It is very hard to say what makes a 2

6
Some more examples of tasks that are best
solved by using a learning algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
7
Sample Applications
• Web search
• Computational biology
• Finance
• E-commerce
• Space exploration
• Robotics
• Information extraction
• Social networks
• Debugging software
• [Your favorite area]

8
Samuel’s Checkers-Player
“Machine Learning: Field of study that gives
computers the ability to learn without being
explicitly programmed.” -Arthur Samuel (1959)

9
Defining the Learning Task
Improve on task T, with respect to
performance metric P, based on experience E
T: Playing checkers
P: Percentage of games won against an arbitrary
opponent E: Playing practice games against itself

T: Recognizing hand-written words

P: Percentage of words correctly classified
E: Database of human-labeled images of
handwritten words

T: Driving on four-lane highways using vision

sensors
P: Average distance traveled before a human-
judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

T: Categorize email messages as spam or legitimate. 10

P: Percentage of email messages correctly classified.
State of the Art Applications of
Machine Learning

11
Autonomous Cars

• Nevada made it legal for

autonomous cars to drive on
roads in June 2011
• As of 2013, four states (Nevada,
Florida, California, and
Michigan) have legalized
autonomous cars
Penn’s Autonomous
12
Car 
Autonomous Car Sensors

13
Autonomous Car Technology
Path

Planning

Laser Terrain Mapping

Learning from Human Drivers

Adaptive Vision

Sebastian

Stanley

Images and movies taken from Sebastian Thrun’s multimedia w1e4bsite.

Deep Learning in the Headlines

15
Deep Belief Net on Face Images
object models

object parts
(combination
of edges)

edges

pixels
Based on materials 16
by Andrew Ng
Learning of Object Parts

17
Training on Multiple Objects

Trained on 4 classes (cars, faces,

motorbikes, airplanes).
Second layer: Shared-features
and object-specific features.
Third layer: More specific
features.

18
Scene Labeling via Deep Learning

19
Inference from Deep Learned Models
Generating posterior samples from faces by “filling in” experiments
(cf. Lee and Mumford, 2003). Combine bottom-up and top-down inference.

Input images

Samples from
feedforward
Inference
(control)

Samples from
Full posterior
inference

20
Machine Learning in
Automatic Speech Recognition
A Typical Speech Recognition System

ML used to predict of phone states from the sound spectrogram

Deep learning has state-of-the-art results

# Hidden Layers 1 2 4 8 10 12

Word Error Rate % 16.0 12.8 11.4 10.9 11.0 11.1

Baseline GMM performance = 15.4%

[Zeiler et al. “On rectified linear units for speech
recognition” ICASSP 2013]
2
1
Impact of Deep Learning in Speech Technology

22
Types of Learning

23
Types of Learning

• Supervised (inductive) learning

– Given: training data + desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions

24
Supervised Learning: Regression
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is real-valued == regression
9
8
September Arctic Sea Ice Extent

7
(1,000,000 sq km)

6
5
4
3
2
1
0
1970 1990 2000 2010 2020
1980 Year
26
Data from G. Witt. Journal of Statistics Education, Volume 21,
Supervised Learning: Classification
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is categorical == classification
Breast Cancer (Malignant / Benign)

1(Malignant)

0(Benign)
Tumor Size

27
Supervised Learning: Classification
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is categorical == classification
Breast Cancer (Malignant / Benign)

1(Malignant)

0(Benign)
Tumor Size

Tumor Size 28
Supervised Learning: Classification
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is categorical == classification
Breast Cancer (Malignant / Benign)

1(Malignant)

0(Benign)
Tumor Size
Predict Benign Predict Malignant

Tumor Size 29
Supervised Learning
• x can be multi-dimensional
– Each dimension corresponds to an attribute

- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape
…

Tumor Size

30
Unsupervised Learning
• Given x 1 , x 2 , ..., x n (without labels)
• Output hidden structure behind the x’s
– E.g., clustering

31
Unsupervised Learning
Genomics application: group individuals by genetic similarity
Genes

Individuals 32
Unsupervised Learning

Organize computing clusters Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Market segmentation Astronomical data analysis 33

Unsupervised Learning
• Independent component analysis – separate a
combined signal into its original sources

34
Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html
Unsupervised Learning
• Independent component analysis – separate a
combined signal into its original sources

35
Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html
Reinforcement Learning
• Given a sequence of states and actions with
(delayed) rewards, output a policy
– Policy is a mapping from states  actions that
tells you what to do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
– Balance a pole on your hand

36
The Agent-Environment Interface

Agent and environment interact at discrete time : t  0, 1, 2,

steps Agent observes state at step t: K
t S
sproduces action at step t : at  A(st )
gets resulting reward : rt1 
and resulting next state :
st 1

... st rt +1 rt +2 rt +3 ...
at st +1 st +2 st +3
at +1 at +2 at +3
37
Reinforcement Learning

https://www.youtube.com/watch?v=4cgWya-wjgY 38
Inverse Reinforcement Learning
• Learn policy from user demonstrations

Stanford Autonomous Helicopter

http://heli.stanford.edu/ https://
www.youtube.com/watch?v=VCdxqn0fcnE
39
Framing a Learning Problem

40
Designing a Learning System
• Choose the training experience
• Choose exactly what is to be learned
– i.e. the target function
• Choose how to represent the target function
• Choose a learning algorithm to infer the target
function from the experience

Training data Learner

Environment/
Experience Knowledge

Testing data
Performanc
e Element 41
Training vs. Test Distribution
• We generally assume that the training and
test examples are independently drawn from
the same overall distribution of data
– We call this “i.i.d” which stands for “independent
and identically distributed”

• If examples are not independent, requires

collective classification
• If test distribution is different, requires
transfer learning
42
ML in a Nutshell
• Tens of thousands of machine learning
algorithms
– Hundreds new every year

• Every ML algorithm has three

components:
– Representation
– Optimization
– Evaluation

43
Various Function Representations
• Numerical functions
– Linear regression
– Neural networks
– Support vector machines
• Symbolic functions
– Decision trees
– Rules in propositional logic
– Rules in first-order predicate logic
• Instance-based functions
– Nearest-neighbor
– Case-based
• Probabilistic Graphical Models
– Naïve Bayes
– Bayesian networks
– Hidden-Markov Models (HMMs)
– Probabilistic Context Free Grammars (PCFGs)
– Markov networks

44
Various Search/Optimization
Algorithms
• Gradient descent
– Perceptron
– Backpropagation
• Dynamic Programming
– HMM Learning
– PCFG Learning
• Divide and Conquer
– Decision tree induction
– Rule learning
• Evolutionary Computation
– Genetic Algorithms (GAs)
– Genetic Programming (GP)
– Neuro-evolution

45
Slide credit: Ray Mooney
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• etc.

47
ML in Practice
• Understand domain, prior knowledge, and goals
• Data integration, selection, cleaning, pre-processing, etc.
Loop • Learn models
• Interpret results
• Consolidate and deploy discovered knowledge

48
Lessons Learned about Learning
• Learning can be viewed as using direct or indirect
experience to approximate a chosen target function.

• Function approximation can be viewed as a search

through a space of hypotheses (representations of
functions) for one that best fits a set of training data.

• Different learning methods assume different

hypothesis spaces (representation languages) and/or
employ different search techniques.

49
A Brief History of
Machine Learning

50
History of Machine Learning
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM

51
Slide credit: Ray Mooney
History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism, backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
52
Slide credit: Ray Mooney
History of Machine Learning (cont.)
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
– ???
53

1.0 Introduction
No ratings yet
1.0 Introduction
50 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
01 Introduction
No ratings yet
01 Introduction
43 pages
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
100% (1)
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
51 pages
ML Lecture#1
No ratings yet
ML Lecture#1
52 pages
Machine Learning Week2
No ratings yet
Machine Learning Week2
51 pages
01 Introduction
No ratings yet
01 Introduction
51 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
71 pages
Unit 1
No ratings yet
Unit 1
93 pages
Module 1-Basics of ML
No ratings yet
Module 1-Basics of ML
142 pages
Unit 1 ML
No ratings yet
Unit 1 ML
93 pages
Unit 1
No ratings yet
Unit 1
92 pages
Matrix Formula in Machine Learning
No ratings yet
Matrix Formula in Machine Learning
423 pages
Introduction To ML P2
No ratings yet
Introduction To ML P2
30 pages
Machine Learning Overview and Applications
No ratings yet
Machine Learning Overview and Applications
23 pages
History and Types of Machine Learning
No ratings yet
History and Types of Machine Learning
84 pages
Lec 1,2
No ratings yet
Lec 1,2
69 pages
Intro To ML
No ratings yet
Intro To ML
107 pages
Machine Learning Input-Output Overview
No ratings yet
Machine Learning Input-Output Overview
38 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
21 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Intro to Machine Learning Lecture
No ratings yet
Intro to Machine Learning Lecture
33 pages
Geocluster Mod in Machine Learning
No ratings yet
Geocluster Mod in Machine Learning
124 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
Lecture 1
100% (1)
Lecture 1
43 pages
Module 1
No ratings yet
Module 1
175 pages
Applied ML Course Overview
No ratings yet
Applied ML Course Overview
66 pages
Introduction to Machine Learning Course
No ratings yet
Introduction to Machine Learning Course
224 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
28 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Presentation of AI ML Session 1
No ratings yet
Presentation of AI ML Session 1
131 pages
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
No ratings yet
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
48 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Introduction To ML
No ratings yet
Introduction To ML
46 pages
ML Unit 1 Intro ML
No ratings yet
ML Unit 1 Intro ML
43 pages
AI & ML Basics for Beginners
No ratings yet
AI & ML Basics for Beginners
39 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Finding Maximally Specific Hypotheses
No ratings yet
Finding Maximally Specific Hypotheses
96 pages
Lect1 Introduction
No ratings yet
Lect1 Introduction
38 pages
Machine Learning Overview and Types
No ratings yet
Machine Learning Overview and Types
27 pages
Asset-V1 MITx+6.86x+1T2019+Type@Asset+Block@Slides Lecture1
No ratings yet
Asset-V1 MITx+6.86x+1T2019+Type@Asset+Block@Slides Lecture1
27 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
63 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
01 Introduction
No ratings yet
01 Introduction
49 pages
03 ML Notes PDF
No ratings yet
03 ML Notes PDF
16 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
30 pages
Machine Learning: Professor Department of Computer Science & Engineering
No ratings yet
Machine Learning: Professor Department of Computer Science & Engineering
59 pages
U1 ML Intro and Applications
No ratings yet
U1 ML Intro and Applications
123 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
58 pages
Language Culture and Society 1 Introduction 2
No ratings yet
Language Culture and Society 1 Introduction 2
59 pages
Keywords:-Leadership Behavior, Education Supervision
No ratings yet
Keywords:-Leadership Behavior, Education Supervision
6 pages
5th International Conference On AI, Machine Learning in Communications and Networks (AIMLNET 2025)
No ratings yet
5th International Conference On AI, Machine Learning in Communications and Networks (AIMLNET 2025)
2 pages
AI and ML Tutorial Overview
No ratings yet
AI and ML Tutorial Overview
4 pages
Ipcrf - Sy 2022 2023
0% (1)
Ipcrf - Sy 2022 2023
44 pages
Performance Task
No ratings yet
Performance Task
3 pages
Identification and Prioritisation of Risk Factors in R&D Projects Based On An R&D Process Model
No ratings yet
Identification and Prioritisation of Risk Factors in R&D Projects Based On An R&D Process Model
18 pages
Providence Talks Curriculum
0% (1)
Providence Talks Curriculum
148 pages
Recruitment Process
No ratings yet
Recruitment Process
39 pages
6 Odex
No ratings yet
6 Odex
21 pages
Week 7 - Key Slides Mindfulness 082021
100% (1)
Week 7 - Key Slides Mindfulness 082021
30 pages
Common Writing Errors and Ways To Avoid Them
No ratings yet
Common Writing Errors and Ways To Avoid Them
4 pages
QRT 3 - Week 5 DLL Phil Iri
No ratings yet
QRT 3 - Week 5 DLL Phil Iri
2 pages
PEC 102 Lesson Plan Midterm
No ratings yet
PEC 102 Lesson Plan Midterm
9 pages
Sustainability Plan Template
100% (1)
Sustainability Plan Template
15 pages
Maas 2024 Treatment For Childhood Apraxia of Speech Past Present and Future
No ratings yet
Maas 2024 Treatment For Childhood Apraxia of Speech Past Present and Future
26 pages
Ray Bradbury Lesson Plan: Vocabulary & Essay
No ratings yet
Ray Bradbury Lesson Plan: Vocabulary & Essay
3 pages
The Opponent-Process Theory of Emotion - Getting Stronger
No ratings yet
The Opponent-Process Theory of Emotion - Getting Stronger
2 pages
Restaurant Teamwork & Salary Impact
No ratings yet
Restaurant Teamwork & Salary Impact
1 page
Learning Plan in Speech 9
No ratings yet
Learning Plan in Speech 9
17 pages
Grade 11 Accounting Lesson Plan
No ratings yet
Grade 11 Accounting Lesson Plan
3 pages
FLN Curriculum Integration Guide
No ratings yet
FLN Curriculum Integration Guide
4 pages
MBA for Career Advancement
No ratings yet
MBA for Career Advancement
2 pages
Assessment 2 ESSAY Learning Theories, Divresity and Differentiation Due 21st MAY
No ratings yet
Assessment 2 ESSAY Learning Theories, Divresity and Differentiation Due 21st MAY
2 pages
Aspergers Syndrome Recommedation For Teachers
No ratings yet
Aspergers Syndrome Recommedation For Teachers
3 pages
Teaching Activities: Following Is The List of Classroom Activities That Can Make Learning Fun
No ratings yet
Teaching Activities: Following Is The List of Classroom Activities That Can Make Learning Fun
5 pages
Peiris Lessonplan
No ratings yet
Peiris Lessonplan
6 pages
ENFP
100% (1)
ENFP
6 pages
7th Grade Biomes Lesson Plan
No ratings yet
7th Grade Biomes Lesson Plan
5 pages
Proficient Teacher Observation Report
No ratings yet
Proficient Teacher Observation Report
3 pages

01 Introduction

Uploaded by

01 Introduction

Uploaded by

Introduction to

Definition by Tom Mitchell (1998):

Learning isn’t always useful:

T: Recognizing hand-written words

T: Driving on four-lane highways using vision

T: Categorize email messages as spam or legitimate. 10

• Nevada made it legal for

Laser Terrain Mapping

Learning from Human Drivers

Images and movies taken from Sebastian Thrun’s multimedia w1e4bsite.

Trained on 4 classes (cars, faces,

ML used to predict of phone states from the sound spectrogram

Deep learning has state-of-the-art results

Word Error Rate % 16.0 12.8 11.4 10.9 11.0 11.1

Baseline GMM performance = 15.4%

• Supervised (inductive) learning

Organize computing clusters Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Market segmentation Astronomical data analysis 33

Agent and environment interact at discrete time : t  0, 1, 2,

Stanford Autonomous Helicopter

Training data Learner

• If examples are not independent, requires

• Every ML algorithm has three

• Function approximation can be viewed as a search

• Different learning methods assume different

You might also like