0% found this document useful (0 votes)

31 views

Data in ML

This document provides an introduction to machine learning. It defines machine learning as a computer program that improves its performance on tasks through experience. It discusses important machine learning concepts like training sets, instances, features, and feature vectors. It also describes different types of machine learning including supervised learning for classification and regression, unsupervised learning, and reinforcement learning. Finally, it discusses some challenges in machine learning like data collection, dimensionality, and data issues like noise, outliers, and imbalance.

Uploaded by

Purnama Gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Data in ML

Uploaded by

Purnama Gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

INTRODUCTION TO MACHINE LEARNING

Dr. Gede Angga Pradipta M.Eng

Machine Learning ?

“ A Computer program is said to learn from experience (E) with

some class of tasks (T) and a performance measure (P) if its

performance at task in T as measured by P improves with E “

(Tom Mitchell, 1977)

Some Important Terminologies
• Trainning/Evolution Set
- Set of data to discover potentially predictive relationships.
• Instances
- A sample is an item to process (e.g. classify). It can be a document, a picture, a sound ,
a video, a row in database or CSV file, or whatever you can describe with a fixed set of
quantitative traits.
• Features/attributes
- The number of features or distinct traits that can be used to describe each item in a
quantitative manner.
• Feature vector
- Is an n-dimensional vector of numerical features that represent some object.
• Feature extraction
- Preparation of feature vector
- Transforms the data in the high-dimensional space to a space of fewer dimensions
Attributes Features

Class/label
Target
Attributes

Name Balance Age Employed Write-off

Mike $ 2000 31 No Yes
Mary $13500 30 Yes Yes
Claudio $200 21 No No
Robert $5000 28 Yes Yes
David $10000 42 Yes Yes

This is one row (example). Instance

Feature vector is : <Claudio,200,21,No>
Class label (value of target attribute) is No
WHY MACHINE LEARNING ?
WHY MACHINE LEARNING ?
WHY MACHINE LEARNING ?
Traditional Code Vs Machine Learning
Machine Learning Pipeline • Data acquisition
(image,text,sound,video,
etc)
• ML Problem check list
Defining
Problem
• Removing noise ( e.g
• Present final Tomek-Links)
performance • Oversampling data
Present Preparing • Feature Engineering
result Data
• Feature Selection
• Data transformation

• Performance
measurement
Improve Check • Evaluate different
result Algorithms model
• Tuning parameter
• Build Ensemble
Model
Types of machine
learning
Supervised Machine Learning
Supervised learning (Regression)

• Regression is a measure of the relation between the mean value of

one variable (e.g. salary) and corresponding values of other
variable (e.g experience).
• Regression analysis is a statistical process for estimating the
relationships among variables.
• Regression means to predict the ouput value usong training data.
• Basic algorithm: linier regression
Supervised learning (Classification)

Nerual network model Geometric model

Logical Model/rule based model Probabilistic model

Differences between classification and regression

•A classification problem requires that •A regression problem requires the

examples be classified into one of two or prediction of a quantity.
more classes. •A regression can have real valued or
•A classification can have real-valued or discrete input variables.
discrete input variables. •A problem with multiple input
•A problem with two classes is often called a variables is often called a multivariate
two-class or binary classification problem. regression problem.
•A problem with more than two classes is •A regression problem where input
often called a multi-class classification variables are ordered by time is called a
problem. time series forecasting problem.
•A problem where an example is assigned
multiple classes is called a multi-label
classification problem.
Unsupervised learning

Source : sckit-learn.org
Reinforcement Learning
Some challenges task of machine learning
• Difficult to data collection/Acquisition
• Not all features useful to find good model (Curse of Dimensionality)
• Some dataset contains noise/outlier/redundant data
• Small data and Imbalance data
Curse of Dimensionality
• As number of features or dimensions grows, the amount of data we need
to generalize accurately grows exponentially (C.Bishop, 2006)

As the dimensionality increases, the classifier's

performance increases until the optimal number
of features is reached. Further increasing the
dimensionality without increasing the number of
training samples results in a decrease in
classifier performance.

https://www.datasciencecentral.com/
Curse of Dimensionality
• If we would keep adding features, the dimensionality of the feature space grows, and becomes
sparser and sparser.
• Due to this sparsity, it becomes much more easy to find a separable hyperplane because the
likelihood that a training sample lies on the wrong side of the best hyperplane becomes
infinitely small when the number of features becomes infinitely large.

https://www.datasciencecentral.com/
Curse of Dimensionality

This concept is called overfitting and is a direct result of the curse of dimensionality.

https://www.datasciencecentral.com/
Curse of Dimensionality
How to evaluate/Validate ML Model

K- Fold Cross Validation Train-Test Data Split

How to evaluate ML Model
THANK YOU

Data Analyst CV
No ratings yet
Data Analyst CV
2 pages
SOLUTIONS 2022 Intro Stats Exam2
No ratings yet
SOLUTIONS 2022 Intro Stats Exam2
13 pages
Mastering Objectoriented Python
From Everand
Mastering Objectoriented Python
Steven F. Lott
5/5 (2)
Sensory Lab Report Nutrition 205
No ratings yet
Sensory Lab Report Nutrition 205
32 pages
Palpebral Fissure Length (PFL) Z-Score Calculator: January 1, 1990 January 1, 2000
No ratings yet
Palpebral Fissure Length (PFL) Z-Score Calculator: January 1, 1990 January 1, 2000
2 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
1. Machine Learning - Introduction
No ratings yet
1. Machine Learning - Introduction
73 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
ML -1_Sovan_Introduction to ML
No ratings yet
ML -1_Sovan_Introduction to ML
83 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
AAI Lecture 9 Sp 25
No ratings yet
AAI Lecture 9 Sp 25
26 pages
Week 01
No ratings yet
Week 01
37 pages
CHP 1
No ratings yet
CHP 1
47 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
28 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Classification
No ratings yet
Classification
53 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
unit 01
No ratings yet
unit 01
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Introduction To ML - MCA - 2023
No ratings yet
Introduction To ML - MCA - 2023
30 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
15 pages
Introduction to ML
No ratings yet
Introduction to ML
17 pages
unit 1
100% (1)
unit 1
13 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Introduction
No ratings yet
Introduction
41 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Ch7 Introduction to Machine Learning
No ratings yet
Ch7 Introduction to Machine Learning
29 pages
01 - ML - Introduction (1)
No ratings yet
01 - ML - Introduction (1)
65 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Intro To ML
No ratings yet
Intro To ML
26 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
Introduction to ML Unit-1 PPT
No ratings yet
Introduction to ML Unit-1 PPT
90 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
1. Machine Learning - Introduction
No ratings yet
1. Machine Learning - Introduction
138 pages
Unit 1
No ratings yet
Unit 1
62 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
ML 1
No ratings yet
ML 1
35 pages
Chap-6 Machine Learning Introduction
No ratings yet
Chap-6 Machine Learning Introduction
49 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
15 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
The Machine Learning Landscape
No ratings yet
The Machine Learning Landscape
30 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
Lecture 1 - Introduction (DONE!!)
No ratings yet
Lecture 1 - Introduction (DONE!!)
33 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
31 pages
AML All Merged PDF Class 1 To 8
No ratings yet
AML All Merged PDF Class 1 To 8
423 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
48 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
12 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
1.Intro
No ratings yet
1.Intro
18 pages
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
From Everand
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
Marije Brummel
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
MBA - Nidhi - Project Intrim Report
No ratings yet
MBA - Nidhi - Project Intrim Report
4 pages
Course Selection Theory and College Transition Seminars - An Adapt PDF
No ratings yet
Course Selection Theory and College Transition Seminars - An Adapt PDF
249 pages
Just Plain Data Analysis Finding Presenting and Interpreting Social Science Data 2nd Edition Gary M. Klass instant download
No ratings yet
Just Plain Data Analysis Finding Presenting and Interpreting Social Science Data 2nd Edition Gary M. Klass instant download
66 pages
ChatGPT in Exploratory Data Analysis
No ratings yet
ChatGPT in Exploratory Data Analysis
6 pages
Institution Nam-WPS Office
No ratings yet
Institution Nam-WPS Office
5 pages
Halima Aktar Panshi
No ratings yet
Halima Aktar Panshi
4 pages
BizAgi Xpress Functional Description PDF
No ratings yet
BizAgi Xpress Functional Description PDF
26 pages
Abstract:: A Study On Mergers and Acquisition in Banking Industry of India
No ratings yet
Abstract:: A Study On Mergers and Acquisition in Banking Industry of India
4 pages
Data Science Practical No 03
No ratings yet
Data Science Practical No 03
5 pages
Analisis Pengaruh Public Relations Dan Pencitraan Terhadap Minat Kuliah Di Perguruan Tinggi Swasta
No ratings yet
Analisis Pengaruh Public Relations Dan Pencitraan Terhadap Minat Kuliah Di Perguruan Tinggi Swasta
18 pages
The Research Process Elements of Research Design
No ratings yet
The Research Process Elements of Research Design
3 pages
MSPM Clark University
No ratings yet
MSPM Clark University
27 pages
Ch. 10 Principal Components Analysis (PCA)
No ratings yet
Ch. 10 Principal Components Analysis (PCA)
17 pages
Financial Statement Analysis of BSNL
75% (20)
Financial Statement Analysis of BSNL
58 pages
Information Technology P.G. Syllabus
No ratings yet
Information Technology P.G. Syllabus
42 pages
Careem+Go Problem+Defination
No ratings yet
Careem+Go Problem+Defination
7 pages
Important 2
No ratings yet
Important 2
16 pages
MS3252
No ratings yet
MS3252
5 pages
HRM Dissertation - Chapter 3
No ratings yet
HRM Dissertation - Chapter 3
5 pages
Data Analytics Guide
100% (1)
Data Analytics Guide
5 pages
Eco Training
No ratings yet
Eco Training
37 pages
Improving Organizational Performance THR
No ratings yet
Improving Organizational Performance THR
15 pages
DBSCAN Presentation
No ratings yet
DBSCAN Presentation
10 pages
SEM 4 Mini Project On CHPP
No ratings yet
SEM 4 Mini Project On CHPP
8 pages
Math 7 Learning Competencies
No ratings yet
Math 7 Learning Competencies
2 pages
Simple Linear Regression: Presented by Tayyab Pervaiz 19011507-093
No ratings yet
Simple Linear Regression: Presented by Tayyab Pervaiz 19011507-093
11 pages

Data in ML

Uploaded by

Data in ML

Uploaded by

INTRODUCTION TO MACHINE LEARNING

Dr. Gede Angga Pradipta M.Eng

“ A Computer program is said to learn from experience (E) with

some class of tasks (T) and a performance measure (P) if its

performance at task in T as measured by P improves with E “

(Tom Mitchell, 1977)

Name Balance Age Employed Write-off

This is one row (example). Instance

• Regression is a measure of the relation between the mean value of

Nerual network model Geometric model

Logical Model/rule based model Probabilistic model

•A classification problem requires that •A regression problem requires the

As the dimensionality increases, the classifier's

K- Fold Cross Validation Train-Test Data Split

You might also like