Machine Learning Overview
Lương Thái Lê
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
Affect of Machine learning
• Machine learning (ML) is one of the most
exciting recent technologies:
• Google or Bing search engine
• spam emails
• photo tagging
• Why is machine learning so prevalent
today?
• Grew of work in AI
• New capability for computers
Pitfalls & Perils of ML
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
ML definition
• It is an important sub-area of AI that seeks to answer the following
question:
How can we build computer systems that automatically improve with
experience?
• Arthur Samuel (1959):
“Field of study that gives computers the ability to learn without being
explicitly programmed”
• Tom Mitchell (1998):
“A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.”
Clarify the ML definition
• ML Task (T): Classification (pattern recognition), regression
(prediction), clustering, retrieval…
• Experience (E):
• Supervised learning: Labeled data, target value (Decision Tree, SVMs,… )
• Unsupervised learning: Unlabeled data (K-mean, DB scan,…)
• Reinforcement learning: A reward function (Q-learning,…)
• Performance measure (P):
• Accuracy or Error rate
• Confusion matrix: Precision, Recall…
Example Systems that use ML
• Google Search
• Google Car
• Amazon’s recommendation system
• Adobe’s Optical Character Recognition (OCR)
• Facebook’s face tagging, news feed
• Apple’s Siri, Microsoft’s Cortana, Amazon’s Echo (Speech
Recognition)
• Auto-parking and Advanced Driver Assistance Systems (ADAS)
An Incomplate History of ML
• Turing Test (1950) • Neural Networks (1980’s)
• Machines do very poorly • Connectionism
• Rosenblatt’s Perceptron (1960’s) • Back-propagation [LeCun, `86]
• Kickstarted the mathematical analysis of • CNNs, RNNs
the learning process
• Key idea behind Support Vector • SVMs (1990’s)
Machines
(SVMs) and Neural Networks • Margin Maximization
• Construction of Fundamentals of • Kernel Methods to handle non-
Learning Theory (1960-70’s) linearity
• Focus on generalization capability of • Deep Learning (>2006)
learning machines
• Performance on unseen data • Hinton, Bengio, LeCun at forefront
• Regularization for ill-posed problems • (>2012) Craziness!!
• e.g., linear equations for ill-conditioned
matrices
What Should You Learn
• Modelling a learning problem
• Various algorithms (techniques) for solving ML problems
• Pitfalls while designing ML systems
• Modelling, Generalization, Regularization & Model Selection, (hyper)-Parameter tuning,
Overfitting, Underfitting
• Importance of Domain Knowledge
• Not treating ML techniques as a black box
• Simplify the learning problem by using domain knowledge
• Engineering Tricks
• Debugging ML systems
• Tools
• Scikit-learn, PyTorch, TensorFlow, OpenCV, etc
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
Email Spam Filtering
• T: Predict (to filter) emails which one is
spam?
• P: % of emails are classify exactly
• E: A set of sample emails, each of which
is represented with a corresponding set
of attributes (eg keyword set) and class
labeled (email/spam)
Handwriting recognition
• T: Identify and classify thewords in
handwritten pictures
• P: % of words are recognized and
classified correctly
• E: A set of handwritten images, in
which each image is attached with
an identifier of a word
Financal loan risk prediction
• T: Determine the level of risk (eg:
high/low) for loan applicationsfinance
• P: % of of loan applications with high risk
(no return) are determined exactly
• E: A set of loan applications, each
represented by a set of attributes and risk
(high/low)
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
Building a ML model
Data set
• Preprocess:
• Remove noise (icons, stickers…), and stopword (if need)
• Decode the abbreviations, native language
• Lower case and Upper case problems
• Fill the missing values of features
• Vietnamese: single or complex word
• …
• Train/Validation/Test set:
• Train: for train model
• Validation: for optimize model
• Test: for test model
ML paradigms
• Supervised Learning:
• learn a function that can be used to predict the output associated with new inputs
• training data is labeled
• Ex: Classification, Regression…
• Unsupervised Learning:
• identify commonalities in the data and react based on the presence or absence of such
commonalities in each new piece of data
• training data is unlabeled
• Ex: Clustering, Community detection…
• Reinforcement learning:
• take actions in an environment in order to maximize the notion of cumulative reward
• the environment is typically stated in the form of a Markov decision process (MDP)
• training data is set of all possibilities and corresponding rewards
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
The main components of the ML problem
• Training data:
• Labeled or unlabeled
• compatible with the examples to be used by the system in the future
• Objective function F:
• Determine function F:
• F: X -> {0,1}
• F: X -> {Set of classes: c1 ,c2 ,…,cn}
• …
• Choose the way to present F:
• a polynomial function
• a set of rules
• a decision tree
• an artificial neural network)
• ML Algorimth: can learn (approximately) the objective function F
• Regression-based
• Back-propagation
…
Problems in Machine Learning
• Training Examples (Datas)
• How many is enough?
• How do error (noise) and/or missing-value examples affect accuracy?
• Affects of data imbalance
• Learning algorithm (LA)
• Under what conditions, a LA will converge (asymptotically) the objective function
need to be learned?
• Which LA is the best for the specific conditions?
• Learning process
• What is the optimal strategy for choosing the order of using training examples?
• How can problem-specific knowledge (besides training examples) contribute to the
learning process?
Outline of Machine learning overview
1. Affect of machine learning (ML)
2. ML definition & history
3. Examples of ML problem
4. Process of building a ML model
5. The main components of the ML problem
6. Problems in ML
7. Learning Outcomes and prerequisites
Learning Outcomes
• Explain the different types of learning problems along with some
techniques to solve them
• Model real-world problems, apply different learning techniques and
quantitatively evaluate the performance
• Identify and use advanced techniques with the help of existing
machine learning tools and libraries
• Analyze performance of ML techniques and comment on their
limitations
Prerequisites
• Required
• Linear Algebra
• Probability and Statistics
• Advanced Calculus (mainly, vector differentiation)
• Introduction to Programming (Python)
• In reality you would need much more than an introduction
• Desired
• Optimization
• At least knowledge of gradient descent used for function minimization
Q&A - Thank you!