1
Machine Learning
機器學習
孫民
清華大學
2/13/25
2
Head of VP Applied Scientist
Feb. 2023 - present
2/13/25
3 Textbook
Microsoft Technical
Fellow and Director
of Microsoft Research
AI4Science
Christopher M. Bishop, Pattern
Recognition and Machine Learning, 2/13/25
Springer, 2006
4 Course Content
• Introduction
• Probability Distributions
• Linear Regression Models
• Linear Classification Models
• Neural Networks & CNNs
• Transformer (new)
• Kernel Methods & SVM
• Graphical Models
• Mixture Models & EM
• Beyond Supervised Learning: Clustering, 2/13/25
Transfer Learning, Few-Shot Learning
5 Grading Policy
• Homework (60%)
• 3 Written exercises (selected
problems)
• 3 Computer assignments
• No midterm & final exams
• Course project for max 3-person
team (35%)
• Self-tutorial (5%)
2/13/25
6 Course Project
• Max 3 person per team
• Kaggle platform
• Grading
• Mid-term proposal presentation
(25%) – Novelty-oriented
• Final presentation (20%)
• Rank in the topics(20%)
• Final report (35%)
2/13/25
7
2/13/25
8
2/13/25
9
2022, 4/15
Among Top 1000
Only 8 from Taiwan
2/13/25
10
2/13/25
https://www.kaggle.com/khyeh0719/competitions
11 Self-tutorial
• Same team as your course project
• 5 minutes presentation including
• Title, the problem, the solution, the
impact.
Examples in Two Minute Papers
https://youtu.be/Lu56xVlZ40M
2/13/25
Academic Integrity
12
2/13/25
Sign before next Monday on eeclass.
13
https://aliensunmin.github.io/teaching/ml2025s/#Sy 2/13/25
What’s Machine Learning?
14
Machine learning is a subfield of AI
that aims to teach computers how to
learn and act without being explicitly
programmed. More specifically,
machine learning involves building
and adapting models, which allow
programs to "learn" through
experience (e.g., data). Machine
learning involves the construction of
algorithms that adapt their models to
improve their ability to make
predictions. 2/13/25
Looking back the History of AI
15
2/13/25
16 What’s Machine Learning?
A computer program is
said to learn from
experience E with
respect to some task T
and some performance
measure P, if its
performance on T, as
Tom Mitchell, Professor of
measured by P, Computer Science and Machine
improves with Learning at Carnegie Mellon
experience E. 2/13/25
17 Why Machine Learning?
Machine Learning is everywhere
Digital Transformation leads to Machine Learning as a flying wheel.
2/13/25
https://www.tecton.ai/blog/managing-the-flywheel-of-machine-learning-data/
18 Why Machine Learning?
Software 2.0 v.s. 1.0
The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in
languages such as Python, C++, etc. It consists of explicit instructions to the computer
written by a programmer. By writing each line of code, the programmer identifies a
specific point in program space with some desirable behavior.
Software 2.0 is written in much more abstract, human unfriendly language, such as the
weights of a neural network. No human is involved in writing this code because there
are a lot of weights.
Software 1.0 is code we write. Software 2.0 is code written by the optimization based on
an evaluation criterion (such as “classify this training data correctly”). It is likely that any
setting where the program is not obvious but one can repeatedly evaluate the
performance of it (e.g. — did you classify some images correctly? do you win games of
Go?) will be subject to this transition, because the optimization can find much better
code than what a human can write. 2/13/25
https://karpathy.medium.com/software-2-0-a64152b37c35
19 Why Machine Learning?
Higher Salary in Averages (US, Dec. 2021)
Machine Learning Engineer (ML Engineer) V.S. Software Engineer
2/13/25
https://medium.com/geekculture/machine-learning-engineer-vs-software-engineer-salary-aebc9a5bc2c5
20 Schools of Machine Learning Algorithms
KNN
2/13/25
21 Machine Learning Problems
Supervised Unsupervised
Learning Learning
Discrete
Classification Clustering
Output
Continuous Dimensionality
Regression
Output Reduction
2/13/25
22 Machine Learning Problems
Supervised Unsupervised
Learning Learning
Discrete
Classification Clustering
Output
Continuous Dimensionality
Regression
Output Reduction
2/13/25
23 Breast Cancer (Malignant, Benign)
2/13/25
24 Breast Cancer (Malignant, Benign)
2/13/25
25 Face Recognition
Facebook auto-tagging
2/13/25
26 Speech Recognition
2010 2013
2/13/25
27 Speech Recognition – early vision
2/13/25
28 Machine Translation
2/13/25
29 Machine Learning Problems
Supervised Unsupervised
Learning Learning
Discrete
Classification Clustering
Output
Continuous Dimensionality
Regression
Output Reduction
2/13/25
30 Housing Price Prediction
2/13/25
31 Stock Market
2/13/25
32 Weather Prediction
2/13/25
33 Human Pose Estimation
2/13/25
34 Machine Learning Problems
Supervised Unsupervised
Learning Learning
Discrete
Classification Clustering
Output
Continuou Dimensionality
Regression
s Output Reduction
2/13/25
35 Clustering People
2/13/25
36 Clustering DNA Sequence
build groups of genes with related expression patterns (also known as coexpressed genes)
2/13/25
37 Machine Learning Problems
Supervised Unsupervised
Learning Learning
Discrete
Classification Clustering
Output
Continuou Dimensionality
Regression
s Output Reduction
2/13/25
38 Dimensionality Reduction
2/13/25
39 3D Face Modeling
2/13/25
40 3D Shape Modeling
2/13/25
41 Data for ML/AI
2/13/25
42 Data for ML/AI
• More than 90% data were generated
in recent two years.
• 80–90% of the data that internet
users generate daily is unstructured
(text, image, audio, etc.).
2/13/25
https://firstsiteguide.com/big-data-stats/
43 Model Size for ML/AI
Leaked Predicted
2/13/25
44 Model Size for ML/AI
Leaked Predicted
2/13/25
45 Model Size for ML/AI
2/13/25
46 Computing Advance for ML/AI
2/13/25
47 Three Pillars of ML/AI
DATA – Driven by
Digital
Transformation
Compute –
Semiconductors Algorithm – Stable,
and Chips Efficient, and
become critical Scalable
2/13/25