0% found this document useful (0 votes)
24 views47 pages

ML 0

The document outlines a machine learning course led by Professor Sun Min at Tsinghua University, detailing the course content, grading policy, and project requirements. It emphasizes the importance of machine learning in modern applications and discusses various machine learning problems such as supervised and unsupervised learning. Additionally, it highlights the significance of data, computing power, and algorithms in the field of machine learning and AI.

Uploaded by

wj9hn5fc5c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views47 pages

ML 0

The document outlines a machine learning course led by Professor Sun Min at Tsinghua University, detailing the course content, grading policy, and project requirements. It emphasizes the importance of machine learning in modern applications and discusses various machine learning problems such as supervised and unsupervised learning. Additionally, it highlights the significance of data, computing power, and algorithms in the field of machine learning and AI.

Uploaded by

wj9hn5fc5c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

1

Machine Learning
機器學習

孫民
清華大學

2/13/25
2

Head of VP Applied Scientist


Feb. 2023 - present

2/13/25
3 Textbook

Microsoft Technical
Fellow and Director
of Microsoft Research
AI4Science

Christopher M. Bishop, Pattern


Recognition and Machine Learning, 2/13/25

Springer, 2006
4 Course Content
• Introduction
• Probability Distributions
• Linear Regression Models
• Linear Classification Models
• Neural Networks & CNNs
• Transformer (new)
• Kernel Methods & SVM
• Graphical Models
• Mixture Models & EM
• Beyond Supervised Learning: Clustering, 2/13/25

Transfer Learning, Few-Shot Learning


5 Grading Policy

• Homework (60%)
• 3 Written exercises (selected
problems)
• 3 Computer assignments
• No midterm & final exams
• Course project for max 3-person
team (35%)
• Self-tutorial (5%)
2/13/25
6 Course Project

• Max 3 person per team


• Kaggle platform
• Grading
• Mid-term proposal presentation
(25%) – Novelty-oriented
• Final presentation (20%)
• Rank in the topics(20%)
• Final report (35%)
2/13/25
7

2/13/25
8

2/13/25
9

2022, 4/15
Among Top 1000
Only 8 from Taiwan

2/13/25
10

2/13/25

https://www.kaggle.com/khyeh0719/competitions
11 Self-tutorial

• Same team as your course project


• 5 minutes presentation including
• Title, the problem, the solution, the
impact.

Examples in Two Minute Papers


https://youtu.be/Lu56xVlZ40M

2/13/25
Academic Integrity
12

2/13/25
Sign before next Monday on eeclass.
13

https://aliensunmin.github.io/teaching/ml2025s/#Sy 2/13/25
What’s Machine Learning?
14

Machine learning is a subfield of AI


that aims to teach computers how to
learn and act without being explicitly
programmed. More specifically,
machine learning involves building
and adapting models, which allow
programs to "learn" through
experience (e.g., data). Machine
learning involves the construction of
algorithms that adapt their models to
improve their ability to make
predictions. 2/13/25
Looking back the History of AI
15

2/13/25
16 What’s Machine Learning?
A computer program is
said to learn from
experience E with
respect to some task T
and some performance
measure P, if its
performance on T, as
Tom Mitchell, Professor of
measured by P, Computer Science and Machine
improves with Learning at Carnegie Mellon

experience E. 2/13/25
17 Why Machine Learning?
Machine Learning is everywhere
Digital Transformation leads to Machine Learning as a flying wheel.

2/13/25

https://www.tecton.ai/blog/managing-the-flywheel-of-machine-learning-data/
18 Why Machine Learning?
Software 2.0 v.s. 1.0
The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in
languages such as Python, C++, etc. It consists of explicit instructions to the computer
written by a programmer. By writing each line of code, the programmer identifies a
specific point in program space with some desirable behavior.

Software 2.0 is written in much more abstract, human unfriendly language, such as the
weights of a neural network. No human is involved in writing this code because there
are a lot of weights.

Software 1.0 is code we write. Software 2.0 is code written by the optimization based on
an evaluation criterion (such as “classify this training data correctly”). It is likely that any
setting where the program is not obvious but one can repeatedly evaluate the
performance of it (e.g. — did you classify some images correctly? do you win games of
Go?) will be subject to this transition, because the optimization can find much better
code than what a human can write. 2/13/25

https://karpathy.medium.com/software-2-0-a64152b37c35
19 Why Machine Learning?

Higher Salary in Averages (US, Dec. 2021)


Machine Learning Engineer (ML Engineer) V.S. Software Engineer

2/13/25

https://medium.com/geekculture/machine-learning-engineer-vs-software-engineer-salary-aebc9a5bc2c5
20 Schools of Machine Learning Algorithms

KNN

2/13/25
21 Machine Learning Problems

Supervised Unsupervised
Learning Learning

Discrete
Classification Clustering
Output

Continuous Dimensionality
Regression
Output Reduction

2/13/25
22 Machine Learning Problems

Supervised Unsupervised
Learning Learning

Discrete
Classification Clustering
Output

Continuous Dimensionality
Regression
Output Reduction

2/13/25
23 Breast Cancer (Malignant, Benign)

2/13/25
24 Breast Cancer (Malignant, Benign)

2/13/25
25 Face Recognition

Facebook auto-tagging

2/13/25
26 Speech Recognition

2010 2013
2/13/25
27 Speech Recognition – early vision

2/13/25
28 Machine Translation

2/13/25
29 Machine Learning Problems

Supervised Unsupervised
Learning Learning

Discrete
Classification Clustering
Output

Continuous Dimensionality
Regression
Output Reduction

2/13/25
30 Housing Price Prediction

2/13/25
31 Stock Market

2/13/25
32 Weather Prediction

2/13/25
33 Human Pose Estimation

2/13/25
34 Machine Learning Problems

Supervised Unsupervised
Learning Learning

Discrete
Classification Clustering
Output

Continuou Dimensionality
Regression
s Output Reduction

2/13/25
35 Clustering People

2/13/25
36 Clustering DNA Sequence

build groups of genes with related expression patterns (also known as coexpressed genes)
2/13/25
37 Machine Learning Problems

Supervised Unsupervised
Learning Learning

Discrete
Classification Clustering
Output

Continuou Dimensionality
Regression
s Output Reduction

2/13/25
38 Dimensionality Reduction

2/13/25
39 3D Face Modeling

2/13/25
40 3D Shape Modeling

2/13/25
41 Data for ML/AI

2/13/25
42 Data for ML/AI
• More than 90% data were generated
in recent two years.
• 80–90% of the data that internet
users generate daily is unstructured
(text, image, audio, etc.).

2/13/25

https://firstsiteguide.com/big-data-stats/
43 Model Size for ML/AI
Leaked Predicted

2/13/25
44 Model Size for ML/AI
Leaked Predicted

2/13/25
45 Model Size for ML/AI

2/13/25
46 Computing Advance for ML/AI

2/13/25
47 Three Pillars of ML/AI

DATA – Driven by
Digital
Transformation

Compute –
Semiconductors Algorithm – Stable,
and Chips Efficient, and
become critical Scalable

2/13/25

You might also like