0% found this document useful (0 votes)
62 views142 pages

Introduction 1

Uploaded by

l.m.frykman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views142 pages

Introduction 1

Uploaded by

l.m.frykman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 142

DD1420 Foundations of Machine Learning

Kevin Smith, Gustav Henter, Florian Pokorny KTH Royal Institute of Technology

Document Version: Monday 25th August, 2025 at 13:24


Outline

1. Introduction

2. Course structure

3. What is Machine Learning?

4. The modules & what you will learn

5. Brushing up on prerequisites
Teaching Team: Course Responsible Teachers

Gustav Henter Florian Pokorny Kevin Smith

Page 1/50
Teaching Team: TAs

Rafael Cabral Muchacho Jingyu (Jim) Guo Emir Konuk Robert Welch

Emma Fyring Phitchapha (Pim) Lertsiravarameth Changuan Meng

Noel Sarban David Skalin Jen-Hung Wang

Page 2/50
Who is this course for?

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master?

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites?

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites? YES

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites? YES

• You want to just have a quick intro to some practical ML?

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites? YES

• You want to just have a quick intro to some practical ML? NO

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites? YES

• You want to just have a quick intro to some practical ML? NO

• You want to learn the ins and outs of Tensorflow/Pytorch and train
deep nets?

Page 3/50
Who is this course for?

• You are a Bachelor student and want to seriously pursue ML in your


Master? YES

• You are a ML Master student and missed some prerequisites? YES

• You want to just have a quick intro to some practical ML? NO

• You want to learn the ins and outs of Tensorflow/Pytorch and train
deep nets? NO

Page 3/50
Page 4/50
Structure of a typical module

Page 5/50
Walkthrough: Canvas and Notion Pages

https://dd1420.notion.site/dd1420/
https://canvas.kth.se/courses/55922 DD1420-Lecture-Notes-b555e017345a4119950ce8fd67133275

Page 6/50
Grading Summary

PRO1 1.5 credits - Complete all lesson assignments (video lectures and
lecture notes) and pass all practice quizzes with a score of at least
70%. Grade: P/F.

Page 7/50
Grading Summary

PRO1 1.5 credits - Complete all lesson assignments (video lectures and
lecture notes) and pass all practice quizzes with a score of at least
70%. Grade: P/F.

INL1 3.0 credits - Exercises. Exercises. This component includes both the
completion of exercises (40% of the grade) and performance on oral
examinations (60% of the grade). Grade: A, B, C, D, E, F.

Page 7/50
Grading Summary

PRO1 1.5 credits - Complete all lesson assignments (video lectures and
lecture notes) and pass all practice quizzes with a score of at least
70%. Grade: P/F.

INL1 3.0 credits - Exercises. Exercises. This component includes both the
completion of exercises (40% of the grade) and performance on oral
examinations (60% of the grade). Grade: A, B, C, D, E, F.

TES1 3.0 credits - Summary Quizzes. These quizzes test your knowledge
and understanding of the course material. Grade: A, B, C, D, E, F.

Page 7/50
Grading Scale

A 90-100%

B 80-89%

C 70-79%

D 60-69%

E 50-59%

F less than 50%

Fx is not offered, but we offer two course runs/year to retake the courses.

Page 8/50
Tasks for this week

Page 9/50
Motivations for Machine Learning and Definition of
the term?

Enter your answers at www.menti.com and use code 8554 0709

Page 10/50
What is Machine Learning?

Arthur Samuel, 1959: “the field of study that gives computers the
ability to learn without explicitly being programmed.”

Interesting early work: "Some Studies in Machine Learning Using the Game of Checkers".
IBM Journal of Research and Development. 1959, 3 (3): 210–229.

Cambridge Dictionary, 2022: “the process of computers changing


the way they carry out tasks by learning from new data, without a
human being needing to give instructions in the form of a program”

Page 11/50
What is Machine Learning?

Tom M. Mitchell, 1997: ”A computer program is said to learn from


experience E with respect to some class of tasks T and perfor-
mance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.”

Machine Learning, Tom Mitchell, McGraw Hill, 1997

Page 12/50
What is Machine Learning?

Tom M. Mitchell, 1997: ”A computer program is said to learn from


experience E with respect to some class of tasks T and perfor-
mance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.”

Machine Learning, Tom Mitchell, McGraw Hill, 1997

Not examples according to this definition: Some methods you may still
think of as being “intelligent” or part of “AI”

Page 12/50
What is Machine Learning?

Tom M. Mitchell, 1997: ”A computer program is said to learn from


experience E with respect to some class of tasks T and perfor-
mance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.”

Machine Learning, Tom Mitchell, McGraw Hill, 1997

Not examples according to this definition: Some methods you may still
think of as being “intelligent” or part of “AI”
i.e.:

Page 12/50
What is Machine Learning?

Tom M. Mitchell, 1997: ”A computer program is said to learn from


experience E with respect to some class of tasks T and perfor-
mance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.”

Machine Learning, Tom Mitchell, McGraw Hill, 1997

Not examples according to this definition: Some methods you may still
think of as being “intelligent” or part of “AI”
i.e.: Manually programmed chat bot, aspects of Game Theory, Vanilla
Search, Motion Planning, Optimization - if these methods do not improve
with experience E.

Page 12/50
What are some examples of modern machine learning?

Page 13/50
Waymo’s Selfdriving Car

Waymo selfdriving car, Image: CC BY-SA 4.0 wikimedia.org user Grendelkhan


Page 14/50
NVIDIA’s StyleGAN

github.com/NVlabs/stylegan, Image: NVIDIA Corporation CC BY-NC 4.0


Page 15/50
Many many examples...

• OpenAI’s ChatGPT

Page 16/50
Many many examples...

• OpenAI’s ChatGPT

• Automated translation

Page 16/50
Many many examples...

• OpenAI’s ChatGPT

• Automated translation

• Recommender systems when you go shopping

Page 16/50
Many many examples...

• OpenAI’s ChatGPT

• Automated translation

• Recommender systems when you go shopping

• ...

Page 16/50
Most basic categorization of ML methods?

Page 17/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ?

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ?

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text. What is f ?

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text. What is f ? X = text, Y = audio.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text. What is f ? X = text, Y = audio.
• Determine motor currents to enable a robot arm to reach with its
hand to a given position.

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text. What is f ? X = text, Y = audio.
• Determine motor currents to enable a robot arm to reach with its
hand to a given position. What is f ?

Page 18/50
Supervised Learning
Learn a function f : X → Y from example input output pairs
(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.
• Predict the temperature tomorrow based on the temperature of the
last d days. What is f ? X = Rd , Y = R.
• Learn to diagnose if a medical conditions is present or not from
imaging data. What is f ? X = photographs, Y = {true, false}
• Synthesize spoken audio from text. What is f ? X = text, Y = audio.
• Determine motor currents to enable a robot arm to reach with its
hand to a given position. What is f ? X = Rd , Y = Rm .

Page 18/50
Supervised Learning

Learn a function f : X → Y from example input output pairs


(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.

• When Y is a discrete/finite set, we call this classification.

Page 19/50
Supervised Learning

Learn a function f : X → Y from example input output pairs


(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.

• When Y is a discrete/finite set, we call this classification.


Examples:

Page 19/50
Supervised Learning

Learn a function f : X → Y from example input output pairs


(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.

• When Y is a discrete/finite set, we call this classification.


Examples: Y = {true, false}, Y = {cat, dog, tiger, tree},
Y = {1, 2, 3, 4, 5}.

• When Y lies in Rd we call this regression.

Page 19/50
Supervised Learning

Learn a function f : X → Y from example input output pairs


(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.

• When Y is a discrete/finite set, we call this classification.


Examples: Y = {true, false}, Y = {cat, dog, tiger, tree},
Y = {1, 2, 3, 4, 5}.

• When Y lies in Rd we call this regression.


Examples:

Page 19/50
Supervised Learning

Learn a function f : X → Y from example input output pairs


(x1 , y1 ), . . . , (xn , yn ) ∈ X × Y.

• When Y is a discrete/finite set, we call this classification.


Examples: Y = {true, false}, Y = {cat, dog, tiger, tree},
Y = {1, 2, 3, 4, 5}.

• When Y lies in Rd we call this regression.


Examples: Images, GPS coordinates, motor currents.

Page 19/50
Can we also learn without labels yi ∈ Y ?

Page 20/50
Unsupervised Learning

In unsupervised learning, we assume we only have data x1 , . . . , xn , without


corresponding labels y1 , . . . , yn .

• The problem of automatically detecting classes or ‘groupings’ of data


is referred to as clustering.

Page 21/50
Unsupervised Learning

In unsupervised learning, we assume we only have data x1 , . . . , xn , without


corresponding labels y1 , . . . , yn .

• The problem of automatically detecting classes or ‘groupings’ of data


is referred to as clustering.

• Examples:

Page 21/50
Unsupervised Learning

In unsupervised learning, we assume we only have data x1 , . . . , xn , without


corresponding labels y1 , . . . , yn .

• The problem of automatically detecting classes or ‘groupings’ of data


is referred to as clustering.

• Examples: Phoneme recognition performed by children.

Page 21/50
Unsupervised Learning

In unsupervised learning, we assume we only have data x1 , . . . , xn , without


corresponding labels y1 , . . . , yn .

• The problem of automatically detecting classes or ‘groupings’ of data


is referred to as clustering.

• Examples: Phoneme recognition performed by children. Determining


groups of customer behaviors, patterns in disease in medicine

Page 21/50
Unsupervised Learning

• Learning which aspects of input data matter for a problem by


removing variables/aspects: feature selection or by mapping the
data to a lower dimensional space: dimensionality reduction.

Page 22/50
Unsupervised Learning

• Learning which aspects of input data matter for a problem by


removing variables/aspects: feature selection or by mapping the
data to a lower dimensional space: dimensionality reduction.

• Examples:

Page 22/50
Unsupervised Learning

• Learning which aspects of input data matter for a problem by


removing variables/aspects: feature selection or by mapping the
data to a lower dimensional space: dimensionality reduction.

• Examples: e.g.: Knowing your favorite movie does not help me in


predicting if it is going to rain today.

Page 22/50
Unsupervised Learning

• Learning which aspects of input data matter for a problem by


removing variables/aspects: feature selection or by mapping the
data to a lower dimensional space: dimensionality reduction.

• Examples: e.g.: Knowing your favorite movie does not help me in


predicting if it is going to rain today. Understanding the contept of a
real world elephant from a child’s drawing? Dimensionality reduction
for visualization.

Page 22/50
Some Milestones
1950s-1970s: Initial Euphoria

• Alan Turing defines the “Turing Test”

Page 23/50
Some Milestones
1950s-1970s: Initial Euphoria

• Alan Turing defines the “Turing Test”

• Arthur Samuel, learning the game of checkers

Page 23/50
Some Milestones
1950s-1970s: Initial Euphoria

• Alan Turing defines the “Turing Test”

• Arthur Samuel, learning the game of checkers

• First neural network ideas, the Perceptron

Page 23/50
Some Milestones
1950s-1970s: Initial Euphoria

• Alan Turing defines the “Turing Test”

• Arthur Samuel, learning the game of checkers

• First neural network ideas, the Perceptron

• Development of nearest neighbor algorithm, statistical methods

Page 23/50
Some Milestones
1950s-1970s: Initial Euphoria

• Alan Turing defines the “Turing Test”

• Arthur Samuel, learning the game of checkers

• First neural network ideas, the Perceptron

• Development of nearest neighbor algorithm, statistical methods

• Extremely optimistic expectations of progress

Page 23/50
Some Milestones
1974-1980: First AI Winter

• Real world progress fails to match expectations

Page 24/50
Some Milestones
1974-1980: First AI Winter

• Real world progress fails to match expectations

• Reduction in funding

Page 24/50
Some Milestones
1974-1980: First AI Winter

• Real world progress fails to match expectations

• Reduction in funding

• Neural Networks out of favour and critisised by leading researchers,


e.g. Marvin Minsky

Page 24/50
Some Milestones
1974-1980: First AI Winter

• Real world progress fails to match expectations

• Reduction in funding

• Neural Networks out of favour and critisised by leading researchers,


e.g. Marvin Minsky

• neats vs scruffies: precise logic vs “frames and scripts”

Page 24/50
Some Milestones

1980-1987: Recovering from the crisis

• Expert systems / knowledge bases

Page 25/50
Some Milestones

1980-1987: Recovering from the crisis

• Expert systems / knowledge bases

• Hopfield networks

Page 25/50
Some Milestones

1980-1987: Recovering from the crisis

• Expert systems / knowledge bases

• Hopfield networks

• Neural network training with backpropagation

Page 25/50
Some Milestones

1987-1993: Second AI Winter

• Again, progress did not match expectation build up

Page 26/50
Some Milestones
1993-2011: Real World Use cases with limits

• Larger training data: e.g. ImageNet

Page 27/50
Some Milestones
1993-2011: Real World Use cases with limits

• Larger training data: e.g. ImageNet

• Probabilistic ML, Kernel Methods, Bayesian Inference mature

Page 27/50
Some Milestones
1993-2011: Real World Use cases with limits

• Larger training data: e.g. ImageNet

• Probabilistic ML, Kernel Methods, Bayesian Inference mature

• Accepted wisdom: Deep Neural Networks “impossible to train”

• Moore’s law

Page 27/50
Some Milestones
1993-2011: Real World Use cases with limits

• Larger training data: e.g. ImageNet

• Probabilistic ML, Kernel Methods, Bayesian Inference mature

• Accepted wisdom: Deep Neural Networks “impossible to train”

• Moore’s law

• Real world commercial use cases: Recommender Systems for web


shops etc

Page 27/50
Some Milestones
2011-: Breakthroughs in scale and methodology

• Deep Learning revolution: from AlexNet to GANs, VAEs and many


more. Starting with breakthrough applications in Computer Vision,
then extended to most areas of application including Robotics,
Translation, Speech etc

Page 28/50
Some Milestones
2011-: Breakthroughs in scale and methodology

• Deep Learning revolution: from AlexNet to GANs, VAEs and many


more. Starting with breakthrough applications in Computer Vision,
then extended to most areas of application including Robotics,
Translation, Speech etc

• GPU and cloud infrastructure for ML matures rapidly with focus on


BigData

• Transfomers, ChatGPT, Diffusion Models

Page 28/50
ML Today
• True societal and commercial impact

http://www.

incompleteideas.

net/IncIdeas/
IEEE Spectrum, 20 Oct 2014
BitterLesson.html

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization

http://www.

incompleteideas.

net/IncIdeas/
IEEE Spectrum, 20 Oct 2014
BitterLesson.html

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization
• Breakneck publication speed

http://www.

incompleteideas.

net/IncIdeas/
IEEE Spectrum, 20 Oct 2014
BitterLesson.html

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization
• Breakneck publication speed
• Mathematical foundations and
explainability?
http://www.

incompleteideas.

net/IncIdeas/
IEEE Spectrum, 20 Oct 2014
BitterLesson.html

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization
• Breakneck publication speed
• Mathematical foundations and
explainability?
http://www.
• Are we again overestimating what
incompleteideas.

net/IncIdeas/
current methods can do?
IEEE Spectrum, 20 Oct 2014
BitterLesson.html

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization
• Breakneck publication speed
• Mathematical foundations and
explainability?
http://www.
• Are we again overestimating what
incompleteideas.

net/IncIdeas/
current methods can do?
IEEE Spectrum, 20 Oct 2014
BitterLesson.html • Concerns about ethics, data privacy,
bias in ML come to the fore

Page 29/50
ML Today
• True societal and commercial impact
• Hybrid Deep Learning + X, where X is
probabilistic, geometric, optimization
• Breakneck publication speed
• Mathematical foundations and
explainability?
http://www.
• Are we again overestimating what
incompleteideas.

net/IncIdeas/
current methods can do?
IEEE Spectrum, 20 Oct 2014
BitterLesson.html • Concerns about ethics, data privacy,
bias in ML come to the fore
• A very exciting but challenging time to
study and teach Machine Learning!
Page 29/50
Landscape of Machine Learning?

Page 30/50
Ingredients of Machine Learning

Page 31/50
Ingredients of Machine Learning

Page 32/50
Ingredients of Machine Learning

Page 33/50
1. Introduction (this week)

• Understand the structure of the course

Page 34/50
1. Introduction (this week)

• Understand the structure of the course


• Think about motivation for applying ML

Page 34/50
1. Introduction (this week)

• Understand the structure of the course


• Think about motivation for applying ML
• Definition and applications of ML

Page 34/50
1. Introduction (this week)

• Understand the structure of the course


• Think about motivation for applying ML
• Definition and applications of ML
• History of ML

Page 34/50
1. Introduction (this week)

• Understand the structure of the course


• Think about motivation for applying ML
• Definition and applications of ML
• History of ML
• Supervised vs unsupervised ML

Page 34/50
1. Introduction (this week)

• Understand the structure of the course


• Think about motivation for applying ML
• Definition and applications of ML
• History of ML
• Supervised vs unsupervised ML
• Revision: Linear Algebra and Probability
• Quickstart: Python/Numpy

Page 34/50
Page 35/50
Page 36/50
Page 37/50
Programming in Python

Good to refresh your knowledge of:

• Basic Python programming

Page 38/50
Programming in Python

Good to refresh your knowledge of:

• Basic Python programming

• Numpy - linear algebra library

Page 38/50
Programming in Python

Good to refresh your knowledge of:

• Basic Python programming

• Numpy - linear algebra library

• Matplotlib - graphing library

Page 38/50
2. ML & Optimization

• Optimization Basics

Page 39/50
2. ML & Optimization

• Optimization Basics
• Gradient-based Optimization

Page 39/50
2. ML & Optimization

• Optimization Basics
• Gradient-based Optimization
• Constrained Optimization

Page 39/50
2. ML & Optimization

• Optimization Basics
• Gradient-based Optimization
• Constrained Optimization
• Duality and Quadratic
Programming

Page 39/50
2. ML & Optimization

• Optimization Basics
• Gradient-based Optimization
• Constrained Optimization
• Duality and Quadratic
Programming
• Other Techniques

Page 39/50
3. ML & Generalization

• Intro to Generalization

Page 40/50
3. ML & Generalization

• Intro to Generalization
• Generalization Basics

Page 40/50
3. ML & Generalization

• Intro to Generalization
• Generalization Basics
• Learning theory

Page 40/50
3. ML & Generalization

• Intro to Generalization
• Generalization Basics
• Learning theory
• Generalization in Practice

Page 40/50
3. ML & Generalization

• Intro to Generalization
• Generalization Basics
• Learning theory
• Generalization in Practice

Page 40/50
4. ML & Neural Networks

• Basics of Neural Networks

Page 41/50
4. ML & Neural Networks

• Basics of Neural Networks


• Backpropagation

Page 41/50
4. ML & Neural Networks

• Basics of Neural Networks


• Backpropagation
• Universal Approximation
Theorem

Page 41/50
4. ML & Neural Networks

• Basics of Neural Networks


• Backpropagation
• Universal Approximation
Theorem

Page 41/50
5. ML & Geometry

• Geometry and Data,


Geodesic Distances

Page 42/50
5. ML & Geometry

• Geometry and Data,


Geodesic Distances
• Dimensionality
Reduction

Page 42/50
5. ML & Geometry

• Geometry and Data,


Geodesic Distances
• Dimensionality
Reduction
• Clustering

Page 42/50
5. ML & Geometry

• Geometry and Data,


Geodesic Distances
• Dimensionality
Reduction
• Clustering

Page 42/50
6. ML & Kernel Methods

• Kernel Trick and Duality

Page 43/50
6. ML & Kernel Methods

• Kernel Trick and Duality


• Kernel Design

Page 43/50
6. ML & Kernel Methods

• Kernel Trick and Duality


• Kernel Design
• Kernel Regression

Page 43/50
6. ML & Kernel Methods

• Kernel Trick and Duality


• Kernel Design
• Kernel Regression
• Gaussian Processes

Page 43/50
6. ML & Kernel Methods

• Kernel Trick and Duality


• Kernel Design
• Kernel Regression
• Gaussian Processes

Page 43/50
7. ML & Probability

• Parametric principles and


P(B|A)P(A) simple models
P(A|B) =
P(B)

Page 44/50
7. ML & Probability

• Parametric principles and


P(B|A)P(A) simple models
P(A|B) = • Mixture models and EM
P(B) algorithm

Page 44/50
7. ML & Probability

• Parametric principles and


P(B|A)P(A) simple models
P(A|B) = • Mixture models and EM
P(B) algorithm
• Nonparametric methods

Page 44/50
7. ML & Probability

• Parametric principles and


P(B|A)P(A) simple models
P(A|B) = • Mixture models and EM
P(B) algorithm
• Nonparametric methods

Page 44/50
8. ML & Information Theory

• Introduction

Page 45/50
8. ML & Information Theory

• Introduction
• Fundamentals of Information
Theory

Page 45/50
8. ML & Information Theory

• Introduction
• Fundamentals of Information
Theory
• Decision Trees

Page 45/50
8. ML & Information Theory

• Introduction
• Fundamentals of Information
Theory
• Decision Trees

Page 45/50
9. ML & Data Generation

Image: “Astronaut Riding a Horse (SDXL)”, Stable Diffusion, CC0 1.0 Author: VulcanSphere,
available on Wikipedia Stable Diffusion article

• Similarities and differences from classification

Page 46/50
9. ML & Data Generation

Image: “Astronaut Riding a Horse (SDXL)”, Stable Diffusion, CC0 1.0 Author: VulcanSphere,
available on Wikipedia Stable Diffusion article

• Similarities and differences from classification


• Synthesis methods (exemplars, regression, GANs, flows, etc.)

Page 46/50
9. ML & Data Generation

Image: “Astronaut Riding a Horse (SDXL)”, Stable Diffusion, CC0 1.0 Author: VulcanSphere,
available on Wikipedia Stable Diffusion article

• Similarities and differences from classification


• Synthesis methods (exemplars, regression, GANs, flows, etc.)
• How to evaluate and improve synthesis models

Page 46/50
Our Main Expected Learning Outcomes (see syllabus)

• be able to define problems in data analysis clearly

Page 47/50
Our Main Expected Learning Outcomes (see syllabus)

• be able to define problems in data analysis clearly

• formulate a suitable solution with machine learning and strengthen


this solution through critical and quantitative evaluation

Page 47/50
Our Main Expected Learning Outcomes (see syllabus)

• be able to define problems in data analysis clearly

• formulate a suitable solution with machine learning and strengthen


this solution through critical and quantitative evaluation

• be well prepared to read advanced courses in machine learning.

Page 47/50
Some ML courses to take after this course

• DD2434 Machine Learning, Advanced Course


• DD2447 Statistical Methods in Applied Computer Science
• DD2429 Probabilistic Graphical Models
• DD2437 Artificial Neural Networks and Deep Architectures
• DD2424 Deep Learning in Data Science
• EQ2341 Pattern Recognition and Machine Learning
• EL2805 Reinforcement Learning

Page 48/50
Questions?

Page 49/50

You might also like