APS1070 Lecture (5) Slides Annotated
APS1070 Lecture (5) Slides Annotated
Lecture 5:
• Linear Algebra
• Analytical Geometry
• Data Augmentation
Samin Aref
Mid-term assessment logistics
• Midterm Assessment: Oct 21 at 9:20 at EX 200
• Absence from the assessment or not having student ID= A mark of 0. No excuses.
• Everything from lectures 1-5 (inclusive) are included (the slides and codes for weeks 1-5
plus: same topics from the main textbook, all piazza posts, project 1, tutorials 0-2, and
reading assignments 1-4).
• You may practice with sample midterm questions (the solution is available now on
Quercus) and examples with solutions in the textbook and lecture slides.
3
Slide Attribution
These slides contain materials from various sources. Special
thanks to the following authors:
• Marc Deisenroth
• Mark Schmidt
• Jason Riordon
Agenda
➢ Linear Algebra
➢ Scalars, Vectors, Matrices
➢ Solving Systems of Linear Equations
➢ Linear Independence
➢ Linear Mappings Today’s Theme:
➢ Analytic Geometry Data Processing
➢ Norms, Inner Products, Lengths, etc.
➢ Angles and Orthonormal Basis
➢ Data Augmentation
5
Part 1
Linear Algebra
Readings:
• Chapter 2.1-5 MML Textbook
Systems of Linear Equations
➢ The solution space of a system of two linear equations with two variables can be
geometrically interpreted as the intersection of two lines
➢ intersection of planes in three variables
x2
4x1+ 4x2=5
2x1- 4x2=1
x1
System in three variables – solution is System with 2 equations and three
at intersection variables – solution is typically a line
7
Matrix Representation
➢ Used to solve systems of linear equations more systematically
➢ Compact notation collects coefficients into vectors, and vectors
into matrices:
𝑎11 ⋯ 𝑎1𝑛 𝑥1 𝑏1
⟺ ⋮ ⋮ ⋮ = ⋮
𝑎𝑚1 ⋯ 𝑎𝑚𝑛 𝑥𝑛 𝑏𝑚
8
Matrix Notation
➢ A matrix has m x n elements (with 𝑚, 𝑛 ∈ ℕ, and aij, i=1,…,m; j=1,…,n)
which are ordered according to a rectangular scheme consisting of m rows
and n columns:
1 2 n
𝑎11 𝑎12 … 𝑎1𝑛 1
𝑎21 𝑎22 … 𝑎2𝑛 2
𝐴= ⋮ ⋮ ⋮ 𝑎𝑖𝑗 ∈ ℝ
𝑎𝑚1 𝑎𝑚2 … 𝑎𝑚𝑛 m
➢ Scalar multiplication:
𝑏1 ⍺𝑏1
⍺𝑏 = ⍺ =
𝑏2 ⍺𝑏2
10
Addition and Scalar Multiplication
➢ Matrix addition: The sum of two matrices 𝐴 ∈ ℝ𝑚×𝑛 , 𝐵 ∈ ℝ𝑚×𝑛 is
defined as the element-wise sum:
𝛼 ∗ 𝑎11 … 𝛼 ∗ 𝑎1𝑛
𝛼∗𝐴 = ⋮ ⋮
𝛼 ∗ 𝑎𝑚1 … 𝛼 ∗ 𝑎𝑚𝑛
11
Example: Matrix Multiplication
0 2
1 2 3
➢ For two matrices: 𝐴 = ∈ ℝ2×3 , 𝐵 = 1 −1 ∈ ℝ3×2 ,
3 2 1
0 1
➢ we obtain:
0 2
1 2 3 2 3
𝐴𝐵 = 1 −1 = ∈ ℝ2×2 ,
3 2 1 2 5
0 1
0 2 6 4 2
1 2 3
𝐵𝐴 = 1 −1 = −2 0 2 ∈ ℝ3×3
3 2 1
0 1 3 2 1
Not commutative! 𝐴𝐵 ≠ 𝐵𝐴
12
Basic Properties
➢ A few properties:
➢ Associativity:
∀𝐴 ∈ ℝ𝑚×𝑛 , 𝐵 ∈ ℝ𝑛×𝑝 , 𝐶 ∈ ℝ𝑝×𝑞 : 𝐴𝐵 𝐶 = 𝐴(𝐵𝐶)
➢ Distributivity:
∀𝐴, 𝐵 ∈ ℝ𝑚×𝑛 , 𝐶, 𝐷 ∈ ℝ𝑛×𝑝 : 𝐴 + 𝐵 𝐶 = 𝐴𝐶 + 𝐵𝐶
𝐴 𝐶 + 𝐷 = 𝐴𝐶 + 𝐴𝐷
13
Transpose
➢ Transpose definition: For 𝐴 ∈ ℝ𝑚×𝑛 the matrix 𝐵 ∈ ℝ𝑛×𝑚 with 𝑏𝑖𝑗 =
𝑎𝑗𝑖 is called transpose of A. We write 𝐵 = 𝐴𝑇 .
➢ Symmetric Matrix: A matrix 𝐴 ∈ ℝ𝑛×𝑛 is symmetric if 𝐴 = 𝐴𝑇 .
➢ Some useful identities:
𝐴𝐴−1 = 𝐼 = 𝐴−1 𝐴
(𝐴𝐵)−1 = 𝐵 −1 𝐴−1
(𝐴 + 𝐵)−1 ≠ 𝐴−1 + 𝐵 −1
(𝐴𝑇 )𝑇 = 𝐴
(𝐴 + 𝐵)𝑇 = 𝐴𝑇 + 𝐵𝑇
(𝐴𝐵)𝑇 = 𝐵𝑇 𝐴𝑇
14
Inner Product and Outer Product
➢ The inner product between vectors of the same length is:
𝑛
15
Identity Matrix
➢ We define the identity matrix as shown:
1 0 ⋯ 0 ⋯ 0
0 1 ⋯ 0 ⋯ 0
⋮ ⋮ ⋱ ⋮ ⋱ ⋮
𝐼𝑛 : = ∈ ℝ𝑛×𝑛
0 0 ⋯ 1 ⋯ 0
⋮ ⋮ ⋱ ⋮ ⋱ ⋮
0 0 ⋯ 0 ⋯ 1
➢ Any matrix multiplied by the identity will not change the matrix:
1 2 3 1 0 0 1+0+0 0+2+0 0+0+3
4 5 6 X 0 1 0 = 4+0+0 0+5+0 0+0+6
16
Inverse
➢ If square matrices 𝐴 ∈ ℝ𝑛×𝑛 and 𝐵 ∈ ℝ𝑛×𝑛 have the property that
𝐴𝐵 = 𝐼𝑛 = 𝐵𝐴. Then B is called the inverse of A and denoted by A-1.
1 2 1 −7 −7 6
𝐴= 4 4 5 , 𝐵= 2 1 −1
6 7 7 4 5 −4
2 1 1 5 Row
0 −8 −2อ −12 Echelon
0 0 1 2 form
21
Example: Reduced Row Echelon Form
➢ We can simplify this even further:
2𝑥1 + 𝑥2 + 𝑥3 = 5
2 1 1 5
−8𝑥2 − 2𝑥3 = −12 0 −8 −2อ −12
𝑥3 = 2 0 0 1 2
𝐴𝑥 = 𝑏
𝐴−1 𝐴𝑥 = 𝐴−1 𝑏 Note that 𝐴−1 will cancel out
𝐴 only if multiplied from the
𝐼𝑛 𝑥 = 𝐴−1 𝑏 left-hand side, otherwise we
have 𝐴𝑥𝐴−1
𝑥 = 𝐴−1 𝑏
29
Calculating an Inverse Matrix
1 0 2 0
➢ To determine the inverse of a matrix A 𝐴=
1 1 0 0
1 2 0 1
1 1 1 1
2 1 1 1 1 0 0 5
0 0 3อ −2 0 1 0อ 4
0 0 4 2 0 0 0 0
singular singular
31
What can go wrong?
➢ Applying Gaussian Elimination (row reduction) does not always lead
to a solution.
➢ Singular Case: When we have a 0 in a pivot column. This is an
example of a matrix that is not invertible.
➢ For example:
2 1 1 1 1 0 0 5
0 0 3อ −2 0 1 0อ 4
0 0 4 2 0 0 0 0
singular singular
Hyperplane
from row 1
b=1.7*Column1+
2.2*Column2
so x=[1.7,2.2]T
Solution(x)
x=[1.7,2.2]T
Hyperplane Column 2
from row 2
Column 1
34
What can go wrong?
➢ By rows:
Hyperplane
from row 2
Hyperplane
from row 2
Hyperplane Hyperplane
from row 1 from row 1
Columns of matrix
Column-space
36
Infinite solution
➢ By columns:
Columns of matrix
37
Solutions to Ax=b
➢ Q: In general, when does Ax=b Vector not in column
have a unique solution? space (no solution)
38
Linear Dependence
➢ A vector is linearly dependent on a set of vectors if it can be written as a
linear combination of them:
𝑐 = 𝛼1 𝑏1 + 𝛼2 𝑏2 + … 𝛼𝑛 𝑏𝑛
➢ We say that c is “linearly dependent” on {b1, b2, …, bn}, and that the set
{c,b1, b2, …, bn} is “linearly dependent”
39
Linear Independence
➢ A set of vectors is either linearly dependent or linearly independent.
➢ If the vectors are independent, then there is no way to represent any of
the vectors as a combination of the others.
40
Linear Dependence vs Independence
➢ Independence in R2:
Dependent Independent
Independent Dependent
41
Linear Independence
➢ Consider we have a set of three vectors {𝑥1 , 𝑥2 , 𝑥3 } 𝜖 ℝ4 1 1 −1
2 1 −2
𝑥1 = , 𝑥2 = , 𝑥3 =
➢ To check whether they are linearly dependent, we write −3 0 1
the vectors 𝑥𝑖 , 𝑖 = 1, 2, 3, as the columns of a matrix and 4 2 1
set of vectors 𝒜 =
{𝑥1 , … , 𝑥𝑘 } ⊆ 𝒱
44
Basis
➢ Basis in vector space V ∈ R2:
Every linearly
Basis
independent set
Not a basis
of vectors that
span V is called
a basis of V
46
Linear Mapping/Transformation
➢ A vector has different coordinate representations
depending on which coordinate system or basis is
chosen. 𝑥
𝑒2
𝑏2
𝑒1
𝑏1
two different bases 47
Source: Eli Bendersky
f u2
u1
v1
2
[u1]v = [ ]
3
4
[u2]v = [ ]
5
2
[f]v= [ ]
4
[f]u=?
Examples of Transforms
➢Scale ➢Rotation ➢Horizontal Mirror
Source: mathisfun.com 49
Part 2
Analytical Geometry
Readings:
• Chapter 3.1-5,8,9 MML Textbook
Norms
➢ A norm is a scalar measure of a vector’s length.
➢ The most important norm is the Euclidean norm and for 𝑥 ∈
ℝ𝑛 is defined as:
𝑛
𝑥 = 𝑥 2 ≔ 𝑥𝑖2 = 𝑥𝑇𝑥
𝑖=1
1
1 𝑥 1 =1 𝑥 2 =1
1 1
52
Dot product
➢ Dot product:
𝑛
𝑥 𝑇 𝑦 = 𝑥𝑖 𝑦𝑖
𝑖=1
1 3
𝑎1 ∙ 𝑏1 = ∙ = 1 ∙ 3 + 7 ∙ 5 = 38
7 5
53
Lengths and Distances
➢ Consider an inner product space.
➢ Then
𝑑 𝑥, 𝑦 ≔ 𝑥 − 𝑦 = 𝑥 − 𝑦, 𝑥 − 𝑦
54
Angles
➢ The angle 𝜽 between two vectors 𝒙, 𝒚 is computed using the inner product.
56
Orthonormal Basis
➢ In n-dimensional space, we need n basis vectors that are linearly
independent, if these vectors are orthogonal, and each has length 1,
it’s a special case: orthonormal basis
➢ Consider an n-dimensional vector space V and a basis 𝑏1 , … , 𝑏𝑛 of V. If
𝑏𝑖 , 𝑏𝑗 = 0 𝑓𝑜𝑟 𝑖 ≠ 𝑗
𝑏𝑖 , 𝑏𝑖 = 1
for all 𝑖, 𝑗 = 1, … , 𝑛 then the basis is called an orthonormal basis (ONB). Note
that 𝑏𝑖 , 𝑏𝑖 = 1 implies that every basis vector has length/norm 1.
58
Orthogonal Projections
𝒃𝑇 𝒙 𝒃𝒃𝑇
𝜋𝑈 𝒙 = 𝜆𝒃 = 𝒃 2
= 2
𝒙
𝒃 𝒃
60
Example: Orthogonal Projections
𝒃𝒃𝑇
1
𝜋𝑈 𝒙 = 2
𝒙
𝒃
b
0 1
61
Projection Matrix
➢ We can also use a projection matrix, which allows us to
project any vector x onto the subspace defined by 𝜋. 𝒃𝒃𝑇
𝜋𝑈 𝒙 = 2
𝒙
𝒃
𝒃𝒃𝑇
𝑃𝜋 =
𝒃 2
62
Example: Applying Projection Matrix
𝒃𝒃𝑇
𝑃𝜋 =
𝒃 2
1
b
0 1
63
Part 3
Data Augmentation
What do these datasets have in common?
66
Non-Representative Data
➢ Everything our algorithms learn comes form
the data used to train them.
67
Capacity and Training
➢ Deep learning algorithms have the
capacity to classify real images in
various orientations and scales.
68
Data Augmentation
➢ Use linear algebra to perform common GAN Fake Celebrities
➢ Advanced:
➢ Generative models (i.e., Deep
learning) to create new images with
similar characteristics
Source: kaggle.com 69
Test Time Data Augmentation
➢ You can also apply data augmentation to better
evaluate your performance on test examples.
➢ Great way to assess limitations of your model to
images of different rotations, scales, noise, etc.
70
Next Time
➢ Week 6 labs on Wednesday and Thursday: Q&A sessions before the midterm
➢ Week 6 sessions on Friday: midterm, no lectures
71
Google Colab