0% found this document useful (0 votes)

207 views3 pages

1 9780692196380 FM

The document is a note from the author discussing the importance of linear algebra in relation to deep learning, statistics, and optimization. It emphasizes that understanding matrix factorizations is crucial for applications in data analysis and deep learning. The author also mentions the course Math 18.065, which integrates these subjects and provides resources for instructors and students, including recorded lectures and a new textbook.

Uploaded by

hau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

207 views3 pages

1 9780692196380 FM

Uploaded by

hau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Linear Algebra and Learning from Data

Downloaded 06/12/25 to 171.255.185.198 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

a note from the author

This message and this new textbook are about an established subject—linear algebra—leading to the much newer
subject of deep learning. May I express separate thoughts about those two subjects, and connect them. Two other
subjects are essential to success—statistics and optimization—and the book shows how and where they play a crucial
part.

Linear Algebra is completely accepted as basic to the undergraduate curriculum. But I don’t see that its surge
in importance is fully recognized. Multivariable algebra is far more widely used than multivariable calculus. Our
students are really missing out if our teaching is limited to matrix manipulation. It is factorization of the matrices that
we need in applications—into orthogonal and diagonal and triangular matrices.

Deep Learning is a particularly successful application to understanding data. It constructs a learning function
F (v) = w. The data vectors are v, and their meaning is w. F is constructed from a training set of known pairs v
and w. The word deep indicates that F is a composition FL (. . . (F1 (v))) of L simple steps (the “depth” is L). Each
step involves a matrix Ai , a vector bi , and a fixed nonlinear activation function : often Fi = max(Ai vi−1 + bi , 0).
The matrices Ai and the vectors bi are optimized to reproduce F (v) = w on the known training data, leading to good
accuracy on the unseen data.

This is the course I now teach : Math 18.065. And students come to it, knowing that the two subjects are important
for their future. They learn quite a lot about linear algebra, and they see how optimization finds those matrices Ai
in the learning function. Research labs and companies have data to analyze and understand, and this deep learning
approach has become widespread. Students learn key ideas from statistics, to measure the success of the learning
function F .

The course needs an instructor who wants to help. It begins with linear algebra—matrix factorizations A = QR
from Gram-Schmidt orthogonalization and S = QΛQT from eigenvalues and A = U ΣV T from singular values.
This is the heart of the subject and you could not teach any mathematics that is more useful.

To help instructors and students, the 2018 lectures were recorded for MIT’s OpenCourseWare. They will be on
ocw.mit.edu in mid-April. It’s now 2019 and we have the textbook and more experience with the course. I would be
happy to send you the new 2019 videos when SIAM sends you a sample copy of the book.

Gilbert Strang
Department of Mathematics, MIT
GILBERT STRANG

Deep Learning and Neural Nets iii

Downloaded 06/12/25 to 171.255.185.198 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

Preface and Acknowledgments vi

Part I : Highlights of Linear Algebra 1

I.1 Multiplication Ax Using Columns of A 2
I.2 Matrix-Matrix Multiplication AB 9
I.3 The Four Fundamental Subspaces 14
I.4 Elimination and A = LU 21
I.5 Orthogonal Matrices and Subspaces 29
I.6 Eigenvalues and Eigenvectors 36
I.7 Symmetric Positive Definite Matrices 44
I.8 Singular Values and Singular Vectors in the SVD 56
I.9 Principal Components and the Best Low Rank Matrix 71
I.10 Rayleigh Quotients and Generalized Eigenvalues 81
I.11 Norms of Vectors and Functions and Matrices 88
I.12 Factoring Matrices and Tensors : Positive and Sparse 97

Part II : Computations with Large Matrices 113

II.1 Numerical Linear Algebra 115
II.2 Least Squares : Four Ways 124
II.3 Three Bases for the Column Space 138
II.4 Randomized Linear Algebra 146

Part III: Low Rank and Compressed Sensing 159

III.1 Changes in A−1 from Changes in A 160
III.2 Interlacing Eigenvalues and Low Rank Signals 168
III.3 Rapidly Decaying Singular Values 178
III.4 Split Algorithms for ℓ 2 + ℓ 1 184
III.5 Compressed Sensing and Matrix Completion 195

Part IV: Special Matrices 203

IV.1 Fourier Transforms : Discrete and Continuous 204
IV.2 Shift Matrices and Circulant Matrices 213
IV.3 The Kronecker Product A  B 221
IV.4 Sine and Cosine Transforms from Kronecker Sums 228
IV.5 Toeplitz Matrices and Shift Invariant Filters 232
IV.6 Graphs and Laplacians and Kirchhoff’s Laws 239
IV.7 Clustering by Spectral Methods and k-means 245
IV.8 Completing Rank One Matrices 255
IV.9 The Orthogonal Procrustes Problem 257
IV.10 Distance Matrices 259
Part V: Probability and Statistics 263
V.1 Mean, Variance, and Probability 264
V.2 Probability Distributions 275
V.3 Moments, Cumulants, and Inequalities of Statistics 284
V.4 Covariance Matrices and Joint Probabilities 294
Downloaded 06/12/25 to 171.255.185.198 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

V.5 Multivariate Gaussian and Weighted Least Squares 304

V.6 Markov Chains 311

Part VI: Optimization 321

VI.1 Minimum Problems : Convexity and Newton’s Method 324
VI.2 Lagrange Multipliers = Derivatives of the Cost 333
VI.3 Linear Programming, Game Theory, and Duality 338
VI.4 Gradient Descent Toward the Minimum 344
VI.5 Stochastic Gradient Descent and ADAM 359

Part VII: Learning from Data 371

VII.1 The Construction of Deep Neural Networks 375
VII.2 Convolutional Neural Nets 387
VII.3 Backpropagation and the Chain Rule 397
VII.4 Hyperparameters : The Fateful Decisions 407
VII.5 The World of Machine Learning 413

Books on Machine Learning 416

Eigenvalues and Singular Values : Rank One 417

Codes and Algorithms for Numerical Linear Algebra 418

Counting Parameters in the Basic Factorizations 419

Index of Authors 420

Index 423

Index of Symbols 432

Advanced Linear Algebra Guide
No ratings yet
Advanced Linear Algebra Guide
2 pages
Linear Algebra For Data Science 9811276226 9789811276224 - Compress
100% (3)
Linear Algebra For Data Science 9811276226 9789811276224 - Compress
257 pages
Linear Algebra For Computer Science
100% (1)
Linear Algebra For Computer Science
279 pages
Jean H Gallier, Jocelyn Quaintance - Linear Algebra and Optimization With Applications to Machine Learning - Volume I_ Linear Algebra for Computer Vision, Robotics, And Machine Learning-World Scientif
No ratings yet
Jean H Gallier, Jocelyn Quaintance - Linear Algebra and Optimization With Applications to Machine Learning - Volume I_ Linear Algebra for Computer Vision, Robotics, And Machine Learning-World Scientif
823 pages
EML Couse Outcome
No ratings yet
EML Couse Outcome
2 pages
Linear Algebra for AI & Robotics
No ratings yet
Linear Algebra for AI & Robotics
753 pages
Mathematical Foundations for ML Course
No ratings yet
Mathematical Foundations for ML Course
7 pages
1 Introduction To Vectors: Factorization: A LU - . - . - . - . - . - . - . - . .
No ratings yet
1 Introduction To Vectors: Factorization: A LU - . - . - . - . - . - . - . - . .
2 pages
Linear Algebra and Optimization For Machine Learning
No ratings yet
Linear Algebra and Optimization For Machine Learning
18 pages
Course Outline 2
No ratings yet
Course Outline 2
4 pages
Linear Algebra Optimization Machine Learning PDF
100% (12)
Linear Algebra Optimization Machine Learning PDF
507 pages
Linear Algebra111
No ratings yet
Linear Algebra111
9 pages
Advanced Linear Algebra with Python
No ratings yet
Advanced Linear Algebra with Python
235 pages
Linear Algebra With Its Applications
100% (1)
Linear Algebra With Its Applications
336 pages
Linear Algebra For Robotic, Mechine Learning, Computer Vision
No ratings yet
Linear Algebra For Robotic, Mechine Learning, Computer Vision
787 pages
Linear Algebra For Machine Learning 1720308513
No ratings yet
Linear Algebra For Machine Learning 1720308513
20 pages
Linear Algebra and Its Applications 5th Edition-6-8
No ratings yet
Linear Algebra and Its Applications 5th Edition-6-8
3 pages
Linear Algebra for AI and Robotics
No ratings yet
Linear Algebra for AI and Robotics
80 pages
Applied Linear Algebra: Third Edition
No ratings yet
Applied Linear Algebra: Third Edition
5 pages
SEM V Honours Mathematics For Data-Science
No ratings yet
SEM V Honours Mathematics For Data-Science
5 pages
Matrix Analysis For Scientists and Engineers Alan J Laub
No ratings yet
Matrix Analysis For Scientists and Engineers Alan J Laub
172 pages
Linear Algebra - Pure & Applied
90% (10)
Linear Algebra - Pure & Applied
734 pages
V Aids Ad3501 DL Unit-1
No ratings yet
V Aids Ad3501 DL Unit-1
70 pages
Linear Algebra - Intuition, Math, Code
No ratings yet
Linear Algebra - Intuition, Math, Code
565 pages
Finite-Dimensional Linear Algebra: Mark S
0% (1)
Finite-Dimensional Linear Algebra: Mark S
7 pages
LinearAlgebra GDF Jan5 23
No ratings yet
LinearAlgebra GDF Jan5 23
305 pages
Linear Algebra Table of Contents
No ratings yet
Linear Algebra Table of Contents
3 pages
Aiml ZC416 Course Handout
No ratings yet
Aiml ZC416 Course Handout
7 pages
Gilbert Strang - ZoomNotes For Linear Algebra-Wellesley - Cambridge Press (2021)
100% (2)
Gilbert Strang - ZoomNotes For Linear Algebra-Wellesley - Cambridge Press (2021)
80 pages
MML Book 2 PDF
100% (2)
MML Book 2 PDF
421 pages
Modern Linear Algebra Applications
No ratings yet
Modern Linear Algebra Applications
810 pages
Shaik MubarakMlGZ
No ratings yet
Shaik MubarakMlGZ
15 pages
Cla Cif
No ratings yet
Cla Cif
2 pages
Leniear Algebra Operation For Machine Learning
No ratings yet
Leniear Algebra Operation For Machine Learning
10 pages
Algebra's Real-World Applications Explored
No ratings yet
Algebra's Real-World Applications Explored
8 pages
Linear Algebra Course Syllabus
No ratings yet
Linear Algebra Course Syllabus
1 page
Linearalgebra: Pure Applied
100% (1)
Linearalgebra: Pure Applied
726 pages
Tian-Linear Algebra Key Ideas and Methods For A First Course
100% (1)
Tian-Linear Algebra Key Ideas and Methods For A First Course
154 pages
Applications of Linear Algebra
No ratings yet
Applications of Linear Algebra
4 pages
Functional Linear Algebra (Hannah Robbins)
100% (5)
Functional Linear Algebra (Hannah Robbins)
406 pages
Mathematics for Machine Learning Guide
No ratings yet
Mathematics for Machine Learning Guide
416 pages
Linear Algebra Applications Guide
100% (1)
Linear Algebra Applications Guide
288 pages
Linear Algebra - A Powerful Tool For Data Science
No ratings yet
Linear Algebra - A Powerful Tool For Data Science
6 pages
Linear Algebra Applications in Machine Learning
No ratings yet
Linear Algebra Applications in Machine Learning
17 pages
Modern Linear Algebra Applications
No ratings yet
Modern Linear Algebra Applications
809 pages
Comprehensive Exam Syllabus Overview
No ratings yet
Comprehensive Exam Syllabus Overview
46 pages
Linear Algebra Applications in ML
No ratings yet
Linear Algebra Applications in ML
12 pages
Unit1 ML
No ratings yet
Unit1 ML
36 pages
MATH-314 Linear Algebra Course Outlines
No ratings yet
MATH-314 Linear Algebra Course Outlines
4 pages
Graphical Method Calculator
No ratings yet
Graphical Method Calculator
3 pages
Lecture - Activation Function
No ratings yet
Lecture - Activation Function
30 pages
Greedy Algorithms and MST Techniques
No ratings yet
Greedy Algorithms and MST Techniques
69 pages
Signal Analysis & Transmission
No ratings yet
Signal Analysis & Transmission
59 pages
Advanced Image Segmentation Techniques
No ratings yet
Advanced Image Segmentation Techniques
71 pages
Blending Neural Operators and Relaxation Methods in PDE Numerical Solvers
No ratings yet
Blending Neural Operators and Relaxation Methods in PDE Numerical Solvers
11 pages
CS-365 Design Analysis of Algorithm
No ratings yet
CS-365 Design Analysis of Algorithm
2 pages
Recursion in Programming: Concepts & Examples
No ratings yet
Recursion in Programming: Concepts & Examples
36 pages
Iterative Methods for Linear Systems
No ratings yet
Iterative Methods for Linear Systems
30 pages
Curve Fitting Techniques in Numerical Methods
No ratings yet
Curve Fitting Techniques in Numerical Methods
60 pages
Assignment-DAA 1
No ratings yet
Assignment-DAA 1
4 pages
Lagrange and Spline Interpolation Methods
No ratings yet
Lagrange and Spline Interpolation Methods
31 pages
Numerical Methods Course Overview
No ratings yet
Numerical Methods Course Overview
3 pages
Binary Search Tree Guide
No ratings yet
Binary Search Tree Guide
48 pages
Sieve Sort Algorithm Guide
No ratings yet
Sieve Sort Algorithm Guide
3 pages
CTFT DPP-01 Discussion Notes Parakram GATE 2024 Instrumentation Weekday Hinglish 63a975b9d2d1ce0018366adf
No ratings yet
CTFT DPP-01 Discussion Notes Parakram GATE 2024 Instrumentation Weekday Hinglish 63a975b9d2d1ce0018366adf
14 pages
Feature Selection Methods in ML Lab
No ratings yet
Feature Selection Methods in ML Lab
3 pages
Info & Coding Theory Course Guide
0% (1)
Info & Coding Theory Course Guide
2 pages
Machine Vision for Fruit Grading Review
No ratings yet
Machine Vision for Fruit Grading Review
14 pages
FALLSEM2025-26 VL BCSE334L 00100 TH 2025-09-29 KNN - (Lazy-Learner)
No ratings yet
FALLSEM2025-26 VL BCSE334L 00100 TH 2025-09-29 KNN - (Lazy-Learner)
22 pages
Understanding Recursive Functions
No ratings yet
Understanding Recursive Functions
8 pages
Deep Learning: Neural Networks Explained
No ratings yet
Deep Learning: Neural Networks Explained
20 pages
A Transfer Learning Approch To Predict The Diagnosis of Brain Stroke
No ratings yet
A Transfer Learning Approch To Predict The Diagnosis of Brain Stroke
6 pages
Data Analysis with WEKA Guide
No ratings yet
Data Analysis with WEKA Guide
21 pages
Ofdm Problems
No ratings yet
Ofdm Problems
3 pages
Portfolio Optimization Using Particle Swarm Optimization
No ratings yet
Portfolio Optimization Using Particle Swarm Optimization
6 pages
Anomaly Detection
No ratings yet
Anomaly Detection
76 pages
UNIT 1 ML Complete
No ratings yet
UNIT 1 ML Complete
115 pages
District Mechanics - Tower of Hanoi
No ratings yet
District Mechanics - Tower of Hanoi
3 pages
Coding Theory and Cryptography Overview
No ratings yet
Coding Theory and Cryptography Overview
104 pages

1 9780692196380 FM

Uploaded by

1 9780692196380 FM

Uploaded by

Linear Algebra and Learning from Data

a note from the author

Deep Learning and Neural Nets iii

Preface and Acknowledgments vi

Part I : Highlights of Linear Algebra 1

Part II : Computations with Large Matrices 113

Part III: Low Rank and Compressed Sensing 159

Part IV: Special Matrices 203

V.5 Multivariate Gaussian and Weighted Least Squares 304

Part VI: Optimization 321

Part VII: Learning from Data 371

Books on Machine Learning 416

Eigenvalues and Singular Values : Rank One 417

Codes and Algorithms for Numerical Linear Algebra 418

Counting Parameters in the Basic Factorizations 419

Index of Authors 420

Index of Symbols 432

You might also like