0% found this document useful (0 votes)
18 views11 pages

Linear Algebra

Uploaded by

jwds079
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views11 pages

Linear Algebra

Uploaded by

jwds079
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Statistics 135

Chapter 2 - Linear Algebra

Chris Drake
Department of Statistics
University of California, Davis
Vectors (deterministic)
1 definition of vector, multiplication by a constant and addition of
two vectors
       
x1 cx1 y1 x1 + y 1
 x2   cx2   y2   x2 + y 2 
x=     
 ..  cx =  ..  y =  ..  x + y = 
 .. 

xn cxn yn xn + y n
2 length or norm of a vector

Lx = x21 + x22 + ... + x2n =|| x || Lcx = c × Lx =| c | · || x ||
3 unit vector: Lx =|| x ||= 1

∑n
4 inner product of two vectors x y = i=1 xi yi
5 angle θ between x and y is given by
[ ] √
cos(θ) = x′y / x′x · y′y
6 def: a set of m-tuples (vectors) and all other vectors that can ob-
∑k
tained as as linear combinations of the form x = i=1 aixk is called
a vector space
7 def: the linear span∑of a set of vectors xi, .., xk is the set of all linear
combinations x = ki=1 aixk
∑k
8 def: a set of vectors is linearly independent if x = i=1 aixk = 0
implies a1, .., ak = 0
9 def: a set of m linearly independent m-tuples (vectors) is called a
basis for the space of m-tuples.
10 fact: any vector can be expressed uniquely in terms of a given basis
11 fact: x and y are orthogonal (perpendicular) if x′y = 0
12 fact: the projection of x on y is
x′y
Py = y
|| y ||2
Matrices (deterministic
A matrix A is an array of real numbers with n rows and p columns.
Its transpose A′ is the array with rows and columns exchanged.
   
x11 x12 . . . x1p x11 x12 . . . x1n
 x21 x22 . . . x2p   x21 x22 . . . x2n 
A= .   and A = 
′ 
. .
. . . . .
.   .
. .
. . . . . 
.
xn1 xn2 . . . xnp xp1 xp2 . . . xnp
1 the multiplication of a matrix by a constant is the element by
element multiplication by this constant, similar to a vector multi-
plication by a constant.
2 two matrices of the same dimension (n rows, p columns) are added
by summing element by element; (a + b)ij = aij + bij .
3 the product of two matrices A·B is defined if the number
∑k of columns
of A equals the number of rows of B; then (ab)ij = l=1 ail blj and
A · B has rA rows and cB columns.
4 def: a square matrix has the same number of rows and columns; a
symmetric matrix is a square matrix that equals its transpose.
5 fact: if A and B are square matrices then A · B and B · A are
both defined but need not be equal; matrix multiplication is not
commutative in general.
6 def: an inverse B for a square matrix A is defined by that fact that
B · A = A · B = I.
7 fact: a square matrix A has an inverse if the columns (rows) are
linearly independent.
8 def: a square matrix Q is called orthogonal if Q · Q′ = Q′ · Q = I.
9 def: an eigenvalue of a square matrix A is denoted by λ the solu-
tion to the following equation: Ax = λx. The vector x is called an
eigenvector. The number of nonzero eigenvalues equals the number
of linearly independent columns (rows); eigenvectors belonging to
two different eigenvalues are orthogonal; if a matrix has an eigen-
value of multiplicity > 1 the associated eigenvectors are orthogonal
but not unique.
10 def: a square n×n matrix A is called of full rank if it has n linearly
independent columns (rows); the number of independent columns
(rows) is called the rank of A; row and column rank of a matrix
are always equal.
11 def: the spectral decomposition of a square matrix A is given by
the following A = PΛP′ where P is orthogonal and Λ is a diagonal
matrix with  
λ1 0 . . . 0
 0 λ2 . . . 0 
Λ=  .. .. . . . .. 

0 0 . . . λn
The number of nonzero eigenvalues equals the rank of A.
12 def: an equation x′Ax with A an n × n square matrix and x a
vector (n-tuple) is called a quadratic form (note that this form is
an equation with only quadratic terms in the xi and second order
terms xixj .
13 def: a quadratic form and matrix A are called positive definite if
x′Ax > 0 for all x ̸= 0.
14 def: if x′Ax ≥ 0 then A and the quadratic form are called nonneg-
ative definite.
15 def: the square root of a matrix A is given by A1/2 = PΛ1/2P′
where √ 
λ1 √0 ... 0
 0 λ2 . . . 0 
1/2
Λ = .  
. .
. . . . √. 
.
0 0 ... λn
16 fact:
A1/2 is symmetric
A1/2A1/2 = A
A−(1/2) = PΛ−(1/2)P′
17 a full rank square matrix A has inverse A−1 = PΛ−1P′ where Λ−1
has ith diagonal element λ−1
i .
Random Vectors and Matrices
X = (X1, .., Xn)′ is a vector of random variables. Similarly, a ran-
dom matrix is a (n × p) array of random variables. The mean of
a random vector and matrix are obtained by taking expectation for
each element.
 
E(X11) E(X12) . . . E(X1p)
 E(X21) E(X22) . . . E(X2p) 
E(X) =   .. .. ... .. 

E(Xn1) E(Xn2) . . . E(Xnp)
1 fact: E(X + Y) = E(X) + E(Y)
2 fact: E(AXB) = AE(X)B for A, B constant matrices and X a
random matrix.
3 fact: the behavior of random vectors and matrices is generally
described by their probability distribution; generally, the variables
are not independent.
4 fact: if X is a p-variate random vector with density f (x1, x2, .., xp)
then (X1, .., Xp) independent if and only if f (x1, x2, .., xp) = f (x1)×
... × f (xp). Independence implies zero correlation but the converse
is not true.
5 variance of a random vector

V ar(X) = Σ = E(X × X′) − E(X) × E(X)′

note that the (i, j)th element of the first matrix is E(Xi × Xj ) and
of the second matrix it is E(Xi) × E(Xj ) = µi × µj
6 the off-diagonal elements of the variance-covariance matrix are de-
noted by σij and the diagonal elements by σii.

7 the (i, j)th correlation is given by ρij = σij / σii × σjj

8 if V1/2 denotes the diagonal matrix with σii on the diagonal and
R the correlation matrix, then Σ = V1/2RV1/2
9 X = (X1, X2) is a partition of X; the mean vector is partitioned
accordingly into µ = (µ1, µ2) and the variance-covariance matrix is
partitioned into ( )
Σ11 Σ12
Σ=
Σ21 Σ22
Note, Σ12 = Σ′21
10 linear combinations of random variables (vector notation); X a
random vector (p-tuple); c a vector of length p of constants then
c′X is a linear combination of the X ′s with mean c′µ and variance
c′Σc.
11 extended Cauchy-Schwarz inequality: b, d are p × 1 vectors and
B a positive definite matrix; then

(b′d)2 ≤ (b′Bb)(d′B−1d)

For B = I we get the usual Cauchy-Schwarz inequality.


12 maximization lemma: Bp×p positive definite and dp×1 a vector;
then for an arbitrary nonzero vector x we have
max (x′d)2 ′ −1
= d B d
x ̸= 0 ′
xB x
for Bp×p positive definite with eigenvalues λ1 ≥ λ2 ≥ ... ≥ λp we
get
max x′B x
= λ1
x ̸= 0 x′x
min x′B x
= λp
x ̸= 0 ′
xx
The maximum is achieved when x equals e1 the eigenvector asso-
ciated with λ1; similarly for the minimum.

You might also like