0% found this document useful (0 votes)
5 views6 pages

Gaussian Random Vectors

The document discusses Gaussian random vectors, focusing on the standard normal random vector and its properties. It explains how multivariate normal distributions can be derived from linear transformations of standard normal variables, detailing the expectations and covariance matrices. Additionally, it covers the conditions under which a Gaussian vector has a probability density function and the implications of covariance matrices in relation to independence of random variables.

Uploaded by

16fjleon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

Gaussian Random Vectors

The document discusses Gaussian random vectors, focusing on the standard normal random vector and its properties. It explains how multivariate normal distributions can be derived from linear transformations of standard normal variables, detailing the expectations and covariance matrices. Additionally, it covers the conditions under which a Gaussian vector has a probability density function and the implications of covariance matrices in relation to independence of random variables.

Uploaded by

16fjleon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MATH 896-2021 Lecture 10 SEP 28, 2021

Gaussian random vectors

The standard normal random vector. Recall that Z is a standard normal (or
Gaussian) random variable if its pdf is given by
1 z2
f (x) = √ e− 2 , −∞ < z < ∞.

For briefness, we denoted this distribution N (0, 1). All other normal random vari-
ables can be obtained as linear transformations of Z. Thus, if X = µ + σZ, then
X ∼ N (µ, σ 2 ). We will use the same idea to introduce multivariate normal distri-
butions.
Let Z1 , ..., Zn be independent standard normal random variables. We define stan-
dard normal random vector as
 
Z1
 Z2 
Z= 
 ...  .
Zn

By our convention about expectation of random vectors,


 
0
0
EZ =  
 ...  = 0 and CovZ = I,
0

where 0 = 0n×1 is the column vector of 0’s, and I = In×n is the identity matrix.
The distribution of the standard normal random vector Z is denoted N (0, I). Due
to independence of its components, the joint pdf of the standard normal vector Z
is given by

Yn
1 zi2 1 1 ∑n 1 1 ′ 1
√ e− 2 = e− 2 i=1 zi = e− 2 z z = e− 2 ∥z∥ .
2 1 2
fZ (z) = n/2 n/2 n/2
i=1
2π (2π) (2π) (2π)

Note that here we treated vector variable z as a column vector. Then z′ z coincides
with the dot product z · z and defines the squared Euclidean norm ∥z∥2 .
Defining normal random vectors via linear transformations. Now let us
consider a linear transformation

Y = µ + AZ = µn×1 + An×n Zn×1 ,

1
where µ is an arbitrary n × 1 column-vector and A is a square n × n matrix (with
real entries). In a more detailed form,

X
n
Y i = µi + Aij Zj .
j=1

Thus, as a sum of independent normal random variables, every component Yi has


a normal distribution. Their expected values, variances, and covariances are easier
to access using the matrix notation. By the rules of expected values and covariance
matrices discussed in (Lecture 9),

EY = µ + A · 0 = µ, Cov Y = A · Cov Z · A′ = AIA′ = AA′ := V.

We say that random vector Y has an n-dimensional normal distribution, with the
mean vector µ and variance-covariance matrix V, and denote this as

Y ∼ N (µ, V).

Of course, for i = 1, ..., n

EYi = µi , σi2 = Var Yi = Vii .

Multivariate normal pdf. We will show next that the random vector Y has a
pdf if and only if the matrix A is invertible, and derive its pdf in the latter case.
So, let first the matrix A be invertible. Hence, V = AA′ is also invertible. In this
case, one can invert the relation

y = µ + Az

as follows
X
n

−1
z = A (y − µ), or zi = A−1 ij
(yj − µj ).
j=1

Next, we need to find the Jacobian of this transformation:


 
∂zi 1 1
Jij = = (A−1 )ij , |J| = |A−1 | = =p .
∂yj |A| |V|

Which properties of determinants have been used here? Now, by Theorem 2


from Lecture 9, the joint pdf of the transformed vector Y is given by

1 − 12 z′ z 1 − 21 (A−1 (y−µ))′ (A−1 (y−µ))


fY (y) = fZ (z(y))|J| = e |J| = e =
(2π)n/2 (2π)n/2 |V|1/2

2
1 −1 )′ A−1 (y−µ) 1 ′ −1 A−1 (y−µ)
e− 2 (y−µ)(A e− 2 (y−µ)(A )
1 1
= =
(2π)n/2 |V|1/2 (2π)n/2 |V|1/2

1 ′ −1 (y−µ) 1 −1 (y−µ)
e− 2 (y−µ)(AA ) e− 2 (y−µ)V
1 1
= =
(2π)n/2 |V|1/2 (2π)n/2 |V|1/2
1 −1
e− 2 (y−µ)V (y−µ) .
1
(1)
|2πV|1/2

Which properties of determinants and transposed matrices have been used


here?
Thus, under the assumption that the matrix A is invertible, the joint pdf of Y is
given by (1).
Now, let us show that if the matrix A is not invertible, the gaussian vector Y
cannot have a pdf. Note that the image ARn of the whole n-dimensional space Rn
is a linear subspace L ⊂ Rn , i.e., for any y1 , y2 ∈ L and any real α1 , α2 ,

α1 y1 + α2 y2 ∈ L.

Why? Certainly, L ̸= Rn , since if L coincided with Rn , the matrix A would be


invertible, in contradiction with our assumption!
The image of Rn under the transformation z −→ µ + Az is the so-called affine
subspace µ + L. Here again µ + L ̸= Rn , since the opposite would imply that
L = Rn .
Thus, if the random vector Y had a pdf fY (y) , it would necessarily vanish every-
where outside the affine subspace µ + L. But then one would have
Z Z
fY (y) dy = fY (y) dy = 0, (2)
Rn µ+L

which is impossible.
Remark.
RR To make a better sense of (2), think of a two dimensional integral
G
f (x, y) dxdy, where the region G is a straight line. Such integral necessarily
vanishes, since the volume of any straight line equals 0.
From the all-important formula (1), one can derive a number of corollaries.

Corollary 1. Let Y1 , ..., Yn be jointly Gaussian, with σi2 = VarYi > 0. Then Yi
are independent if and only if they are uncorrelated. Indeed, independent r.v. are
always uncorrelated. Now, if Yi are uncorrelated and jointly Gaussian, then
   
σ12 0 ··· 0 σ1−2 0 ··· 0
 0 σ22 ··· 0   0 σ2−2 ··· 0 
V = CovY = 
 ... .. ..  , so that V−1 =
 ... .. ..  .
.  . 
.. ..
. . . .
0 0 · · · σn2 0 0 · · · σn−2

3
Since

X
n
 X
n
 X
n
(yi − µi )2
−1 −1 −1
(y−µ)V (y−µ) = (yi −µi ) V (yj −µj ) = V (yi −µi ) =
2
,
i,j=1
ij
i=1
ii
i=1
σi2

and
Y
n
|V| = σi2 ,
i=1

one gets

1 − 12
∑n (yi −µi )2 Y
n
1 (y −µ )
− i 2i Y2 n
fY (y) = Q e
i=1 σ2
i = e 2σ
i = fYi (yi ),
(2π)n/2 ( ni=1 σi2 )1/2 i=1
(2πσi2 )1/2 i=1

which shows that Yi are independent random variables Yi ∼ N (µi , σi2 ), by one of
our equivalent definitions of independence.
One can farther generalize this important property. Suppose a Gaussian random
n-vector X can be partitioned as follows
    
X1 µ1
Xn×1 = ∼ N ,V ,
X2 µ2

where

(X1 )p×1 , (X2 )q×1 , p + q = n.

Suppose the components of X1 and X2 are mutually uncorrelated, resulting in a


block-diagonal covariance matrix
   
CovX1 0 V1 0
V = CovX = = .
0 CovX2 0 V2
Then it is easy to verify that
 
−1 V1−1 0
|V| = |V1 | · |V2 |, V = ,
0 V2−1

and
X
n
′ −1
(x − µ) V (x − µ) = (xi − µi )(V−1 )ij (xj − µj ) =
i,j=1

X
p
X
n
−1
(xi − µi )(V )ij (xj − µj ) + (xi − µi )(V−1 )ij (xj − µj ) =
i,j=1 i,j=p+1

(x1 − µ1 )⊤ V1−1 (x1 − µ1 ) + (x2 − µ2 )⊤ V2−1 (x2 − µ2 ).

4
Thus,

1 − 12 (x1 −µ1 )′ V1−1 (x1 −µ1 )− 12 (x2 −µ2 )′ V2−1 (x2 −µ2 )
fX (x) = e =
(2π)(p+q)/2 (|V1 | · |V2 |)1/2

1 ′ −1 1 ′ −1
e− 2 (x1 −µ1 ) V1 (x1 −µ1 )
e− 2 (x2 −µ2 ) V2 (x2 −µ2 )
1 1
· = fX1 (x1 )fX2 (x2 ).
(2π)p/2 (|V 1 |)1/2 (2π)q/2 (|V 2 |)1/2

Using the elimination rule, we find that X1 ∼ N (µ1 , V1 ) and X2 ∼ N (µ2 , V2 )


are two independent normal vectors. Since their joint pdf fX (x) factorizes into the
product of their marginal pdf’s, vectors X1 and X2 are independent. 2

More on covariance matrices. Obviously, any covariance matrix V = CovX =


(Cov (Xi , Xj )) is a symmetric matrix. Suppose a random vector X has a pdf fX (x)
in Rn . Thus, X does not belong to any proper linear subspace L ∈ R. Then for any
vector u ∈ Rn , u ̸= 0,

u′ Vu = u′ (CovX)u = E[u′ (X − EX) · (X − EX)′ u] = E[u · (X − EX)]2 > 0. (3)


| {z } | {z }

Here the equality to 0 on the right-hand side is excluded, because it would imply
that
X
n
u · (X − EX) = ui (Xi − EXi ) = 0
i=1

almost surely, in which case X − E X would belong to a proper linear subspace L


of all vectors x ∈ R satisfying u · x = 0, in contradiction with our assumption.
Any matrix V satisfying (3), for any u ∈ Rn , u ̸= 0, is said to be positive definite.
It is known from Linear Algebra, that any symmetric, positive definite matrix V
can be represented as V = A2 , where A is also a symmetric and positive definite,
hence, invertible, matrix. This matrix is denoted A = V1/2 .
Standardization of normal vectors. Let Y ∼ N (µ, V). Then, just as it was
shoen in (1) that the transformation Y = µ + AZ has the normal distribution
N (µ, V), it can be demonstrated that the inverse transformation Z = A−1 (Y − µ)
has the standard multivariate normal distribution, Z ∼ N (0, I). In other words,
any normal vector Y ∼ N (µ, V) having a pdf, can be represented as

Y = µ + AZ,

with a vector µ ∈ Rn and an invertible n × n matrix A. This is the multivariate


version of the so-called standardization. From this we immediately get the following

5
Corollary 2. a) Any normal vector Y ∼ N (µ, V) having a pdf is a linear function
of a standard normal random vector Z.
b) Any linear function U = Cn×1 + Bn×n Yn×1 with an invertible matrix B is also
normal (since, ultimately, it is also a linear function of a standard normal random
vector Z).
Orthogonal transformations. A square matrix An×n is called orthogonal, if
its row-vectors Ai· are orthonormal, i.e. mutually orthogonal and of length one:

1 , i=j
Ai· · Aj· =
0 , i ̸= j.

The above can be written as

AA′ = I, or A−1 = A′ ,

where A′ is the transpose of the matrix A. Thus, an orthogonal matrix A is


invertible, and A′ is the right inverse of A. However, as is well known from Linear
Algebra, if A is invertible, there exists exactly one left-inverse matrix, exactly one
right-inverse matrix, exactly one inverse, and all these matrices are the same, A−1 .
Thus, for an orthogonal matrix also

A′ A = I.

which shows that the column-vectors of an orthogonal matrix are also orthonor-
mal.
Orthogonal matrices have some special properties. First, such matrices preserve
lengths of vectors: if we consider the linear transformation

An×n yn×1 = (Ay)n×1 ,

then ∥Ay∥ = ∥y∥. Indeed, by the rules of transposition,

∥Ay∥2 = (Ay)′ (Ay) = (y′ A′ )(Ay) = y′ (A′ A)y = y′ y = ∥y∥2 .

Second, orthogonal matrices preserve the standard normal distribution. Indeed, if


Z ∼ N (0, I), then, as we know, Y = AZ ∼ N (0, AA′ ) = N (0, I). Thus, we have
proved the following
Corollary 3. Let A be an orthogonal matrix and let Z ∼ N (0, I). Then the
transformed vector Y = AZ is also a standard normal vector, Y ∼ N (0, I).

You might also like