Lecture Booklet v01
Lecture Booklet v01
J.W. Grizzle
Fall 2015
1
Page = 2
Notation:
Direct Proofs: We derive a result by applying the rules of logic to the given
assumptions, denitions, axioms, and (already) known theorems.
Example:
Def. An integer n is even if n = 2k for some integer k ; it is odd if n = 2k + 1
for some integer k . Prove that the sum of two odd integers is even.
p = n2 is even, ∼ p = n2 is odd
q = n is even, ∼ q = n is odd
2
Page = 4
Proof by Exhaustion: Reduce the proof to a nite number of cases, and then
prove each case separately.
Proofs by Induction:
Example:
Claim: For all n ≥ 1, 1 + 3 + 5 + · · · + (2n − 1) = n2
Proof:
3
Page = 5
But,
k 2 + (2(k + 1) − 1) = k 2 + 2k + 2 − 1 = k 2 + 2k + 1 = (k + 1)2
which is what we wanted to show.
4
Page = 6
Example:
Def.: A natural number n is composite if it can be factored asn = a · b, where
a and b are natural numbers satisfying 1 < a, b < n. Otherwise, n is prime.
Proof:
Base Case: The number 2 can be written as the product of a single prime.
1
Page = 7
a = p1 · p2 · · · · · pi , f or some primes pi
b = q1 · q2 · · · · · qj , f or some primes qj
Hence, a · b = (p1 · p2 · · · · · pi ) · (q1 · q2 · · · · · qj ) is a product of primes.
meaning some statement that is obviously false, such as " 1 + 1 = 3". More
true!
√
Example: Prove that 2 is an irrational number.
√
Proof by Contradiction: Assume 2 is rational.
2
Page = 8
∴ 2n2 = (2k)2 = 4k 2
∴ n2 = 2k 2 ⇒ n2 is even ⇒ n is even
Conclusion, m and n have 2 as a common factor. This contradicts m and n
having no common factors.
√
Hence,
√ 2 is not a rational number.
∴ 2 must be irrational.
Explanation:
√
p: 2 irrational.
√
We start with the assumption that (∼ p :) 2 is a rational number.
∴ p is true.
arrive at ∼ p)
• Proof by Contradiction: p ∧ (∼ q)
(Assume p is true and q is false. Find that both R and ∼ R and true,
which is a contradiction.)
Negating a Statement:
Examples:
p:x≥0 ∼p:x<0
3
Page = 9
Exercise: y ∈ R,
Let
Answer:
Def.
4
Page = 10
Def: Field: (Chen, 2nd edition, page 8) : A eld consists of a set, denoted
by F, of elements called scalars and two operations called addition + and
multiplication ·; the two operations are dened over F such that they satisfy
the following conditions:
1
Page = 11
Examples Non-examples
Def: Vector Space (Linear Space) (Chen 2nd Edition, page 9) A linear
space over a eld F , denoted by (X , F), consists of a set, denoted by X , of
elements called vectors, a eld F , and two operations called vector addition
and scalar multiplication. The two operations are dened over X and F such
that they satisfy all the following conditions:
2
Page = 12
Examples:
1. Every eld forms a vector space over itself. (F, F). Examples: (R, R),
(C, C), (Q, Q).
2. X = C, F = R: (C, R).
3. F = R, D ⊂ R (examples: D = [a, b] ; D = (0, ∞) ; D = R) and X =
{f : D → R} = {functions from D to R}
f, g ∈ X , dene f + g ∈ X by ∀t ∈ D, (f + g) (t) := f (t) + g(t) and let
α ∈ R, α · f ∈ X , dene f · g ∈ X by ∀t ∈ D, (α · f ) (t) = α · f (t).
4. Let F be a eld and dene F n the set of n-tuples written as columns
α1
.
F = .. αi ∈ F, 1 ≤ i ≤ n = X
n
αn
α1 β1 α1 + β1
. . ..
Vector Addition: .. + .. = .
αn βn αn + βn
αx1
.
Scalar Multiplication: α · x = ..
αxn
5. X = F n×m = {n × m matrices with coecients in F}
3
Page = 13
Non-examples:
1. (Y, F) is a subspace.
Non-example:
X = R2 , F =
R
x1
Y= ∈ R2 x1 + x2 = 3 .
x
2
x1 y1 x1 + y1
Let ∈ Y and ∈ Y . Then, ∈
/Y because x1 +y1 +x2 +y2 = 6
x2 y2 x2 + y2
4
Page = 14
Therefore, x + y 6∈ Y , which means that this space is not closed under vector
addition! Thus, it is not a subspace!
5
Page = 15
P∞
Something of the form k=1 αk v
k
is not a linear combination because it is not
nite.
1
Page = 16
Suppose α1 6= 0
α1 v 1 = −α2 v 2 − α3 v 3 − · · · − αk v k
α2 α3 αk
v1 = − v2 − v3 − · · · − vk
α1 α1 α1
p(t) = α0 + α1 t + · · · + αn tn
0 = p(0) ⇐⇒ α0 = 0
0 = dp(t)
dt |t=0 = (α1 + 2α2 t + · · · + nαn t
n−1
)|t=0 ⇐⇒ α1 = 0
..
.
Etc.
1 0 0
Example: Let X = {2×3 matrices with real coecients}. Let v = , 1
2 0 0
1 0 0 0 0 1 0 0 0
v2 = , v3 = , v4 = .
0 0 0 0 0 0 1 0 0
{v 1 , v 2 } is a linearly independent
set.
α 1 0 0 α 0 0 0 0 0
α1 v 1 + α2 v 2 = 0 ⇐⇒ + 2 =
2α1 0 0 0 0 0 0 0 0
⇐⇒ α1 = α2 = 0.
2
Page = 17
• B is linearly independent.
• span{B} = X .
3
Page = 18
1 0 0
0 1 0
Example: (F n , F) where F is R or C. e1 = . , e2 = . , . . . , en = . .
.. .. ..
0 0 1
{e1 , e2 , . . . , en } is both linearly independent and its span is F n .
∴ It is a basis.
It is called the Natural Basis.
Moreover, {e1 , e2 , . . . , en , je1 , je2 , . . . , jen } is a basis for Cn in (Cn , R). How-
ever, it is not a basis for Cn in (Cn , C).
1 1 1
0 1 1
Let v 1 = .. , v 2 = .. , . . . , v n = .. . {v 1 , v 2 , . . . , v n } is also a basis for
. . .
0 0 1
(F , F) where F is R or C.
n
Def. Let n > 0 be an integer. The vector space (X , F) has nite dimension
n if
4
Page = 19
Examples:
dim(F n , F) = n
dim(Cn , R) = 2n
dim(P(t), R) = ∞
5
Page = 20
1
Page = 21
1 0 0 1 0 0 0 0
Basis 1: v 1 = , v2 = , v3 = , v4 =
0 0 0 0 1 0 0 1
1 0 0 1 0 1 0 0
Basis 2: w1 = , w2 = , w3 = , w4 =
0 0 1 0 −1 0 0 1
5 3
x= = 5w1 + 2w2 + 1w3 + 4w4
1 4
5
2
Therefore, [x]w =
1∈R .
4
Easy Facts:
Change of Basis Matrix: Let {u1 , · · · , un } and {ū1 , · · · , ūn } be two bases
for (X , F). Is there a relation between [x]u and [x]ū ?
2
Page = 22
3
Page = 23
We choose to compute P̄
1
0
P̄1 = [ū1 ]u =
0
0
0
1
P̄2 = [ū2 ]u =
1
0
0
1
P̄3 = [ū3 ]u =
−1
0
0
0
P̄4 = [ū4 ]u =
0
1
1 0 0 0
0 1 1 0
Therefore, P̄ =
0
and P = P̄ −1
1 −1 0
0 0 0 1
4
Page = 24
0 1
Example: A = , det(λI − A) = λ2 + 1 = 0.
−1 0
Therefore, the eigenvalues are λ1 = j, λ2 = −j .
To nd eigenvectors, we need
to solve (A
− λi I)v = 0.
i
1 1
The eigenvectors are v 1 = , v2 = .
j −j
Note that both eigenvalues and eigenvectors are complex conjugate pairs.
5
Page = 25
Proof: We prove the contrapositive and show there is a repeated e-value (λi = λj
for some i 6= j ).
1
Page = 26
Because v i is an e-vector,
(A − λj I)v i = Av i − λj v i = λi v i − λj v i = (λi − λj )v i
Equivalently,
L(x + z) = L(x) + L(z)
L(αx) = αL(x)
2
Page = 27
Example:
3
Page = 28
Using linearity
L(x) = L(α1 u1 + · · · + αm um )
= α1 L(u1 ) + · · · + αm L(um )
= α1 A1 + · · · + αm Am
α1
α2
= A1 |A2 | · · · |Am ..
.
αm
= A x {u1 ,··· ,um }
∴ L(x) {v1 ,··· ,vn } = A x {u1 ,··· ,um }
4
Page = 29
Example:
0
1
0
A2 = L(t) {1,t,t2 ,t3 } = 0
0
0
2 2
A3 = L(t ) {1,t,t2 ,t3 } =
0
0
0
0
A4 = L(t3 ) {1,t,t2 ,t3 } =
3
and thus
0 1 0 0
0 0 2 0
A=
0
0 0 3
0 0 0 0
5
Page = 30
p(t) = a0 + a1 t + a2 t2 + a3 t3
and
a0
a1
p(t) {1,t,t2 ,t3 } =
a2
a3
0 1 0 0 a0 a1
0 0 2 0
A[p(t)]{1,t,t2 ,t3 } = a1 = 2a2
0 0 0 3 a2 3a3
0 0 0 0 a3 0
d
p(t) = a1 + 2a2 t + 3a3 t2
dt
a1
d 2a2
[ p(t)]{1,t,t2 ,t3 } =
3a3
dt
0
d
A[p(t)]{1,t,t2 ,t3 } = [ p(t)]{1,t,t2 ,t3 }
dt
.
6
Page = 31
Normed Spaces:
Let Field F be R or C,
Def. A function k · k: X → R is a norm if it satises
Examples:
1 F = R or C, X = Fn .
21
P
n
i) kxk2 = |xi |2 , Two norm, Euclidean norm
i=1
p1
P
n
ii) kxkp = |xi |p , 1 ≤ p < ∞, p-norm
i=1
1
Page = 32
Rb
i) kf k2 = ( a |f (t)|2 dt) 2
1
Rb 1
ii) kf kp = ( a |f (t)|p dt) p , 1 ≤ p < ∞
iii) kf k∞ = max |f (t)|, which is also written kf k∞ = sup |f (t)|
a≤t≤b a≤t≤b
Important questions:
2
Page = 33
Remarks:
Examples:
P
n
a) (Cn , C), hx, yi = x> y = xi yi .
i=1
P
n
b) (Rn , R), hx, yi = x> y = xi yi .
i=1
3
Page = 34
Therefore, we can conclude that |hx, yi|2 ≤ hx, xihy, yi ⇒ |hx, yi| ≤ hx, xi1/2 hy, yi1/2 .
4
Page = 35
Orthogonal Bases
Proof: (For F = R) will only check the triangle inequality kx + yk ≤ kxk + kyk,
which is equivalent to showing
kx + yk2 ≤ kxk2 + kyk2 + 2kxk · kyk
kx + yk2 = hx + y, x + yi
= hx, x + yi + hy, x + yi
= hx, xi + hx, yi + hy, xi + hy, yi
= kxk2 + kyk2 + 2hx, yi
≤ kxk2 + kyk2 + 2 |hx, yi|
≤ kxk2 + kyk2 + 2kxk · kyk
Def.
1
Page = 36
Remark:
x
x 6= 0, kxk has norm 1.
x 1 1
= · kxk = · kxk = 1
kxk kxk kxk
Remarks:
2
Page = 37
Proof:
Claim 1: If m0 ∈ M satises kx − m0 k = d(x, M ), then x − m0 ⊥ M .
Proof: (By contrapositive) Assume x − m0 6⊥ M , we will nd m1 ∈ M such
that kx − m1 k < kx − m0 k.
Suppose x − m0 6⊥ M . Hence, ∃m ∈ M such that hx − m0 , mi = 6 0. We know
m 6= 0, and hence we dene m̃ = kmk ∈ M .
m
Dene δ := hx − m0 , m̃i =
6 0.
m1 = m0 + δ m̃
∴ m1 ∈ M
kx − m1 k2 = kx − m0 − δ m̃k2
= hx − m0 − δ m̃, x − m0 − δ m̃i
= hx − m0 , x − m0 i − δ hx − m0 , m̃i −δ hm̃, x − m0 i +δ 2 hm̃, m̃i
| {z } | {z } | {z }
δ δ =1
2 2
= kx − m0 k − δ
∴ kx − m1 k2 < kx − m0 k2
3
Page = 38
Step 1: v 1 = y 1
Remark: v 1 6= 0 because {y 1 , . . . , y n } linearly independent.
4
Page = 39
Step 1
v1 = y1
Step 2
v 2 = y 2 − a21 v 1
2 1 hy 2 , v 1 i
hv , v i = 0 ⇔ a21 =
kv 1 k2
Step 3
v 3 = y 3 − a31 v 1 − a32 v 2
Choose coecients such that hv 3 , v 1 i = 0 and hv 3 , v 2 i = 0,
0 = hv 3 , v 1 i = hy 3 , v 1 i − a31 hv 1 , v 1 i − a32 hv 2 , v 1 i
| {z }
=0
0 = hv 3 , v 2 i = hy 3 , v 2 i − a31 hv 1 , v 2 i −a32 hv 2 , v 2 i
| {z }
=0
hy 3 , v 1 i hy 3 , v 2 i
∴ a31 = a32 =
kv 1 k2 kv 2 k2
Pk−1 hyk ,vj i
Therefore, we can conclude that vk = yk − j=1 kvj k2 vj .
1
Page = 40
Intermediate Facts
Proposition: Let(X , F) be an n-dimensional vector space and let {v 1 , · · · , v k }
be a linearly independent set with 0 < k < n. Then, ∃v k+1 such that
{v 1 , · · · , v k , v k+1 } is linearly independent.
2
Page = 41
M a subspace of X . Then,
X = M ⊕ M ⊥.
Proof: If x ∈ M ∩ M ⊥ , hx, xi = 0 ⇔ x = 0.
Hence, M ∩ M ⊥ = {0}.
Why?
x = α1 v 1 + · · · + αk v k + αk+1 v k+1 + · · · + αn v n
x ⊥ M ⇔ hx, v i i = 0, 1 ≤ i ≤ k
hx, v i i = α1 hv 1 , v i i + · · · + αi hv i , v i i + · · · + αn hv n , v i i
| {z } | {z }
=0 =0
i i
= αi hv , v i
2
= αi kv i k
∴ x = αk+1 v k+1 + · · · + αn v n ⇔ x ∈ span{v k+1 , · · · , v n }.
∴ x ∈ M ⊥ ⇔ x ∈ span{v k+1 , · · · , v n }.
Projection Theorem
3
Page = 42
Moreover, m0 is characterized by x − m0 ⊥ M .
4
Page = 43
(a) x − m0 ⊥M .
(b) ∃m̃ = M ⊥ such that x = m0 + m̃.
(c) kx − m0 k = d(x, M ) = inf kx − mk.
m∈M
1
Page = 44
Normal Equations
Let X be a nite dimensional (real) inner product space and M = span{y 1 , · · · , y k },
1 k
with {y , · · · , y } linearly independent. Given x ∈ X , seek x̂ ∈ M such that
Remark: One solution is Gram Schmidt and the orthogonal projection oper-
ator. We provide an alternative way to compute the answer.
x̂ = α1 y 1 + α2 y 2 + · · · + αk y k
and impose x − x̂⊥M ⇔ x − x̂⊥y i , 1 ≤ i ≤ k .
hx̂, y i i = hx, y i i i = 1, 2, · · · , k
⇔hα1 y 1 + α2 y 2 + · · · + αk y k , y i i = hx, y i i i = 1, 2, · · · , k.
..
.
i=k
α1 hy 1 , y k i + α2 hy 2 , y k i + · · · + αk hy k , y k i = hx, y k i
2
Page = 45
1 1 1 2 1 k
hy , y i hy , y i · · · hy , y i
hy 2 , y 1 i hy 2 , y 2 i · · · hy 2 , y k i
Def. G = G(y 1 , · · · , y k ) = .. .. ..
. . .
k 1 k 2 k k
hy , y i hy , y i · · · hy , y i
where
β1
β
2
βi = hx, y i i, β = .. .
.
βk
3
Page = 46
x̂ = α1 y 1 + α2 y 2 + · · · + αk y k
GT α = β
Gij = hy i , y j i
βi = hx, y i i.
4
Page = 47
Solution:
n
X
n T T
X =R , F = R, hx, yi = x y = y x = x i yi
i=1
Therefore,
n
X
2
kxk = hx, xi = |xi |2 .
i=1
Write
A = [A1 |A2 | · · · |Am ] and α = [α1 , α2 , · · · , αm ]>
and we note that
Aα = α1 A1 + α2 A2 + · · · αm Am .
5
Page = 48
Aside
A>1
A>
A> = ..2 A = [A1 | · · · |Am ]
.
A> m
G = G> = A> A
(A> b)i = A>
i b
6
Page = 49
From our construction of the normal equations, G> α = 0 if, and only if
hα1 y 1 + α2 y 2 + · · · + αk y k , y i i = 0 i = 1, 2, · · · , k.
This is equivalent to
α1 y 1 + α2 y 2 + · · · + αk y k ⊥y i = 0 i = 1, 2, · · · , k
which is equivalent to
α1 y 1 + α2 y 2 + · · · + αk y k ⊥span{y 1 , · · · , y k } =: M
and thus
α1 y 1 + α2 y 2 + · · · + αk y k ∈ M ⊥ .
α1 = α2 = · · · αk = 0.
7
Page = 50
Symmetric Matrices
1
Page = 51
where hx, yi = x> ȳ and kxk2 = hx, xi = x> x̄ = x̄> x. Because kvk2 6= 0, we
deduce that λ = λ̄, proving the result.
hx, yi = x> y .
Proof: Av 1 = λ1 v 1 .
Take the transpose of both sides, and use A = A> . Then,
(v 1 )> A = λ1 (v 1 )>
(v 1 )> Av 2 = λ1 (v 1 )> v 2
(v 1 )> λ2 v 2 = λ1 (v 1 )> v 2
(λ1 − λ2 )(v 1 )> v 2 = 0
λ1 6= λ2 , ⇒ (v 1 )> v 2 = 0.
Claim 3: Suppose the eigenvalues of A are all distinct. Then there exists an
orthogonal matrix Q such that
Q> AQ = Λ = diag(λ1 , · · · , λn ).
Useful Observation: Let A be m × n real matrix. Then both A> A and AA>
are symmetric, and hence their eigenvalues are real.
3
Page = 53
Quadratic Forms
M + M> M − M>
Exercise: M a real matrix, M = + .
2 2
| {z } | {z }
symmetric skew symmetric
M +M >
Def. 2 is the symmetric part of M .
M +M >
Exercise: >
x Mx = x >
2 x.
4
Page = 54
Notation: P > 0: P is positive denite. (Does not mean all entries of P are
positive)
Proof:
Claim 1: P is positive denite. ⇒ All eigenvalues of P are greater than 0.
Proof: Let λ ∈ R, P x = λx, x 6= 0. (λ is an eigenvalue of P ).
Then, we have:
x> P x = x> λx = λ kxk2 > 0
∴ kxk > 0 ⇒ λ > 0.
∴ x> x = 1.
x> P x ≥ min
n
x> P x = λmin (P )
x∈R , kxk=1
1
Page = 55
Exercise: Show
2 −1
P = >0
−1 2
∴ N > N = P.
2
Page = 56
1. M > 0.
2. A > 0, and C − B > A−1 B > 0.
3. C > 0, and A − BC −1 B > > 0.
3
Page = 57
x
For an arbitrary , dene x̄ = x + A−1 By .
y
x 0 x̄ 0
Note that 6= ⇔ 6= .
y 0 y 0
> >
x x x̄ − A−1 By x̄ − A−1 By
M = M
y y y y
> > >
x̄ x̄ −A−1 By −A−1 By x̄ −A−1 By
= M + M +2 M
0 0 y y 0 y
= x̄> Ax̄ + y > (C − B > A−1 B)y + 0 > 0.
4
Page = 58
Normal Equations:
x̂ = α̂1 A1 + α̂2 A2 + · · · + α̂m Am
G> α̂ = β , with G = G>
[G> ]ij = [G]ij = hAi , Aj i = A> >
i QAj = [A QA]ij
βi = hb, Ai i = b> QAi = A> >
i Qb = [A Qb]i .
1
Page = 59
Model:
yi = Ci x + ei , i = 1, 2, 3, · · ·
Ci ∈ Rm×n
i = time index
x = an unknown constant vector ∈ Rn
yi = measurements ∈ Rm
ei = model "mismatch" ∈ Rm
Solution:
k
!
X
x̂k : = argmin (yi − Ci x)> Si (yi − Ci x)
x∈Rn i=1
k
!
X
= argmin e>
i Si ei
x∈Rn i=1
where Si = m × m positive denite matrix. (Si > 0 for all time index i)
Batch Solution:
y1 C1 e1
y C e
2 2 2
Yk = .. , Ak = .. , Ek = ..
. . .
yk Ck ek
2
Page = 60
S1
0
S2
Rk = ... = diag(S1 , S2 , · · · , Sk ) > 0
0
Sk
Yk = Ak x + Ek , [model for 1 ≤ i ≤ k]
kYk − Ak xk2 = kEk k2 := Ek> Rk Ek
Since x̂k is the value minimizing the error kEk k, which is the unexplained
part of the model,
x̂k = argminkEk k = argminkYk − Ak xk,
x∈Rn x∈Rn
Solution: Find a recursive means to compute x̂k+1 in terms of x̂k and the
new measurement yk+1 !
k
! k
X X
Ci> Si Ci x̂k = Ci> Si yi .
i=1 i=1
We dene
k
X
Qk = Ci> Si Ci
i=1
so that
>
Qk+1 = Qk + Ck+1 Sk+1 Ck+1 .
3
Page = 61
At time k + 1,
Xk+1 k+1
X
>
( Ci Si Ci ) x̂k+1 = Ci> Si yi
| i=1 {z } i=1
Qk+1
or
k
X
Qk+1 x̂k+1 = Ci> Si yi +Ck+1
>
Sk+1 yk+1 .
|i=1 {z }
Qk x̂k
Continuing,
x̂k+1 = Q−1
k+1 Q k x̂ k + C >
k+1 Sk+1 y k+1 .
Because
>
Qk = Qk+1 − Ck+1 Sk+1 Ck+1 ,
we have
x̂k+1 = x̂k + Q−1 >
k+1 Ck+1 Sk+1 (y k+1 − Ck+1 x̂k ) .
| {z }| {z }
Kalman gain Innovations
4
Page = 62
k !
which is a recursion for Q−1
Upon dening
Pk = Q−1
k ,
we have
> > −1 −1
Pk+1 = Pk − Pk Ck+1 Ck+1 Pk Ck+1 + Sk+1 Ck+1 Pk
We note that we are now inverting a matrix that is m × m, instead of one that
is n × n. Typically, n > m, sometimes by a lot!
5
Page = 63
Overdetermined Equation:
Let Ax = b, where x ∈ Rn , b ∈ Rm , A = m × n, n < m, and rank(A) = n.
−1
Then, we conclude that x̂ = A> SA A> Sb, where x̂ = argminkxkS .
Ax=b
Underdetermined Equation:
Let Ax = b, where x ∈ Rn , b ∈ Rm , A = m × n, n > m, and rank(A) = m. In
other words, we are assuming the rows of A are linearly independent instead of
the columns of A are linearly independent.
1
Page = 64
Aside:
(v + w)> (v + w) = v > v + w> w + v > w + w> v
= kvk2 + kwk2 + 2v > w (Because v > w is a scalar.)
2
Page = 65
Xn
= kki> k2Q
i=1
3
Page = 66
4
Page = 67
Remarks:
• Comparing Weighted Least Squares to BLUE, we see that they are identical
when the weighting matrix is taken as the inverse of the covariance matrix
of the noise term: S = Q−1 .
• Another way to say this, if you solve a least squares problem with weight
matrix S , you are implicitly assuming that your uncertainty in the mea-
surements has zero mean and a covariance matrix of Q = S −1 .
• If you know the uncertainty has zero mean and a covariance matrix of
Q, using S = Q−1 makes a lot of sense! For simplicity, assume that Q
is diagonal. A large entry of Q means high variance, which means the
measurement is highly uncertain. Hence, the corresponding component of
y should not be weighted very much in the optimization problem....and
indeed, taking S = Q−1 does just that because, the weight term S is small
for large terms in Q.
• The inverse of the covariance matrix is sometimes called the information
matrix. Hence, there is low information when the variance (or covariance)
is large!
• Wow! We do all this abstract math, and the answer makes sense!
5
Page = 68
Stochastic assumptions:
Remark: E{x> } = 0 implies that the states and noise are uncorrelated.
Recall that uncorrelated does NOT imply independence, except for Gaussian
random variables.
1
Page = 69
F = R,
X = span{x1 , x2 , . . . , xn , 1 , 2 , . . . , m },
where
x1 1
x = ... and = ..
. .
xn m
M = span{y1 , y2 , . . . , ym } ⊂ X (measurements),
P
n
yi = Ci x + i = Cij xj + i , 1 ≤ i ≤ m, (i-th row of y)
j=1
2
Page = 70
G = CP C > + Q.
3
Page = 71
G> α̂ = β
m
[CP C > + Q]α̂ = CPi
m
α̂ = [CP C > + Q]−1 CPi
4
Page = 72
Remarks:
4. BLUE vs MVE
5
Page = 73
Solution to Exercise
(x̂ − x)(x̂ − x)> = (KC − I)xx> (KC − I)> + K> K > − 2(KC − I)x> K >
6
Page = 74
Solution to MIL
Hence
[C > Q−1 C + P −1 ]−1 C > Q−1 = P C > Q−1 − P C > [Q + CP C > ]−1 CP C > Q−1
= P C > I − [Q + CP C > ]−1 CP C > Q−1
= P C > [[Q+CP C > ]−1 [Q+CP C > ]−[Q+CP C > ]−1 CP C > ]Q−1
= P C > [Q + CP C > ]−1 [Q + CP C > ] − CP C > Q−1
= P C > [Q + CP C > ]−1 Q + CP C > − CP C > Q−1
= P C > [Q + CP C > ]−1 [Q] Q−1
= P C > [Q + CP C > ]−1
1 The sizes are such the matrix products and sum in A + BCD make sense.
7
Page = 75
Matrix Factorizations
Notes:
1) Q> Q = In×n
r11 · · · · · · r1n
.. .
. r22 · · · ..
2) [R]ij = 0, for i < j , R = .. .. . . . ..
. . .
0 · · · · · · rnn
3) Columns of A linearly independent ⇔ R is invertible
Utility of QR Decomposition:
1
Page = 76
Computation of QR Factorization:
Gram Schmidt with Normalization:
A = [A1 |A2 | · · · |An ], Ai ∈ Rm , hx, yi = x> y .
For 1 ≤ k ≤ n, {A1 , A2 , · · · , An } → {v1 , v2 , · · · , vn }
2
Page = 77
by
A1
v1 = ;
kA1 k
v 2 = A2 − hA2 , v 1 iv 1 ;
v2
v2 = 2 ;
kv k
..
.
v k = Ak − hAk , v 1 iv 1 − hAk , v 2 iv 2 − · · · − hAk , v k−1 iv k−1 ;
k vk
v = k ;
kv k
For k = 1 : n
v k = Ak
For j = 1 : k − 1
v k = v k − hAk , v j iv j
End
vk
vk = kv k k
End
Q = [v 1 |v 2 | · · · |v n ] has orthonormal columns, and hence Q> Q = In×n because
[Q> Q]ij = hv i , v j i = δij .
What about R?
Ai ∈ span{v 1 , · · · , v i }
Ai = hA1 , v 1 iv 1
+ hA2 , v 2 iv
2
+ · · · + hAi , v i iv i
hA1 , v 1 i
...
i
hAi , v i
We dene Ri = , where the value becomes 0 in Ri from the (i+1)-th
0
.
..
0
element to the n-th element.
∴ QRi = Ai ⇔ QR = A
3
Page = 78
4
Page = 79
end
end
5
Page = 80
Lecture 17
S
a) m > n Σ= , S is an n × n diagonal matrix
0
b) m < n Σ = S 0 , S is an m × m diagonal matrix
1
Page = 81
Projection! Notice how Y˜2 gets multiplied by 0, in the last line above. Here we
are throwing away the orthogonal parts.
We decomposed Y into part in column span of A, Y˜1 , and a part not in the
2
Page = 82
span Y˜2 .
Ax = Y
⇒ A> Ax̂ = A> Y
⇒ Q2 S 2 Q> ˜
2 x̂ = Q2 S Y1
∴ x̂ = Q2 S −1 Y˜1
Remarks:
• Only S −1 scales.
3
Page = 83
Σ
U V H, m>n
A= 0
U Σ 0 V H, m < n
Fact: The numerical rank of A is the number of singular values that are larger
than a given threshold. Often the threshold is chosen as a percentage of the
largest singular value.
4
Page = 84
1 Random Variables
I will assume known the definition of a probability space, a set of events, and
random variable. My scanned lecture notes are attached at the end of this
handout.
Given: (Ω, F , P ) a probability space
X : Ω → R random variable
2 Random Vectors
1
Page = 85
X : Ω → Rp with p = n + m
3 Conditioning
2
Page = 86
3
Page = 87
4 Moments
Suppose g : Rp → R
Z Z ∞ Z ∞
E{g(X)} = g(x)fX (x)dx = ... g(x1 ...xp )fX (x1 ...xp )dx1 ...dxp
Rp −∞ −∞
Covariance Matrices
cov(X) = cov(X, X) = E{(X − µ)(X − µ)T }
where
(X − µ) is p × 1, (X − µ)T is 1 × p, (X − µ)(X − µ)T is p × p
X1 X1 : Ω → R n
If we have X decomposed in blocks X = we may
X2 X2 : Ω → R m
compute
cov(X1 , X2 ) = E{(X1 − µ1 )(X2 − µ2 )T }
where
(X1 − µ1 ) is m × 1, (X2 − µ2 )T is 1 × n, (X1 − µ1 )(X2 − µ2 )T is m × n
4
Page = 88
\ Z x1 Z x2 +ε
P (A Bε ) = fX1 X2 (x̄1 , x̄2 )dx̄2 dx̄1
−∞ x2 −ε
Z x2 +ε
P (Bε ) = fX2 (x̄2 )dx̄2
x2 −ε
T R x1 R x2 +ε
P (A Bε ) −∞ x2 −ε fX1 X2 (x̄1 , x̄2 )dx̄2 dx̄1
FX1 |X2 (x1 | x2 ) = = R x2 +ε , ε small
P (Bε ) f
x2 −ε X2 2(x̄ )dx̄ 2
5
ROB 501 Fall 2014
Lecture 19
Typeset by:
Proofread by:
There was no lecture on this day.
89
Page = 90
X1
X=
X2
where X1 ∈ R n and X2 ∈ R m , and let p = n + m.
Then, the distribution function
Conditioning:
Conditional Density:
fX1 X2 (x1 , x2 )
fX1 |X2 =
fX2 (x2 )
Sometimes, it is convenient to write f (x1 |x2 ).
1
Page = 91
all functions g : R → Rn .
m
2
Page = 92
Luenberger Observers
xk+1 = Axk
yk = Cxk
Question 1: When can we reconstruct the initial condition (xo ) from the
measurements y0 , y1 , y2 , . . .
yo = Cxo
y1 = Cx1 = CAxo
y2 = Cx2 = CAx1 = CA2 xo
..
.
yk = CAk xo
1
Page = 93
yo C
y CA
1
.. = .. xo
. .
yk CAk
C
CA
We note that if rank .. = n, then we can determine x0 uniquely on the
.
CAk
basis of the measurements.
C C
CA CA
rank .. = rank .. for all k ≥n−1
. .
CAn−1 CAk
C
CA
Theorem: rank .. =n means that we can determine xo uniquely
.
CAn−1
from the measurements. (This called the Kalman observability rank condition.)
2
Page = 94
ek+1 = (A − LC)ek
C
CA
rank .. = n = dim(x)
.
CAn−1
.
3
Page = 95
1. Reason to choose one gain over the other: Optimality of the estimate when
you know the noise statistics.
4
Page = 96
Lecture 22
Real Analysis
Recall:
Def.
Examples:
1. R2 , k · k 2 : Euclidean norm
1
Page = 97
2. R2 , k · k 1 : One norm
k · k∞ = max |xi |
1≤i≤n
Corollary:
Def.
P̊ = {p ∈ P | p is an interior point}
= {p ∈ P | ∃ > 0 such that B (p) ⊂ P }
2
Page = 98
Example:
Def.
2.
Example:
Proposition:
x ∈ X , x ∈ P ⇔ d(x, P ) = 0.
x ∈ X , x ∈ P̊ ⇔ d(x, ∼ P ) > 0.
3
Page = 99
Proposition:
P is closed ⇔P = P .
P is open ⇔ P = P̊ .
Proposition:
P is closed ⇔ ∼ P is open.
P is open ⇔ ∼ P is closed.
Proof:
=∼ P}
∼ P =∼ (P̊ ) = {x ∈ X | d(x, ∼ P ) = 0} = |∼ P {z
| {z }
P is open ∼P is closed
4
Page = 100
Sequence
1. kxn k → kxk
2. sup kxn k < ∞ (The sequence is bounded.)
n
3. If xn → y then y = x. (Limits are unique.)
1
Page = 101
Proof:
3. kx − yk = kx − xn + xn − yk ≤ kx − xn k + kxn − yk −−−→ 0.
n→∞
2
Page = 102
Example:
X = {f : [0, 1] → R | f continuous}
R1
where kf k1 = 0 |f (τ )|dτ .
Dene a
sequence as follow
0 0 ≤ t ≤ 21 − n1
fn (t) = 1 + n(t − 12 ) 12 − n1 ≤ t ≤ 12
1 t ≥ 12
kfn − fm k1 = 1
2 | n1 − 1
m| −−−−−→ 0, but there is no continuous f (t), such
n, m→∞
that f (t) → f .
3
Page = 103
Theorem:
4
Page = 104
Idea: Have xk , seek xk+1 such that h (xk+1)−y ≈ 0. We write xk+1 = xk +∆xk
so that h (xk + ∆xk ) − y ≈ 0. Applying Taylor's Theorem and keeping only
the zeroth and rst order terms,
∂h
h (xk ) + (xk ) ∆xk − y ≈ 0
∂x
∂h
(xk ) ∆xk ≈ − (h (xk ) − y)
∂x −1
∂h
∆xk ≈ − (xk ) (h (xk ) − y)
∂x
−1
∂h
∴ xk+1 = xk − (xk ) (h (xk ) − y)
∂x
| {z }
T (xk )
1
Page = 105
∂h −1
As indicated, we dene T (x) = x − ∂x (x) (h (x) − y). Then,
Questions:
1. When does ∃x∗ s.t. T (x∗ ) = x∗ ? (Fixed point)
2. If a xed point exists, is it unique?
3. When can a xed point be determined by the Method of Successive Ap-
proximations: xn+1 = T (xn )?
2
Page = 106
Claim: x∗ = T (x∗)
Proof: For every n ≥ 1,
kx∗ − T (x∗ ) k = kx∗ − xn + xn − T (x∗ ) k
= kx∗ − xn + T (xn−1 ) − T (x∗ ) k
≤ kx∗ − xn k + kT (xn−1 ) − T (x∗ ) k
≤ kx∗ − xn k + αkxn−1 − x∗ k −−−→ 0.
n→∞
Claim: x∗ is unique.
Proof: Suppose y ∗ = T (y ∗ ).
Then,
kx∗ − y ∗ k = kT (x∗ ) − T (y ∗ ) k
≤ αkx∗ − y ∗ k and 0 ≤ α < 1
3
Page = 107
The only non-negative real number γ that satises γ ≤ γα for some 0 ≤ α < 1
is γ = 0. Hence, due to the property of norms, 0 = kx∗ − y ∗ k ⇔ x∗ = y ∗ .
Theorem: Let Let (X , k · k), and (Y, k| · k|) be two normed spaces. f : X →
Y a function.
4
Page = 108
(a) (xn ) has nite number of distinct values and at least one of them has to be
used for innite times.
(b) (xn ) has innite number of distinct values.
and
∃x∗ ∈ C , s.t. f (x∗ ) = inf f (x).
x∈C
∴ f ∗ = f (x∗ ).
2
Page = 110
Remark:
𝑥𝑥 𝑦𝑦
𝑥𝑥
𝑦𝑦
𝑓𝑓 𝑓𝑓
𝑓𝑓(𝑥𝑥) 𝑓𝑓(𝑥𝑥)
𝑓𝑓(𝑦𝑦) 𝑓𝑓(𝑦𝑦)
𝑥𝑥 𝑦𝑦 𝑥𝑥 𝑦𝑦
3
Page = 111
Theorem: If D and f are both convex, then any local miniumum is also a
global minimum.
4
Page = 112
Additional Facts:
Special case: B1 (0) convex set. (unit ball about the origin)
Let C be an open, bounded and convex set, 0 ∈ C. Then, ∃ k·k : X → [0, ∞)
such that C = {x ∈ X | ||x|| < 1} = B1 (0).
• K1 convex, K2 convex → K1 ∩ K2 is convex. (Proved by line inside the
set)
• Consider (Rn , R), A is a real m by n matrix, b ∈ Rm . Then:
K = {x ∈ Rn | Ax ≤ b} is also convex. (linear inequality)
K = {x ∈ Rn | Ax = b} is convex. (linear equality)
K = {x ∈ Rn | Aeq x = beq , Ain x ≤ bin } is convex as well. (intersection
property)
1
Page = 113
Quadratic Programming
x ∈ R n , Q ≥ 0.
Minimize: x T
Qx
| {z }
+ fx
|{z}
subject to Ain x ≤ bin and Aeq x = beq
quadratic term linear term
Note: f (x), Q, Ain and Aeq are all convex. Also, check if constraints form
the empty set.
Suppose the desired feedback signal is u = γ(q, q̇), but we need to respect
bounds on the ground reaction forces
F v ≥ 0.2mtotal g.
Therefore, the normal force should be at least 20% of the total weight
|F h | ≤ 0.6F v .
Therefore, the friction force has a cone shape, and its magnitude is less than
2
Page = 114
QP:
u∗ = argmin uT u + dT dp
Ain (q)u ≤ bin (q, q̇)
u = γ(q, q̇) + dT d
where dT d is often called the relaxation parameter. Further, p is an weighting
factor and it should be >>>> 1 · 104 . Dr. Grizzle nished by showing his
handout in linear programming and quadractic programming. And remember
Stephen Boyd!