Solution 3: Problem 1
Solution 3: Problem 1
27
Solution 3
Instructor: Soheil Mohajer
Problem 1
(a)
n
X
L(C) = p i `i (1)
i=1
n
X
H(P) = −pi log pi (2)
i=1
(b) What we should do to check the validity of the Huffman trees, is to see whether T u satisfies properties of
optimal Huffman trees or not. The required properties are:
i) If `i < `j then pi ≥ pj
First consider i and j such that u ∈/ {i, j}. Then since T is a valid Huffman tree, clearly the property
holds. Now assume i = u. In this case by construction of T probability of u is smaller than the nodes
with less depth and is larger than the nodes with more depth, otherwise T wouldn’t be a valid Huffman
tree.
ii) The two least probable codewords have the largest length.
iii) The two least probable codewords differ only in one bit.
Both ii) and iii) are again immediate consequences of T being a valid Huffman tree.
i) If `i < `j then pi ≥ pj ;
– For T u : If u ∈/ {i, j}, then since T is a valid Huffman tree, the property still holds from first
property of T . Now consider that i = u: In this case by construction of T probability of u is
smaller than the nodes with less depth and is larger than the nodes with more depth, otherwise T
wouldn’t be a valid Huffman tree. For j = u it is similar.
– For Tu : This property obviously holds because Tu is part of T and we previously had:
`i < `j ⇐⇒ pi ≥ pj
And now:
pi pj
`i − ` < `j − ` ⇐⇒ ≥ .
q q
ii) The two least probable codewords have the largest length.
iii) The two least probable codewords differ only in one bit.
These two properties are simply guaranteed by Huffman procedure.
(c) Note that P u includes all source symbols {x1 , . . . , xk }, together with the new symbol {u}. Therefore,
k
X
L(P u ) = pi `i + q`
i=1
k
X
H(P u ) = −pi log pi − q log q
i=1
1
(d) We again need to prove properties i), ii), and iii) as in part (b). Properties ii) and iii) are clear by construction
of T . In order to prove i), we should recall that the length of codeword associated to symbol i by Tu are
given by `i − `. Now, since T was originally a valid Huffman tree, we have
`i < `j ⇐⇒ pi ≥ pj
n
X pi pi
H(Pu ) = − log
q q
i=k+1
(f) We have
k n
X X pi
L(P u ) + qL(Pu ) = pi `i + q` + q (`i − `)
q
i=1 i=k+1
k
X n
X n
X
= pi `i + q` + p i `i − ` pi
i=1 i=k+1 i=k+1
n
X
= pi `i = L(P).
i=1
Problem 2
2
(b) Let t is chosen such that P[Bn (t)] ≥ 1 − δ. Then we have
h c i h c i
P (Bn (t))c ∪ A(n) ≤ P [(Bn (t))c ] + P A(n)
≤ δ + .
Therefore, h i h c i
c
P Bn (t) ∩ A(n)
= 1 − P (B n (t)) ∪ A (n)
≥ 1 − − δ.
This implies
h i X X
2−n(H(X)−) = Bn (t) ∩ A(n)
−n(H(X)−)
1 − − δ ≤ P Bn (t) ∩ A(n) = P(x) ≤ 2 ,
(n) (n)
x∈Bn (t)∩A x∈Bn (t)∩A
that is,
Bn (t) ∩ A(n)
≥ (1 − − δ)2
n(H(X)−)
Problem 3
(a) We have
X X p(x)
E[− log2 q(X)] = − p(x) log2 q(x) = p(x) log2
x x
p(x)q(x)
X 1 X p(x)
= p(x) log2 + p(x) log2 = H(p) + D(pkq)
x
p(x) x
q(x)
P of 1/2, then − log2 q(x) will be an integer. Therefore, the solution of the
(b) When q(x) is an integer power
minimization problem for x q(x)l(c(x)) (recall the Lagrange multiplier method) will be valid: l(c(x)) =
− log2 q(x).
3
(c) From parts (a) and (b) we see that
X X
E[l(c(x))] = p(x)l(c(x)) = − p(x) log2 q(x) = H(p) + D(pkq).
x x
Hence,
E[l(c(x))] − H(p) = D(pkq).
(d) Observe that − n1 q(X1 , . . . , Xn ) = n1 ni=1 (− log2 q(Xi )). Since Xi are i.i.d., so are − log2 q(Xi ). Thus, the
P
weak law of large numbers tells us that
n
1X
(− log2 q(Xi )) −→ E[− log2 q(X)] = H(p) + D(pkq),
n
i=1
(e) We see from part (d) that as n gets large, − n1 q(X1 , . . . , Xn ) gets close to H(p) + D(pkq), with probability
1. Under the assumption that H(p) + D(pkq) ∈ / [H(q), H(q) + ], we then conclude that as n gets large
(X1 , . . . , Xn ) will not be in A(n, ) with probability 1, and thus the source output will not be assigned a
codeword with probability 1.
(f) By the same reasoning as in part (e), under the assumption that H(p) + D(pkq) ∈ [H(q), H(q) + ], as n
gets large, the source output will be assigned a codeword with probability 1. Now, since
A(n)
≥ (1 − )2
n(H(q)−)
,
Problem 4
This implies
µ0 µ 1 · P = µ0 µ1 .
4
(b) Given X1 = 0, X2 is a Bernoulli random variable with a (1 − p01 , p01 ) distribution. Hence, its entropy is
H(X2 |X1 = 0) = hB (p01 ). Similarly, we have H(X2 |X1 = 1) = hB (p10 ).
H(X1 , X2 , . . . , Xt )
H(X) = lim
t→∞ t
H(X1 ) + H(X2 |X1 ) + · · · + H(Xt |Xt−1 )
= lim
t→∞ t
H(X1 ) + (t − 1)H(X2 |X1 )
= lim
t→∞ t
= H(X2 |X1 )
H(X2 |X1 ) = H(Xt+1 |Xt ) = H(Xt+1 |Xt = 0)P (Xt = 0) + H(Xt+1 |Xt = 1)P (Xt = 1)
p10 hB (p01 ) + p01 hB (p10 )
= µ0 hB (p01 ) + µ1 hB (p10 ) = .
p01 + p10
(d) Since the process has only two states the entropy rate is at most 1 bit. T can be achieved iff p01 = p10 = 1/2.
hB (p? )
max hB (X2 |X1 ) = = 0.694 bits
p 1 + p?
1
Note that p? is the Golden Ratio!
(g) Note that the Markov chain of part (c) doesn’t allow consecutive ones. Consider any allowable sequence of
symbols of length t. If the first symbol is 1, then the next symbol must be 0. The remaining N (t−2) remaining
symbols can make any allowable sequence. If the first symbol is 0, then the remaining N (t − 1) symbols can
make any allowable sequences. So we have:
Note that one can find the closed form of N (t) and then compute limn→∞ 1t log N (t) from that. However, a
simpler way to find this limit is the following. Assume the limit exists, and the sequence converges to some
constant λ0 . Therefore, as t grows, log N (t) behaves like λ0 t, or N (t) ≈ 2λ0 t = λt1 where λ1 = 2λ0 .
Plugging this into the the recursive equation, we get
λt1 = cλt−1
1 + cλt−2
1
√
or λ21 − λ1 − 1 = 0. Solving for λ1 , we get λ1 = (1 + 5)/2. Therefore
√
1 1+ 5
lim log N (t) = λ0 = log λ1 = log = 0.694 bits
n→∞ t 2
5
Since there are only N (t) possible outcomes for X1 , X2 , · · · , Xt an upper bound on H(X1 , X2 , · · · , Xt ) is
log N (t), so the entropy rate of the Markov chain of part (c) is at most H0 . And we saw in part (d) that this
bound is achievable.
Therefore, for N (t) can be written as N (t) = uαn + vβ n , where α and√β are the roots of Z n √
= Z n−1 + Z n−2 ,
2
or equivalently Z = Z + 1. Solving this problem, we get α = (1 + 5)/2 and β = (1 − 5)/2. To find u
and v we can use the initial conditions of the recursive equation:
2 =N (1) = uα + vβ
3 =N (2) = uα2 + vβ 2 .