0% found this document useful (0 votes)
62 views6 pages

Solution 3: Problem 1

This document contains the solutions to problems from an information theory and coding class. Problem 1 involves calculating entropy and expected length of a Huffman code. Part b shows that removing a symbol u from a valid Huffman tree T results in a new tree Tu that also satisfies the Huffman tree properties. Part c extends the analysis to include u in the source probabilities. Problem 2 shows that most probable codewords are concentrated in a set Bn(t) with probability at least 1-δ. This is used to bound the probability that a random codeword is both typical and in the set A(n)ε of almost constant composition sequences.

Uploaded by

Sanjay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views6 pages

Solution 3: Problem 1

This document contains the solutions to problems from an information theory and coding class. Problem 1 involves calculating entropy and expected length of a Huffman code. Part b shows that removing a symbol u from a valid Huffman tree T results in a new tree Tu that also satisfies the Huffman tree properties. Part c extends the analysis to include u in the source probabilities. Problem 2 shows that most probable codewords are concentrated in a set Bn(t) with probability at least 1-δ. This is used to bound the probability that a random codeword is both typical and in the set A(n)ε of almost constant composition sequences.

Uploaded by

Sanjay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

EE5581 - Information Theory and Coding UMN, Fall 2018, Oct.

27

Solution 3
Instructor: Soheil Mohajer

Problem 1
(a)
n
X
L(C) = p i `i (1)
i=1
n
X
H(P) = −pi log pi (2)
i=1

(b) What we should do to check the validity of the Huffman trees, is to see whether T u satisfies properties of
optimal Huffman trees or not. The required properties are:
i) If `i < `j then pi ≥ pj
First consider i and j such that u ∈/ {i, j}. Then since T is a valid Huffman tree, clearly the property
holds. Now assume i = u. In this case by construction of T probability of u is smaller than the nodes
with less depth and is larger than the nodes with more depth, otherwise T wouldn’t be a valid Huffman
tree.
ii) The two least probable codewords have the largest length.
iii) The two least probable codewords differ only in one bit.
Both ii) and iii) are again immediate consequences of T being a valid Huffman tree.

i) If `i < `j then pi ≥ pj ;
– For T u : If u ∈/ {i, j}, then since T is a valid Huffman tree, the property still holds from first
property of T . Now consider that i = u: In this case by construction of T probability of u is
smaller than the nodes with less depth and is larger than the nodes with more depth, otherwise T
wouldn’t be a valid Huffman tree. For j = u it is similar.
– For Tu : This property obviously holds because Tu is part of T and we previously had:
`i < `j ⇐⇒ pi ≥ pj
And now:
pi pj
`i − ` < `j − ` ⇐⇒ ≥ .
q q
ii) The two least probable codewords have the largest length.
iii) The two least probable codewords differ only in one bit.
These two properties are simply guaranteed by Huffman procedure.

(c) Note that P u includes all source symbols {x1 , . . . , xk }, together with the new symbol {u}. Therefore,
k
X
L(P u ) = pi `i + q`
i=1
k
X
H(P u ) = −pi log pi − q log q
i=1

1
(d) We again need to prove properties i), ii), and iii) as in part (b). Properties ii) and iii) are clear by construction
of T . In order to prove i), we should recall that the length of codeword associated to symbol i by Tu are
given by `i − `. Now, since T was originally a valid Huffman tree, we have

`i < `j ⇐⇒ pi ≥ pj

This simply implies


pi pj
`i − ` < `j − ` ⇐⇒ ≥ .
q q
(e)
n
X pi
L(Pu ) = (`i − `)
q
k+1

n
X pi pi
H(Pu ) = − log
q q
i=k+1

(f) We have
k n
X X pi
L(P u ) + qL(Pu ) = pi `i + q` + q (`i − `)
q
i=1 i=k+1
k
X n
X n
X
= pi `i + q` + p i `i − ` pi
i=1 i=k+1 i=k+1
n
X
= pi `i = L(P).
i=1

Similarly, we can combine the entropies as


k n
u
X X pi pi
H(P ) + qH(Pu ) = −pi log pi − q log q − q log
q q
i=1 i=k+1
k
X n
X
= −pi log pi − q log q − pi [log pi − log q]
i=1 i=k+1
n n
!
X X
= −pi log pi − q log q + log q pi = H(P)
i=1 i=k+1

Problem 2

(a) Note that


X X X
1= P(xn ) ≥ P(xn ) ≥ 2−nt = |Bn (t)|2−nt .
xn ∈X n xn ∈Bn (t) xn ∈Bn (t)

Multiplying both sides by 2nt , we get the desired inequality.

2
(b) Let t is chosen such that P[Bn (t)] ≥ 1 − δ. Then we have
h  c i h c i
P (Bn (t))c ∪ A(n) ≤ P [(Bn (t))c ] + P A(n)
 ≤ δ + .

Therefore, h i h  c i
c
P Bn (t) ∩ A(n)
 = 1 − P (B n (t)) ∪ A (n)
 ≥ 1 −  − δ.
This implies
h i X X
2−n(H(X)−) = Bn (t) ∩ A(n)
−n(H(X)−)
1 −  − δ ≤ P Bn (t) ∩ A(n) = P(x) ≤  2 ,


(n) (n)
x∈Bn (t)∩A x∈Bn (t)∩A

that is,
Bn (t) ∩ A(n)
 ≥ (1 −  − δ)2
n(H(X)−)

for any , δ > 0 and sufficiently large n.


Now, assume t < H(x). Choose  ∈ (0, H(x) − t). For large enough n, we have
 
(n)
 −nt
1
Bn (t) ∩ A = x : p(x) ≥ 2 ∩ x : − log p(x) ∈ [H(X) − , H(X) + ]
n
   
1 1
= x : − log p(x) ≤ t ∩ x : − log p(x) ∈ [H(X) − , H(X) + ] = ∅
n n

(n)
where the last equality holds sine t < H(X) − . This implies Bn (t) ∩ A = 0, which is in contradiction

with what we derive in the class.
Next, consider some t > H(X). By choosing  ∈ (0, t − H(X)), we have
 
(n)
 −nt
1
Bn (t) ∩ A = x : p(x) ≥ 2 ∩ x : − log p(x) ∈ [H(X) − , H(X) + ]
n
   
1 1
= x : − log p(x) ≤ t ∩ x : − log p(x) ∈ [H(X) − , H(X) + ] = A(n)
 .
n n
(n)
Therefore, A ⊆ Bn (t), and for sufficiently large n, we have

P[Bn (t)] ≥ P[A(n)


 ] → 1.

Therefore t > H(X) is the range of t in order to have P[Bn (t)] → 1.

Problem 3

(a) We have
X X p(x)
E[− log2 q(X)] = − p(x) log2 q(x) = p(x) log2
x x
p(x)q(x)
X 1 X p(x)
= p(x) log2 + p(x) log2 = H(p) + D(pkq)
x
p(x) x
q(x)

P of 1/2, then − log2 q(x) will be an integer. Therefore, the solution of the
(b) When q(x) is an integer power
minimization problem for x q(x)l(c(x)) (recall the Lagrange multiplier method) will be valid: l(c(x)) =
− log2 q(x).

3
(c) From parts (a) and (b) we see that
X X
E[l(c(x))] = p(x)l(c(x)) = − p(x) log2 q(x) = H(p) + D(pkq).
x x

Hence,
E[l(c(x))] − H(p) = D(pkq).

(d) Observe that − n1 q(X1 , . . . , Xn ) = n1 ni=1 (− log2 q(Xi )). Since Xi are i.i.d., so are − log2 q(Xi ). Thus, the
P
weak law of large numbers tells us that
n
1X
(− log2 q(Xi )) −→ E[− log2 q(X)] = H(p) + D(pkq),
n
i=1

or equivalently, for any δ > 0


 
1
lim P − log q(X1 , . . . , Xn ) − (H(p) + D(pkq)) > δ = 0.

n→∞ n

(e) We see from part (d) that as n gets large, − n1 q(X1 , . . . , Xn ) gets close to H(p) + D(pkq), with probability
1. Under the assumption that H(p) + D(pkq) ∈ / [H(q), H(q) + ], we then conclude that as n gets large
(X1 , . . . , Xn ) will not be in A(n, ) with probability 1, and thus the source output will not be assigned a
codeword with probability 1.

(f) By the same reasoning as in part (e), under the assumption that H(p) + D(pkq) ∈ [H(q), H(q) + ], as n
gets large, the source output will be assigned a codeword with probability 1. Now, since

A(n)
 ≥ (1 − )2
n(H(q)−)
,

we see that as n gets large


1
dlog2 |A(n) |e ≥ H(q) −  + O(1/n).
n
Under the assumption, H(q) ≥ H(p) + D(pkq) − , and thus for large n, the codeword length per source
letter exceeds
H(p) + D(pkq) − 2.

Problem 4

(a) At the stationary point we have

µ0 = lim P (Xt = 0) = lim P (Xt−1 = 0)


t→∞ t→∞
µ1 = lim P (Xt = 1) = lim P (Xt−1 = 1).
t→∞ t→∞

This implies
   
µ0 µ 1 · P = µ0 µ1 .

Solving this equation. we get


p10 p01
µ0 = , µ0 = .
p01 + p10 p01 + p10

4
(b) Given X1 = 0, X2 is a Bernoulli random variable with a (1 − p01 , p01 ) distribution. Hence, its entropy is
H(X2 |X1 = 0) = hB (p01 ). Similarly, we have H(X2 |X1 = 1) = hB (p10 ).

(c) First note that

H(X1 , X2 , . . . , Xt )
H(X) = lim
t→∞ t
H(X1 ) + H(X2 |X1 ) + · · · + H(Xt |Xt−1 )
= lim
t→∞ t
H(X1 ) + (t − 1)H(X2 |X1 )
= lim
t→∞ t
= H(X2 |X1 )

This last term can be simplified to

H(X2 |X1 ) = H(Xt+1 |Xt ) = H(Xt+1 |Xt = 0)P (Xt = 0) + H(Xt+1 |Xt = 1)P (Xt = 1)
p10 hB (p01 ) + p01 hB (p10 )
= µ0 hB (p01 ) + µ1 hB (p10 ) = .
p01 + p10

(d) Since the process has only two states the entropy rate is at most 1 bit. T can be achieved iff p01 = p10 = 1/2.

(e) The entropy rate is


hB (p)
H(X) = H(X2 |X1 ) = µ0 hB (p) + µ1 hB (1) =
p+1

(f) After some mathematics,we see that the maximum happens at p? = (3 − 5)/2 = 0.382 and the maximum
will be:

hB (p? )
max hB (X2 |X1 ) = = 0.694 bits
p 1 + p?
1
Note that p? is the Golden Ratio!

(g) Note that the Markov chain of part (c) doesn’t allow consecutive ones. Consider any allowable sequence of
symbols of length t. If the first symbol is 1, then the next symbol must be 0. The remaining N (t−2) remaining
symbols can make any allowable sequence. If the first symbol is 0, then the remaining N (t − 1) symbols can
make any allowable sequences. So we have:

N (t) = N (t − 1) + N (t − 2) N (1) = 2, N (2) = 3

Note that one can find the closed form of N (t) and then compute limn→∞ 1t log N (t) from that. However, a
simpler way to find this limit is the following. Assume the limit exists, and the sequence converges to some
constant λ0 . Therefore, as t grows, log N (t) behaves like λ0 t, or N (t) ≈ 2λ0 t = λt1 where λ1 = 2λ0 .
Plugging this into the the recursive equation, we get

λt1 = cλt−1
1 + cλt−2
1

or λ21 − λ1 − 1 = 0. Solving for λ1 , we get λ1 = (1 + 5)/2. Therefore

1 1+ 5
lim log N (t) = λ0 = log λ1 = log = 0.694 bits
n→∞ t 2

5
Since there are only N (t) possible outcomes for X1 , X2 , · · · , Xt an upper bound on H(X1 , X2 , · · · , Xt ) is
log N (t), so the entropy rate of the Markov chain of part (c) is at most H0 . And we saw in part (d) that this
bound is achievable.

Pk of N (t), you can use a standard result that


Remark: If you are interested in finding the exact closed form
is: the solution of any linear recursive equation x(n) = i=1 ai x(n − i) can be expressed in terms of
x(n) = i=1 bi zi , where zi ’s are the non-zero roots of Z = ki=1 ai Z i .
Pk n n
P

Therefore, for N (t) can be written as N (t) = uαn + vβ n , where α and√β are the roots of Z n √
= Z n−1 + Z n−2 ,
2
or equivalently Z = Z + 1. Solving this problem, we get α = (1 + 5)/2 and β = (1 − 5)/2. To find u
and v we can use the initial conditions of the recursive equation:

2 =N (1) = uα + vβ
3 =N (2) = uα2 + vβ 2 .

Solving this system of linear equations for u and v we get



5−3 5 1
u= = √ α2
10 5

5−3 5 1
v= = − √ β2.
10 5
Therefore,
√ !t+2 √ !t+2
 
1  1+ 5 1− 5
N (t) = √ − 
5 2 2

Note that the second term vanishes at t → ∞. Hence,


√ √
1 − 1 log 5 + (t + 2) log 1+2 5
1+ 5
lim log N (t) = lim 2 = log .
n→∞ t n→∞ t 2

You might also like