Huffman Coding
Huffman Coding
Vasiliki Velona
Q
µ λ∀
November, 2014
Outline
1 Introduction
Introduction
3 Huffman Encoding
The Algorithm
An example
The algorithm’s Complexity and Optimality
Closure
The problem
Fixed-length Codes
Variable-lengthed Codes
L
Our aim is to reduce the rate L̄ = n of encoded bits per original
source symbols.
Prefix-free Codes
Then the expected value of L̄ for the given code is given by:
L̄ = E [L] = M
P
j=1 l(aj )pX (aj )
Kraft’s inequality
A prefix code with codeword lengths l1 , l2 , ..., lM exists if and only
if:
PM −l
i ≤ 1
i=1 2
Proof:
PM lmax −li
PM −li
i=1 2 ≤ 2lmax ⇒ i=1 2 ≤1
For the converse:
Assume that the lengths are sorted in increasing order.
Start with a binary tree.Choose a free node for each li until all
codewords are placed.
Note that in each i step there are free leaves at the maximum
depth lmax :
The number
Pi−1 oflmax
the remaining leaves
Pi−1is (using Kraft’s inequality):
2lmax − j=1 2 −lj = 2lmax (1 − j=1 2−lj ) >
2lmax (1 − M −lj
P
j=1 2 ) ≥ Vasiliki
0 Velona Huffman Coding
Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding
P
H[X ] = − j pj log pj
We’ll prove that if L̄min is the minimum expected length over all
prefix-free codes for X then:
Proof:
(First inequality)
−l
H[X ] − L̄ = M
P 1 PM PM 2 j
j=1 pj log pj − j=1 pj lj = j=1 pj log pj
2−lj
Thus, H[X ] − L̄ ≤ (log e) M
P
j=1 pj ( pj − 1) =
(log e)( M
P −lj −
PM
j=1 2 j=1 pj ) ≤ 0
where
P the inequality lnx ≤ x − 1, the Kraft inequality, and
p
j j = 1 have been used.
(Second Inequality) We need to prove that there exist a
prefix-free code such that L̄ < H[X ] + 1.It suffices to choose
lj = d− log pj e. Then − log pj ≤ lj < − log pj + 1 which is
equivalent
P −l (the left part) to 2−lj ≤ pj , thus
P
j2
j ≤ j pj = 1 and the Kraft inequality is satisfied.
Algorithm Revisited
For (i, 1 TO n − 1) do
Merge last two subtrees;
Rearrange subtrees in nonincreasing order of root - probability
End for
General Comments
Sources used
Thank you!