0% found this document useful (0 votes)
22 views

Huffman Coding

This document discusses Huffman coding and entropy bounds. It introduces codes and compression, and describes Huffman encoding which assigns variable length binary codes to symbols based on their probability. The document outlines Kraft's inequality and proves entropy bounds on codeword lengths.

Uploaded by

ddgChrgr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Huffman Coding

This document discusses Huffman coding and entropy bounds. It introduces codes and compression, and describes Huffman encoding which assigns variable length binary codes to symbols based on their probability. The document outlines Kraft's inequality and proves entropy bounds on codeword lengths.

Uploaded by

ddgChrgr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Introduction

Codes, Compression, Entropy


Huffman Encoding

Huffman Coding and Entropy Bounds

Vasiliki Velona
Q
µ λ∀

November, 2014

Vasiliki Velona Huffman Coding


Introduction
Codes, Compression, Entropy
Huffman Encoding

Outline

1 Introduction
Introduction

2 Codes, Compression, Entropy


Codes and Compression
Information and Entropy

3 Huffman Encoding
The Algorithm
An example
The algorithm’s Complexity and Optimality
Closure

Vasiliki Velona Huffman Coding


Introduction
Codes, Compression, Entropy Introduction
Huffman Encoding

The problem

Transform a symbol string into a binary symbol string with


the most economic way

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Fixed-length Codes

Each symbol from the alphabet X is maped into a codeword


C (x), and all the codewords are of the same length L.
For example, if X = {a, b, c, d, e}then we could use L=3
C (a) = 000
C (b) = 001
C (c) = 010
C (d) = 011
C (e) = 100

There are 2L different L-tuples, thus for an alphabet of size M


we need L = dlog Me bits.

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Variable-lengthed Codes

L
Our aim is to reduce the rate L̄ = n of encoded bits per original
source symbols.

The idea is to map more probable symbols into shorter bit


sequences, and less likely symbols into longer bit sequences.
We need unique decodability.
Example: If X = {a, b, c} and
C (a) = 0
C (b) = 1
C (c) = 01

Solution: Prefix-free Codes

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Prefix-free Codes

A code is prefix-free (or just a prefix code) if no codeword is a


prefix of any other codeword. For example, {0, 10, 11} is
prefix-free, but the code {0, 1, 01} is not.

Every prefix-free code is uniquely decodable. Why?: Every


prefix-free code corresponds to a binary code tree, and each
node on the tree is either a codeword or a proper prefix of a
codeword.

Note: The converse is not true.

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Optimum Source Coding problem

Suppose that X = {a1 , a2 , ...aM }, with probabilities


{p(a1 ), p(a2 ), ...p(aM )} and lengths {l(a1 ), l(a2 ), ...l(aM )}
respectively, where the lenghts correspond to a prefix-free code

Then the expected value of L̄ for the given code is given by:

L̄ = E [L] = M
P
j=1 l(aj )pX (aj )

and we want to minimize this quantity.

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Kraft’s inequality
A prefix code with codeword lengths l1 , l2 , ..., lM exists if and only
if:
PM −l
i ≤ 1
i=1 2
Proof:
PM lmax −li
PM −li
i=1 2 ≤ 2lmax ⇒ i=1 2 ≤1
For the converse:
Assume that the lengths are sorted in increasing order.
Start with a binary tree.Choose a free node for each li until all
codewords are placed.
Note that in each i step there are free leaves at the maximum
depth lmax :
The number
Pi−1 oflmax
the remaining leaves
Pi−1is (using Kraft’s inequality):
2lmax − j=1 2 −lj = 2lmax (1 − j=1 2−lj ) >
2lmax (1 − M −lj
P
j=1 2 ) ≥ Vasiliki
0 Velona Huffman Coding
Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Entropy, Lower and Upper Bounds


Entropy Definition:

P
H[X ] = − j pj log pj

We’ll prove that if L̄min is the minimum expected length over all
prefix-free codes for X then:

H[X ] ≤ L̄min ≤ H[X ] + 1 bit per symbol

Vasiliki Velona Huffman Coding


Introduction
Codes and Compression
Codes, Compression, Entropy
Information and Entropy
Huffman Encoding

Entropy, Lower and Upper Bounds, cont.

Proof:
(First inequality)
−l
H[X ] − L̄ = M
P 1 PM PM 2 j
j=1 pj log pj − j=1 pj lj = j=1 pj log pj
2−lj
Thus, H[X ] − L̄ ≤ (log e) M
P
j=1 pj ( pj − 1) =
(log e)( M
P −lj −
PM
j=1 2 j=1 pj ) ≤ 0
where
P the inequality lnx ≤ x − 1, the Kraft inequality, and
p
j j = 1 have been used.
(Second Inequality) We need to prove that there exist a
prefix-free code such that L̄ < H[X ] + 1.It suffices to choose
lj = d− log pj e. Then − log pj ≤ lj < − log pj + 1 which is
equivalent
P −l (the left part) to 2−lj ≤ pj , thus
P
j2
j ≤ j pj = 1 and the Kraft inequality is satisfied.

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Huffman Encoding Algorithm

1 Pick two letters x, y from alphabet A with the smallest


frequencies and create a subtree that has these two characters
as leaves. Label the root of this subtree as z.

2 Set frequency f (z) = f (x) + f (y ). Removex, y and add z


creating new alphabet A0 = A ∪ {z} − {x, y }. Then
|A0 | = |A| − 1.

3 Repeat this procedure with new alphabet A0 until only one


symbol is left.

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Algorithm Revisited

For (i, 1 TO n − 1) do
Merge last two subtrees;
Rearrange subtrees in nonincreasing order of root - probability
End for

Complexity: O(n log n) - if a heap is used.

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Huffman Coding is Optimal

1 Prefix-free Codes have the property that the associated code


tree is full.
2 Optimal prefix-free Codes have the property that, for each of
the longest codewords in the code, the sibling of the codeword
is another longest codeword
3 There is an optimal prefix-free code for X in which the
codewords for M − 1and M are siblings and have maximal
length within the code.
4 An optimal code for the reduced alphabet
X 0 = X − {M, M − 1} ∪ z yields an optimal code for X .
(Note that L̄ = L̄0 + pM−1 + pM )

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

General Comments

Huffman Code is usefull in finding an optimal code, while the


entropy bounds provide insightful performance bounds.
Huffman Coding is generally close to the entropy.
By Coding in Large k-blocks we can find codings that
approximate as much as we want the lower entropy bounds
(for large k). Not practical though, due to the size of |X |k .

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Sources used

1 Robert Gallager, course materials for 6.450 Principles of


Digital Communications I, Fall 2006. MIT OpenCourseWare
(http://ocw.mit.edu/)
2 Notes from 2005 Design and Analysis of Algorithms (Hong
Kong University)
3 Stathis Zachos 2014, NTUA
4 Anadolu University, Notes from 2010 Algorithm Analysis and
Complexity
5 Linkopings University, 2008 Data Compression Notes

Vasiliki Velona Huffman Coding


The Algorithm
Introduction
An example
Codes, Compression, Entropy
The algorithm’s Complexity and Optimality
Huffman Encoding
Closure

Thank you!

Vasiliki Velona Huffman Coding

You might also like