0% found this document useful (0 votes)
0 views22 pages

ITC_2020_21_Lecture_6

The document outlines a lecture on Information Theory and Coding, focusing on Shannon-Fano-Elias Coding, Arithmetic Coding, and the Lempel-Ziv Algorithm. It discusses the construction of uniquely decodable codes, the limitations of prefix codes, and introduces various coding techniques with examples. The lecture is part of a course taught by Dr. Amit Ranjan Azad at BITS Pilani, Hyderabad Campus.

Uploaded by

f20220457
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views22 pages

ITC_2020_21_Lecture_6

The document outlines a lecture on Information Theory and Coding, focusing on Shannon-Fano-Elias Coding, Arithmetic Coding, and the Lempel-Ziv Algorithm. It discusses the construction of uniquely decodable codes, the limitations of prefix codes, and introduces various coding techniques with examples. The lecture is part of a course taught by Dr. Amit Ranjan Azad at BITS Pilani, Hyderabad Campus.

Uploaded by

f20220457
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

ECE F344

Information Theory and Coding

Instructor-in-Charge: Dr. Amit Ranjan Azad


Email: [email protected]

Birla Institute of Technology and Science Pilani, Hyderabad Campus


Department of Electrical and Electronics Engineering

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 1
Lecture - 6
Module - 1
Information and Source Coding

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 2
Outline
• Shannon-Fano-Elias Coding
• Problem With Prefix Codes
• Arithmetic Coding
• The Lempel-Ziv Algorithm
• Run Length Encoding (RLE)

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 3
Shannon-Fano-Elias Coding
• Codes that use the codeword lengths of l(x) are called Shannon Codes.
 1 
l  x   log 
 P  x  
• Shannon codeword lengths satisfy the Kraft Inequality and can therefore be used to construct a
uniquely decodable code.
• We will discuss another simple method for constructing uniquely decodable codes based on
Shannon-Fano-Elias encoding technique.
• It uses the Cumulative Distribution Function to allocate the codewords.
• The cumulative distribution function is defined as
F  x   P  z 
z x
where P(z) are the probability of occurrence of the z.
• The cumulative distribution function consists of step size
P(x), as shown in the figure.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 4
Shannon-Fano-Elias Coding
• Let us define modified cumulative distribution function as
1
F  x   P  z   P  x
z x 2
where F̅(x) represents the sum of the probabilities of all symbols less than x plus half of the
probability of the symbols x.
• The value of the function F̅(x) is the midpoint of the step corresponding to x of the cumulative
distribution function.
• Since probabilities are positive, F(x) ≠ F(y) if x ≠ y.
• Thus, it is possible to determine x given F̅(x) merely by looking at the graph of the cumulative
distribution function. Therefore, the value of F̅(x) can be used to code x.
• In general, F̅(x) is a real number.
• This means we require an infinite number of bits to represent F̅(x), which would lead to an
inefficient code.
• Suppose we round off F̅(x) and use only the first l(x) bits, denoted by
 F  x  
l x

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 5
Shannon-Fano-Elias Coding
• By the definition of rounding off, we have
1
F  x    F  x   
2 
l x l x

 1  1 P  x
• If l  x   log   1, then l  x    F  x   F  x  1
 P  x  2 2

• This implies that  F  x   l x lies within the step corresponding to x, and l(x) bits are sufficient to
 
describe x.
• The interval corresponding to any codeword is of length 2–l(x). We see that this interval is less than
half the height of the step corresponding to x.
 1 
• Since we use l  x   log   1 bits to represent x, the expected length of this code is
 P  x 
 1  
R   P  x l  x    P  x   log   1  H  X   2
x x 
 P  x  
• The Shannon-Fano-Elias coding scheme achieves codeword length within two bits of the entropy.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 6
Shannon-Fano-Elias Coding (Example - 1)
• Consider the D-adic distribution given in the following table.

 1 
Symbol Probability F(x) F̅(x) F̅(x) (binary) l  x   log  1 Codeword
 P  x 

x1 1/2 0.5 0.25 0.01 2 01


x2 1/22 0.75 0.625 0.101 3 101
x3 1/23 0.875 0.8125 0.1101 4 1101
x4 1/23 1 0.9375 0.1111 4 1111

• The entropy of this distribution is 1.75 bits. However, the average codeword length for the
Shannon-Fano-Elias coding scheme is 2.75 bits.
• It is easy to observe that if the last bit from all the codeword is deleted, we get the optimal code
(Huffman code).
• It is worthwhile to note that unlike in Huffman coding procedure, here we do not have to arrange
the probabilities in decreasing order first.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 7
Shannon-Fano-Elias Coding (Example - 2)
• Let us shuffle the probabilities and redo the exercise.

 1 
Symbol Probability F(x) F̅(x) F̅(x) (binary) l  x   log  1 Codeword
 P  x 

x1 1/22 0.25 0.125 0.001 3 001


x2 1/2 0.75 0.5 0.10 2 10
x3 1/23 0.875 0.8125 0.1101 4 1101
x4 1/23 1 0.9375 0.1111 4 1111

• We observe that the codewords obtained from the Shannon-Fano-Elias coding procedure is not
unique. The average codeword length is again 2.75 bits.
• However, this time we cannot get the optimal code simply by deleting the last bit from every
codeword. If we do so, the code no longer remains a prefix code.
• The basic concept of Shannon-Fano-Elias coding is used in a computationally efficient algorithm
for encoding and decoding called Arithmetic Coding.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 8
Shannon-Fano-Elias Coding (Example - 3)

 1 
Symbol Probability F(x) F̅(x) F̅(x) (binary) l  x   log  1 Codeword
 P  x 

x1 0.25 0.25 0.125 0.001 3 001


x2 0.25 0.5 0.375 0.011 3 011
x3 0.2 0.7 0.600 0.10011 4 1001
x4 0.15 0.85 0.775 0.1100011 4 1100
x5 0.15 1 0.925 0.1110110 4 1110

Note:
• The bits shown in red keep on recurring.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 9
Problem With Prefix Codes
• If we consider the prefix codes being generated using a binary tree, the decisions between tree
branches always take one bit.
• The Huffman Code also needs one bit for each decision.
• Huffman Codes are only optimal if the probabilities of the symbols are negative powers of two.
• Arithmetic Coding does not have this restriction.
• It works by representing the file to be encoded by an interval of real numbers between 0 and 1.
• Successive symbols in the message reduce this interval according to the probability of that symbol.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 10
Arithmetic Coding
• Let our alphabet consists of only three symbols A, B and C with probabilities of occurrence P(A) =
0.5, P(B) = 0.25 and P(C) = 0.25.
• We first divide the interval [0, 1) into three intervals proportional to their probabilities.
• Thus, the variable A corresponds to [0, 0.5), the variable B corresponds to [0.5, 0.75) and the
variable C corresponds to [0.75, 1.0).
• Note that the lengths of these intervals are proportional to their probabilities.
• Next, suppose the input symbol stream is B A C A ...
• We first encode B. This is simply choosing the corresponding interval, i.e., [0.5, 0.75).
• Now, this interval is again subdivided into three intervals, proportional to the probabilities of
occurrence.
• So, for the second step, the variable A corresponds to [0.5, 0.625), the variable B corresponds to
[0.625, 0.6875) and the variable C corresponds to [0.6875, 0.75).
• Since the next symbol to arrive after B is A, we choose the interval corresponding to A, which is
[0.5, 0.625).
• This is again subdivided to yield the interval [0.5, 0.5625) for A, the interval [0.5625, 0.59375) for
B and the interval [0.59375, 0.625) for C.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 11
Arithmetic Coding
• Now we look at the next symbol to encode, which is C. This corresponds to the interval [0.59375,
0.625).
• Continuing this process, after encoding A, we are left with the interval [0.59375, 0.609375).
• The arithmetic code for B A C A is any number that lies within this interval.
• To complete this example, we can say that the arithmetic code for the sequence B A C A is 0.59375.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 12
Arithmetic Coding

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 13
Arithmetic Coding
• Next consider the decoding at the receiver. The receiver needs to know a-priori, the probabilities of
A, B and C.
• So, it will also have an identical number line, partitioned into three segments, proportional to the
probabilities of A, B and C.
• Let us say that the receiver receives 0.59375.
• First it checks where this number lies. Clearly, 0.5 < 0.59375 < 0.75, which is the segment
corresponding to B. So the 1st decoded symbol is B.
• Now we split the segment corresponding to B into three sub-segments proportional to the
probabilities of A, B and C (exactly as we did at the encoder side).
• Again we map the received number 0.59375, and find that it lies in the region of A.
• The decoding is instantaneous.
• Mechanically, we proceed to decode the next symbol, and so on.
• But, how will the receiver know when to stop?
• Therefore, we need to have a stopping criterion, or a pre-decided protocol to do so.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 14
The Lempel-Ziv Algorithm
• Huffman coding requires symbol probabilities.
• But most real life scenarios do not provide the symbol probabilities in advance (i.e., the statistics of
the source is unknown).
• In principle, it is possible to observe the output of the source for a long enough time period and
estimate the symbol probabilities. However, this is impractical for real-time application.
• Also, Huffman coding is optimal for a DMS source where the occurrence of one symbol does not
alter the probabilities of the subsequent symbols.
• Huffman coding is not the best choice for a source with memory.
• For example, consider the problem of compression of written text.
• We know that many letters occur in pairs or groups, like ‘q-u’, ‘t-h’, ‘i-n-g’etc.
• It might be more efficient to use the statistical inter-dependence of the letters in the alphabet along
with their individual probabilities of occurrence.
• Such a scheme was proposed by Lempel and Ziv in 1977. Their source coding algorithm does not
need the source statistics.
• It is a variable-to-fixed length source coding algorithm and belongs to the class of Universal Source
Coding algorithms.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 15
The Lempel-Ziv Algorithm
• The logic behind Lempel-Ziv Universal Coding is as follows.
• The compression of an arbitrary sequence of bits is possible by coding a series of 0’s and 1’s as
some previous such string (the prefix string) plus one new bit.
• Then, the new string formed by adding the new bit to the previously used prefix string becomes a
potential prefix string for future strings.
• These variable length blocks are called phrases.
• The phrases are listed in a dictionary which stores the existing phrases and their locations.
• In encoding a new phrase, we specify the location of the existing phrase in the dictionary and
append the new letter.
• We can derive a better understanding of how the Lempel-Ziv algorithm works by the following
example.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 16
The Lempel-Ziv Algorithm (Example)
• Suppose we wish to code the string:
101011011010101011
• We will begin by parsing it into comma-separated phrases that represent strings that can be
represented by a previous string as a prefix, plus a bit.
• The first bit, a 1, has no predecessors, so it has a null prefix string and the one extra bit is itself.
1, 01011011010101011
• The same goes for the 0 that follows since it can’t be expressed in terms of the only existing prefix:
1, 0, 1011011010101011
• So far our dictionary contains the strings ‘1’and ‘0’.
• Next we encounter a 1, but it already exists in our dictionary. Hence we proceed further.
• The following 10 is obviously a combination of the prefix 1 and a 0, so we now have:
1, 0, 10, 11011010101011
• Continuing in this way, we eventually parse the whole string as follows:
1, 0, 10, 11, 01, 101, 010, 1011

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 17
The Lempel-Ziv Algorithm (Example)
• Now, since we found 8 phrases, we will use a three bit code to label the null phrase and the first
seven phrases for a total of 8 numbered phrases (with the ninth and last phrase we found being
expressed in the others, hence not required to be numbered).
• Next, we write the string in terms of the prefix phrase plus the new bit needed to create the new
phrase.
• We will use parentheses and commas to separate these at first, in order to aid our visualization of
the process.
• The eight phrases can be described by:
(000,1), (000,0), (001,0), (001,1), (010,1), (011,1), (101,0), (110,1)
• It can be read out as: (codeword at location 0,1), (codeword at location 0,0), (codeword at location
1,0), (codeword at location 1,1), (codeword at location 2,1), (codeword at location 3,1) ...
• Thus, the coded version of the above string is:
00010000001000110101011110101101
• The dictionary for this example is given in the table (next slide).

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 18
The Lempel-Ziv Algorithm (Example)
Table: Dictionary for the Lempel-Ziv algorithm

Dictionary Dictionary Fixed Length


Location Content Codeword
001 1 0001
010 0 0000
011 10 0010
100 11 0011
101 01 0101
110 101 0111
111 010 1010
– 1011 1101

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 19
The Lempel-Ziv Algorithm (Length of the Table)
• In this case, we have not obtained any compression, or coded string is actually longer.
• However, the larger the initial string, the more saving we get as we move along, because prefixes
that are quite larger become representable as small numerical indices.
• In fact, Ziv proved that for long documents, the compression of the file approaches the optimum
obtainable as determined by the information content of the document.
• The next question is what should be the length of the table.
• In practical application, regardless of the length of the table, it will eventually overflow.
• This problem can be solved by pre-deciding a large enough size of the dictionary.
• The encoder and decoder can update their dictionaries by periodically substituting the less used
phrases from their dictionaries by more used ones.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 20
Run Length Encoding (RLE)
• Run Length Encoding or RLE is a technique used to reduce the size of a repeating string of
characters.
• This repeating string is called a run.
• Typically RLE encodes a run of symbols into two bytes, a count and a symbol.
• RLE can compress any type of data regardless of its information content, but the content of data to
be compressed affects the compression ratio.
• RLE cannot achieve high compression ratios compared to other compression methods, but it is easy
to implement and is quick to execute.
• RLE is supported by most bitmap file formats such as TIFF, JPG, BMP, PCX and fax machines.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 21
RLE (Example)
• Consider the following bit stream:
S = 11111111111111100000000000000000001111
• This can be represented as: fifteen 1’s , nineteen 0’s, four 1’s, i.e., (15, 1), (19, 0), (4, 1).
• Since the maximum number of repetitions is 19, which can be represented with 5 bits, we can
encode the bit stream as (01111, 1), (10011, 0), (00100, 1).
• The compression ratio in this case is 18:38 = 1:2.11.

ECE F344 Information Theory and Coding | Dr. Amit Ranjan Azad | BITS Pilani, Hyderabad Campus 22

You might also like