School of Computing Science and Engineering
Lecture Notes
on
Information Theory and Coding
July 2020
(Be safe and stay at home)
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
17.07.2020 Assignment
Question 1: Analog signal is bandlimited to 4khz. It is sampled at Nyquist rate &
samples are quantized into 4 levels Q1, Q2, Q3, Q4 are independent messages with
probability: P1=P2=1/8 & P3=P4=3/8.
Find information rate?
Question 2: If P1=P2=P3=P4=0.1 & P5=P6=P7=P8=0.05
P9=P10=P11=P12=0.075 & P13=P14=P15=P16=0.025
Fm=3KHZ, Find R?
Question 3: If, the probability of occurrence of event e1 is ¼ and probability of
occurrence of another event e2 is ¾.
Prove that-
More uncertainty carries more information.
Information Theory and Coding Techniques (BCSE3087)
Information- (intelligence/ ideas or messages/ symbols/ Signals)
Types of information-Speech, Images, Videos, audios
Source of information (S)
Communication System – (S), (D), Channel
1-Hartley’s Assumptions-
The amount of information contained in two messages should be the sum of the
information contained in the individual messages. I (M)= I (M1) + I (M2)
Means-
The amount of information in ‘l/L’ messages of length should equal to the amount
of information in one message of length ‘l/L’. = L.M
Note: Hartley’s assumptions were based on –
1. Each symbol has equal probability of occurrence.
2. All symbols were independent to each other.
2-Shannon’s Definition- (Popular)
To understand Shannon’s definition let’s take an example-
Surprise/ Uncertainty (associated with the content)
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
i. Tomorrow the sun will rise in the east.
ii. India wins the cricket match series against Australia by 5-0.
iii. There is snow fall in Mumbai.
First, we will identify that probability of occurrence of event associated with i,
ii, and iii.
Probability of occurrence is 0 shows Uncertainty or Surprise.
Higher Surprise/ Uncertainty indicate Higher Information.
i. Tomorrow the sun will rise in the east. (S/U=1, Least information)
ii. India wins the cricket match series against Australia by 5-0. (S/U=1,
Least information)
iii. There is now fall in Mumbai. (Surprise/ Uncertainty=0, high information)
According to Shannon-
Information α (directly proportional to ) Uncertainty/ Surprise
Information is inversely proportional to Probability of occurrence of events.
P->1 I->0
P->0 I->1
Let E is an event and probability of occurrence is P(E).
I(E) = log2 1/P(E) (bits/ message) ----(1)
Note- log2 (X) = log10(x)/ log10 (2) (Conversion formula for log2 the unit will be
bits)
S, D, C (Important factors to convey information)
S- Source
D- Destination
C- Channel
Communication Flow using S, C, and D
Information-> Encoder-> Transmitter -> Channel-> Decoder ->Receiver (D)
--------------- Source------------------ ----- Destination<---------
Data Compression-
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
Data compression is the process of modifying, encoding or converting the bits
structure of data in such a way that it consumes less space on disk. It enables
reducing the storage size of one or more data instances or elements. Data
compression is also known as source coding or bit-rate reduction.
Information flows from-
Source Destination
Some source is generating symbols/messages –
X={x0, x1, ----xn}
P={p0, p1, ---pn}
As per probability Theory concept-
Summation of Pk =1
Information measure associated with it-
Ik = log2 (1/Pk)
Properties of information
1- More uncertainty about message = more information
2- Receiver knows message being transmitted= information is Zero.
3- I1 is the information of message m1 and I2 is the information of m2=
combined information by m1 and m2.
4- If there are M=2N equally likely messages then amount of information
carried by each message will be N bits.
Proves are available on PPT.
Question 1- Calculate amount of information if-
a- Pk=1/4 b- Pk=3/4
Solution-
Ik= log2(1/Pk)
a.
Ik= log2(22)
Ik=2 bits
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
b. Ik= log2(4/3)= log10 4/3/ log102 =0.415 bits.
Question 2-
A card is selected at random from deck of playing cards (52). If you
have been told that it is red in color (26). Find out-
i. How much information you have received? (Pi)
ii. How much more information you needed to completely specify
the card? (Pii)
Solution
i. We have information that 26 cards would be red.
Probability of being red P(red)= 26/52=1/2
Information (Ired) = log2(2) =1 bit Answer
Solution ii. Probability P(specify)=1/52
I(SC)=log2(52) = 5.7 bits
I(SC)=log2(52) – log2(2)
5.7-1 = 4.7 bits.
Entropy-(H)- (Average information of the content) or Average
information contained by symbols or messages)
Source S is generation M number of messages-
M={m1, m2, m3, m4, ------}
P={p1, p2, p3, p4, --------}
Suppose a sequence of ‘L’ messages are transmitted –
(m1m2, m1m3, m4m1------) -> L messages
P1, P1 P1,
LP1
I(m1)=log2 1/P1
So, for
Lp1 message of m1 are transmitted.
Lp2 message of m2 are transmitted.
-
-
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
-
Lpm message of mM are transmitted.
I(total) =p1.L log21/p1 +p2.L log21/p2 +------+ pM.L log21/pM
I (total)Avg= (p1.L log21/p1 +p2.L log21/p2 +------+ pM.L log21/pM) / L
H=Summation of Pk log2 (1/Pk)
Properties of Entropy-
1. Entropy is zero if the event is sure.
2. When Pk= 1/M for all ‘M’ messages then messages are equally
likely=> H=log2M
3. Upper bound on entropy is given as
Hmax=log2M
Source Efficiency and redundancy-
Efficiency of Source = H/Hmax and this is represented by Eta.
Redundancy of Source=1-Eta
Information Rate- Represented by R
R=r * H
H-> Entropy
r-> rate at which message are being generated by source.
Question 1-
For a discrete memoryless source there are three symbols (m1, m2, m3)
with probability p1=x, p2=p3. Find the entropy of the source?
Solution-
p1=x
p2 and P2=?
p1+p2+p3=1
x+p2+p3=1
2p2=1-x
p2=1-x/2=p3
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
H=?
Question 2
Show that entropy of the source with following probability distribution
is-
2- n+2/2n
Si S1 S2 S3 Sj Sn-1 Sn
Pi 1/2 1/4 1/8 1/2j 1/2n-1 1/2n
Question 3:
The source emits three messages with probabilities, p1=0.7, p2=0.2 and
p3=0.1. Calculate-
i. Source Entropy
ii. Max Entropy
iii. Source Efficiency
iv. Redundancy
Try this--
3
1
1. Source Entropy (H) =∑ 𝑝𝑘 log
𝑘=1 𝑝𝑘
Answer is: 1.1568 bits/ messages
2. Max Entropy (Hmax) = log2M
Answer is: 1.585 bits/ messages
3. Etasource= H/Hmax
Answer: 0.73
4. Redundancysource= 1- Etasource
Answer: 0.27
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava
School of Computing Science and Engineering
Question 4- A discrete source emits one of six symbols once every mili
seconds r=(1000 msgs/second). The symbol probabilities are ½, ¼, 1/8,
1/16, 1/32 and 1/32. Find the source Entropy and Information Rate.
R=r*H
R=1937.5 bits/ second
Course: Information Theory and Coding Course Code: BCSE3087 Dr. Saurabh Kumar Srivastava