0% found this document useful (0 votes)

6 views47 pages

chap2

The document discusses mathematical preliminaries for lossless compression, covering topics such as information theory, data representation, and coding techniques. It explains how data can be compressed by removing redundancy and introduces concepts like entropy and Markov models to improve compression efficiency. The document also emphasizes the importance of unique decodability in coding schemes to ensure accurate data retrieval.

Uploaded by

shreya.dilip649

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views47 pages

chap2

Uploaded by

shreya.dilip649

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Mathematical Preliminaries for

Lossless Compression
C.M. Liu
Perceptual Lab, College of Computer Science
National Chiao-Tung University
http://www.csie.nctu.edu.tw/~cmliu/Courses/Compression/

Office: EC538
(03)5731877
[email protected]
Outlines
2

Introduction
Information Theory
Models
Coding
Achieving Data Compression
3

Most data has natural redundancy

I.e.,‘straightforward’ encoding contains more data than
the actual information in the data
E.g., audio sampling:
Achieving Data Compression (2)
4

Compression == ‘squeezing out’ the inefficiencies of

the information representation
Note #1: in lossy compression we threw out less
important/imperceptible information
Note #2: We must be able to reverse the process to make
the data usable again
Q1: What data can be compressed?
Q2: By how much?
Q3: How close are we to optimal compression?
Information Theory: a mathematical description of
information and its properties
Representing Data
5

Analog (continuous) data

Represented by real numbers
Note: cannot be represented by computers

Digital (discrete) data

Given a finite set of symbols {a1, a2, …, an},
All data represented as symbol sequences (or strings) in the
symbol set
E.g.: {a,b,c,d,r} => abc, car, bar, abracadabra, …
We use digital data to approximate analog data
Common Symbol Sets
6

Roman alphabet plus punctuation

ASCII - 256 symbols
Braille, Morse
Binary - {0,1}
0 and 1 are called bits
All digital data can be represented efficiently in binary

E.g.: {a, b, c, d} fixed length binary representation (2

bits/symbol):
Symbol a b c d
Binary 00 01 10 11
Information
7

First formally developed by Claude Shannon at Bell

Labs in the 1940s/50s
Explains limits on coding/communication using
probability theory
Self-information
Given event A with probability P(A)

= − log b P( A)
1
i ( A) = log b
P( A)
Self-Information
8

Observations 6

Low P(A) => high i(A) 5

High P(A) => low i(A) 4

Rationale:
3

− log 2 x
2

Low probability (surprise)

events carry more information; 1

think 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

man bites dog vs.

dog bites man
1 1
Suppose A and B are i ( AB) = log b = log b =
P( AB) P( A) P( B)
independent then 1 1
= log b + log b = i ( A) + i ( B)
i(AB) = i(A) + i(B) P( A) P( B)
Coin Flip Example
9

Fair coin
Let H & T be the outcomes
If P(H) = P(T) = 1/2, then

i(H) = i(T) = -1/log2(1/2) = 1 bit

Unfair coin
Let P(H) = 1/8, P(T) = 7/8
i(H) = 3 bits

i(T) = 0.193 bits

Note that P(H) + P(T) = 1

(First-Order) Entropy
10

Let
A1,…,An be all the independent possible outcomes from an
experiment
with probabilities P(A1), …,P(An)

H = ∑i =1 P( Ai )i ( Ai ) = − ∑i =1 P( Ai ) log b P( Ai )
n n

If the experiment generates symbols, then (for b=2) H is the

average number of binary symbols needed to code the
symbols.
Shannon: No lossless compression algorithm can do better.
Note: The general expression for H is more complex but
reduces to the above for iid sources
Entropy Example #1
11

Consider the sequence:

1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10
Assume it correctly describes the probabilities
generated by the source; then
P(1) = P(6) = P(7) = p(10) = 1/16

P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16

Assuming the sequence is iid

1 ⎛1⎞ 2 ⎛2⎞
H = −∑i =1 P(i ) log 2 P(i ) = −4
10
log 2 ⎜ ⎟ − 6 log 2 ⎜ ⎟ = 3.25 bits
16 ⎝ 16 ⎠ 16 ⎝ 16 ⎠
Entropy Example #2
12

Assume sample-to-sample correlation

Instead of coding samples, code difference:
1 1 1 -1 1 1 1 -1 1 1 1 1 1 -1 1 1
Now P(1) = 13/16, P(-1) = 3/16
H = 0.70 bits (per symbol)
Model also needs to be coded
Knowing something about the source can help us
‘reduce’ the entropy
Note the we cannot actually reduce the entropy of the
source, as long as our coding is lossless
Instead, we are reducing our estimate of the entropy
Entropy Example #3
13

Consider the sequence:

12123333123333123312
P(1) = P(2) = 1/4, P(3) = 1/2, H = 1.5 bits/symbol
Total bits: 20 x 1.5 = 30

Reconsider the sequence

(1 2) (1 2) (3 3) (3 3) (1 2) (3 3) (3 3) (1 2) (3 3) (1 2)
P(1 2) = 1/2, P(3 3) = 1/2
H = 1 bit/symbol x 10 symbols = 10 bits
In theory, structure can eventually be extracted by taking
larger samples
In reality, we need an accurate model as it is often impractical
to observe a source for long
Models
14

Physical models
Based on understanding of the process generating the
data
E.g., speech
A good model leads to good compression
Usually impractical

Empirical data instead

Statistical methods can help take a proper sample
Probability Models
15

Ignorance model
1. Assume each letter is generated independently from the
rest
2. Assume all letters are generated with equal probability
Examples?
ASCII, RGB, CDDA, …
Improvement—drop assumption 2:
A = {a1, a2, …, an}, P = {P(a1}, P(a2), …, P(an)}
Very efficient coding schemes exist already
Note
If 1. does not hold, a better solution likely exists
Markov Models
16

Assume that each output symbol depends on

previous k ones. Formally:
Let {xn} be a sequence of observations
We call {xn} a kth-order discrete Markov chain (DMC) if

P (x n x n − 1 , K , x n − k ) = P (x n x n − 1 , K , x n − k , K )

¾ Usually, we use a first‐order DMC:

P (x n x n − 1 ) = P (x n x n − 1 , K , x n − k , K )
Linear dependency model
¾ xn = ρ xn-1 + εn
¾ εn => white noise
Non-linear Markov Models
17

Consider a BW image as a
string of black & white pixels P(w|w) P(b|w) P(b|b)

(e.g. row-by-row)
Sw Sb
Define two states: Sb & Sw for P(w|b)
the current pixel
Define probabilities:
P(Sb) = prob of being in Sb H (S w ) = − P(b / w) log(b / w) − P(w / w) log(w / w)
P(Sw) = prob of being in Sw H (Sb ) = − P(w / b ) log(b / w) − P(b / b ) log(b / b )
Transition probabilities P(w / w) = 1 − P (b / w), P(b / b ) = 1 − P(w / b )
P(b|b), P(b|w)
P(w|b), P(w|w) H = P(Sb )H (Sb ) + P (S w )H (S w )
Markov Model (MM) Example
18

Assume
P(S w ) = 30 / 31 P(Sb ) = 1 / 31
P(w / w) = 0.99 P(b / w) = 0.01 P(b / b ) = 0.7 P(w / b ) = 0.3
For the iid model:
H iid = −0.8 log 0.8 − 0.2 log 0.2 = 0.206

For the Markov model:

H (Sb ) = −0.3 log 0.3 − 0.7 log 0.7 = 0.881
H (S w ) = −0.01 log 0.01 − 0.99 log 0.99 = 0.081
30 1
H Markov = 0.081 + 0.881 = 0.107
31 31
Markov Models in Text Compression
19

In written English, probability of next letter is heavily influenced

by previous ones
E.g. u after q
Shannon’s work
2nd-order MM, 26 letters + space H = 3.1 bits/letter
Word-based model H=2.4 bits/letter
Human prediction based on 100 letters
0.6 ≤ H ≤ 1.3 bits/letter

Longer context => better prediction

Practical concerns:
Context model storage (e.g. 4th-order w/ 95 chars = 954 contexts)
Zero frequency problem
Composite Source Model
20

Many sources cannot be adequately described by a single

model
E.g.: an executable contains:
Code, resources (text, images, …)
Solution: composite model:
…
Coding
21

Alphabet
Collection of symbols called letters
Code
A set of binary sequences called codewords
Coding
The process of mapping letters to codewords
Fixed vs. variable-length coding
Example: letter ‘A’
ASCII: 01000001
Morse: •—
Code rate
Average number of bits per symbol
Uniquely Decodable Codes
22

Example
Alphabet = {a1, a2, a3, a4}
P(a1) = 1/2, P(a2) = 1/4, P(a3) = P(a4) = 1/8
H = 1.75 bits
n(ai) = length (codeword(ai)), i=1..4
Avg length l = Σi=1..4P(ai) n(ai)
Possible codes:
Probability Code 1 Code 2 Code 3 Code 4
a1 0.500 0 0 0 0
a2 0.250 0 1 10 01
a3 0.125 1 00 110 011
a4 0.125 10 11 111 0111
l 1.125 1.250 1.750 1.875
Uniquely Decodable Codes (2)
23

Probability Code 1 Code 2 Code 3 Code 4

a1 0.500 0 0 0 0
a2 0.250 0 1 10 01
a3 0.125 1 00 110 011
a4 0.125 10 11 111 0111
l 1.125 1.250 1.750 1.875
Code 1
Identical codewords for a1 & a2==> decode(‘00’) = ???
Code 2
Unique codes but ambiguous: decode( ‘00’/’11’) = ???
Code 3
Uniquely decodable, instantaneous
Code 4
Uniquely decodable, ‘near-instantaneous’
Uniquely Decodable Codes (3)
24

Unique decodability:
Given any sequence of codewords, there is a unique
decoding of it.
Unique != instantaneous
E.g.:
a1 Ù 0
a2 Ù 01
a3 Ù 11
decode(0111111111) = a1a3… or a2a3… ?
don’t know until the end of the string
0111111111 Æ 01111111a3 Æ 011111a3a3 Æ 0111a3a3a3 Æ
01a3a3a3a3 Æ a2a3a3a3a3a3
Unique Decodability Test
25

Prefix & dangling suffix

Let a = a1…ak, b = b1…bn be binary codewords and k < n
If a1…ak = b1…bk then a is a prefix of b and
bk+1…bn is a dangling suffix: ds(a, b)
Algorithm
Let C = {cn} be the set of all codeword
For all pairs (ci, cj) in C repeat
If ds(ci, cj) ∉ C // dangling suffix is not a codeword
CI = CI U ds(ci, cj)
Else // dangling suffix is a codeword
return NOT_UNIQUE
until no more unique pairs
return UNIQUE
Prefix Codes
26

Prefix code:
No codeword is prefix of another.
Prefix codes are also known as prefix-free codes, prefix condition codes, comma-
free codes[1] (although this is incorrect), and instantaneous codes.
Binary trees as prefix decoders:
repeat
symbol code 0 1 curr = root
a 00 repeat
c if get_bit(input) = 1
b 01 0 1
curr = curr.right
c 1 else
a b
curr = curr.left
until is_leaf(curr)
output curr.symbol
until eof(input)
Decoding Prefix Codes: Example
27

symbol code 0 1
a 0 a 0 1
b 10
b 0 1
c 110
c 0 1
d 1110
r 1111 d r

abracadabra = 010111101100111001011110
Decoding Example
28

0 1

a 0 1

b 0 1

c 0 1

d r

Input = 010111101100111001011110
Output = -----------
Decoding Example
29

0 1

a 0 1

b 0 1

c 0 1

d r

Input = 010111101100111001011110
Output = a----------
Decoding Example
30

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -10111101100111001011110
Output = a----------
Decoding Example
31

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -10111101100111001011110
Output = a----------
Decoding Example
32

0 1

a 0 1

b 0 1

c 0 1

d r

Input = --0111101100111001011110
Output = ab---------
Decoding Example
33

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ---111101100111001011110
Output = ab---------
Decoding Example
34

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ---111101100111001011110
Output = ab---------
Decoding Example
35

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ----11101100111001011110
Output = ab---------
Decoding Example
36

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -----1101100111001011110
Output = ab---------
Decoding Example
37

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ------101100111001011110
Output = abr--------
Decoding Example
38

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -------01100111001011110
Output = abr--------
Decoding Example
39

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -------01100111001011110
Output = abra-------
Decoding Example
40

0 1

a 0 1

b 0 1

c 0 1

d r

Input = --------1100111001011110
Output = abra-------
Decoding Example
41

0 1

a 0 1

b 0 1

c 0 1

d r

Input = --------1100111001011110
Output = abra-------
Decoding Example
42

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ---------100111001011110
Output = abra-------
Decoding Example
43

0 1

a 0 1

b 0 1

c 0 1

d r

Input = ----------00111001011110
Output = abrac------
Decoding Example
44

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -----------0111001011110
Output = abrac------
Decoding Example
45

0 1

a 0 1

b 0 1

c 0 1

d r

Input = -----------0111001011110
Output = abraca-----
Decoding Example
46

0 1

a 0 1
0 1 and so on …
b
c 0 1

d r

Input = ------------111001011110
Output = abraca-----
Summary
47

Basic definitions of Information Theory

Information
Entropy
Models
Codes
Unique decodability
Prefix codes
Homeworks (pp. 38-39)
3, 4, 7.
Program (pp. 39)
5

Mazak 640 Series How To Restore NC Data
100% (2)
Mazak 640 Series How To Restore NC Data
4 pages
Lossless Math
No ratings yet
Lossless Math
32 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
100% (2)
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
26 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
No ratings yet
EC 2214: Coding & Data Compression: Vishwakarma Institute of Technology
35 pages
Introduction To Information Theory and Coding
No ratings yet
Introduction To Information Theory and Coding
46 pages
Source Coding
No ratings yet
Source Coding
29 pages
3 Information Theory
No ratings yet
3 Information Theory
48 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
Intro To ICT 11
No ratings yet
Intro To ICT 11
31 pages
chapter10_part1_Huffman(1)
No ratings yet
chapter10_part1_Huffman(1)
17 pages
Lecture 2-Print
No ratings yet
Lecture 2-Print
19 pages
Entropy 3
No ratings yet
Entropy 3
10 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
Introduction To Data Compression - Guy E. Blelloch PDF
No ratings yet
Introduction To Data Compression - Guy E. Blelloch PDF
54 pages
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
No ratings yet
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
7 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Lecture 3-Print
No ratings yet
Lecture 3-Print
22 pages
L15-Compression
No ratings yet
L15-Compression
63 pages
Lecture
No ratings yet
Lecture
75 pages
Chapter 2 - Mathematical Preliminaries For Lossless Compression
No ratings yet
Chapter 2 - Mathematical Preliminaries For Lossless Compression
56 pages
Sayood DataCompression
No ratings yet
Sayood DataCompression
22 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Compression PDF
No ratings yet
Compression PDF
55 pages
8
No ratings yet
8
35 pages
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
No ratings yet
15-583:algorithms in The Real World: Data Compression I - Introduction - Information Theory - Probability Coding
33 pages
Information Theory
No ratings yet
Information Theory
38 pages
Agenda For The Lecture: C Himanshu Tyagi. Feel Free To Use With Acknowledgement
No ratings yet
Agenda For The Lecture: C Himanshu Tyagi. Feel Free To Use With Acknowledgement
7 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Lossless Data Compression
No ratings yet
Lossless Data Compression
24 pages
Entropy, Coding and Data Compression
No ratings yet
Entropy, Coding and Data Compression
33 pages
DC M1 Merged
No ratings yet
DC M1 Merged
26 pages
Input Source Encoder Channel Encoder Binary Interface
No ratings yet
Input Source Encoder Channel Encoder Binary Interface
29 pages
Unit - 2 - Mathematical Preliminaries For Lossless Compression Models
No ratings yet
Unit - 2 - Mathematical Preliminaries For Lossless Compression Models
12 pages
TSBK08 Data Compression Exercises: Informationskodning, ISY, Link Opings Universitet, 2013
No ratings yet
TSBK08 Data Compression Exercises: Informationskodning, ISY, Link Opings Universitet, 2013
32 pages
Information Theory Notes
No ratings yet
Information Theory Notes
4 pages
Data Compression Basics: Discrete Source
No ratings yet
Data Compression Basics: Discrete Source
34 pages
Data Compression Unit-1 - 1
No ratings yet
Data Compression Unit-1 - 1
21 pages
Text Compression
No ratings yet
Text Compression
16 pages
Noise, Information Theory, and Entropy: CS414 - Spring 2007
No ratings yet
Noise, Information Theory, and Entropy: CS414 - Spring 2007
44 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
CH 6
No ratings yet
CH 6
21 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
ECE 565a: Information Theory Instructor: Prof. Salman Avestimehr
No ratings yet
ECE 565a: Information Theory Instructor: Prof. Salman Avestimehr
49 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
cp467_12_lecture14_compression1
No ratings yet
cp467_12_lecture14_compression1
146 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
CHAPTER 7
No ratings yet
CHAPTER 7
36 pages
Mobile Communicaton Engineering: Review On Fundamental Limits On Communications
No ratings yet
Mobile Communicaton Engineering: Review On Fundamental Limits On Communications
31 pages
Shanon Encoding and Fano Encoding, Theorem, Problems On Entropy
No ratings yet
Shanon Encoding and Fano Encoding, Theorem, Problems On Entropy
25 pages
Week 3
No ratings yet
Week 3
30 pages
Source Coding Ompression
No ratings yet
Source Coding Ompression
34 pages
Lecture 7 Source Coding 2024
No ratings yet
Lecture 7 Source Coding 2024
28 pages
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Probability Theory: A Concise Course
From Everand
Probability Theory: A Concise Course
Y. A. Rozanov
4/5 (2)
The Red Book of Mathematical Problems
From Everand
The Red Book of Mathematical Problems
Kenneth S. Williams
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Graph Databases (GDB) : Adrian Silvescu Doina Caragea Anna Atramentov
No ratings yet
Graph Databases (GDB) : Adrian Silvescu Doina Caragea Anna Atramentov
24 pages
Programming Windows Visual Basic Beginnings
No ratings yet
Programming Windows Visual Basic Beginnings
4 pages
General Question: 1. What Is Read Modify Write Technique?
No ratings yet
General Question: 1. What Is Read Modify Write Technique?
4 pages
Logix 5000 Controllers Sequential Function Charts: Programming Manual
No ratings yet
Logix 5000 Controllers Sequential Function Charts: Programming Manual
81 pages
500000count Professional Dual Display DMMs Protocol
No ratings yet
500000count Professional Dual Display DMMs Protocol
2 pages
Resume Sadi Hossain
No ratings yet
Resume Sadi Hossain
1 page
C# Cheat Sheet
No ratings yet
C# Cheat Sheet
46 pages
Module6 - Loops
No ratings yet
Module6 - Loops
28 pages
DMP l4 Software Developer Sample Paper
No ratings yet
DMP l4 Software Developer Sample Paper
24 pages
Unit 3, Reading 1
No ratings yet
Unit 3, Reading 1
4 pages
Network Interface Converter Net-01X (03X) Owner's Manual 2010-01
No ratings yet
Network Interface Converter Net-01X (03X) Owner's Manual 2010-01
12 pages
Facility Explorer - Review Session - GCG
No ratings yet
Facility Explorer - Review Session - GCG
54 pages
DS SX1280-1-2 V3.0
No ratings yet
DS SX1280-1-2 V3.0
143 pages
Parameter Configuration
No ratings yet
Parameter Configuration
1 page
Elektor 2101IND-EN
100% (1)
Elektor 2101IND-EN
68 pages
BCSC 1204 - Lecture 1
No ratings yet
BCSC 1204 - Lecture 1
29 pages
BCU TR - 6MD85.rv1
No ratings yet
BCU TR - 6MD85.rv1
5 pages
VolMemLyzer Volatile Memory Analyzer for Malware Classification Using Feature Engineering
No ratings yet
VolMemLyzer Volatile Memory Analyzer for Malware Classification Using Feature Engineering
8 pages
MN67605 Eng
No ratings yet
MN67605 Eng
24 pages
IP Routing Technology
No ratings yet
IP Routing Technology
43 pages
Poweredge r7625 Spec Sheet
No ratings yet
Poweredge r7625 Spec Sheet
3 pages
PIC16F716: PIC16F716 Rev. A Silicon/Data Sheet Errata
No ratings yet
PIC16F716: PIC16F716 Rev. A Silicon/Data Sheet Errata
6 pages
(Entrance-Exam - Net) - Tech Mahindra Placement Sample Paper 1
No ratings yet
(Entrance-Exam - Net) - Tech Mahindra Placement Sample Paper 1
10 pages
Complete Download Beginning BBC micro:bit: A Practical Introduction to micro:bit Development 1st Edition Pradeeka Seneviratne PDF All Chapters
No ratings yet
Complete Download Beginning BBC micro:bit: A Practical Introduction to micro:bit Development 1st Edition Pradeeka Seneviratne PDF All Chapters
65 pages
2 Set1 09 Solutions
No ratings yet
2 Set1 09 Solutions
4 pages
APN-050 SIPROTEC 5 Main Protection Functions - Limit Overview
No ratings yet
APN-050 SIPROTEC 5 Main Protection Functions - Limit Overview
7 pages
reduction proofs
No ratings yet
reduction proofs
9 pages
Automated Systems - New
No ratings yet
Automated Systems - New
46 pages
PART-06 (VXLAN L3 Packet Forwading)
No ratings yet
PART-06 (VXLAN L3 Packet Forwading)
4 pages

chap2

Uploaded by

chap2

Uploaded by

Mathematical Preliminaries for

Most data has natural redundancy

Compression == ‘squeezing out’ the inefficiencies of

Analog (continuous) data

Digital (discrete) data

Roman alphabet plus punctuation

 E.g.: {a, b, c, d} fixed length binary representation (2

First formally developed by Claude Shannon at Bell

 Low P(A) => high i(A) 5

 High P(A) => low i(A) 4

 Low probability (surprise)

 man bites dog vs.

 i(H) = i(T) = -1/log2(1/2) = 1 bit

 i(T) = 0.193 bits

Note that P(H) + P(T) = 1

 If the experiment generates symbols, then (for b=2) H is the

Consider the sequence:

 P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16

 Assuming the sequence is iid

Assume sample-to-sample correlation

Consider the sequence:

Reconsider the sequence

 Empirical data instead

Assume that each output symbol depends on

¾ Usually, we use a first‐order DMC:

 For the Markov model:

In written English, probability of next letter is heavily influenced

Longer context => better prediction

Many sources cannot be adequately described by a single

Probability Code 1 Code 2 Code 3 Code 4

Prefix & dangling suffix

Basic definitions of Information Theory

You might also like

E.g.: {a, b, c, d} fixed length binary representation (2

Low P(A) => high i(A) 5

High P(A) => low i(A) 4

Low probability (surprise)

man bites dog vs.

i(H) = i(T) = -1/log2(1/2) = 1 bit

i(T) = 0.193 bits

If the experiment generates symbols, then (for b=2) H is the

P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16

Assuming the sequence is iid

Empirical data instead

For the Markov model: