0% found this document useful (0 votes)
8 views10 pages

Basics of Compression

Uploaded by

Summiya Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views10 pages

Basics of Compression

Uploaded by

Summiya Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1/22/2024

Basics of Compression

Dr. Sania Bhatti

Outline
 Need for compression and compression
algorithms classification
 Basic Coding Concepts
 Fixed-length
coding and variable-length coding
 Compression Ratio
 Entropy

 RLE Compression (Entropy Coding)


 Huffman Compression (Statistical Entropy Coding)

1
1/22/2024

Need for Compression


 Uncompressed video
 Uncompressed audio  640 x 480 resolution, 8 bit
 8 KHz, 8 bit color, 24 fps
 8K per second  7.37 Mbytes per second
 30M per hour  26.5 Gbytes per hour
 44.1 KHz, 16 bit  640 x 480 resolution, 24 bit
 88.2K per second (3 bytes) color, 30 fps
 317.5M per hour  27.6 Mbytes per second
 100 Gbyte disk holds 315  99.5 Gbytes per hour
hours of CD quality music  100 Gbyte disk holds 1 hour
of high quality video

Broad Classification
 Entropy Coding (statistical)
 lossless;independent of data characteristics
 e.g. RLE, Huffman, LZW, Arithmetic coding
 Source Coding
 lossy;may consider semantics of the data
 depends on characteristics of the data
 e.g. DCT, DPCM, ADPCM, color model transform
 Hybrid Coding (used by most multimedia systems)
 combine entropy with source encoding
 e.g., JPEG-2000, H.264, MPEG-2, MPEG-4, MPEG-7

2
1/22/2024

Data Compression
 Branch of information theory
 minimize amount of information to be
transmitted
 Transform a sequence of characters into a
new string of bits
 same information content
 length as short as possible

Concepts
 Coding (the code) maps source messages from
alphabet (A) into code words (B)

 Source message (symbol) is basic unit into which a


string is partitioned
 can be a single letter or a string of letters

 EXAMPLE: aa bbb cccc ddddd eeeeee fffffffgggggggg


 A = {a, b, c, d, e, f, g, space}
 B = {0, 1}
6

3
1/22/2024

Taxonomy of Codes
 Block-block
 source msgs and code words of fixed length; e.g.,
ASCII
 Block-variable
 sourcemessage fixed, code words variable; e.g.,
Huffman coding
 Variable-block
 source variable, code word fixed; e.g., RLE
 Variable-variable
 source variable, code words variable; e.g., Arithmetic
7

Example of Block-Block
 Coding “aa bbb cccc ddddd Symbol Code word
eeeeee fffffffgggggggg” a 000
b 001
 Requires 120 bits c 010
d 011
e 100

f 101
g 110
space 111

4
1/22/2024

Example of Variable-Variable
 Coding “aa bbb cccc ddddd Symbol Code word
eeeeee fffffffgggggggg” aa 0
bbb 1
 Requires 30 bits cccc 10
 don’t forget the spaces ddddd 11
eeeeee 100

fffffff 101
gggggggg 110
space 111

Concepts (cont.)
 A code is
 distinct
if each code word can be distinguished from
every other (mapping is one-to-one)
 uniquely decodable if every code word is identifiable
when immersed in a sequence of code words
 e.g., with previous table, message 11 could be defined as
either ddddd or bbbbbb

10

5
1/22/2024

Static Codes
 Mapping is fixed before transmission
 message represented by same codeword
every time it appears in message (ensemble)
 Huffman coding is an example

 Better for independent sequences


 probabilities
of symbol occurrences must be
known in advance;
11

Dynamic Codes
 Mapping changes over time
 also referred to as adaptive coding
 Attempts to exploit locality of reference
 periodic,
frequent occurrences of messages
 dynamic Huffman is an example

 Hybrids?
 build set of codes, select based on input

12

6
1/22/2024

Traditional Evaluation Criteria


 Algorithm complexity
 running time

 Amount of compression
 redundancy
 compression ratio

 How to measure?
13

Measure of Information
 Consider symbols si and the probability of
occurrence of each symbol p(si)
 In case of fixed-length coding , smallest
number of bits per symbol needed is
 L ≥ log2(N) bits per symbol
 Example: Message with 5 symbols need 3
bits (L ≥ log25)

14

7
1/22/2024

Variable-Length Coding-
Entropy
 What is the minimum number of bits per
symbol?
 Answer: Shannon’s result – theoretical
minimum average number of bits per code
word is known as Entropy (H)
n

  p(s ) log
i 1
i 2 p( si )

15

Entropy Example
 Alphabet = {A, B}
 p(A) = 0.4; p(B) = 0.6

 Compute Entropy (H)


 -0.4*log2 0.4 + -0.6*log2 0.6 = .97 bits

16

8
1/22/2024

Entropy Example
 Calculate the entropy for an image with
only two levels 0 and 255. P(0)=0.5 and
P(255)= 0.5

17

Entropy example
 A gray scale image has 256 levels A={ 0,
1, 2, ………….255} with equal
probabilities. Calculate Entropy.

 H= 256* (1/256)*log2(1/256) = 8bits

18

9
1/22/2024

Entropy Example
 Calculate the Entropy of aaabbbbccccdd
 P(a)= 0.23
 P(b) = 0.3
 P(c)=0.3
 P(d)= 0.15

19

10

You might also like