0% found this document useful (0 votes)
56 views

Lossless Compression

Compression reduces the size of data to reduce storage and transmission requirements. It is possible due to redundancy in digital media like audio and video. Compression can be lossless, preserving all data, or lossy, tolerating some loss. Lossless methods like run-length encoding and Huffman coding remove redundancy. Lossy methods like JPEG and MPEG also remove redundancy and exploit human perception, introducing tolerable loss.

Uploaded by

GeHad Mohey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Lossless Compression

Compression reduces the size of data to reduce storage and transmission requirements. It is possible due to redundancy in digital media like audio and video. Compression can be lossless, preserving all data, or lossy, tolerating some loss. Lossless methods like run-length encoding and Huffman coding remove redundancy. Lossy methods like JPEG and MPEG also remove redundancy and exploit human perception, introducing tolerable loss.

Uploaded by

GeHad Mohey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 36

Why Compress?

 To reduce the volume of data to be


transmitted (text, fax, images)
 To reduce the bandwidth required for
transmission and to reduce storage
requirements (speech, audio, video)

2
Compression
 How is compression possible?
– Redundancy in digital audio, image, and video data
– Properties of human perception
 Digital audio is a series of sample values; image
is a rectangular array of pixel values; video is a
sequence of images played out at a certain rate
 Neighboring sample values are correlated
Classification
 Lossless compression
– lossless compression for legal and medical documents,
computer programs
– exploit only data redundancy
 Lossy compression
– digital audio, image, video where some errors or loss can
be tolerated
– exploit both data redundancy and human perception
properties
 Constant bit rate versus variable bit rate coding
Data Compression- Entropy
• Entropy is the measure of information content in a
message.
 Messages with higher entropy carry more information than messages
with lower entropy.
• How to determine the entropy
 Find the probability p(x) of symbol x in the message
 The entropy H(x) of the symbol x is:
H(x) = - p(x) • log2p(x)

• The average entropy over the entire message is the sum


of the entropy of all n symbols in the message
Data Compression Methods
• Data compression is about storing and sending a
smaller number of bits.
• There’re two major categories for methods to
compress data: lossless and lossy methods
LOSSLESS
COMPRESSION
METHODS
Lossless Compression Methods
• In lossless methods, original data and the data
after compression and decompression are exactly
the same.

• Redundant data is removed in compression and


added during decompression.

• Lossless methods are used when we can’t afford to


lose any data: legal and medical documents,
computer programs.
Run-length encoding
• Simplest method of compression.
• How: replace consecutive repeating occurrences of a symbol by 1
occurrence of the symbol itself, then followed by the number of
occurrences.

• The method can be more efficient if the data uses only 2 symbols (0s and
1s) in bit patterns and 1 symbol is more frequent than another.
Huffman coding
• In Huffman coding, you assign shorter codes
to symbols that occur more frequently and
longer codes to those that occur less
frequently.
• For example:
Character A B C D E
------------------------------------------------------
Frequency 17 12 12 27 32
Table 15.1 Frequency of characters
Figure 15-4

Huffman coding
Figure 15-5

Final tree and code


Figure 15-6

Huffman encoding
Figure 15-7

Huffman decoding
Huffman coding
• The beauty of Huffman coding is that no code in the
prefix of another code.
• There is no ambiguity in encoding.
• The receiver can decode the received data without
ambiguity.
• Huffman code is called instantaneous code because
the decoder can unambiguously decode the bits
instantaneously with the minimum number of bits.
Lempel Ziv encoding
• LZ encoding is an example of a category of
algorithms called dictionary-based encoding.
• The idea is to create a dictionary (table) of
strings used during the communication
session.
• The compression algorithm extracts the
smallest substring that cannot be found in the
dictionary from the remaining non-
compressed string.
Figure 15-8:Part I

Example of Lempel Ziv encoding


Figure 15-8:Part 2

Example of Lempel Ziv encoding


Figure 15-9: Part I

Example of Lempel Ziv decoding


Figure 15-9: Part II

Example of Lempel Ziv decoding


LOSSY
COMPRESSION
METHODS
Lossy compression methods
 Loss of information is acceptable in a picture
of video.
 The reason is that our eyes and ears cannot
distinguish subtle changes.
 Loss of information is not acceptable in a text
file or a program file.
 For examples:
– Joint photographic experts group (JPEG)
– Motion picture experts group (MPEG)
JPEG Encoding
 Used to compress pictures and graphics.
 In JPEG, a grayscale picture is divided into 8x8
pixel blocks to decrease the number of
calculations.
 Basic idea:
 Change the picture into a linear (vector) sets of numbers
that reveals the redundancies.
 The redundancies is then removed by one of lossless
compression methods.
Figure 15-11

JPEG process

• DTC: discrete cosine transform


• Quantization
• Compression
Figure 15-12

Discrete cosine transform


Case 1: uniform gray scale T(0, 0)

• T(0, 0): DC value (direct current value)


• T(m, n) : AC values (represent changes in the pixel values)
Figure 15-13

Discrete cosine transform


Case 2: two sections
Figure 15-14

Discrete cosine transform


Case 3: gradient gray scale
DCT discussion
• The DCT transformation creates table T from
table P.
• The DC value gives the average value of the
pixels.
• The AC values gives the changes.
• Lack of changes in neighboring pixels creates
0s.
• The DCT transformation is reversible.
• Appendix F (Mathematical formula for DCT
transformation)
Quantization
• After the T table is created, the values are
quantized to reduce the number of bits
needed for encoding.
• Quantization:
– Divide the number by a constant and then drop
the fraction.
– The quantizing phase is not reversible.
– Some information will be lost.
Compression
• After quantization, the values are read from
the table, and redundant 0s are removed.
• The reason is that if the picture does not have
fine changes, the bottom right corner of the T
table is all 0s.
• Fig. 15.15
Figure 15-15

Reading
the table
Video compression--MPEG
• MPEG method
– Spatial compression
• The spatial compression of each frame is done
with JPEG.
– Temporal compression
• The temporal compression removes the
redundant frames.
• MPEG method first divides frames into three
categories: I-frames, P-frames, B-frames.
Figure 15-16

MPEG frames

• I-frames: (intra-coded frame)


– It is an independent frame that is not related to
any other frame.
– They are present at regular intervals.
– I-frames are independent of other frames and
cannot be constructed from other frames.
Figure 15-16

MPEG frames

• P-frames: (predicted frame)


– It is related to the preceding I-frame or P-frame.
– Each P-frame contains only the changes from the
preceding frame.
– P-frames can be constructed only from previous I- or P-
frames.
• B-frames: (bidirectional frame)
– It is relative to the preceding and following I-frame or P-
frame.
– Each B-frame is relative to the past and the future.
– A B-frame is never related to another B-frame.
Figure 15-17

ce
en
MPEG frame construction
qu
se
ut
p
In

e
en c
u
seq
EG
MP
Hatem ZAKARIA, 24th February 2013 36

You might also like