7.
5 Dictionary-based Coding
LZW uses fixed-length code words to represent
variable-length strings of symbols/characters that
commonly occur together, e.g., words in English
text
LZW encoder and decoder build up the same dictionary
dynamically while receiving the data
LZW places longer and longer repeated entries into
a dictionary, and then emits the code for an
element, rather than the string itself, if the element
has already been placed in the dictionary
1
1/30/09 CSE 40373/60373: Multimedia Systems page 1
LZW compression for string
“ABABBABCABABBA”
The output codes are: 1 2 4 5 2 3 4 6 1. Instead of
sending 14 characters, only 9 codes need to be
sent (compression ratio = 14/9 = 1.56).
S C Output Code String
1 A
2 B
3 C
A B 1 4 AB
B A 2 5 BA
A B
AB B 4 6 ABB
B A
BA B 5 7 BAB
B C 2 8 BC
C A 3 9 CA
A B
AB A 4 10 ABA
A B
AB B
ABB A 6 11 ABBA
A EOF 1
2
1/30/09 CSE 40373/60373: Multimedia Systems page 2
LZW decompression (1 2 4 5 2 3 4 6 1)
S K Entry/output Code String
1 A
2 B
3 C
NIL 1 A
A 2 B 4 AB
B 4 AB 5 BA
AB 5 BA 6 ABB
BA 2 B 7 BAB
B 3 C 8 BC
C 4 AB 9 CA
AB 6 ABB 10 ABA
ABB 1 A 11 ABBA
A EOF
ABABBABCABABBA
3
1/30/09 CSE 40373/60373: Multimedia Systems page 3
LZW Coding (cont’d)
In real applications, the code length l is kept in the
range of [l0, lmax]. The dictionary initially has a size
of 2l0. When it is filled up, the code length will be
increased by 1; this is allowed to repeat until l = lmax
When lmax is reached and the dictionary is filled up,
it needs to be flushed (as in Unix compress, or to
have the LRU (least recently used) entries
removed
1/30/09 CSE 40373/60373: Multimedia Systems page 4
7.6 Arithmetic Coding
Arithmetic coding is a more modern coding method
that usually out-performs Huffman coding
Huffman coding assigns each symbol a codeword
which has an integral bit length. Arithmetic coding
can treat the whole message as one unit
More details in the book
5
1/30/09 CSE 40373/60373: Multimedia Systems page 5
7.7 Lossless Image Compression
Due to spatial redundancy in normal images I, the
difference image d will have a narrower histogram
and hence a smaller entropy
1/30/09 CSE 40373/60373: Multimedia Systems page 6
Lossless JPEG
A special case of the JPEG image compression
The Predictive method
Forming a differential prediction: A predictor combines
the values of up to three neighboring pixels as the
predicted value for the current pixel
Predictor Prediction
P1 A
P2 B
P3 C
P4 A+B–C
P5 A + (B – C) / 2
P6 B + (A – C) / 2
P7 (A + B) / 2
1/30/09 CSE 40373/60373: Multimedia Systems page 7
2. Encoding: The encoder compares the prediction
with the actual pixel value at the position ‘X’ and
encodes the difference using Huffman coding
8
1/30/09 CSE 40373/60373: Multimedia Systems page 8
Performance: generally poor, 2-3
Compression Program Compression Ratio
Lena Football F-18 Flowers
Lossless JPEG 1.45 1.54 2.29 1.26
Optimal Lossless JPEG 1.49 1.67 2.71 1.33
Compress (LZW) 0.86 1.24 2.21 0.87
Gzip (LZ77) 1.08 1.36 3.10 1.05
Gzip -9 (optimal LZ77) 1.08 1.36 3.13 1.05
Pack(Huffman coding) 1.02 1.12 1.19 1.00
9
1/30/09 CSE 40373/60373: Multimedia Systems page 9
Implementation details for VLC
Consider the code for HELLO: 10 110 0 0 111. how
do you extract a bit? (decoding)
union bitField {
struct {
unsigned int one:1; Bit operators: & |
unsigned int two:1; One = (0xb1 & 0x80)>>7;
unsigned int thr:1;
unsigned int fou:1;
unsigned int fiv:1;
unsigned int six:1;
unsigned int sev:1;
unsigned int eig:1;
} bit;
unsigned char chr;
}
1/30/09 CSE 40373/60373: Multimedia Systems page 10
Chapter 8: Lossless compression
Information is permanently lost in the compression
process to achieve higher compression ratios
Metrics: Mean square error, SNR, Peak SNR
Primary loss mechanism: quantization to reduce
the number of different levels in the input
Three different forms of quantization
– Uniform: midrise and midtread quantizers
– Nonuniform: companded quantizer (u-law, A-law)
– Vector Quantization
1/30/09 CSE 40373/60373: Multimedia Systems page 11
Transform coding
The rationale behind transform coding:
If Y is the result of a linear transform T of the input vector
X in such a way that the components of Y are much less
correlated, then Y can be coded more efficiently than X
If most information is accurately described by the
first few components of a transformed vector, then
the remaining components can be coarsely
quantized, or even set to zero, with little signal
distortion
Discrete Cosine Transform (DCT)
1/30/09 CSE 40373/60373: Multimedia Systems page 12
Spatial Frequency and DCT
Spatial frequency indicates how many times pixel
values change across an image block
The DCT formalizes this notion with a measure of
how much the image contents change in
correspondence to the number of cycles of a
cosine wave per block
The role of the DCT is to decompose the original
signal into its DC and AC components; the role of
the IDCT is to reconstruct (re-compose) the signal
1/30/09 CSE 40373/60373: Multimedia Systems page 13
Graphical Illustration of 8 × 8 2D DCT
basis
1/30/09 CSE 40373/60373: Multimedia Systems page 14
Chapter 9: Image compression
JPEG standard - JPEG is a lossy image
compression method. It employs a transform
coding method using the DCT (Discrete Cosine
Transform)
An image is a function of i and j (or conventionally x and
y) in the spatial domain. The 2D DCT is used as one step
in JPEG in order to yield a frequency response which is a
function F(u, v) in the spatial frequency domain, indexed
by two integers u and v
1/30/09 CSE 40373/60373: Multimedia Systems page 15
Observations for JPEG Image Compression
The effectiveness of the DCT transform coding method
in JPEG relies on 3 major observations:
Observation 1: Useful image contents change relatively
slowly across the image, i.e., it is unusual for intensity
values to vary widely several times in a small area, for
example, within an 8×8 image block.
much of the information in an image is repeated (“spatial
redundancy”)
Observation 2: Psychophysical experiments suggest that
humans are much less likely to notice the loss of very high
spatial frequency components than the loss of lower
frequency components
spatial redundancy reduced by reducing the high spatial
frequency contents
Observation 3: Visual acuity (accuracy in distinguishing
closely spaced lines) is much greater for gray (“black and
white”) than for color
chroma subsampling (4:2:0) is used in JPEG
1/30/09 CSE 40373/60373: Multimedia Systems page 16
JPEG encoder
1/30/09 CSE 40373/60373: Multimedia Systems page 17
DCT on image blocks
Image is divided into 8 × 8 blocks. The 2D DCT is
applied to each block image f(i, j), with output being
the DCT coefficients F(u, v) for each block
Using blocks, however, has the effect of isolating
each block from its neighboring context. This is
why JPEG images look choppy (“blocky”) when a
high compression ratio is specified by the user
1/30/09 CSE 40373/60373: Multimedia Systems page 18
Quantization
F(u, v) represents a DCT coefficient, Q(u, v) is a
“quantization matrix” entry, and represents
the quantized DCT coefficients which JPEG will
use in the succeeding entropy coding
quantization step is the main source for loss in JPEG
The entries of Q(u, v) tend to have larger values towards
the lower right corner. This aims to introduce more loss at
the higher spatial frequencies — a practice supported by
Observations 1 and 2
default Q(u, v) values obtained from psychophysical
studies with the goal of maximizing the compression ratio
while minimizing perceptual losses in JPEG images.
19
1/30/09 CSE 40373/60373: Multimedia Systems page 19
The Luminance Quantization Table
16
11
10 16
24 40
51 61
12
12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14
17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24
35
55
64
81
104
113
92
49
64
78
87
103
121
120
101
72
92
95
98
112
100
103
99
The Chrominance Quantization Table
17 18 24 47 99 99 99 99
18 21 26 66 99 99 99 99
24 26 56 99 99 99 99 99
47 66 99 99 99 99 99 99
99 99 99 99 99 99 99 99
99 99 99 99 99 99 99 99
99 99 99 99 99 99 99 99
99 99 99 99 99 99 99 99
20
1/30/09 CSE 40373/60373: Multimedia Systems page 20
515 65 -12 4 1 2 -8 5
200 202 189 188 189 175 175 175 -16 3 2 0 0 -11 -2 3
200 203 198 188 189 182 178 175 -12 6 11 -1 3 0 1 -2
203 200 200 195 200 187 185 175
200 200 200 200 197 187 187 187
-8 3 -4 2 -2 -3 -5 -2
200 205 200 200 195 188 187 175 0 -2 7 -5 4 0 -1 -4
200 200 200 200 200 190 187 175 0 -3 -1 0 4 1 -1 0
205 200 199 200 191 187 187 175
210 200 200 200 188 185 187 186
3 -2 -3 3 3 -1 -1 3
f(i, j) -2 5 -2 4 -2 2 -3 0
F(u, v)
1/30/09 CSE 40373/60373: Multimedia Systems page 21
1/30/09 CSE 40373/60373: Multimedia Systems page 22
Run-length Coding on AC coefficients
To make it most likely to hit a long run of zeros: a
zig-zag scan is used to turn the 8×8 matrix
into a 64-vector
1/30/09 CSE 40373/60373: Multimedia Systems page 23
DPCM on DC coefficients
The DC coefficients are coded separately from the
AC ones. Differential Pulse Code modulation
(DPCM) is the coding method
If the DC coefficients for the first 5 image blocks
are 150, 155, 149, 152, 144, then the DPCM would
produce 150, 5, -6, 3, -8, assuming di = DCi+1 −
DCi, and d0 = DC0
AC components are Huffman coded
1/30/09 CSE 40373/60373: Multimedia Systems page 24
Four Commonly Used JPEG Modes
Sequential Mode — the default JPEG mode, each
graylevel image or color image component is
encoded in a single left-to-right, top-to-bottom scan
Progressive Mode
Hierarchical Mode
Lossless Mode — discussed in Chapter 7
1/30/09 CSE 40373/60373: Multimedia Systems page 25
Progressive Mode
Progressive JPEG delivers low quality versions of
the image quickly, followed by higher quality passes
1. Spectral selection: Takes advantage of the
“spectral” (spatial frequency spectrum)
characteristics of the DCT coefficients: higher AC
components provide detail information
Scan 1: Encode DC and first few AC components, e.g.,
AC1, AC2
Scan 2: Encode a few more AC components, e.g., AC3,
AC4, AC5
...
Scan k: Encode the last few ACs, e.g., AC61, AC62,
AC63.
1/30/09 CSE 40373/60373: Multimedia Systems page 26
Progressive Mode (Cont’d)
2. Successive approximation: Instead of
gradually encoding spectral bands, all DCT
coefficients are encoded simultaneously but with
their most significant bits (MSBs) first
Scan 1: Encode the first few MSBs, e.g., Bits 7, 6, 5, 4.
Scan 2: Encode a few more less significant bits, e.g., Bit
3.
...
Scan m: Encode the least significant bit (LSB), Bit 0.
1/30/09 CSE 40373/60373: Multimedia Systems page 27