Introduction to JPEG and
MPEG
Ingemar J. Cox
University College London
Outline
Elementary information theory
Lossless compression
Quantization
UCL Adastral Park Postgraduate Campus
Fundamentals of images
Discrete Cosine Transform (DCT)
JPEG
MPEG-1, MPEG-2
Nov 27th 2006 Ingemar J. Cox 2
Bibliography
D. MacKay, Information Theory, Inference and learning
Algorithms, Cambridge University Press, 2003.
http://www.inference.phy.cam.ac.uk/itprnn/book.html
W. B. Pennebaker and J. L. Mitchell, JPEG Still Image
UCL Adastral Park Postgraduate Campus
Data Compression Standard, Chapman Hall, 1993
(ISBN 0-442-01272-1).
G. K. Wallace, The JPEG Still-Picture Compression
Standard, IEEE Trans. On Consumer Electronics, 38,
1, 18-34, 1992.
http://en.wikipedia.org/wiki/JPEG
Nov 27th 2006 Ingemar J. Cox 3
Bibliography
http://en.wikipedia.org/wiki/MPEG-2
T. Sikora, MPEG Digital Video-Coding
Standards, IEEE Signal Processing Magazine,
UCL Adastral Park Postgraduate Campus
82-100, September 1997
Nov 27th 2006 Ingemar J. Cox 4
Elementary Information Theory
Elementary Information Theory
How much information does a symbol convey?
Intuitively, the more unpredictable or surprising
it is, the more information is conveyed.
UCL Adastral Park Postgraduate Campus
Conversely, if we strongly expected something,
and it occurs, we have not learnt very much
Nov 27th 2006 Ingemar J. Cox 6
Elementary Information Theory
If p is the probability that a symbol will occur
Then the amount of information, I, conveyed is:
1
UCL Adastral Park Postgraduate Campus
I log 2
p
The information, I, is measured in bits
It is the optimum code length for the symbol
Nov 27th 2006 Ingemar J. Cox 7
Elementary Information Theory
The entropy, H, is the average information per
symbol
1
H p( s) log 2 (
UCL Adastral Park Postgraduate Campus
)
s p( s)
Provides a lower bound on the compression
that can be achieved
Nov 27th 2006 Ingemar J. Cox 8
Elementary Information theory
A simple example. Suppose we need to
transmit four possible weather conditions:
1. Sunny
UCL Adastral Park Postgraduate Campus
2. Cloudy
3. Rainy
4. Snowy
If all conditions are equally likely, p(s)=0.25,
and H=2
i.e. we need a minimum of 2 bits per symbol
Nov 27th 2006 Ingemar J. Cox 9
Elementary information theory
Suppose instead that it is:
1. Sunny 0.5 of the time
2. Cloudy 0.25 of the time
UCL Adastral Park Postgraduate Campus
3. Rainy 0.125 of the time, and
4. Snowy 0.125 of the time
Then the entropy is
1 1 1
H 0.5 log 2 0.25 log 2 2 0.125 log 2
0.5 0.25 0.125
H 0.5 1 0.25 2 2 0.125 3
H 0.5 0.5 0.75 1.75
Nov 27th 2006 Ingemar J. Cox 10
Elementary Information Theory
Variable length codewords
Huffman code integer code lengths
Arithmetic codes non-integer code lengths
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 11
Elementary Information Theory
Huffman code
Weather Probability Information Integer code
UCL Adastral Park Postgraduate Campus
Sunny 0.5 1 0
Cloudy 0.25 2 10
Rainy 0.125 3 110
Snowy 0.125 3 111
Nov 27th 2006 Ingemar J. Cox 12
Elementary Information Theory
Previous illustration is an example of a lossless
code
I.e. we are able to recover the information exactly
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 13
Elementary Information Theory
Note that we have assumed that each symbol
is independent of the other symbols
I.e. the current symbol provides no information
regarding the next symbol
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 14
Quantization
Quantization is the process of approximating a
continuous (or range of values) by a (much)
smaller range of values
UCL Adastral Park Postgraduate Campus
x 0.5
Q( x, ) Round
Where Round(y) rounds y to the nearest integer
is the quantization stepsize
Nov 27th 2006 Ingemar J. Cox 15
Quantization
Example: =2
-5 -4 -3 -2 -1 0 1 2 3 4 5
UCL Adastral Park Postgraduate Campus
-2 -1 0 1 2
-4 -2 0 2 4
Nov 27th 2006 Ingemar J. Cox 16
Quantization
Quantization plays an important role in lossy
compression
This is where the loss happens
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 17
Fundamentals of Images
Fundamentals of images
An image consists of pixels (picture elements)
Each pixel represents luminance (and colour)
Typically, 8-bits per pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 19
Fundamentals of images
Colour
Colour spaces (representations)
RGB (red-green-blue)
CMY (cyan-magenta-yellow)
UCL Adastral Park Postgraduate Campus
YUV
Y = 0.3R+0.6G+0.1B (luminance)
U=R-Y
V=B-Y
Greyscale
Binary
Nov 27th 2006 Ingemar J. Cox 20
Fundamentals of images
A TV frame is about 640x480 pixels
If each pixels is represented by 8-bits for each
colour, then the total image size is
UCL Adastral Park Postgraduate Campus
640480*3=921,600 bytes or 7.4Mbits
At 30 frames per second, this would be
220Mbits/second
Nov 27th 2006 Ingemar J. Cox 21
Fundamentals of images
Do we need all these bits?
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 22
Fundamentals of images
Here is an image represented with 8-bits per
pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 23
Fundamentals of images
Here is the same image at 7-bits per pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 24
Fundamentals of images
And at 6-bits per pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 25
Fundamentals of images
And at 5-bits per pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 26
Fundamentals of images
And at 4-bits per pixel
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 27
Fundamentals of images
Do we need all these bits?
No!
The previous example illustrated the eyes
UCL Adastral Park Postgraduate Campus
sensitivity to luminance
We can build a perceptual model
Only code what is important to the human visual
system (HVS)
Usually a function of spatial frequency
Nov 27th 2006 Ingemar J. Cox 28
Fundamentals of Images
Just as audio has temporal frequencies
Images have spatial frequencies
Transforms
UCL Adastral Park Postgraduate Campus
Fourier transform
Discrete cosine transform
Wavelet transform
Hadamard transform
Nov 27th 2006 Ingemar J. Cox 29
Discrete cosine transform
Forward DCT
C (u )N 1
u
S (u ) s(n) cos (n 0.5)
UCL Adastral Park Postgraduate Campus
2 n 0 8
Inverse DCT
C (u ) N 1 u
s ( n)
2 u 0
S (u ) cos
8
(n 0.5)
Nov 27th 2006 Ingemar J. Cox 30
Basis functions
DC term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 31
Basis functions
First term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 32
Basis functions
Second term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 33
Basis functions
Third term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 34
Basis functions
Fourth term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 35
Basis functions
Fifth term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 36
Basis functions
Sixth term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 37
Basis functions
Seventh term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 38
DCT Example
Example
Signal
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 40
Example
DCT coefficients are:
4.2426
0
UCL Adastral Park Postgraduate Campus
-3.1543
0
0
0
-0.2242
0
Nov 27th 2006 Ingemar J. Cox 41
Example: DCT decomposition
DC term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 42
Example: DCT decomposition
2nd AC term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 43
Example: DCT decomposition
6th AC term
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 44
Example: summation of DCT terms
First two non-zero coefficients
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 45
Example: summation of DCT terms
All 3 non-zero coefficients
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 46
Example
What if we quantize DCT coefficients?
=1
Quantized DCT coefficients are:
UCL Adastral Park Postgraduate Campus
4
0
-3
0
0
0
0
0
Nov 27th 2006 Ingemar J. Cox 47
Example
Approximate reconstruction
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 48
Example
Exact reconstruction
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 49
2-D DCT Transform
Let i(x,y) represent an image with N rows and
M columns
Its DCT I(u,v) is given by
UCL Adastral Park Postgraduate Campus
1 M N (2 x 1)u (2 y 1)v
I (u, v) C (u )C (v) i( x, y) cos cos
4 x 1 y 1 16 16
where
1 C (u ) 1
C (0)
2
Nov 27th 2006 Ingemar J. Cox 50
Fundamentals of images
Discrete cosine transform
Coefficients are approximately uncorrelated
Except DC term
C.f. original 88 pixel block
UCL Adastral Park Postgraduate Campus
Concentrates more power in the low frequency
coefficients
Computationally efficient
Block-based DCT
Compute DCT on 88 blocks of pixels
Nov 27th 2006 Ingemar J. Cox 51
Fundamentals of images
Basis functions for the 88 DCT (courtesy
Wikipedia)
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 52
Fundamentals of JPEG
Fundamentals of JPEG
Encoder
DCT Quantizer Entropy coder
UCL Adastral Park Postgraduate Campus
Compressed
image data
IDCT Dequantizer Entropy
decoder
Decoder
Nov 27th 2006 Ingemar J. Cox 54
Fundamentals of JPEG
JPEG works on 88 blocks
Extract 88 block of pixels
Convert to DCT domain
UCL Adastral Park Postgraduate Campus
Quantize each coefficient
Different stepsize for each coefficient
Based on sensitivity of human visual system
Order coefficients in zig-zag order
Entropy code the quantized values
Nov 27th 2006 Ingemar J. Cox 55
Fundamentals of JPEG
A common quantization table is
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
UCL Adastral Park Postgraduate Campus
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Nov 27th 2006 Ingemar J. Cox 56
Fundamentals of JPEG
Zig-zag ordering
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
UCL Adastral Park Postgraduate Campus
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
Nov 27th 2006 Ingemar J. Cox 57
Fundamentals of JPEG
Entropy coding
Run length encoding followed by
Huffman
Arithmetic
UCL Adastral Park Postgraduate Campus
DC term treated separately
Differential Pulse Code Modulation (DPCM)
2-step process
1. Convert zig-zag sequence to a symbol sequence
2. Convert symbols to a data stream
Nov 27th 2006 Ingemar J. Cox 58
Fundamentals of JPEG
Modes
Sequential
Progressive
Spectral selection
UCL Adastral Park Postgraduate Campus
Send lower frequency coefficients first
Successive approximation
Send lower precision first, and subsequently refine
Lossless
Hierarchical
Send low resolution image first
Nov 27th 2006 Ingemar J. Cox 59
Fundamentals of MPEG-1/2
Fundamentals of MPEG
A sequence of 2D images
Temporal correlation as well as spatial
correlation
UCL Adastral Park Postgraduate Campus
TV broadcast
Frame-based
Field-based
Nov 27th 2006 Ingemar J. Cox 61
MPEG
Moving Picture Experts Group
Standard for video compression
Similarities with JPEG
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 62
MPEG
Design is a compromise between
Bit rate
Encoder/decoder complexity
Random access capability
UCL Adastral Park Postgraduate Campus
Nov 27th 2006 Ingemar J. Cox 63
MPEG
Images
Spatial redundancy
Perceptual redundancy
UCL Adastral Park Postgraduate Campus
Video
Spatial redundancy
Intraframe coding
Temporal redundancy
Interframe coding
Perceptual redundancy
Nov 27th 2006 Ingemar J. Cox 64
MPEG
Consider a sequence of n frames of video.
It consists of:
I-frames
UCL Adastral Park Postgraduate Campus
P-frames
B-frames
A sequence of one I-frame followed by P- and
B-frames is known as a GOP
Group of Pictures
E.g. IBBPBBPBBPBBP
Nov 27th 2006 Ingemar J. Cox 65
MPEG
I-frames
Intraframe coded
No motion compensation
P-frames
UCL Adastral Park Postgraduate Campus
Interframe coded
Motion compensation
Based on past frames only
B-frames
Interframe coded
Motion compensation
Based on past and future frames
Nov 27th 2006 Ingemar J. Cox 66
MPEG
Motion-compensated prediction
Divide current frame, i, into disjoint 1616
macroblocks
Search a window in previous frame, i-1, for closest
UCL Adastral Park Postgraduate Campus
match
Calculate the prediction error
For each of the four 88 blocks in the macroblock,
perform DCT-based coding
Transmit motion vector + entropy coded prediction
error (lossy coding)
Nov 27th 2006 Ingemar J. Cox 67
MPEG
Like JPEG, the DC term is treated separately
DPCM
B-frame compression high
UCL Adastral Park Postgraduate Campus
Need buffer and delay
Nov 27th 2006 Ingemar J. Cox 68