0% found this document useful (0 votes)
12 views

DIP Unit 4 image compression

The document discusses digital image processing with a focus on image coding and compression techniques. It explains data redundancy, including coding, interpixel, and psychovisual redundancies, and outlines various compression methods such as error-free and lossy compression. Additionally, it describes the structure of compression systems, including encoders and decoders, and emphasizes the importance of reducing redundancies to achieve efficient data representation.

Uploaded by

rehna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

DIP Unit 4 image compression

The document discusses digital image processing with a focus on image coding and compression techniques. It explains data redundancy, including coding, interpixel, and psychovisual redundancies, and outlines various compression methods such as error-free and lossy compression. Additionally, it describes the structure of compression systems, including encoders and decoders, and emphasizes the importance of reducing redundancies to achieve efficient data representation.

Uploaded by

rehna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Unit 4: Image Coding and Compression


Image Coding Fundamentals, Image Compression Model, fundamentals-
redundancy: coding, interpixel, psychovisual, fidelity criteria, Basic
compression methods Error Free Compression - variable length, bit plane,
LZW arithmetic Lossless Predictive, Lossy Compression- Lossy Predictive.
Fundamentals of JPEG, MPEG, fractals.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 1


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Image compression

The term data compression refers to the process of reducing the amount of data required to
represent a given quantity of information. A clear distinction must be made between data and
information. They are not synonymous.
In fact, data are the means by which information is conveyed. Various amounts of
data may be used to represent the same amount of information.Such might be the case,
for example, if a long-winded individual and someone who is short and to the point were to
relate the same story. Here, the information of interest is the story; words are the data used to
relate the information.
If the two individuals use a different number of words to tell the same basic story, two
different versions of the story are created, and at least one includes non essential data.
That is, it contains data (or words) that either provide no relevant information or
simply restate that which is already known. It is thus said to contain data redundancy.
Data redundancy is a central issue in digital image compression.
It is not an abstract concept but a mathematically quantifiable entity. If n1 and n2
denote the number of information-carrying units in two data sets that represent the same
information, the relative data redundancy R D of the first data set (the one characterized by n1)
can be defined as

where CR , commonly called the compression ratio, is

For the case


n2 = n1, CR = 1 and RD = 0, indicating that (relative to the second data set) the first
representation of the information contains no redundant data.
When n2 << n1, CR = ∞ and RD=1, implying significant compression and highly
redundant data.
Finally, when n2 >> n1 , CR= 0 and RD =∞, indicating that the second data set
contains much more data than the original representation.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 2


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

This, of course, is the normally undesirable case of data expansion. In general, CR


and RD lie in the open intervals (0,∞) and (-∞, 1), respectively.
A practical compression ratio, such as 10 (or 10:1), means that the first data set has 10
information carrying units (say, bits) for every 1 unit in the second or compressed data set.
The corresponding redundancy of 0.9 implies that 90% of the data in the first data set is
redundant.
In digital image compression, three basic data redundancies can be identified and exploited:
coding redundancy,
interpixel redundancy, and
psychovisual redundancy.
Data compression is achieved when one or more of these redundancies are reduced or
eliminated.

Coding Redundancy:
In this, we utilize formulation to show how the gray-level histogram of an image also can
provide a great deal of insight into the construction of codes to reduce the amount of data
used to represent it.
Let us assume, once again, that a discrete random variable rk in the interval [0, 1]
represents the gray levels of an image and that each rk occurs with probability pr (rk).

where L is the number of gray levels, nk is the number of times that the kth gray level
appears in the image, and n is the total number of pixels in the image. If the number of bits
used to represent each value of rk is l (rk), then the average number of bits required to
represent each pixel is

That is, the average length of the code words assigned to the various gray-level values is
found by summing the product of the number of bits used to represent each gray level and the

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 3


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

probability that the gray level occurs. Thus the total number of bits required to code an M X
N image is MNL avg.

Interpixel Redundancy:

Fig.1.1 Two images and their gray-level histograms and normalized autocorrelation
coefficients along one line.
Consider the images shown in Figs. 1.1(a) and (b). As Figs. 1.1(c) and (d) show, these
images have virtually identical histograms. Note also that both histograms are trimodal,
indicating the presence of three dominant ranges of gray-level values. Because the gray levels
in these images are not equally probable, variable-length coding can be used to reduce the
coding redundancy that would result from a straight or natural binary encoding of their
pixels.
The coding process, however, would not alter the level of correlation between the
pixels within the images.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 4


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

In other words, the codes used to represent the gray levels of each image have nothing
to do with the correlation between pixels. These correlations result from the structural or
geometric relationships between the objects in the image.
Figures 1.1(e) and (f) show the respective autocorrelation coefficients computed along one
line of each image.

where

The scaling factor in Eq. above accounts for the varying number of sum terms that arise for
each integer value of n. Of course, n must be strictly less than N, the number of pixels on a
line. The variable x is the coordinate of the line used in the computation.
Note the dramatic difference between the shape of the functions shown in Figs. 1.1(e) and (f).
Their shapes can be qualitatively related to the structure in the images in Figs. 1.1(a) and (b).
This relationship is particularly noticeable in Fig. 1.1 (f), where the high correlation
between pixels separated by 45 and 90 samples can be directly related to the spacing between
the vertically oriented matches of Fig. 1.1(b). In addition, the adjacent pixels of both images
are highly correlated.
When n is 1, γ is 0.9922 and 0.9928 for the images of Figs. 1.1 (a) and (b),
respectively. These values are typical of most properly sampled television images.
These illustrations reflect another important form of data redundancy—one directly related to
the interpixel correlations within an image. Because the value of any given pixel can be
reasonably predicted from the value of its neighbors, the information carried by individual
pixels is relatively small. Much of the visual contribution of a single pixel to an image is
redundant; it could have been guessed on the basis of the values of its neighbors.
A variety of names, including spatial redundancy, geometric redundancy, and
interframe redundancy, have been coined to refer to these interpixel dependencies. We use
the term interpixel redundancy to encompass them all.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 5


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

In order to reduce the interpixel redundancies in an image, the 2-D pixel array normally used
for human viewing and interpretation must be transformed into a more efficient (but usually
"nonvisual") format. For example, the differences between adjacent pixels can be used to
represent an image. Transformations of this type (that is, those that remove interpixel
redundancy) are referred to as mappings. They are called reversible mappings if the original
image elements can be reconstructed from the transformed data set.

Psychovisual Redundancy:
The brightness of a region, as perceived by the eye, depends on factors other than simply the
light reflected by the region. For example, intensity variations (Mach bands) can be perceived
in an area of constant intensity. Such phenomena result from the fact that the eye does not
respond with equal sensitivity to all visual information. Certain information simply has less
relative importance than other information in normal visual processing. This information is
said to be psychovisually redundant.
It can be eliminated without significantly impairing the quality of image perception.
That psychovisual redundancies exist should not come as a surprise, because human
perception of the information in an image normally does not involve quantitative analysis of
every pixel value in the image. In general, an observer searches for distinguishing features
such as edges or textural regions and mentally combines them into recognizable groupings.
The brain then correlates these groupings with prior knowledge in order to complete the
image interpretation process.
Psychovisual redundancy is fundamentally different from the redundancies discussed
earlier. Unlike coding and interpixel redundancy, psychovisual redundancy is associated with
real or quantifiable visual information. Its elimination is possible only because the
information itself is not essential for normal visual processing. Since the elimination of
psychovisually redundant data results in a loss of quantitative information, it is commonly
referred to as quantization.
This terminology is consistent with normal usage of the word, which generally means the
mapping of a broad range of input values to a limited number of output values. As it is
an irreversible operation (visual information is lost), quantization results in lossy data
compression.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 6


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Image compression models

Fig. shows, a compression system consists of two distinct structural blocks: an encoder and a
decoder.
An input image f(x, y) is fed into the encoder, which creates a set of symbols from the
input data. After transmission over the channel, the encoded representation is fed to the
decoder, where a reconstructed output image f^(x, y) is generated.
In general, f^(x, y) may or may not be an exact replica of f(x, y). If it is, the system is error
free or information preserving; if not, some level of distortion is present in the reconstructed
image.
Both the encoder and decoder shown in Fig. consist of two relatively independent
functions or sub-blocks. The encoder is made up of a source encoder, which removes input
redundancies, and a channel encoder, which increases the noise immunity of the source
encoder's output. As would be expected, the decoder includes a channel decoder followed by
a source decoder.
If the channel between the encoder and decoder is noise free (not prone to error), the
channel encoder and decoder are omitted, and the general encoder and decoder become the
source encoder and decoder, respectively.

The Source Encoder and Decoder:


The source encoder is responsible for reducing or eliminating any coding, interpixel,
or psychovisual redundancies in the input image. The specific application and associated
fidelity requirements dictate the best encoding approach to use in any given situation.
Normally, the approach can be modeled by a series of three independent operations. As Fig.
(a) shows, each operation is designed to reduce one of the three redundancies. Figure (b)
depicts the corresponding source decoder. In the first stage of the source encoding process,
the mapper transforms the input data into a (usually nonvisual) format designed to reduce

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 7


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

interpixel redundancies in the input image. This operation generally is reversible and may or
may not reduce directly the amount of data required to represent the image.

(a) Source encoder and (b) source decoder model

Run-length coding is an example of a mapping that directly results in data compression in


this initial stage of the overall source encoding process. The representation of an image by a
set of transform coefficients is an example of the opposite case. Here, the mapper transforms
the image into an array of coefficients, making its interpixel redundancies more accessible for
compression in later stages of the encoding process.
The second stage, or quantizer block in Fig. (a), reduces the accuracy of the mapper's output
in accordance with some preestablished fidelity criterion. This stage reduces the psychovisual
redundancies of the input image. This operation is irreversible. Thus it must be omitted when
error-free compression is desired.
In the third and final stage of the source encoding process, the symbol coder creates a
fixed- or variable-length code to represent the quantizer output and maps the output in
accordance with the code.
The term symbol coder distinguishes this coding operation from the overall source
encoding process.
In most cases, a variable-length code is used to represent the mapped and quantized
data set.
It assigns the shortest code words to the most frequently occurring output values and
thus reduces coding redundancy.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 8


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

The operation, of course, is reversible. Upon completion of the symbol coding step,
the input image has been processed to remove each of the three redundancies.
Figure (a) shows the source encoding process as three successive operations, but all three
operations are not necessarily included in every compression system. Recall, for example,
that the quantizer must be omitted when error-free compression is desired.
In addition, some compression techniques normally are modeled by merging blocks
that are physically separate in Fig. (a). In the predictive compression systems, for instance,
the mapper and quantizer are often represented by a single block, which simultaneously
performs both operations.
The source decoder shown in Fig. (b) contains only two components: a symbol decoder and
an inverse mapper. These blocks perform, in reverse order, the inverse operations of the
source encoder's symbol encoder and mapper blocks. Because quantization results in
irreversible information loss, an inverse quantizer block is not included in the general source
decoder model shown in Fig. (b).

The Channel Encoder and Decoder:


The channel encoder and decoder play an important role in the overall encoding-
decoding process when the channel of Fig. is noisy or prone to error. They are designed to
reduce the impact of channel noise by inserting a controlled form of redundancy into the
source encoded data. As the output of the source encoder contains little redundancy, it would
be highly sensitive to transmission noise without the addition of this "controlled redundancy.
" One of the most useful channel encoding techniques was devised by R. W. Hamming
(Hamming [1950]). It is based on appending enough bits to the data being encoded to ensure
that some minimum number of bits must change between valid code words. Hamming
showed, for example, that if 3 bits of redundancy are added to a 4-bit word, so that the
distance between any two valid code words is 3, all single-bit errors can be detected and
corrected. (By appending additional bits of redundancy, multiple-bit errors can be detected
and corrected.) The 7-bit Hamming (7, 4) code word h1, h2, h3…., h6, h7 associated with a
4-bit binary number b3b2b1b0 is

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 9


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

where denotes the exclusive OR operation. Note that bits h1, h2, and h4 are even-
parity bits for the bit fields b3 b2 b0, b3b1b0, and b2b1b0, respectively. (Recall that a string
of binary bits has even parity if the number of bits with a value of 1 is even.)
To decode a Hamming encoded result, the channel decoder must check the encoded
value for odd parity over the bit fields in which even parity was previously established. A
single-bit error is indicated by a nonzero parity word c4c2c1,
where

If a nonzero value is found, the decoder simply complements the code word bit
position indicated by the parity word. The decoded binary value is then extracted from the
corrected code word as h3h5h6h7.

Variable-Length Coding:

The simplest approach to error-free image compression is to reduce only coding redundancy.
Coding redundancy normally is present in any natural binary encoding of the gray levels in
an image. It can be eliminated by coding the gray levels. To do so requires construction of a
variable length code that assigns the shortest possible code words to the most probable gray
levels.
Here, we examine several optimal and near optimal techniques for constructing such a
code. These techniques are formulated in the language of information theory. In practice, the
source symbols may be either the gray levels of an image or the output of a gray-level
mapping operation (pixel differences, run lengths, and so on).

Huffman coding:

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 10


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

The most popular technique for removing coding redundancy is due to Huffman (Huffman
[1952]). When coding the symbols of an information source individually, Huffman coding
yields the smallest possible number of code symbols per source symbol. In terms of the
noiseless coding theorem, the resulting code is optimal for a fixed value of n, subject to the
constraint that the source symbols be coded one at a time.

The first step in Huffman's approach is to create a series of source reductions by


ordering the probabilities of the symbols under consideration and combining the lowest
probability symbols into a single symbol that replaces them in the next source reduction.
Figure illustrates this process for binary coding (K-ary Huffman codes can also be
constructed).

At the far left, a hypothetical set of source symbols and their probabilities are ordered
from top to bottom in terms of decreasing probability values. To form the first source
reduction, the bottom two probabilities, 0.06 and 0.04, are combined to form a "compound
symbol" with probability 0.1. This compound symbol and its associated probability are
placed in the first source reduction column so that the probabilities of the reduced source are
also ordered from the most to the least probable. This process is then repeated until a reduced
source with two symbols (at the far right) is reached.

The second step in Huffman's procedure is to code each reduced source, starting with
the smallest source and working back to the original source. The minimal length binary code
for a two-symbol source, of course, is the symbols 0 and 1. As Fig. shows, these symbols are
assigned to the two symbols on the right (the assignment is arbitrary; reversing the order of
the 0 and 1 would work just as well). As the reduced source symbol with probability 0.6
was generated by combining two symbols in the reduced source to its left, the 0 used to code
it is now assigned to both of these symbols, and a 0 and 1 are arbitrarily appended to each to
distinguish them from each other. This operation is then repeated for each reduced source
until the original source is reached. The final code appears at the far left in Fig.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 11


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Fig.4.1 Huffman source reductions.

Fig.4.2 Huffman code assignment procedure.

The average length of this code is

and the entropy of the source is 2.14 bits/symbol. The resulting Huffman code efficiency is
0.973. Huffman's procedure creates the optimal code for a set of symbols and probabilities
subject to the constraint that the symbols be coded one at a time.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 12


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

After the code has been created, coding and/or decoding is accomplished in a simple
lookup table manner. The code itself is an instantaneous uniquely decodable block code. It is
called a block code because each source symbol is mapped into a fixed sequence of code
symbols. It is instantaneous, because each code word in a string of code symbols can be
decoded without referencing succeeding symbols.
It is uniquely decodable, because any string of code symbols can be decoded in only
one way. Thus, any string of Huffman encoded symbols can be decoded by examining the
individual symbols of the string in a left to right manner. For the binary code of Fig., a left-
to-right scan of the encoded string 010100111100 reveals that the first valid code word is
01010, which is the code for symbol a3 .The next valid code is 011, which corresponds to
symbol a1. Continuing in this manner reveals the completely decoded message to be
a3a1a2a2a 6.

Arithmetic coding:

Unlike the variable-length codes described previously, arithmetic coding generates non block
codes. In arithmetic coding, which can be traced to the work of Elias, a one-to-one
correspondence between source symbols and code words does not exist. Instead, an entire
sequence of source symbols (or message) is assigned a single arithmetic code word. The code
word itself defines an interval of real numbers between 0 and 1.
As the number of symbols in the message increases, the interval used to represent it
becomes smaller and the number of information units (say, bits) required to represent the
interval becomes larger. Each symbol of the message reduces the size of the interval in
accordance with its probability of occurrence.
Because the technique does not require, as does Huffman's approach, that each source
symbol translate into an integral number of code symbols (that is, that the symbols be coded
one at a time), it achieves (but only in theory) the bound established by the noiseless coding
theorem.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 13


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Fig.5.1 Arithmetic coding procedure

Figure illustrates the basic arithmetic coding process. Here, a five-symbol sequence or
message, a1a2a3a3a4, from a four-symbol source is coded. At the start of the coding process,
the message is assumed to occupy the entire half-open interval [0, 1). As Table 5.2 shows,
this interval is initially subdivided into four regions based on the probabilities of each source
symbol. Symbol ax, for example, is associated with subinterval [0, 0.2). Because it is the first
symbol of the message being coded, the message interval is initially narrowed to [0, 0.2).
Thus in Fig. [0, 0.2) is expanded to the full height of the figure and its end points labeled by
the values of the narrowed range. The narrowed range is then subdivided in accordance with
the original source symbol probabilities and the process continues with the next message
symbol.

Table 1 Arithmetic coding example

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 14


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

In this manner, symbol a2 narrows the subinterval to [0.04, 0.08), a3 further narrows it
to [0.056, 0.072), and so on. The final message symbol, which must be reserved as a special
end-of-message indicator, narrows the range to [0.06752, 0.0688). Of course, any number
within this subinterval—for example, 0.068—can be used to represent the message.
In the arithmetically coded message of Fig. 5.1, three decimal digits are used to
represent the five-symbol message. This translates into 3/5 or 0.6 decimal digits per source
symbol and compares favorably with the entropy of the source, which is 0.58 decimal digits
or 10- ary units/symbol. As the length of the sequence being coded increases, the resulting
arithmetic code approaches the bound established by the noiseless coding theorem.
In practice, two factors cause coding performance to fall short of the bound:
(1) the addition of the end-of-message indicator that is needed to separate one
message from an-other; and
(2) the use of finite precision arithmetic. Practical implementations of arithmetic
coding address the latter problem by introducing a scaling strategy and a rounding strategy
(Langdon and Rissanen [1981]).
The scaling strategy renormalizes each subinterval to the (0, 1) range before
subdividing it in accordance with the symbol probabilities. The rounding strategy guarantees
that the truncations associated with finite precision arithmetic do not prevent the coding
subintervals from being represented accurately.

LZW Coding:

The technique, called Lempel-Ziv-Welch (LZW) coding, assigns fixed-length code words to
variable length sequences of source symbols but requires no a priori knowledge of the
probability of occurrence of the symbols to be encoded. LZW compression has been
integrated into a variety of mainstream imaging file formats, including the graphic
interchange format (GIF), tagged image file format (TIFF), and the portable document format
(PDF). LZW coding is conceptually very simple (Welch [1984]). At the onset of the
coding process, a codebook or "dictionary" containing the source symbols to be coded is
constructed. For 8-bit monochrome images, the first 256 words of the dictionary are assigned
to the gray values 0, 1, 2..., and 255. As the encoder sequentially examines the image's pixels,

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 15


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

gray level sequences that are not in the dictionary are placed in algorithmically determined
(e.g., the next unused) locations. If the first two pixels of the image are white, for instance,
sequence “255-255” might be assigned to location 256, the address following the locations
reserved for gray levels 0 through 255. The next time that two consecutive white pixels are
encountered, code word 256, the address of the location containing sequence 255-255, is used
to represent them. If a 9-bit, 512-word dictionary is employed in the coding process, the
original (8 + 8) bits that were used to represent the two pixels are replaced by a single 9-bit
code word. Cleary, the size of the dictionary is an important system parameter. If it is too
small, the detection of matching gray-level sequences will be less likely; if it is too large, the
size of the code words will adversely affect compression performance.

Consider the following 4 x 4, 8-bit image of a vertical edge:

Table details the steps involved in coding its 16 pixels. A 512-word dictionary with the
following starting content is assumed:

Locations 256 through 511 are initially unused. The image is encoded by processing its pixels
in a left-to-right, top-to-bottom manner. Each successive gray-level value is concatenated
with a variable—column 1 of Table 6.1 —called the "currently recognized sequence." As can
be seen, this variable is initially null or empty. The dictionary is searched for each

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 16


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

concatenated sequence and if found, as was the case in the first row of the table, is replaced
by the newly concatenated and recognized (i.e., located in the dictionary) sequence. This was
done in column 1 of row 2.

Table 6.1 LZW coding example

No output codes are generated, nor is the dictionary altered. If the concatenated
sequence is not found, however, the address of the currently recognized sequence is output as
the next encoded value, the concatenated but unrecognized sequence is added to the
dictionary, and the currently recognized sequence is initialized to the current pixel value. This
occurred in row 2 of the table.
The last two columns detail the gray-level sequences that are added to the dictionary
when scanning the entire 4 x 4 image. Nine additional code words are defined. At the
conclusion of coding, the dictionary contains 265 code words and the LZW algorithm has
successfully identified several repeating gray-level sequences—leveraging them to reduce the

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 17


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

original 128-bit image to 90 bits (i.e., 10 9-bit codes). The encoded output is obtained by
reading the third column from top to bottom. The resulting compression ratio is 1.42:1.
A unique feature of the LZW coding just demonstrated is that the coding dictionary or
code book is created while the data are being encoded. Remarkably, an LZW decoder builds
an identical decompression dictionary as it decodes simultaneously the encoded data stream. .
Although not needed in this example, most practical applications require a strategy for
handling dictionary overflow. A simple solution is to flush or reinitialize the dictionary when
it becomes full and continue coding with a new initialized dictionary. A more complex option
is to monitor compression performance and flush the dictionary when it becomes poor or
unacceptable. Alternately, the least used dictionary entries can be tracked and replaced when
necessary.

Bit-Plane Coding:

An effective technique for reducing an image's interpixel redundancies is to process the


image's bit planes individually. The technique, called bit-plane coding, is based on the
concept of decomposing a multilevel (monochrome or color) image into a series of binary
images and compressing each binary image via one of several well-known binary
compression methods.

Bit-plane decomposition:

The gray levels of an m-bit gray-scale image can be represented in the form of the
base 2 polynomial Based on this property, a simple method of decomposing the image into a
collection of binary images is to separate the m coefficients of the polynomial into m 1-bit bit
planes. The zeroth-order bit plane is generated by collecting the a0 bits of each pixel, while
the (m - 1) st-order bit plane contains the am-1, bits or coefficients. In general, each bit plane
is numbered from 0 to m-1 and is constructed by setting its pixels equal to the values of the
appropriate bits or polynomial coefficients from each pixel in the original image. The
inherent disadvantage of this approach is that small changes in gray level can have a
significant impact on the complexity of the bit planes.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 18


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

If a pixel of intensity 127 (01111111) is adjacent to a pixel of intensity 128 (10000000), for
instance, every bit plane will contain a corresponding 0 to 1 (or 1 to 0) transition. For
example, as the most significant bits of the two binary codes for 127 and 128 are different, bit
plane 7 will contain a zero-valued pixel next to a pixel of value 1, creating a 0 to 1 (or 1 to 0)
transition at that point.

An alternative decomposition approach (which reduces the effect of small gray-level


variations) is to first represent the image by an m-bit Gray code. The m-bit Gray code gm-1...
g2g1g0 that corresponds to the polynomial in Eq. above can be computed from

Here, denotes the exclusive OR operation. This code has the unique property that successive
code words differ in only one bit position. Thus, small changes in gray level are less likely to
affect all m bit planes. For instance, when gray levels 127 and 128 are adjacent, only the 7th
bit plane will contain a 0 to 1 transition, because the Gray codes that correspond to 127 and
128 are 11000000 and 01000000, respectively.

Lossless Predictive Coding:

The error-free compression approach does not require decomposition of an image into
a collection of bit planes. The approach, commonly referred to as lossless predictive coding,
is based on eliminating the interpixel redundancies of closely spaced pixels by extracting and
coding only the new information in each pixel. The new information of a pixel is defined as
the difference between the actual and predicted value of that pixel.
Figure shows the basic components of a lossless predictive coding system. The
system consists of an encoder and a decoder, each containing an identical predictor. As each
successive pixel of the input image, denoted fn, is introduced to the encoder, the predictor
generates the anticipated value of that pixel based on some number of past inputs. The output
of the predictor is then rounded to the nearest integer, denoted f^ n and used to form the
difference or prediction error which is coded using a variable-length code (by the symbol
encoder) to generate the next element of the compressed data stream.
Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 19
VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Fig. A lossless predictive coding model: (a) encoder; (b) decoder

The decoder of Fig. 8.1 (b) reconstructs en from the received variable-length code words and
performs the inverse operation

Various local, global, and adaptive methods can be used to generate f^ n. In most cases,
however, the prediction is formed by a linear combination of m previous pixels. That is,

where m is the order of the linear predictor, round is a function used to denote the
rounding or nearest integer operation, and the αi, for i = 1,2,..., m are prediction coefficients.
In raster scan applications, the subscript n indexes the predictor outputs in accordance with
their time of occurrence. That is, fn, f^n and en in Eqns. above could be replaced with the
more explicit notation f (t), f^(t), and e (t), where t represents time. In other cases, n is used as
an index on the spatial coordinates and/or frame number (in a time sequence of images) of an
image. In 1-D linear predictive coding, for example, Eq. above can be written as

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 20


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

where each subscripted variable is now expressed explicitly as a function of spatial


coordinates x and y. The Eq. indicates that the 1-D linear prediction f(x, y) is a function of
the previous pixels on the current line alone. In 2-D predictive coding, the prediction is a
function of the previous pixels in a left-to-right, top-to-bottom scan of an image. In the 3-D
case, it is based on these pixels and the previous pixels of preceding frames. Equation above
cannot be evaluated for the first m pixels of each line, so these pixels must be coded by using
other means (such as a Huffman code) and considered as an overhead of the predictive coding
process. A similar comment applies to the higher-dimensional cases.

Lossy Predictive Coding:

In this type of coding, we add a quantizer to the lossless predictive model and examine the
resulting trade-off between reconstruction accuracy and compression performance. As Fig.9
shows, the quantizer, which absorbs the nearest integer function of the error-free encoder, is
inserted between the symbol encoder and the point at which the prediction error is formed. It
maps the prediction error into a limited range of outputs, denoted e^n which establish the
amount of compression and distortion associated with lossy predictive coding.

Fig. A lossy predictive coding model: (a) encoder and (b) decoder.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 21


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

In order to accommodate the insertion of the quantization step, the error-free encoder of
figure must be altered so that the predictions generated by the encoder and decoder are
equivalent. As Fig. (a) shows, this is accomplished by placing the lossy encoder's predictor
within a feedback loop, where its input, denoted f˙n, is generated as a function of past
predictions and the corresponding quantized errors. That is,

This closed loop configuration prevents error buildup at the decoder's output. Note from Fig.
(b) that the output of the decoder also is given by the above Eqn.

Optimal predictors:

The optimal predictor used in most predictive coding applications minimizes the encoder's
meansquare prediction error

subject to the constraint that

and

That is, the optimization criterion is chosen to minimize the mean-square prediction error, the
quantization error is assumed to be negligible (e˙n ≈ en), and the prediction is constrained to a
linear combination of m previous pixels.1 These restrictions are not essential, but they
simplify the analysis considerably and, at the same time, decrease the computational
complexity of the predictor. The resulting predictive coding approach is referred to as
differential pulse code modulation (DPCM)

Transform Coding:

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 22


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

All the predictive coding techniques operate directly on the pixels of an image and thus are
spatial domain methods. In this coding, we consider compression techniques that are based
on modifying the transform of an image. In transform coding, a reversible, linear transform
(such as the Fourier transform) is used to map the image into a set of transform coefficients,
which are then quantized and coded. For most natural images, a significant number of the
coefficients have small magnitudes and can be coarsely quantized (or discarded entirely) with
little image distortion. A variety of transformations, including the discrete Fourier transform
(DFT), can be used to transform the image data.

Fig. A transform coding system: (a) encoder; (b) decoder.

Figure shows a typical transform coding system. The decoder implements the inverse
sequence of steps (with the exception of the quantization function) of the encoder, which
performs four relatively straightforward operations: subimage decomposition, transformation,
quantization, and coding. An N X N input image first is subdivided into subimages of size n
X n, which are then transformed to generate (N/n) 2 subimage transform arrays, each of size n
X n. The goal of the transformation process is to decorrelate the pixels of each subimage, or
to pack as much information as possible into the smallest number of transform coefficients.
The quantization stage then selectively eliminates or more coarsely quantizes the
coefficients that carry the least information. These coefficients have the smallest impact on
reconstructed subimage quality. The encoding process terminates by coding (normally using
a variable-length code) the quantized coefficients. Any or all of the transform encoding steps

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 23


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

can be adapted to local image content, called adaptive transform coding, or fixed for all
subimages, called nonadaptive transform coding.

Wavelet Coding:

The wavelet coding is based on the idea that the coefficients of a transform that decorrelates
the pixels of an image can be coded more efficiently than the original pixels themselves. If
the transform's basis functions—in this case wavelets—pack most of the important visual
information into a small number of coefficients, the remaining coefficients can be quantized
coarsely or truncated to zero with little image distortion.

Figure shows a typical wavelet coding system. To encode a 2J X 2J image, an analyzing


wavelet, Ψ, and minimum decomposition level, J - P, are selected and used to compute the
image's discrete wavelet transform. If the wavelet has a complimentary scaling function φ,
the fast wavelet transform can be used. In either case, the computed transform converts a
large portion of the original image to horizontal, vertical, and diagonal decomposition
coefficients with zero mean and Laplacian-like distributions.

Fig. A wavelet coding system: (a) encoder; (b) decoder.

Since many of the computed coefficients carry little visual information, they can be
quantized and coded to minimize inter coefficient and coding redundancy. Moreover, the
quantization can be adapted to exploit any positional correlation across the P decomposition
levels. One or more of the lossless coding methods, including run-length, Huffman,
arithmetic, and bit-plane coding, can be incorporated into the final symbol coding step.
Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 24
VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

Decoding is accomplished by inverting the encoding operations—with the exception of


quantization, which cannot be reversed exactly. The principal difference between the
wavelet-based system and the transform coding system is the omission of the transform
coder's subimage processing stages. Because wavelet transforms are both computationally
efficient and inherently local (i.e., their basis functions are limited in duration), subdivision of
the original image is unnecessary.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 25


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

1. What is image compression?


Image compression refers to the process of redundancy amount of data required to represent the
given quantity of information for digital image. The basis of reduction process is removal of
redundant data.

2. What is Data Compression?


Data compression requires the identification and extraction of source redundancy. In other
words, data compression seeks to reduce the number of bits used to store or transmit information.

3. What are two main types of Data compression?


1. Lossless compression can recover the exact original data after compression. It is used mainly
for compressing database records, spreadsheets or word processing files, where exact
replication of the original is essential.
2. Lossy compression will result in a certain loss of accuracy in exchange for a substantial
increase in compression. Lossy compression is more effective when used to compress graphic
images and digitised voice where losses outside visual or aural perception can be tolerated.

4. What is the need for Compression?


In terms of storage, the capacity of a storage device can be effectively increased with methods
that compress a body of data on its way to a storage device and decompress it when it is
retrieved.
1. In terms of communications, the bandwidth of a digital communication link can be
effectively increased by compressing data at the sending end and decompressing data at the
receiving end.
2. At any given time, the ability of the Internet to transfer data is fixed. Thus, if data can
effectively be compressed wherever possible, significant improvements of data
throughput can be achieved. Many files can be combined into one compressed document
making sending easier.

5. What are different Compression Methods?


Run Length Encoding (RLE) Arithmetic coding Huffman coding and Transform coding

6. Define is coding redundancy

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 26


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

If the gray level of an image is coded in a way that uses more code words than necessary to
represent each gray level, then the resulting image is said to contain coding redundancy.
7. Define interpixel redundancy
The value of any given pixel can be predicted from the values of its neighbors.The information
carried by is small. Therefore the visual contribution of a single pixel to an image is redundant.
Otherwise called as spatial redundant geometric redundant or interpixel redundant. Eg: Run
length coding

8. What is run length coding?


Run-length Encoding, or RLE is a technique used to reduce the size of a repeating string of
characters. This repeating string is called a run; typically RLE encodes a run of symbols into two
bytes, a count and a symbol. RLE can compress any type of data regardless of its information
content, but the content of data to be compressed affects the compression ratio. Compression is
normally measured with the compression ratio:

9. Define compression ratio.


Compression Ratio = original size / compressed size: 1

10. Define psycho visual redundancy


In normal visual processing certain information has less importance than other information. So
this information is said to be psycho visual redundant.

11. Define encoder


Source encoder is responsible for removing the coding and interpixel redundancy and psycho
visual redundancy. There are two components A) Source Encoder B) Channel Encoder

12. Define source encoder


Source encoder performs three operations
1) Mapper -this transforms the input data into non-visual format. It reduces the interpixel
redundancy.
2) Quantizer - It reduces the psycho visual redundancy of the input images .This step is omitted
if the system is error free.

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 27


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

3) Symbol encoder- This reduces the coding redundancy .This is the final stage of encoding
process.

13. Define channel encoder


The channel encoder reduces reduces the impact of the channel noise by inserting redundant bits
into the source encoded data. Eg: Hamming code

14. What are the types of decoder?


Source decoder- has two components a) Symbol decoder- This performs inverse operation of
symbol encoder. b) Inverse mapping- This performs inverse operation of mapper. Channel
decoder-this is omitted if the system is error free.

15. What are the operations performed by error free compression?


1) Devising an alternative representation of the image in which its interpixel redundant are
reduced.
2) Coding the representation to eliminate coding redundancy

16. What is Variable Length Coding?


Variable Length Coding is the simplest approach to error free compression. It reduces only the
coding redundancy. It assigns the shortest possible codeword to the most probable gray levels.

17. Define Huffman coding and mention its limitation


1. Huffman coding is a popular technique for removing coding redundancy.
2. When coding the symbols of an information source the Huffman code yields the smallest
possible number of code words, code symbols per source symbol.
Limitation: For equi probable symbols, Huffman coding produces variable code words.

18. Define Block code


Each source symbol is mapped into fixed sequence of code symbols or code words. So it is
called as block code.

19. Define instantaneous code

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 28


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

A code word that is not a prefix of any other code word is called instantaneous or prefix
codeword.

20. Define uniquely decodable code


A code word that is not a combination of any other codeword is said to be uniquely decodable
code.

21. Define B2 code


Each code word is made up of continuation bit c and information bit which are binary numbers.
This is called B2 code or B code. This is called B2 code because two information bits are used
for continuation bits

22. Define the procedure for Huffman shift coding


List all the source symbols along with its probabilities in descending order. Divide the total
number of symbols into block of equal size. Sum the probabilities of all the source symbols
outside the reference block. Now apply the procedure for reference block, including the prefix
source symbol. The code words for the remaining symbols can be constructed by means of one
or more prefix code followed by the reference block as in the case of binary shift code.

23. Define arithmetic coding


In arithmetic coding one to one corresponds between source symbols and code word doesn’t
exist where as the single arithmetic code word assigned for a sequence of source symbols. A
code word defines an interval of number between 0 and 1.

24. What is bit plane Decomposition?


An effective technique for reducing an image’s interpixel redundancies is to process the image’s
bit plane individually. This technique is based on the concept of decomposing multilevel images
into a series of binary images and compressing each binary image via one of several well-known
binary compression methods.

25. Draw the block diagram of transform coding system

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 29


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

26. How effectiveness of quantization can be improved?


1. Introducing an enlarged quantization interval around zero, called a dead zero.
2. Adapting the size of the quantization intervals from scale to scale. In either case, the selected
quantization intervals must be transmitted to the decoder with the encoded image bit stream.

27. What are the coding systems in JPEG


1. A lossy baseline coding system, which is based on the DCT and is adequate for most
compression application.
2. An extended coding system for greater compression, higher precision or progressive
reconstruction applications. 3. A lossless independent coding system for reversible compression.

28. What is JPEG?


The acronym is expanded as "Joint Photographic Expert Group". It is an international standard in
1992. It perfectly Works with color and grayscale images, Many applications e.g., satellite,
medical,...

29. What are the basic steps in JPEG?


The Major Steps in JPEG Coding involve:
1. DCT (Discrete Cosine Transformation) 2. Quantization 3. Zigzag Scan
4. DPCM on DC component 5. RLE on AC Components 6. Entropy Coding

30. What is MPEG?


The acronym is expanded as "Moving Picture Expert Group". It is an international standard in
1992. It perfectly Works with video and also used in teleconferencing

31. Define I-frame


I-frame is Intraframe or Independent frame. An I-frame is compressed independently of all

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 30


VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

frames. It resembles a JPEG encoded image. It is the reference point for the motion estimation
needed to generate subsequent P and P-frame.

32. Define P-frame


P-frame is called predictive frame. A P-frame is the compressed difference between the current
frame and a prediction of it based on the previous I or P-frame

33. Define B-frame


B-frame is the bidirectional frame. A B-frame is the compressed difference between the current
frame and a prediction of it based on the previous I or P-frame or next P-frame. Accordingly the
decoder must have access to both past and future reference frames.

34. What is data redundancy? Explain three basic data redundancy?


Irrelevant and repeated information is said to be redundancy. Relative redundancy is given by
R=1-1/C. Three basic data redundancies are coding redundancy, spatial redundancy or interpixel
redundancy and psycho visual redundancy

35. What is image compression? Explain variable length coding compression schemes.
Reducing the amount of Irrelevant and repeated information is called image compression.
Explain Shanon fano coding, Huffman coding and Golomb coding techniques with examples

36. Explain about Image compression model?

37. Explain about Error free Compression?


Lossless predictive coding has encoder and decoder and predictor. This predictor generates the
anticipated value of each sample based on a specified number of past samples

38. Explain about Lossy compression?


Lossless predictive coding has encoder, decoder, quantizer and predictor. Delta modulation is
one such lossy compression
Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 31
VII SEMESTER ELECTRONICS Elective – I : Digital Image Processing

39. Explain the schematics of image compression standard JPEG.


JPEG defines three different coding systems
(1) a lossy baseline coding system which is based on the DCT and is adequate for most
compression applications
(2) an extended coding system for greater compression, higher precision or progressive
reconstruction applications
(3) a lossless independent coding system must include support for the baseline system.

40. Explain how compression is achieved in transform coding and explain about DCT.

8. Explain arithmetic coding


Arithmetic coding generates nonblack codes. In this coding, an entire sequence of source
symbols is assigned a single arithmetic code word.
9. Discuss about MPEG standard and compare with JPEG
The acronym is expanded as "Moving Picture Expert Group". It is an international standard in
1992. It perfectly Works with video and also used in teleconferencing

10. Draw and explain the block diagram of MPEG encoder


Macroblock formation, Frame formation, frame encoding, Motion estimation, Audio compression and
quantization and frame construction

Prof. Vijay V. Chakole, Department of Electronics Engineering, KDKCE, Nagpur Page 32

You might also like