0% found this document useful (0 votes)
15 views47 pages

SC 11

The document discusses distortion measures, rate distortion theory, and probability models used in lossy data compression. It defines mean squared error and signal-to-noise ratio as common distortion measures. Rate distortion theory examines the tradeoff between distortion and rate for lossy compression schemes. The rate distortion function specifies the minimum rate needed to encode a source within a given distortion. Probability models like uniform, Gaussian, Laplacian and gamma distributions are used to model source data in compression system design and analysis.

Uploaded by

Warrior Bro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views47 pages

SC 11

The document discusses distortion measures, rate distortion theory, and probability models used in lossy data compression. It defines mean squared error and signal-to-noise ratio as common distortion measures. Rate distortion theory examines the tradeoff between distortion and rate for lossy compression schemes. The rate distortion function specifies the minimum rate needed to encode a source within a given distortion. Probability models like uniform, Gaussian, Laplacian and gamma distributions are used to model source data in compression system design and analysis.

Uploaded by

Warrior Bro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 47

Distortion Measures

{xn} – Source Output


{yn} – Reconstructed sequence
Squared Error Measure = d(x,y) = (x - y) 2
Absolute Difference Measure d(x,y) = x  y
Usually Average of Squared Error Measure is used.
Mean squared error (mse) = 2 or d2
1
 
2

N
 (x n  yn ) 2

 2
x
SNR = 2
d
 2x
SNR (in dB) =10 log10 2
d

x 2peak
PSNR (dB) =10 log10
 d2
Example
• Let X= {0,1,2,…………15} (Source Alphabet)

• By dropping LSB

• Transmitted data ={0,1,2,3,4,5,6,7}

• By inserting 0 as LSB

• Reconstruction Alphabet Y ={0, 2, 4, 6,….14}


• As the source and reconstruction alphabets
can be distinct, we need to be able to talk
about the information relationships between
two random variables that take on values
from two different alphabets.

Review of :
Conditional Entropy
Mutual Information
• Let X be a random variable that takes values
from the source alphabet
X ={x0, x1, ……xN-1}
• Let Y be a random variable that takes on values
from the reconstruction alphabet
Y ={y0, y1,……….yM-1}
• A measure of the relationship between two
random variables is the conditional entropy (the
average value of the conditional self-information).
Previous example
Our uncertainty about the source output is 1 bit.
Continuous Amplitude Sources
• Differential Entropy
• Entropy of a continuous random variable X with
probability density function (pdf) fx(x)
• While the random variable X cannot generally
take on a particular value with nonzero
probability, it can take on a value in an interval
with nonzero probability. Therefore, let us
divide the range of the random variable into
intervals of size Δ. Then, by the mean value
theorem, in each interval [(i - 1) Δ, i Δ), there
exists a number xi , such that
Let us define the random variable Yd in a
manner similar to the random variable Xd,
as the discretized version of a continuous
valued random variable Y

Then I (X;Y) = h(X) – h(X/Y)


Rate Distortion Theory
• Rate distortion theory is concerned with the trade-offs
between distortion and rate in lossy compression
schemes. Rate is defined as the average number of bits
used to represent each sample value. One way of
representing the trade-offs is via a rate distortion function
R(D).
• The rate distortion function R(D) specifies the lowest
rate at which the output of a source can be encoded
while keeping the distortion less than or equal to D.
• In the previous example, knowledge of the value of the
input at time k completely specifies the reconstructed
value at time k.
Distortion
Previous Example
Rate Distortion function for the
binary source
Rate Distortion function for the binary source
(simple explanation)

• We have to find the minimum bit – rate corresponding to


different distortions – lower bound of average mutual
information

Average Mutual Information


• H(X) is Hb (p)

• H(XY) is Hb(D)

• R (D) is the minimum value of I (X;Y)


• i.e., R (D) = Hb (p) - Hb(D)
• When p =D, R(D) =0
• For D > p, R (D) is Negative – Not possible – since
mutual information is positive.

• Hence
R-D Curve for the Binary Source
Rate distortion function for the
Gaussian source
In order to minimize the right-hand side of above Equation we have to
maximize the second term subject to the constraint given by Equation (7.67).
This term is maximized if X - Y is Gaussian, and the constraint can be
satisfied if
E [(X - Y)2] = D.
Therefore, h(X- Y) is the differential entropy of a Gaussian random variable with
variance D, and the lower bound becomes
If D = 2, I (X;Y) = 0

If D> 2, I (X;Y) is negative , not possible.

Hence

1  2
R (D)   log for D   2
2 D
R (D)  0 for D   2
X Y
+
-

X - Gaussian with zero mean and variance 2


Z - Gaussian with zero mean and variance D
Y - Gaussian with zero mean and variance 2 - D
R-D curve for Gaussian Source
• It often helps to interchange the independent
and dependent variables, thus ending up with a
distortion rate function defined by

D( R )  min D
p ( y j / x i ):I ( X ; Y )  R

• There is no objection to this definition since


minimizing average distortion is desirable.

• In this definition R is the independent variable


instead of D.
Distortion Rate Function
(Gaussian Source)

2  for R  0
2 R 2

D( R )   2
 for R  0
Properties of Rate Distortion Function
• R(D) is non-negative since mutual information I(X;Y) is
non-negative.

• R(D) is well defined for all D  Dmin.

• Without loss of generality, we assume Dmin=0.

• In this case R(0)= entropy of the source.

• This means that all of the information must be


reproduced at the destination.
• R(D) = 0 for D  Dmax. Among all possible values of D
which can satisfy R(D)=0, the smallest one is defined as
Dmax.

• R(D) is non-increasing in D .

• R(D) is convex  function. That is, for any pair of


distortion values D and D1 and for any number   [0,1],
we have the inequality

R(D  (1 - D1 )  R (D)  (1  )R (D1 )

• Convexity of R(D) implies that R(D) is strictly decreasing


for increasing distortion.
• R(Dmin)  H where H is the entropy of the source. If for
each x  X there is a unique y  Y that minimizes d(x,y)
and each y minimizes d(x,y) for at most one x, then we
can conclude that R(Dmin) = H.

• In summary R(D) is a continuous, monotonic decreasing,


convex  function in the interval of interest from D = 0 to
D = Dmax and R(D) = 0 for D  Dmax.

• For each D  [0, Dmax] there is one and only one relative
minimum of average mutual information and this
minimum value is R(D) and it always occurs at a point
where the average distortion is D.
A typical rate distortion curve for a discrete memoryless
source and single letter distortion measure is shown in
the figure1.

R(D)

R(D)  H

Dmax

D
Probability models used in Lossy Compression

• Uniform, Gaussian, Laplacian, and Gamma


distribution are four probability models commonly used in the
design and analysis of lossy compression systems:
• Uniform Distribution: As for lossless compression, this is
again our ignorance model. If we do not know anything about the
distribution of the source output, except possibly the range of
values, we can use the uniform distribution to model the source.
The probability density function for a random variable uniformly
distributed between a and b is
• Gaussian Distribution: The Gaussian distribution is one of
the most commonly used probability models for two reasons:
1) it is mathematically tractable and, 2) by virtue of the central
limit theorem, it can be argued that in the limit the distribution
of interest goes to a Gaussian distribution.
• The probability density function for a random variable with
a Gaussian distribution and mean , and variance  2 is
• Laplacian Distribution: Many sources that we deal with have
distributions that are quite peaked at zero.
• For example, speech consists mainly of silence. Therefore,
samples of speech will be zero or close to zero with high
probability.
• Image pixels themselves do not have any attraction to small
values. However, there is a high degree of correlation among
pixels. Therefore, a large number of the pixel-to-pixel
differences will have values close to zero. In these situations, a
Gaussian distribution is not a very close match to the data.
• A closer match is the Laplacian distribution, which is peaked
at zero.
• The distribution function for a zero mean random variable with
Laplacian distribution and variance  2 is
• Gamma Distribution: A distribution that is even more
peaked, though considerably less tractable, than the
Laplacian distribution is the Gamma distribution.

• The distribution function for a Gamma distributed


random variable with zero mean and variance  2 is
given by

You might also like