0% found this document useful (0 votes)
102 views28 pages

Fractal Image Compression Techniques

Uploaded by

nivijune1306
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views28 pages

Fractal Image Compression Techniques

Uploaded by

nivijune1306
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

U.U.D.M.

Project Report 2021:9

Introduction to Fractal Image Compression

Fredrik Mellin

Examensarbete i matematik, 15 hp
Handledare: Örjan Stenflo
Examinator: Martin Herschend
Maj 2021

Department of Mathematics
Uppsala University
Contents

1 Introduction 3

2 Fractals 3

3 Mathematical Preliminaries 8

3.1 Basic topology and mathematical tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 The metric space (H(X), h) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Iterated Function Systems (IFS) 11

4.1 The Collage Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2 Drawing the attractor of an IFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 Fractal Image Compression 16

5.1 Metric Spaces of Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2 The Fractal Block Coding Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.3 Fractal Compression of Grayscale Digital Images . . . . . . . . . . . . . . . . . . . . . . 19

5.4 Implementation and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.5 Conclusion/Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

6 Appendix 26

2
1 Introduction

There are many different fractal image compression techniques, and they are only a handful out of
all compression techniques out there. An image compression technique is considered to be fractal if its
core philosophy is to exploit the self-similarities that naturally occur in many images. One can often
find a part of an image which, if modified in some way, would fit into another part of the same image.
This is where the ”fractal” part comes from, since fractals are often described as self-similar objects.

We will return to the connection between fractals and the image compression techniques later on.
We will also touch upon the benefits and drawbacks of the general method and showcase a simple fractal
image compression scheme. But before venturing into the topic of fractal image compression we need
some knowledge about the mathematical objects called fractals.

2 Fractals

The word ”fractal” originate from the Latin word frāct or frāctus, meaning ”broken” or ”uneven”,
and was coined by Benoit B. Mandelbrot. Instead of trying to get a precise definition of a fractal, one
could approach the subject in a different way. As Falconer states in ”Fractal geometry mathematical
foundation and application” [2], we could make a list of properties which characterize fractals. A fractal
set F may not possess all, but at least some of the following properties:

(i) F has some form of self-similarity;

(ii) F is detailed on every scale;

(iii) Generally, the fractal dimension of F (defined in some way) does not need to be an integer;

(iv) Most of the time F has some simple algorithmic description.

The most common example of a fractal is a self-repeating (self-similar) geometrical figure. By this
we mean a set which consists of smaller copies of itself. Below are some examples of fractals who have
this property together with some of the other properties stated above.

Example 2.1 (Cantor set). The classical Cantor set C is a subset of the unit interval [0,1], created by
iteratively deleting the middle-third open subintervals as follows:

I0 = [0, 1]
I1 = [0, 13 ] ∪ [ 23 , 1]
I2 = [0, 19 ] ∪ [ 29 , 39 ] ∪ [ 69 , 79 ] ∪ [ 89 , 99 ]
..
.
In = In−1 remove the middle open third of each interval in In−1 .

3
Then one can define the Cantor set as the intersection of all In ’s:

\
C= In .
n=0

The Cantor set is clearly defined in a simple recursive way. In fact, we can construct the Cantor set
using a these two functions of scaling and translation:
x
w1 (x) =
3
x 2
w2 (x) = + .
3 3
Both functions scale by 1/3 and the second one moves 2/3 to the right on the real line. Now define
the function W (S) = w1 (S) ∪ w2 (S) = {w1 (x) ∪ w2 (x) : x ∈ S} where S is a subset of [0, 1], then
W (I0 ) = I1 . If we take the output I1 as input to our function W we have W (I1 ) = W (W (I0 )) = I2 .
Repeating this pattern, by taking the output as input, we can define the Cantor set as the limit set:

C = lim W ◦n (I0 ).
n→∞

Even though Figure 1 only illustrates the first few steps in the construction of the Cantor set, it is
easy to note that each In consists of two smaller copies of In−1 . So clearly the Cantor set is self-similar,
but it is also detailed on every scale. If one imagines how the Cantor set would look like and then zoomed
in anywhere on the set, the pattern would repeat itself and the gaps would only get more apparent to
the eye.

I0
I1
I2
I3
I4

Figure 1: Construction of the Cantor set.

Example 2.2 (Sierpinski triangle). The Sierpinski triangle is a classical example of a fractal. It is
constructed by starting of with an equilateral triangle T0 and locating the midpoint of each side of the
triangle. Drawing lines between these points a new triangle is formed, and by removing this new triangle
we are left with three new equilateral triangles inside the original one (see Figure 2), this new set of
triangles is called T1 . By repeating this for each of the new triangles the next iteration is created, and
by repeating it infinitely many times the Sierpinski triangle is formed. Similar to the Cantor set, the
Sierpinski triangle is self-similar and detailed on every scale.

4
Figure 2: Construction of the Sierpinski Triangle.

(a) T0 (b) T1 (c) T2

(d) T3 (e) T8

The area and circumference of the Sierpinski triangle is quiteqinteresting as well. We start of by

2
investigating the area of Tn . If T0 has side length 1 it has height 12 − 12 = 23 .


3 √
1· 3
n=0: The area a0 of the first triangle T0 then is 2
2 = 4 .

3 3
n=1: Since we removed a fourth of the triangle T0 to create T1 the area of T1 is a1 = a0 − 41 a0 = 16 .

n=2 T2 was created √ by removing a fourth of each triangle in T1 , therefore the area of T2 is
2
a2 = 432 a0 = 9643 .

n n

In general the area of Tn is an = 34n a0 = 34n+13 for n ≥ 0. Hence, the area of the Sierpinski triangle is
limn→∞ an = 0. The circumference of Tn , where we also count the circumference of each hole, can be
found by:

n=0 Each side of T0 is of length 1, which yields the circumference is l0 = 3.


1
n=1 T1 consists of three equilateral triangles each with 2 the side length of T0 . This gives us the
circumference l1 = 3 · 3 · 12 = 92 .

n=2 Since it is a repeating pattern T2 consists of three times as many triangles as T1 and each
3
have 21 the side length of a triangle in T1 . Hence the circumference of T2 = 3 · 21 · l1 = 322 .

In general one can see that Tn consists of 3n triangles with side length 21n . The circumference of Tn then
n
3n+1
is ln = 3·3
2n = 2n , which tends to +∞ as n → +∞. Therefore, the Sierpinski triangle has area = 0
but infinite circumference.

The third property of fractals stated above explains this seemingly strange phenomenon. The so
called box dimension of a set is a common illustrative notion of fractal dimension. It can be thought of

5
as a scaling relationship, where the number of boxes it takes to cover a set scales with the side length
of the boxes. Figure 3 shows that the Sierpinski triangle is covered by 4 boxes with side length 12 in
(a) and by 12 boxes with side length 14 (b). In general if N boxes with side length  is needed to cover
the Sierpinski triangle then it takes 3N boxes with side length 2 to cover it. Since the number of boxes
increase by 2d , where d = log3
log2 , we say that this number d is the box dimension of the Sierpinski triangle.
In many cases the box dimension is equal to many other notions of dimension, and it then makes sense
to refer to d as the fractal dimension of the set. With this in mind it seems more reasonable that the
Sierpinski triangle has no area but is more than just a curve since its dimension is a number between 1
and 2. The box dimension is not the only notion of a fractal dimension, there are also other definitions
such as the Hausdorff dimension which is more mathematically convenient. Fractal dimension is an
interesting topic but not the topic of this thesis. However, for the interested reader more information
can be found in [2].

Figure 3: Illustration of boxes it take to cover the Sierpinski triangle.

(a) (b)

Example 2.3 (The Koch Curve). Another famous example of a fractal is the von Koch curve named
after the Swedish mathematician Helge von Koch (1870-1924). Start of with a straight line K0 and
divide it into three equal parts (just like the Cantor set in Example: 2.1). Replace the middle segment
with two sides of an equilateral triangle of the same length as the segment being removed and end up
with K1 (Figure: 4b). Repeat the procedure for each line in the new figure to create the next iteration.
The Koch curve is theoretically obtained by applying this transformation an infinite number of times.

6
Figure 4: Different iterations in the construction of the Koch curve.

(a) K0 (b) K1

(c) K2 (d) K10

Later on we will construct systems of transformations describing the Sierpinski triangle and the von
Koch curve in a similar fashion as with the Cantor set in Example 2.1. This raises the question, is it
possible for any given figure?

7
3 Mathematical Preliminaries

This section reviews some basic topological ideas and some other theory that will prove useful when
studying the fractals we are interested in. As the fractal image compression is connected to the theory
of fractals, some of the definitions and results presented here will come in handy in that section as well.
The results will be stated with little or no explanation as it is not the main topic of this thesis. However,
a more in depth description can be found in pretty much any real analysis textbook such as [5].

3.1 Basic topology and mathematical tools

We start of by letting X denote an arbitrary set. By defining X in an abstract way we can talk
about a variety of different sets. Later on we will work with sets of sets and also, in some way, sets of
images. However, for the reader that is unfamiliar with the concepts stated below it might be helpful
to think of X as R or R2 .

Definition 3.1. A metric space (X, d) is a set X together with a real-valued function d : X × X → R.
such that for any x, y, z ∈ X, the following holds:

(i) d(x, y) = 0 ⇐⇒ x = y;

(ii) 0 < d(x, y) < ∞, if x 6= y;

(iii) d(x, y) = d(y, x);

(iv) d(x, y) ≤ d(x, z) + d(z, y).

Such a function d is called a metric.

It is worth noting that a space need not to have a unique metric. For instance, in the space R2 we
can measure the distance between two points x = (x1 , x2 ) and y = (y1 , y2 ) by,
p
d1 (x, y) = (x1 − y1 )2 + (x2 − y2 )2 .

This real-valued function clearly satisfy the properties (i)-(iv) in Definition 3.1, and thus it is a metric.
On the other hand we also have the following metric, sometimes called the ”Taxicab metric”,

d2 (x, y) = |x1 − y1 | + |x2 − y2 |,

which also satisfy the properties. Hence, both d1 and d2 are metrics in the space R2 . The generalised
form of d1 is p
de (x, y) = (x1 − y1 )2 + (x2 − y2 )2 + · · · + (xn − yn )2 ,
and we will refer to this metric de as the Euclidean metric.

The metric enables us to talk about convergence of a sequence in said metric space. A sequence
{xn } in a metric space (X, d) is said to converge to a point x ∈ X, if given any  > 0, there exists an
N ∈ N such that d(xn , x) <  whenever n > N .

8
Definition 3.2. A sequence {xn }∞ n=1 in a metric space (X, d) is called a Cauchy sequence if, for any
 > 0 there is an N ∈ N such that
d(xn , xm ) <  ∀n, m > N.

In other words, in a Cauchy sequence the points will be closer and closer together the further along
the sequence one moves. However, they need not to approach a point in the space X. For example, let
(Q, d) be the metric space, where Q is the rational numbers and d is the Euclidean metric. Then the
sequence an = (1 + n1 )n converges to the number e which is not in Q. This makes it suitable to have
the following definition:
Definition 3.3. A metric space (X, d) is complete if every Cauchy sequence in X converges in X.

One more very important concept when developing the theory of fractals is the notion of compact
subsets. To understand the definition of compact subsets one needs to recall what a subsequence is.
Consider a sequence {xn }, then a subsequence {xnk } can be derived from {xn } by deleting some or
no elements without changing the order of the elements that remains. For instance, the sequence
1/2, 1/4, 1/6, . . . is the subsequence of even denominators of the sequence {1/n}∞
n=1 = 1, 1/2, 1/3, . . .

Definition 3.4. Let E ⊂ X be a subset of a metric space (X, d). E is said to be compact if every
infinite sequence in E contains a subsequence that converges to an element in E.

The concept of compact subsets might be a bit odd to the reader who has not encounter it be-
fore. Therefore we will state some other definitions regarding subsets of metric spaces, together with a
theorem, that will help with the intuition of compact subsets.
Definition 3.5. A subset E of a metric space (X, d) is open if, for each point p ∈ E there is some
r > 0 such that {q ∈ X : d(p, q) < r} is contained in E.
Definition 3.6. A subset E of a metric space (X, d) is closed if the complement of E, denoted E c , is
open.
Remark. A closed set contains all its limit points.
Definition 3.7. A subset E of a metric space (X, d) is bounded if there exists a real number M > 0
and a point q ∈ X such that d(p, q) < M for all p ∈ E.
Theorem 3.1. Any compact subset K ⊂ X in a metric space (X, d) is closed and bounded.

The converse of Theorem 3.1 is not true in general, but e.g. in the case when X = Rk together with
the Euclidean metric, we have equivalence. This version of the theorem, when X = Rk , is called the
Heine–Borel Theorem. The fractals discussed in the previous section and the ones constructed later on
can be thought of as closed and bounded subsets of R2 (or R in the case of the Cantor set).

We will end this review section by saying something about functions on a metric space (X, d). We
will sometimes refer to functions as mappings or transformations.
Definition 3.8. A contraction mapping on the metric space (X, d) is a mapping f : X → X from the
metric space into itself, such that
d(f (x), f (y)) ≤ c · d(x, y) ∀x, y ∈ X (3.1)
for some constant 0 ≤ c < 1. The smallest such constant c is called the contractivity factor of f .

9
Theorem 3.2 (The Contraction Mapping Theorem). Let (X, d) be a non-empty complete metric space
with a contraction mapping f : X → X. Then f has a unique fixed point xf ∈ X, i.e. f (x) = x if and
only if x = xf . Furthermore, for any point x ∈ X, the sequence {f ◦n (x) : n = 0, 1, 2, . . . } converges to
the fixed point, i.e.
lim f ◦n (x) = xf .
n→∞

The uniqueness of the fixed point in Theorem 3.2 is easy to show. If we let f (xf ) = xf and
f (yf ) = yf , then by (3.1)
d(f (xf ), f (yf )) = d(xf , yf ) ≤ c · d(xf , yf ).
This only holds for d(xf , yf ) = 0, since c ∈ [0, 1), and therefore xf = yf . The proof of the existence of
such a point xf can be found in [5]. The convergence of {f ◦n (x)} to the fixed point xf follows from the
same proof.

3.2 The metric space (H(X), h)

Since the fractals we are interested in studying are compact subsets of a complete metric space we
want to define a space which contains them. Let (X, d) be a complete metric space and let H(X) denote
the space of the non-empty compact subsets of X. Then a fractal F is a point in this space H(X). In
order to be able to talk about distances in this space, a metric is needed.

Let (X, d) be a complete metric space and B a compact subset of X. Define the distance between
a point x ∈ X and B as:
ˆ B) = min{d(x, y) : y ∈ B}.
d(x,
This minimum exists since B is compact and therefore closed and bounded. Furthermore, the distance
between another compact subset A of X and B is defined as:

˜ B) = max{d(x,
d(A, ˆ B) : x ∈ A}.

The same reasoning applies here as well, since A and B are both compact this maximum exists and is
˜ B) 6= d(B,
finite. Note that d(A, ˜ A) in general, so this is not a metric as the symmetry condition in the
definition of being a metric is not satisfied. However, this problem can be evaded by simply taking the
metric to be the maximum of d(A,˜ B) and d(B, ˜ A).

Definition 3.9. Let (X, d) be a complete metric space. Then

˜ B), d(B,
h(A, B) = max{d(A, ˜ A)}, A, B ∈ H(X)

is called the Hausdorff metric.

Theorem 3.3. Let (X, d) be a complete metric space. Then H(X) together with the Hausdorff metric
h forms a complete metric space.

A proof of this theorem can be found in [1].

10
4 Iterated Function Systems (IFS)

Definition 4.1. An Iterated Function System (IFS) is a finite set of contraction mappings on a complete
metric space (X, d),
{wi : X → X | i = 1, 2, . . . , N }.
Each contraction mapping wi has a corresponding contractivity factor ci . An alternative notation for
the same IFS is,
{X; w1 , w2 , . . . , wN }.

In Example 2.1 we saw that the two mappings w1 and w2 , which in fact are contraction mappings,
could be used to construct the Cantor set. We will now state a theorem that generalises this concept
for any given IFS. Recall that if E is a subset of X and f : X → X is a function, then we define
f (E) = {f (x) : x ∈ E}.

Theorem 4.1. Let {X; wi , i = 1, 2, . . . , N } be an IFS with contractivity factor c = max{c1 , c2 , . . . , cN }.


Define the transformation W : H(X) → H(X) by
N
[
W (B) = wi (B)
i=1

for all B ∈ H(X). Then:

(i) W (B) is a contraction mapping with contractivity factor c with respect to the Hausdorff metric;
SN
(ii) Its unique fixed point A ∈ H(X) with A = W (A) = i=1 wi (A) is given by A = limn→∞ W ◦n (B)
for any B ∈ H(X).

The fixed point A ∈ H(X) is called the attractor of the IFS.

Example 4.1 (Sierpinski triangle). Earlier we constructed the Sierpinski triangle by removing the
middle part of a triangle and repeating the procedure. However, it is also possible to represent the
Sierpinski triangle in terms of an Iteration Function System. Start of with a solid triangle T0 . Then
T1 is constructed with the use of three functions, these three functions are affine transformations (see
definition 6.1,Appendix). Each transformation scales the triangle by a half and places each scaled down
triangle (in form of translations) in the corners of T0 .

11
(a) T0 (b) T1 = W (T0 )

(c) T2 = W (W (T0 )) (d) T3 = W (W (W (T0 )))

Figure 5: Using the IFS to construct the Sierpinski triangle.

The corresponding IFS is given by {R2 : w1 , w2 , w3 } where the contractive transformations w1 ,w2
and w3 are given by   
1/2 0 x1
w1 (x1 , x2 ) =
0 1/2 x2
    
1/2 0 x1 1/2
w2 (x1 , x2 ) = +
0 1/2 x2 0
    
1/2 0 x1 1/4
w3 (x1 , x2 ) = + √ .
0 1/2 x2 3/2

The attractor T of this IFS is the Sierpinski triangle and is given by T = limn→∞ W ◦n (T0 ).

Example 4.2 (Koch Curve). The Koch Curve is also an attractor of an IFS. To identify the IFS one
may consider K1 as four copies of K0 scaled, rotated and translated (see figure 4). Let the first line
segment of K1 be a copy of K0 scaled by 31 , the second is also scaled by 13 but rotated counter clockwise
60◦ and translated 13 to the right. The third line segment is similar to the second, however, rotated

clockwise and translated 21 to the right and 63 up. Last the fourth line segment is like the first but
translated by 23 to the right. These scaling, rotations and translations compose the following IFS:
  
1/3 0 x1
w1 (x1 , x2 ) =
0 1/3 x2
 √    
1/6 − 3/6 x1 1/3
w2 (x1 , x2 ) = √ +
3/6 1/6 x2 0
 √    
1/6
√ 3/6 x1 1/2
w3 (x1 , x2 ) = + √
− 3/6 1/6 x2 3/6
    
1/3 0 x1 2/3
w4 (x1 , x2 ) = + .
0 1/3 x2 0
.

12
4.1 The Collage Theorem

Theorem 4.1 states that the attractor of an IFS is unique and given an IFS it does not matter what
the initial set is, in the end iterates of W will tend to its attractor. To answer the inverse, how one
should go about finding an IFS for a given attractor the Collage Theorem proves useful.

Theorem 4.2 (The Collage Theorem, (Barnsley)). Let (X, d) be a complete metric space. Let L ∈
H(X) and  ≥ 0 be given. Choose an IFS {X : w1 , w2 , . . . , wn } with contractivity factor 0 ≤ c < 1,
such that
h(L, ∪ni=1 wi (L)) ≤ 
where h is the Hausdorff metric. Then,

h(L, ∪ni=1 wi (L)) 


h(L, A) ≤ ≤ for all L ∈ H(X)
1−c 1−c
where A is the attractor of the IFS.

This results follows from the contraction mapping theorem. If one considers f to be a contraction
mapping with unique fixed point xf , and let x ∈ X be such that d(x, f (x)) <  for a given  > 0. Then

d(x, xf ) = d(x, f (xf )) ≤ d(x, f (x)) + d(f (x), f (xf ))


≤ d(x, f (x)) + c · d(x, xf ).

Hence,
d(x, f (x)) 
d(x, xf ) ≤ ≤ .
1−c 1−c

In Example 4.1 and Example 4.2 above we identify the IFS by analysing which transformations are
needed for constructing the Sierpinski triangle and the Koch curve. This is essentially what the Collage
Theorem tells us, that given a set L we can find an IFS whose attractor is close to L.

4.2 Drawing the attractor of an IFS

We shall now present an algorithm used for drawing the attractor of an IFS. The algorithm in
question is in fact an application of Theorem 4.1. Let {X; w1 , w2 , . . . , wN } be an IFS and choose an
A0 ∈ H(X). Compute An = W ◦n (A0 ) inductively as:
N
[
An+1 = W (An ) = wi (An ) for n = 0, 1, 2, . . .
i=1

We thus construct a sequence {An : n = 0, 1, 2, . . . } ⊂ H(X). By Theorem 4.1, the sequence An


converges to the attractor of the IFS in the Hausdorff metric.[1]

13
In Example 4.1 we concluded that the Sierpinski triangle is the attractor of the IFS {R2 ; w1 , w2 , w3 },
where the contraction mappings are given by
  
1/2 0 x1
w1 (x1 , x2 ) =
0 1/2 x2
    
1/2 0 x1 1/2
w2 (x1 , x2 ) = +
0 1/2 x2 0
    
1/2 0 x1 1/4
w3 (x1 , x2 ) = + √ .
0 1/2 x2 3/2

Figure 5 in the mentioned example illustrates the first step of the algorithm when the initial set is a
triangle. However, Theorem 4.1 states that we can choose an arbitrary compact subset of R2 and still
tend to the attractor of this IFS. Figure 6 illustrates that even if the initial set is a disk C0 , the set
Cn = W ◦n (C0 ) will tend to the Sierpinski triangle as n grows larger.

(a) C0 (b) C1 = W (C0 ) (c) C2 = W (C1 )

(d) C3 = W (C2 ) (e) C4 = W (C3 ) (f) Cn = W (Cn−1 )

Figure 6: The Sierpinski triangle constructed from a disk.

There are many interesting and beautiful attractors of IFS’s. Below in Figure 7 are some examples
of fractals which are attractors of iterated function systems, and therefore can be drawn using the
algorithm. (Corresponding code tables for the IFS’s can be found in Appendix Table 1).

14
(a) Spiral (b) Barnsley’s fern

(c) Fractal tree (d) Pentagon pattern

Figure 7: Images created by the algorithm.

15
5 Fractal Image Compression

A photo in the physical world is simply a piece of paper with inc on it, but a photo in the digital
world is a collection of small logical units called pixels. The resolution of a digital image refers to the
number of pixels it consists of. If an image is n pixels wide and m pixels high we say that the image
resolution is n × m. One may consider a digital image of resolution n × m as an n by m matrix, where
every entry of the matrix represent a pixel. The value of an entry, the pixel value, is what determines
the color of the pixel. The number of distinct colors that can be represented by a pixel depends on the
number of bits per pixel that is used. For example, 8-bit color allows for 28 colors to be displayed. A
bit (short for ”binary digit”) contains a single binary value of ”0” or ”1” so it can only answer a simple
yes or no question. Each pixel of a grayscale digital image often consists of 8 bits which means that a
grayscale digital image with resolution 1024x1024 then requires 1024 · 1024 · 8 = 8.4 · 106 bits to store.
We will refer to the storage size required of saving an image as the memory size of the image. The
memory size is usually measured in bytes where one byte = 8 bits.

Even though the high-speed Internet access is expanding all around the world, and the connection
speed is increasing, it is still limiting. The time it takes to send a data file depends on the connection
speed, but also the memory size of the file. Hence, sending a high resolution image, or a collection of
images, still might take a significant amount of time. Compressing the images reduces the amount of
information needed to be transferred and so also the time it will take. But how can one compress images?
An important quality of the human eye is its insensitivity to a variety of information loss. In other
words, an image can be changed in ways that the human eye will not detect. If there is high amount
of redundant data that does not affect ”the bigger picture”, then the data can be greatly compressed.
Methods that lose some information during the compression is referred to as lossy compression methods
while their counter parts, where no original data is lost, are called lossless compression methods [3].

The main idea of fractal image compression is to store (also called encode) images as a collection of
transformations. In order for this to be useful there has to be a way to decompress the image, i.e. a
way to reconstruct the image from the stored information. The decompression (or decoding) involves
repeatedly applying the transform to an arbitrary starting image, to end up with an image that is
either the original, or in most cases one very similar to it. Each image in Figure 7 can be stored as
a collection of affine transformations instead of as pixel values. If the numbers in the transformations
are of the commonly used data type float, then the memory size of each number is 32 bits. Storing
the tree as collection of transformations for instance only requires 4 transformations × 6 numbers per
transformation × 32 bits per number = 768 bits. Storing it as a collection of pixels however requires
512 · 512 · 1 = 262.144 bits for the resolution 512 × 512 (since it is only black and white we only need 1
bit to store the color). With this in mind one might ask the question, can we find a small amount of
affine transformations representing any given image? The answer is simply no, since a natural image
is not exactly self-similar, but they are not completely without self-similarity either. As stated earlier,
by looking at an image one might find a part of it, that if scaled and rotated, fits into another part of
the same image. These types of self-similarities can be found in most images of faces, cars, mountains
etc. To utilize these similarities we need to partition the image in some way and compare these bits
and pieces with one another.

16
5.1 Metric Spaces of Images

In order to make use of the main results in previous sections we need a complete metric space. A
mathematical model for a grayscale image is a function f : S → G, where the domain S represents
points on a paper and the range G is the color of the points. For simplicity we assume that S is a closed
rectangular region in R2 and for every (x, y) ∈ S we have that f (x, y) ∈ G, where G represents a closed
interval of grayscale values ranging from black to white. We can generate a 3D-graph with the function
f (x, y), where the height represents the gray level at each point (x, y) of the paper. To be able to say
something about differences or distance between two images f and g we define the metric
sZ
d∗ (f, g) = |f (x, y) − g(x, y)|2 dxdy. (5.1)
S

If we define F to be the space of real-valued square-integrable functions f : S → G, then F together


with the metric d∗ forms a complete metric space. [6]

Recall that W : F → F is a contraction mapping if for some constant c, 0 ≤ c < 1

d∗ (W (f ), W (g)) ≤ c · d∗ (f, g),

where c is called the contractivity factor of W . Then, by the Contraction mapping theorem, there exists
a unique fixed point fW ∈ F satisfying W (fW ) = fW .

We can state the Collage Theorem for grayscale images in the following way. Let f be a grayscale
image and assume that W : F → F is a contraction mapping such that

d∗ (f, W (f )) ≤ .

Then,

d∗ (f, fW ) ≤ ,
1−c
where c is the contractivity factor of W , fW is its fixed point and W ◦n (f0 ) → fW ≈ f for any initial
image f0 .

5.2 The Fractal Block Coding Algorithm

The Collage Theorem is the foundation of fractal image compression. The theorem ensures that it
is possible to find a fractal representation of an image provided we can find a contraction mapping on
F . With this in mind A.Jacquin presented the basic fractal block coding algorithm in 1992 [4]. The
algorithm starts of by partitioning the image into m non-overlapping range blocks Ri (1 ≤ i ≤ m).
These range blocks can be thought of as functions Ri : Ri → G from the ”spatial part” Ri ⊂ S of the
range block Ri to G. Then another partition of the image is made, this time into n non-overlapping
domain blocks D j (1 ≤ j ≤ n). In the same way D j : Dj → G are functions from the ”spatial part” Dj
of D j to G. Each domain block has in general twice the side length of the range blocks . An illustrative
example can be seen in Figure 8.

Given these two partitions the fractal block coding algorithm search, for each range block Ri , the best
match amongst the domain blocks. Since it is unlikely all range blocks have a good match amongst the

17
Figure 8: Example of 64 range blocks of size B × B and 16 Domain Blocks of size 2B × 2B.

domain blocks we are allowed to modify said domain blocks. This can be done by shifting the grayscale
value of the entire domain block by a constant β, and scaling each grayscale value by a constant α. In
the end we have, for each range block Ri a matching domain block D j(i) together with the αi and βi
values for the match. The list of triples (”index of domain block”,α,β) will form the encoding of the
image.

Let αi and βi denote the best grayscale scaling and shift respectively and let vi : Dj(i) → Ri denote
the unique affine map of the form vi (x, y) = 21 (x, y) + (a, b), mapping Dj(i) onto Ri for 1 ≤ i ≤ m. Then
we have the following theorem:

Theorem 5.1. Let Wi : F → F , for 1 ≤ i ≤ m, be defined as


(
αi f (vi−1 (x, y)) + βi if (x, y) ∈ Ri
Wi (f )(x, y) = .
0 if (x, y) ∈ S \ Ri

Then Wi is a contraction mapping with respect to d∗ , if |αi | < 2.

Proof.
Z
(d∗ (Wi (f ), Wi (g)))2 = |Wi (f )(x, y) − Wi (g)(x, y)|2 dxdy
Ri
Z
= αi2 |f (vi−1 (x, y)) − g(vi−1 (x, y))|2 dxdy
Ri

αi2
Z
= |f (x, y) − g(x, y)|2 dxdy
4
Di

α2
Z
≤ i |f (x, y) − g(x, y)|2 dxdy
4
S
α2
= i (d∗ (f, g))2 .
4

α2i
This means that if 4 < 1, then Wi is a contraction.

18
Since the Ri ’s (1 ≤ i ≤ m) forms a partition of S, we can define W : F → F by
m
X
W (f )(x, y) = Wi (f )(x, y) = αi f (vi−1 (x, y)) + βi if (x, y) ∈ Ri .
i=1

If we choose αi such that Wi is a contraction for all i ∈ {1, 2, . . . , m}, then by the Contraction Mapping
theorem, through iteratively applying Wi to any starting image f we will recover the fixed point fWi .
If we now define fW as the sum of all fWi , we have the following theorem.

Theorem 5.2. Suppose c ..= max |αi /2| < 1. Let fWi denote the unique fixed point of the contraction
i
m
P m
P
mapping Wi (i = 1, . . . , m), and let fW = fWi . If W (f ) = Wi (f ), for any f ∈ F , then there
i=1 i=1
exists a constant γ depending on f such that

d∗ (W ◦j (f ), fW ) ≤ γ · cj

Proof.
m
d∗ (Wi◦j (f ), fWi )
X
d∗ (W ◦j (f ), fW ) ≤
i=1
m 
X max |αi | j
≤ i
2 d∗ (f, fWi )
i=1
m
 αi j X ∗
= max d (f, fWi ).
i 2 i=1

5.3 Fractal Compression of Grayscale Digital Images

A grayscale digital image can be represented by a function

f˜ : {1, 2, . . . , n} × {1, 2, . . . , m} → {0, 1, . . . , 255},

and as mentioned earlier, when working with digital images (which are of fixed size, say n × m) we can
think of them as matrices. We let [f˜i,j ] for i = 1, . . . , n and j = 1, . . . , m, be a matrix where each entry
f˜i,j = f˜(i, j). This gives us a way to compute the difference of two digital images with the so called rms
(root mean square) metric:
v
u n X m
uX
˜
drms (f , g̃) = t |f˜(i, j) − g̃(i, j)|2 . (5.2)
i=1 j=1

For simplicity we consider square images/matrices such that m = n for some n = 2p . Now each
range- and domain block represent a submatrix, but the domain blocks are still twice the size of the
range blocks. In order to compare a domain block with a range block they have to be of the same size.

19
This is made possible by averaging the pixel values of the domain block and reduce its size to the size
of a range block. We can now state the fractal block coding algorithm for grayscale digital images as
following:

Algorithm 1 Fractal Block Coding


1: for Ri (1 ≤ i ≤ m) do
2: for D j (1 ≤ j ≤ n) do
3: Downscale D j to match the size of Ri and call it D̂ j .
4: Find best α and β for the pair (Ri , D̂j ) using rms.
5: Compute the error using rms, and if the error is smaller then for any other domain block,
remember the pair along with the α and β.
6: end for
7: end for

This algorithm is a basic version of fractal image compression. There are many variations and
improvements of this method but the fundamental idea stays the same. A small alteration to improve
the matching between the range and domain blocks is to introduce rotation and flipping of the domain
block. This method we opted to call the enhanced fractal block coding algorithm.

In the same way as the original algorithm, the image is partitioned in two ways. The first partition
consisting of non-overlapping domain blocks and the second consisting of non-overlapping range blocks
(see Figure 8). Then for each range block Ri we find the domain block together with a transformation
that is closest to Ri . The transformation tested for each domain block includes:

• Flipping;

• Rotating;

• Changing contrast and brightness.

Flipping is simply a reflection of the scaled down domain block and the rotations includes rotating the
block 0◦ ,90◦ ,180◦ or 270◦ . Since we can either flip or not flip the domain block and then rotate it in
4 different ways, we have 8 variants of each domain block to compare each range block with. Then
for each range block Ri we find the version of the down scaled domain block D̂ j(i) , together with a
contrast scaling constant α and a brightness controlling constant β, which have the lowest root mean
square error. i.e. finding the function and domain block

wi (D j(i) ) = α × rotate( flip( D̂ j(i) )) + β

that minimizes equation (5.2) with f˜ = Ri and g̃ = wi (D j (i)).


The algorithm can be stated as following:

20
Algorithm 2 Enhanced Fractal Block Coding
1: for Ri (1 ≤ i ≤ m) do
2: for D j (1 ≤ j ≤ n) do
3: Downscale D j to match the size of Ri and call it D̂ j .
4: Generate D̂ j,k (k = 1, . . . , 8), the different rotations and flipping of D̂ j .
5: Find best α and β for the pair (Ri , D̂ j,k ) using rms.
6: Compute the error using rms, and if the error is smaller then for any other D̂ j,k , remember
the pair Ri D j along with the rotation, flipping, α and β.
7: end for
8: end for

The decoding is very much the same for both algorithms, the only differences being the saved
parameters. We start of by generating an arbitrary image of the same size as the original image, and
then iteratively applying the transformations corresponding to the saved parameters a fixed number of
times. If we have the restriction |α| < 2 we ensure that each transformation is a contraction, and then
by Theorem 5.2 the decoded image will be close to the original.

When working with lossy image compression (such as fractal image compression) it can be helpful
to be able to somehow measure the quality of the decompressed image. One common way to do this
is to calculate the peak signal-to-noise ratio (PSNR). In order to do this we first need the mean square
error (MSE). For the original m × n image f and the lossy compressed image f ∗ the MSE is defined as:
n m
1 XX
MSE = |f (i, j) − f ∗ (i, j)|2 .
m · n i=1 j=1

Since we are working with 8-bit grayscale images the largest pixel value is 255, given this together with
the MSE the PSNR is defined as:
!
2552
PSNR = 10 · log10 .
MSE

It is important to note that the PSNR only measures the overall difference in the pixel values of two
images. In other words it does not say anything about how the human eye will experience the image
quality. A high PSNR is considered better as we aim to have a small mean square error between the
original and the approximated image.

5.4 Implementation and Results

The two fractal compression algorithms presented above were implemented in Python. To test and
compare the algorithms, experiments on two different types of images were conducted. The first image
is a digital photo of Enya and the second is a QR code with the message ”Fractal image compression”.
The original images can be seen in Figure 9.

By Theorem 5.1 we need to restrict |α| < 2 to ensure a contraction. However, in practice it is useful
to restrict |α| further to reduce the number of iterations needed before reaching the fixed point. This
might affect the quality of the image a bit but it guarantees that the sequence of images converge faster

21
(a) Enya. (b) QR code with the mes-
sage ”Fractal image com-
pression”.

Figure 9: Original image of Enya and the QR-code.

and only a few decoding steps are necessary. Figure 10 illustrates the decoding of a fractal compressed
image with all scaling parameters |α| < 1. Here we reach the approximated image after 6 iterations.

Figure 10: Iteration 0-9 in the decoding of the image of Enya encoded by the standard fractal block coding
algorithm with range block size 8 × 8. The image in the top left corner is a random generated image and the
PSNR is displayed inside the parentheses.

The memory size of an image compressed by either of the presented algorithms does not depend on
the original image per se. What determines the memory size is the number of range- and domain blocks
we choose together with the choices of α and β. For the enhanced method we also need to take the
rotation and flipping into account. The test images in Figure 9 both have resolution 512 × 512, so the
memory size after compression will be the same when using the same range block size. Two different
sizes of the range blocks were tested and in the first case they were 16 × 16 with the corresponding
domain block size of 32 × 32. Since the resolution of the test images are 512 × 512 we have a total of 256
domain blocks of size 32 × 32, so the index of the domain block requires 8 bits. In the second case we

22
have range blocks of size 8 × 8 and domain blocks of size 16 × 16 which result in a total of 1024 domain
blocks, so here 10 bits are required for the index.

The α’s were chosen to be any combination of c1 21 +c2 14 +c3 18 +c4 16


1
, where ci ∈ {0, 1} for i = 1, 2, 3, 4.
Thus, 4 bits were used representing them. The β’s on the other hand were chosen to take any integer
value between −255 and 255, which is a total of 29 − 1 different values. Therefore, 9 bits were needed
in the representation of the β’s.

The image created by the standard fractal block coding algorithm is stored as 1024(8 + 4 + 9) =
21504 bits = 2688 bytes for the larger block sizes and 4096(10 + 4 + 9) = 94208 bits = 11776 bytes
for the smaller block sizes. As mentioned, the enhanced method also needs to store information about
the rotation and flipping of the domain block. 1 bit is used for the flipping and 2 bits are used
for the rotation. This results in the total memory size of the images with the larger block sizes as
1024(8 + 1 + 2 + 4 + 9) = 24576 bits = 3072 bytes and for the smaller block sizes 4096(10 + 1 + 2 +
4 + 9) = 106496 bits = 13312 bytes. By dividing the memory size of the original images (which is
512 · 512 · 8 = 2097152 bits = 262144 bytes) with the memory size of the compressed images, we get the
compression ratio. For example, the compression ratio of the standard fractal block coding algorithm
with range block of size 16 × 16 is 262144
2688 = 97.5.

5.5 Conclusion/Discussion

By inspection of Figure 11 and Figure 12 it is obvious that both algorithms have a tougher time
with an image like the QR code than with Enya. Since the QR code consist of mainly very dark or very
bright grayscale colors each ”error” is quite notable, especially for the larger block sizes. In the image
of Enya the coloring is in some sense smoother except for some parts (e.g. around the eyes) where the
same problem as with the QR code arises, but on a smaller scale. However, it is not only that the errors
are more apparent in the images of the QR code. In fact, the different PSNR results also supports the
conclusion that the quality is better in the images of Enya. Despite all this, the 4 fractal images of the
QR code in Figure 12 are successfully readable by a QR code scanner.

The most computationally intensive part of the algorithms is finding the best match for each range
block. Since the enhanced method includes 8 variants of each domain block, the run-time of the encoding
part is approximately 8 times longer with the used implementation. This is substantial since the encoding
part already is a quite long procedure, especially for smaller block sizes. Theoretically the result of the
enhanced method should be better (or at least as good) when compared with standard fractal block
algorithm. The PSNR results support this, but the trade-off in memory may not be worth it in most
cases.

As mentioned earlier, fractal image compression is a lossy compression method. Another more well
known lossy compression method is the Joint Photographic Experts Group or JPEG. Even though
the compression ratio is sometimes great for the fractal block algorithm, the long encoding time is an
essential drawback. Other fractal image compression methods tries to improve this weakness to make
fractal compression a more competitive option. However, the existing fractal methods are still to this
day being considered as time-consuming compression methods in comparison to e.g. JPEG.

23
(a) Standard algorithm with (b) Standard algorithm with
range block size 16 × 16. range block size 8 × 8.
PSNR: 26.3 PSNR: 30.1
Memory size: 2688 bytes. Memory size: 11776 bytes.
Compression ratio: 97.5 Compression ratio: 22.3

(c) Enhanced algorithm with (d) Enhanced algorithm with


range block size 16 × 16. range block size 8 × 8.
PSNR: 26.9 PSNR: 30.9
Memory size: 3072 bytes. Memory size: 13312 bytes.
Compression ratio: 85.3 Compression ratio: 19.7

Figure 11: Results of both fractal compression methods for Enya with two different range block sizes.
The PSNR, memory size and compression ratio is presented for each compressed image.

24
(a) Standard algorithm with (b) Standard algorithm with
range block size 16 × 16. range block size 8 × 8.
PSNR: 17.6 PSNR: 26.9
Memory size: 2688 bytes. Memory size: 11776 bytes.
Compression ratio: 97.5 Compression ratio: 22.3

(c) Enhanced algorithm with (d) Enhanced algorithm with


range block size 16 × 16. range block size 8 × 8.
PSNR: 19.2 PSNR: 28.0
Memory size: 3072 bytes. Memory size: 13312 bytes.
Compression ratio: 85.3 Compression ratio: 19.7

Figure 12: Results of both fractal compression methods for the QR code with two different range block
sizes. The PSNR, memory size and compression ratio is presented for each compressed image.

25
6 Appendix

Definition 6.1. A transformation w : R2 → R2 of the form


    
a b x1 e
w(x1 , x2 ) = +
c d x2 f

where a, b, c, d, e and f are real numbers, is called an affine transformation. An affine transformation
consists of a linear transformation followed by a translation.

Table 1: IFS code tables.

(a) IFS code for the Spiral.

w a b c d e f
1 0.752 -0.274 0.274 0.752 0 0
2 0.200 0 0 0.200 1 -0.364
3 0.200 0 0 0.200 -0.364 1
4 0.200 0 0 0.200 -0.728 -0.728

(b) IFS code for Barnsley’s fern.

w a b c d e f
1 0 0 0 0.16 0 0
2 0.85 0.04 -0.04 0.85 0 1.6
3 0.2 -0.26 0.23 0.22 0 1.6
4 -0.15 0.28 0.26 0.24 0 0.44

(c) IFS code for a fractal tree.

w a b c d e f
1 0.195 -0.488 0.344 0.443 0.722 0.536
2 0.462 0.414 -0.252 0.361 0.538 1.167
3 -0.058 -0.070 0.453 -0.111 1.125 0.185
4 -0.045 0.091 -0.469 -0.022 0.863 0.871

(d) IFS code for a pentagon pattern.

w a b c d e f
1 0.382 0 0 0.382 0.3072 0.6190
2 0.382 0 0 0.382 0.6033 0.4044
3 0.382 0 0 0.382 0.0139 0.4044
4 0.382 0 0 0.382 0.1253 0.0595
5 0.382 0 0 0.382 0.4920 0.0595

26
References
[1] Michael F. Barnsley. Fractals Everywhere (2nd edition). Academic press, 1993.

[2] Kenneth Falconer. Fractal Geometry, Mathematical foundations and applications. Wiley, 1990.

[3] Yuval Fisher. Fractal Image Compression, Theory and applications. Springer-Verlag, 1995.

[4] A. Jacquin. Image coding based on an fractal theory of iterated contractive image transformations.
IEEE Trans., 1(1):18–30, 1992.

[5] Walter Rudin. Principles of Mathematical Analysis. McGraw-Hill Education, 1976.

[6] Stephen Welstead. Fractal and Wavelet Image Compression Techniques. SPIE Press, 1999.

27

You might also like