0% found this document useful (0 votes)
120 views8 pages

Imagemanipulation PDF

This document discusses image manipulation in MATLAB. It begins by explaining that digital images are matrices of pixels and can be manipulated using matrix operations in MATLAB. It then covers basic MATLAB commands for reading, displaying, and writing images. Transformation matrices are introduced for shifting images by modifying pixel positions. The discrete cosine transform (DCT) is explained for image compression, representing images as combinations of cosine functions of different frequencies. Low frequency components dominate smooth images, allowing compression by removing high frequency components.

Uploaded by

Girindra_W
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views8 pages

Imagemanipulation PDF

This document discusses image manipulation in MATLAB. It begins by explaining that digital images are matrices of pixels and can be manipulated using matrix operations in MATLAB. It then covers basic MATLAB commands for reading, displaying, and writing images. Transformation matrices are introduced for shifting images by modifying pixel positions. The discrete cosine transform (DCT) is explained for image compression, representing images as combinations of cosine functions of different frequencies. Low frequency components dominate smooth images, allowing compression by removing high frequency components.

Uploaded by

Girindra_W
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Image Manipulation in MATLAB

1 Introduction
Digital images are just matrices of pixels, and any type of matrix operation can be applied to a
matrix containing image data. In this project you will explore some ways to manipulate images
using MATLAB. Well start o with transformation matrices, and move on to image compression.
2 Basic MATLAB commands
MATLAB has some nice built in functions for reading and writing image lesthe rst command
well be using is imread, which reads in an image le saved in the current directory. imread works
with most popular image le formats. To view an image in a matlab gure, use imagesc. imagesc
is similar to image, but for our purposes will work more consistently. The following code will read
in an image with le name photo1.jpg, save it as the variable X, and display the image in a matlab
gure window. Make sure you run the code from the same folder that contains the image.
X = imread('photo1.jpg');
imagesc(X)
After reading in an image like this, X(:,:,1) is a 2-D matrix with intensity values for the red
channel, X(:,:,2) for the green and X(:,:,3) for the blue.
When images are read in using imread, MATALB stores the data as integers. If we want to perform
mathematical operations on the image data using oating point numbers, the integers must be con-
verted to oats as well. If you just read in an image as X, you can use X = double(X); to perform
the oating point conversion. Note that at this point, imagesc will no longer work properly on X,
since the values in X arent integers anymore.
If you want to write image data to an image le, you can use imwrite. Note that if you con-
verted the image data to the double format, youll need to convert back to integer values. The
command to do this is uint8. The following code shows how to read in an image, convert to double,
and write to a .jpg le. Youll want to build your image manipulation code around this template
if you wish to write your output to an image le.
1
X_int = imread('photo1.jpg');
X_double = double(X_int);
%
% manipulate the image
%
imwrite(uint8(X_double),'outputFileName.jpg')
3 Image Manipulation
Matrix multiplication allows us to transform a vector in many ways. The following matrix takes
the entries of a vector and shifts them down one position, cycling the last entry around to the top.
_

_
0 0 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
_

_
_

_
1
2
3
4
5
_

_
=
_

_
5
1
2
3
4
_

_
If you notice, the rows in the matrix were transformed in the same way that the product was, if
you consider it starting o as the identity matrix. Each of the rows were shifted down one, and the
last row cycled around to the top. This is called a Transformation Matrix, and it is obtained by
performing the desired transformation on each column of the identity matrix. In this case, column
1 turns into column 2, column 2 turns into column 3 etc. This transformation matrix works on
matrices too.
_

_
0 0 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
_

_
_

_
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
_

_
=
_

_
5 10 15 20 25
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
_

_
Notice how it cycled the rows around in a matrix the same way it did in the vector. Since an image
is just a matrix, we can transform them using a linear transformation. The following image was
transformed using a 256 256 version of the transformation matrix above, shifting the image down
by 50 pixels.
2
Heres the code that produced the shifted image above.
[m,n] = size(X_gray);
r = 50;
E = eye(n);
T = zeros(n,n);
%fill in the first r rows of T with the last r rows of E
T(1:r,:) = E(n-(r-1):n,:);
%fill in the rest of T with the first part of E
T(r+1:n,:) = E(1:n-r,:);
X_shift = T*X_gray;
imagesc(uint8(X_shift));
colormap('gray ');
Rows can be transformed too, just by multiplying by the transformation matrix on the right side.
As an example:
_
1 2 3 4 5

_
0 0 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
_

_
=
_
2 3 4 5 1

However, the reordering is not the same as beforethe transpose of the transformation matrix must
be used to shift the elements so that the last element is rst.
_
1 2 3 4 5

_
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
1 0 0 0 0
_

_
=
_
5 1 2 3 4

Just for fun: this code preforms the shift above, but animates it through the entire image.
3
for r = 1:256;
E = eye(n);
T = zeros(n,n);
T(1:r,:) = E(end -(r-1):end ,:);
T(r+1:end ,:) = E(1:end -r,:);
X_shift = T*X_gray;
imagesc(uint8(X_shift));
colormap('gray ');
drawnow;
end
4 Image Compression
4.1 The Discrete Cosine Transform
You can think of the discrete cosine transform (DCT) as decomposing a vector into a linear com-
bination of cosine functions with dierent frequencies. If the data in a vector are smooth, then
the low frequency components will dominate the linear combination. If the data are not smooth
(discontinuous, jagged, rapidly increasing or decreasing), then there will be more weight placed on
the higher frequency components.
If something is smooth, that is it doesnt wiggle around very much, then most of the information
can be retained by only looking at the rst few cosine modes. Think about Taylor series, smooth
functions can be approximated locally pretty well by low order Taylor polynomials. The idea is
very similar with a cosine transform, except instead of using increasingly higher order (wigglier)
polynomial terms were using increasingly higher frequency (wigglier) cosine functions to represent
a given set of data.
There are several ways to dene the DCT matrix, for this assignment use:
C
i,j
=
_

_
1

n
, for i = 1
_
2
n
cos
_
(2(j 1) + 1)(i 1)
2n
_
, for i > 1
Where C is an n n matrix, and i and j represent the row and column index respectively. There
are several ways to construct this matrix, the easiest way is to use nested for loops. If you get
stuck, use the following skeleton code to get started.
C = zeros(n,n); %initialize C
C(1,:) = %fill in first row
for i = 2:n
for j = 1:n
C(i,j) = % fill in rest of matrix
end
end
4
To apply this transform matrix to a vector just multiply. So if y is the transformed version of
x, we would obtain it by doing y = Cx. Since C is square, the 1-D DCT is an operation that
takes in a vector of length n and returns another vector of length n. For 1-D data, the output is
a vector containing weights for the dierent frequency components; the higher the weight the more
important that frequency component.
This is just the 1-D DCT matrix. To transform our image data, we will need the 2-D trans-
form. Thankfully its very easy to compute the 2-D DCT using the 1-D matrix. Let X
g
be the
grayscale version of the image data
1
, then the 2-D DCT for the image X
g
is:
Y = CX
g
C
1
The DCT matrix has the special property that its inverse is the same as its transpose. So for our
DCT matrix C,
C
1
= C
T
This is useful, since transposes are signicantly easier to compute than inverses. Now we can dene
the 2-D transformed image as:
Y = CX
g
C
T
Intuitively, you can think of CX
g
as applying applying the 1-D DCT to the columns of X
g
, and
X
g
C
T
as applying the 1-D DCT to the rows of X
g
. So CX
g
C
T
applies the 1-D transform to both
the rows and the columns of X
g
.


50 100 150 200 250
50
100
150
200
250
10
20
30
40
50
60
70
80
90
100
Figure 1: DCT coecients for an image (values in the matrix Y ). Values
in the upper left are weights on low frequency cosine components while
values in the lower right are weights on high frequency cosine components.
Since the values in the upper left are signicantly larger than those in the
lower right, we can see that low frequencies dominate the overall image.
1
Xg is a matrix, whos (i, j) entry represents the grayscale level at pixel position (i, j). In our case, the values
range from 0 to 255, with 0 being black and 255 being white.
5
4.2 Compression
JPEG is a type of lossy compression, which means that the compressed le contains less informa-
tion than the original. Since human eyes are better at seeing lower frequency components, we can
aord to toss out the highest frequency components of our transformed image. The more uniform
an image, the more data we can throw away without causing a noticeable loss in quality. More
complicated images can still be compressed, but heavy compression is more noticeable. Thankfully
the DCT can sort out which components of the image are represented by low frequencies, which are
the ones we need to keep.
The information corresponding to the highest frequencies is stored in the lower right of the trans-
formed matrix, while the lowest is stored in the upper left. Therefore, we want to save data in the
upper left, and delete data in the lower right. The following code will zero out the matrix below
the o diagonal.
p = 0.5;
%when p=0, no data are saved
%when p=1, all data are saved
for i = 1:n
for j = 1:n
if i+j>p*2*n
Y(i,j)=0;
end
end
end
Adjusting the value of p moves the diagonal up and down the matrix, aecting how much data are
retained. This illustration shows how the o diagonal moves with changes in p.
After deleting the high frequency data, the inverse 2-D DCT must be applied to return the trans-
formed image back to normal space (right now it will look nothing like the original photograph).
Since none of the zeros need to be stored, this process could allow for a signicant reduction in le
size.
5 Tasks
1. Write a MATLAB function to read in the le photo2.jpg and store it as a matrix of doubles.
Convert the color array into a grayscale matrix formed by the linear combination of 30% red,
59% green, and 11% blue. Include this grayscale image in your writeup.
2. Perform a horizontal shift of 128 pixels to photo2.jpg.
6
3. How could you perform a horizontal and vertical shift? That is, what matrix operations
would need to be applied to to get an image to wrap around both horizontally and vertically?
Apply transformations to the original matrix that result in both a horizontal and vertical
shift. Again, shift 128 pixels in each direction.
4. Using what you learned about transformation matrices, determine what matrix would be
required to ip an image upside down. Using that transformation, ip photo2.jpg upside
down.
5. What should transposing an image matrix do? Try it. Does it look the way you expected?
6. Using your own DCT matrix code, make a plot of the determinant
2
of C as a function of n for
n from 1 to 32 (you will need to create 32 dierent DCT matrices). Do you notice anything
interesting about the relationship? What can you say about the rank of C?
7. Given the dimensions of our DCT matrix, what restrictions must we impose on the aspect
ratio of images we wish to transform?
8. Determine what steps need to be taken to undo the 2-D DCT. Remember that our DCT is
dened by Y = CX
g
C
T
, and also the special property of C. You can easily check to see if
your inverse transform works by applying it to Y and viewing it with imagesc.
9. Perform our simplied JPEG-type compression on the image of Albus the cat, photo2.jpg
Read an image into MATLAB and store as a matrix of doubles
Convert the 3-D RGB matrix to a 2-D grayscale matrix
Perform the 2-D discrete cosine transform on the grayscale image data
Delete some of the less important values in the transformed matrix using the included
algorithm
Perform the inverse discrete cosine transform
View the image or write to a le.
Compress the image with several dierent values of p. Include sample images for compression
values that dont cause an obvious drop in quality, as well as some that do. (Something like
the 4 images at the top of this project writeup)
10. You should be able to make p pretty small before noticing a signicant loss in quality. Explain
why you think this might be the case. The point of image compression is to reduce storage
requirements; compare the number of non-zero entries in the transformed image matrix (Y ,
not X
g
) to a qualitative impression of compressed image quality. What value of p do you
think provides a good balance? (no correct answer, just explain)
UPDATE Include all of your code in an appendix. Make sure to comment your code clearly, so
its very obvious which problem the code was for. Output not supported by code in the appendix
will not be counted.
2
you may use MATLABs det command
7
6 Using Mathematica
You can do all of the tasks in this project using Mathematica as well. You can use the functions
Import and Export to read and write image les. Table will allow you to construct the DCT
matrix. The documentation for FourierDCT has an example of reading in an image and storing as
an array. However, you may not use FourierDCT, you must construct your own DCT matrix using
the denition listed previously. Do not copy code from the Mathematica documentation or other
sources online, as that would be plagiarism.
8

You might also like