0% found this document useful (0 votes)
4 views

Lab8

This document provides lab instructions for an undergraduate course on Geometric Image Transformations in Computer Vision. It covers various transformations such as translation, rotation, similarity, affine, and perspective transformations, along with practical Python code examples using OpenCV. Additionally, it includes tasks for students to apply these transformations and explore their effects on images.

Uploaded by

morteza karimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lab8

This document provides lab instructions for an undergraduate course on Geometric Image Transformations in Computer Vision. It covers various transformations such as translation, rotation, similarity, affine, and perspective transformations, along with practical Python code examples using OpenCV. Additionally, it includes tasks for students to apply these transformations and explore their effects on images.

Uploaded by

morteza karimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Fundamentals of Computer Vision (Undergrad) - B.

Nasihatkon
Spring 2025

Lab Instructions - session 8


Geometric Image Transformations

Translation
We want to translate the image with the vector 𝑡 = (𝑡𝑥, 𝑡𝑦). Translation is a special case of an
affine transformation 𝑥' = 𝐴 𝑥 + 𝑡 in which 𝐴 = 0. The 2 by 3 matrix 𝑀 = [𝐴 𝑡]can represent
an affine transformation. Thus, a translation is represented by the 2 by 3 matrix 𝑀 = [0 𝑡]. Then
we can translate the image by applying the affine transformation 𝑀 using the cv2.warpAffine
function. Run the following script to translate an image with the vector (tx, ty).

File: translation.py
import cv2​
import numpy as np​

I = cv2.imread('karimi.jpg')​

# translations in x and y directions​
tx = 100​
ty = 40​

# use an affine transformation matrix (2x3)​
M = np.array([[1, 0, tx],​
[0, 1, ty]]).astype(np.float32)​

output_size = (I.shape[1],I.shape[0]) # output image size​
#output_size = (I.shape[1]+200, I.shape[0]+200)​

J = cv2.warpAffine(I,M,output_size)​

cv2.imshow('I',I)​
cv2.waitKey(0)​

cv2.imshow('J',J)​
cv2.waitKey(0)​

#! use a homography transformation matrix (3x3)​
#H = np.array([[1, 0, tx],​
# [0, 1, ty],​
# [0, 0, 1]]).astype(np.float32)​
#K = cv2.warpPerspective(I,H, output_size)​
#cv2.imshow('K',K)​
#cv2.waitKey(0)​

cv2.destroyAllWindows()

●​ Change tx and ty and see the result. Set one or both of them to a negative value.
●​ We have chosen the output image size (output_size) as the original image size.
Change the output image size (e.g. to the one commented out in the code:
#output_size = ...) and see the result.
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

You can also apply translation using a 3 by 3 homography matrix given to the
cv2.warpPerspective function. Uncomment the following part in the code and display
the resulting image K.
#! use a homography transformation matrix (3x3)
H = np.array([[1, 0, tx],
[0, 1, ty],
[0, 0, 1]]).astype(np.float32)
K = cv2.warpPerspective(I,H, output_size)
cv2.imshow('K',K)

●​ Notice that the 3 by 3 matrix H is the matrix M plus an extra row [0 0 1]. Compare
the images K and J and observe that they are identical.

Euclidean (Rigid) transformation


The following code rotates the image with an angle th around the origin (pixel location
(0,0) ). For a rotation about the origin, the 2 by 3 affine transformation matrix is
𝑀 = [𝑅 0]where 𝑅is the 2 by 2 rotation matrix. We can also add a translation vector in
which case 𝑀 = [𝑅 𝑡]

File: rigid.py
import cv2​
import numpy as np​

I = cv2.imread('karimi.jpg',0)​

tx = 0​
ty = 0​

th = 20 # angle of rotation (degrees)​
th *= np.pi / 180 # convert to radians​

M = np.array([[np.cos(th),-np.sin(th),tx],​
[np.sin(th), np.cos(th),ty]])​

J = cv2.warpAffine(I,M, (I.shape[1], I.shape[0]) )​

cv2.imshow('I',I)
cv2.waitKey(0)

cv2.imshow('J',J)​
cv2.waitKey(0)

●​ Why have we converted the rotation angle to radians?


●​ Change the translation vector elements tx and ty, and see the result.
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

Task 1:
In the above example, we saw how to rotate around the origin (pixel 0,0). We can also
rotate around an arbitrary point 𝑐 = (𝑐𝑥, 𝑐𝑦)by adding a proper translation vector. This
can be done by translating any point with the translation − 𝑐 (so that 𝑐 moves to the
origin), rotating around the origin, and translating back with the translation vector + 𝑐.
The transformation then becomes 𝑥' = 𝑅(𝑥 − 𝑐) + 𝑐 = 𝑅 𝑥 + (𝑐 − 𝑅𝑐).

The following Python code keeps rotating the image I around the origin (0,0). Run the
code and see the result. You need to change the code so the image is rotated about its
center c. (c has been computed in the code). You are not allowed to use the
cv2.getRotationMatrix2D function.

File: task1.py
import cv2​
import numpy as np​

I = cv2.imread('karimi.jpg',0)​

# center of the image​
c = np.array([[I.shape[1]/2.0], [I.shape[0]/2.0]])​

for theta in range(0,360):​
th = theta * np.pi / 180 # convert to radians​

R = np.array([[np.cos(th),-np.sin(th)],​
[np.sin(th), np.cos(th)]])​

t = np.zeros((2,1)) # you need to change this!​

# concatenate R and t to create the 2x3 transformation matrix​
M = np.hstack([R,t])​

J = cv2.warpAffine(I,M, (I.shape[1], I.shape[0]) )​

cv2.imshow('J',J)​

if cv2.waitKey(10) & 0xFF == ord('q'):​
break

Task 2: Similarity transform


A similarity transformation consists of translation, rotation and global scaling. The
transformation matrix is 𝑀 = [𝑠𝑅 𝑡]. Start with the code from rigid.py, and modify it
by introducing a scaling parameter s that multiplies the rotation matrix R.
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

●​ Set the scale factor to a number > 1 (e.g., s =2) and see the result.
●​ How can we change the size of the output image accordingly?

Affine transformation
The affine transformation is in the form of 𝑥' = 𝐴𝑥 + 𝑡, where 𝐴is an arbitrary 2 by 2
matrix. Thus, 𝑀 = [𝐴 𝑡].

File: affine.py
import cv2​
import numpy as np​

I = cv2.imread('karimi.jpg')​

t = np.array([[30],​
[160]], dtype=np.float32)​
A = np.array([[.7, 0.8],​
[-0.3, .6]], dtype=np.float32)​


M = np.hstack([A,t])​

output_size = (I.shape[1], I.shape[0])​
J = cv2.warpAffine(I,M, output_size)​

cv2.imshow('I', I)
cv2.waitKey(0)

cv2.imshow('J', J)​
cv2.waitKey(0)

●​ Notice that the parallel lines remain parallel.


●​ Change the elements of matrix A and see the results.
●​ What happens when matrix A is diagonal?

Perspective transformation (Homography)


A perspective transformation (projective transformation or homography) can be
represented by a 3 by 3 matrix H. Look at the following example. It adds some
perspective to the above affine transformation:
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

File: perspective.py
import cv2​
import numpy as np​

I = cv2.imread('karimi.jpg')​

t = np.array([[30],​
[160]], dtype=np.float32)​
A = np.array([[.7, 0.8],​
[-0.3, .6]], dtype=np.float32)​

M = np.hstack([A,t])​

# perspective effect​
p = np.array([[0.001,0.002, 1]])​

H = np.vstack([M,​
p])​


output_size = (I.shape[1], I.shape[0])​
J = cv2.warpPerspective(I,H, output_size)​

cv2.imshow('I',I)
cv2.waitKey(0)

cv2.imshow('J',J)​
cv2.waitKey(0)

●​ Change the values of p[0] and p[1] and see what happens. Set them to 0 or
negative values.

Estimating a homography transformation from point


correspondences
If we have a set of 2D points 𝑥1, 𝑥2,... , 𝑥𝑚in one image and a set of corresponding
points 𝑦1, 𝑦2,... , 𝑦𝑚in the second image, we can estimate a transformation which maps
each point 𝑥𝑖to its corresponding point 𝑦𝑖(or sometimes a point close to 𝑦𝑖).

In the next example, we have two photographs of a painting taken from different views.
Thus, the relation between them is a homography.
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

We want to map the first image to the second. We have found four pairs of
corresponding points (corners of the frame) in both images. Using four-point
correspondences, we can estimate a perspective transformation matrix H using the
function cv2.getPerspectiveTransform. We then apply the transformation to the first
image. Run the following file and see the results.

File: compute_perspective.py
import cv2​
import numpy as np​

I1 = cv2.imread('farshchian1.jpg')​
I2 = cv2.imread('farshchian2.jpg')​

points1 = np.array([(82,14),​
(242,17),​
(241, 207),​
(81, 206)]).astype(np.float32)​

points2 = np.array([(46,75),​
(196,61),​
(220,227),​
(76,251)]).astype(np.float32)​

for i in range(4):​
cv2.circle(I1, (points1[i,0], points1[i,1]), 3, [0,0,255],2)​
cv2.circle(I2, (points2[i,0], points2[i,1]), 3, [0,0,255],2)​

# compute homography from point correspondences​
H = cv2.getPerspectiveTransform(points1, points2)​

output_size = (I2.shape[1], I2.shape[0])​
J = cv2.warpPerspective(I1,H, output_size)​

cv2.imshow('I1',I1)​
cv2.waitKey(0)​
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025


cv2.imshow('I2',I2)​
cv2.waitKey(0)​

cv2.imshow('J', J)​
cv2.waitKey(0)

●​ Can you think of an application for this, considering that I1 has a better quality
than I2?
●​ How can you transform I2 to I1?

Task 3: Perspective Correction


Look at the following traffic sign. We want to correct
the perspective and extract the sign plate as if we
are looking at it from the front. The transformed
image J must exactly contain the sign plate (and
not other parts of the image). We have already
found the coordinates of the four corners of the sign
plate and stored them in the array points1.
Complete the task by changing the following code.
You need to find the proper transformation matrix H and apply it to the image.

File: task3.py
import numpy as np​
import cv2​

I = cv2.imread('sign.jpg')​

p1 = (135,105)​
p2 = (331,143)​
p3 = (356,292)​
p4 = (136,290)​

points1 = np.array([p1,p2,p3,p4], dtype=np.float32)​

n = 480​
m = 320​
output_size = (n,m)​

J = np.zeros((m,n)) # delete this!!​
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025


# mark corners of the plate in image I​
for i in range(4):​
cv2.circle(I, (points1[i,0], points1[i,1]), 5, [0,0,255],2)​

cv2.imshow('I', I)
cv2.waitKey(0)

cv2.imshow('J', J)​
cv2.waitKey()

Task 4: Masking
In this task, you will use a homography transformation to replace the license plate with the
KNTU logo. This simulates real-world applications like anonymizing license plates in vehicle ads
on platforms. You should:

●​ Take as input the original image, a set of four corner points specifying the region to
replace, and the cover image (kntu.jpg).
●​ Compute a homography matrix from the cover image corners to the destination region.
●​ Warp and overlay the cover image onto the target using binary masking.
●​ Return or display the final blended image.

File: task4.py
import numpy as np​
import cv2

# Load target and logo image


target_image = cv2.imread('car.jpg')
logo_image = cv2.imread('kntu.jpg')

# Defining destination source_points in the target image


destination_points = np.array([
(281.85645, 325.7745),
(478.53232, 329.53046),
(477.8494, 374.26056),
(282.8808, 369.8217)
], dtype=np.float32)

# TODO: Define source image corner source_points (the corners of the


logo image) using image width and height
h, w = None
source_points = None
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

# TODO: Compute homography H that maps 'source_points' to


'destination_points' using cv2.getPerspectiveTransform
H = None

# TODO: Warp the logo image using the computed homography using
cv2.warpPerspective
output_size = (target_image.shape[1], target_image.shape[0])
warped_source = None

# Create a binary mask from the warped logo


gray_warped_source = cv2.cvtColor(warped_source, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(gray_warped_source, 1, 255,
cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)

# Extract background and foreground using masks


target_bg = cv2.bitwise_and(target_image, target_image,mask=mask_inv)
source_fg = cv2.bitwise_and(warped_source, warped_source, mask=mask)

# Blend the warped logo with the background


result = cv2.add(target_bg, source_fg)

# Display the original source image


cv2.imshow('Source Image (Logo)', logo_image)
cv2.waitKey(0)

# Display the original target image


cv2.imshow('Target Image', target_image)
cv2.waitKey(0)

# Show the warped texture alone


cv2.imshow('Warped Source (Intermediate)', warped_source)
cv2.waitKey(0)

# Display the final image with the texture mapped


cv2.imshow('Result (Texture Mapped)', result)
cv2.waitKey(0)

●​ Why do we threshold the warped logo to create a binary mask?


Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

Task 5: Classifying a Geometric Transformation Matrix (3×3 Only)


In this task, you will complete the classify_transformation() function to analyze a 3×3
geometric transformation matrix and print the type of transformation it represents.You should
replace the pass statements with appropriate classification messages such as:

●​ "Translation"
●​ "Rigid (Rotation + Translation)"
●​ "Similarity (Rotation + Uniform Scaling + Translation)"
●​ "Affine"
●​ "Projective (Homography)

Then test the function using different transformation matrices.

File: task5.py
import numpy as np​
import cv2

def classify_transformation(matrix):
if matrix.shape != (3, 3):
print("Unsupported matrix shape. Only 3x3 matrices are
allowed.")
return

A = matrix[:2, :2]
t = matrix[:2, 2:3]
bottom_row = matrix[2, :]

ATA = A.T @ A

print("Transformation type: ", end='')

# TODO: Print the appropriate Transformation type in the console


if np.allclose(bottom_row, [0, 0, 1], atol=1e-4):
if np.allclose(A, np.eye(2)) and not np.allclose(t, 0):
pass
elif np.allclose(ATA, np.eye(2), atol=1e-2):
pass
elif np.allclose(ATA[0, 0], ATA[1, 1], atol=1e-2) and
np.allclose(ATA[0, 1], 0, atol=1e-2):
pass
else:
pass
else:
pass
Fundamentals of Computer Vision (Undergrad) - B. Nasihatkon
Spring 2025

# Example
M = np.array([
[2 * np.cos(0.4), -2 * np.sin(0.4), 30],
[2 * np.sin(0.4), 2 * np.cos(0.4), 40],
[0, 0, 1]
])
classify_transformation(M)

You might also like