Computer Vision
Syllabus:
Imaging Concepts: The Pinhole Camera Model, Camera Calibration, Digital Images - Raster
& Vector Images. Photometry, Shading and Illumination Models – Ambient, Diffuse &
Specular. Anatomy of Human Vision. Color Perception – Trichromatic Color Models and
Visible Light Spectrum.
Image Processing in Spatial and Frequency Domain. Image Filters. Convolution and Linear
Filters. Histogram, Color and Pixel level operations. Thresholding. Non-Linear and
Morphological Operations. Geometrical Operations. Gaussian, Wavelets and Gabor Filters.
Computer Vision Concepts. Introduction to Vision Problems – Categorization, Classification,
Segmentation, Feature Detection, Object Recognition & Tracking, Labelling and Annotation,
Image Registration, Restoration, and Content Based Retrieval. Stereo Vision.
Basics of Pattern Recognition. Image Features – Local vs Holistic. Feature Extraction –
Template Based, HoG, Haar, Horris, LoG DoG, SIFT, Bag of Words. Dimensionality
Reduction - PCA, LDA, ICA. Challenges in Image Classification. Introduction to Precision,
Recall and Confusion Matrix.
Learning filters through Supervised Learning. Application of CNNs, RNNs and LSTM in
Computer Vision. Case Study of AlexNet, GoogLeNet, ResNet, YOLO. Generative
Adverserial Networks & their applications. Introduction to OpenCV, Keras and PyTorch.
References:
1. Digital Image Processing, Rafael C. Gonzalez (Ref:
https://dl.icdst.org/pdfs/files4/01c56e081202b62bd7d3b4f8545775fb.pdf)
2. Computer Vision: Algorithms and Applications, 2nd ed. (Ref:
https://szeliski.org/Book/download.php)
3. OpenCV-Python Tutorials (Ref:
https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html)
Laboratory Experiments:
Use any one of the OpenCV/Scikit-Image library in C++/Python to perform following
experiments:
1. WAP to open the webcam, capture an image, print the number of pixels in the image
and save the image in JPG and PNG format.
2. WAP to open a JPG Image File and perform following operations on the image
without using any corresponding library functions:
a. Scaling
b. Rotation
c. Flipping
3. WAP to build a GUI to adjust following parameters on a webcam image using
corresponding library functions:
a. Contrast
b. Brightness
c. Sharpness
d. Hue
e. Saturation
4. WAP to perform High Pass & Low Pass Filtering operations (in Fourier Domain) on a
JPG image file and save it.
5. WAP to draw Histogram of an input image before and after performing Histogram
Equalization.
6. WAP to open any medical image file (MRI/CT/PET scan) and perform edge detection
on it.
7. WAP to identify a shape (for e.g. circle, line, square, rectangle) in an input image.
8. WAP to add a watermark on an input image and later remove it using inpainting.
9. Mini Project 1: Take an image of any printed text (for e.g. from a book) as input.
Output image of individual characters as separate images. Save each row of the input
text image in separately numbered folders.
10. Mini Project 2: Use any inbuilt feature extractor for any real-world-object recognition
task (for e.g. face recognition). You may use any suitable ML classifier for the task.
Submission Instructions:
Create a public GitHub repository with repository name set as <ROLLNO>-<Name>-
<CVLAB>-2025. For each experiment create a new folder containing source code files,
sample input and corresponding output. Fill the repository URL in the form latest by 10-
01-2025:
https://forms.gle/59VDphb8ExnZpJn38
Submission Deadlines:
Exp. 1 & 2: 18 Jan 2025
Exp. 3 & 4: 30 Jan 2025
Exp. 5 & 6: 15 Feb 2025
Exp. 7 & 8: 28 Feb 2025
Mini Project 1: 20 Mar 2025
Mini Project 2: 15 Apr 2025