0% found this document useful (0 votes)
56 views17 pages

Unit-2 Notes CV

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views17 pages

Unit-2 Notes CV

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit-2

Color Fundamental

Color fundamentals refer to the basic principles and concepts that describe how colors are
perceived, represented, and manipulated in various fields, including image processing,
computer graphics, and visual arts. Understanding these fundamentals is crucial for working
with color images and creating visually appealing designs.

1. Color Perception

 Human Vision:
o The human eye perceives color through photoreceptor cells called cones,
which are sensitive to different wavelengths of light.
o There are three types of cones: S-cones (sensitive to short wavelengths, blue
light), M-cones (sensitive to medium wavelengths, green light), and L-cones
(sensitive to long wavelengths, red light).
o The brain processes Ssignal from these cones to create the perception of
various colors.
 Visible Spectrum:
o The visible spectrum is the range of electromagnetic wavelengths that the
human eye can detect, approximately from 380 nm (violet) to 700 nm (red).
o Colors are perceived based on the wavelength of light, with shorter
wavelengths appearing blue and longer wavelengths appearing red.
 Color Attributes:
o Hue: The type of color, determined by the wavelength of light (e.g., red,
green, blue).
o Saturation: The intensity or purity of the color; a fully saturated color has no
addition of white.
o Brightness (Value or Lightness): The perceived intensity or luminance of the
color; how light or dark the color appears.

2. Color Models

 Color models are mathematical representations of colors that facilitate their


manipulation in digital systems, such as computers and cameras.
 RGB Color Model:
o Components: Red, Green, Blue.
o Additive Color Model: Colors are created by combining different intensities
of red, green, and blue light. When all three are combined at full intensity, the
result is white.
o Common Usage: Used in digital displays (monitors, TVs), cameras, and
image processing.
o Color Space: The RGB color space is a 3D space where each axis
corresponds to one of the primary colors (R, G, B).
 CMY/CMYK Color Model:
o Components: Cyan, Magenta, Yellow, (Key/Black in CMYK).
o Subtractive Color Model: Colors are created by subtracting light using inks
or dyes. Cyan, magenta, and yellow inks are combined to absorb (subtract)
different wavelengths of light, producing a wide range of colors. Black (K) is
added in CMYK to enhance depth and contrast.
o Common Usage: Used in printing.
 HSV/HSI/HSB Color Model:
o Components: Hue, Saturation, Value (or Intensity, or Brightness).
o Purpose: Represents colors in a way that is more aligned with human
perception, making it easier to adjust colors in terms of their visual attributes
rather than their RGB components.
o Common Usage: Used in color editing, image processing, and design
applications.
 CIE XYZ Color Model:
o Components: X, Y, Z.
o Purpose: A color space that aims to represent all perceivable colors based on
human vision. It serves as a standard reference for other color spaces.
o Common Usage: Used in color science and for converting between different
color models.
 YUV and YCbCr Color Models:
o Components: Y (Luminance), U (Chrominance), V (Chrominance) in YUV;
Y (Luminance), Cb (Blue Chrominance), Cr (Red Chrominance) in YCbCr.
o Purpose: Separate the luminance (brightness) and chrominance (color)
components, which is useful in video compression and broadcasting.
o Common Usage: Used in television broadcasting, video compression (e.g.,
JPEG, MPEG).

3. Color Spaces

 Definition: A color space is a specific organization of colors, often defined by a color


model and a range of allowable values.
 sRGB:
o A standard RGB color space used in most consumer-grade devices and web
content.
 Adobe RGB:
o A wider color space than sRGB, used in professional photography and
printing.
 ProPhoto RGB:
o An even wider color space, designed for high-quality photographic work.

4. Color Mixing

 Additive Color Mixing:


o Involves combining different colors of light to create new colors.
o Used in devices like screens, where red, green, and blue light are combined.
 Subtractive Color Mixing:
o Involves combining different pigments or dyes that absorb (subtract) light.
o Used in printing, where cyan, magenta, and yellow inks are combined.

5. Color Balance and Correction

 White Balance:
o The process of adjusting the colors in an image to ensure that whites appear
neutral, compensating for the color temperature of the light source.
 Color Correction:
o Adjusting the overall color balance of an image to achieve a desired look or to
correct color casts.
 Gamma Correction:
o Adjusting the luminance of the colors in an image to correct for the nonlinear
response of display devices.

6. Color Vision Deficiency

 Definition: A condition where individuals perceive colors differently due to the


absence or malfunction of one or more types of cone cells in the eyes.
 Types:
o Protanopia: Red-blindness, where red cones are absent.
o Deuteranopia: Green-blindness, where green cones are absent.
o Tritanopia: Blue-blindness, where blue cones are absent.

7. Applications of Color Fundamentals

 Digital Imaging: Understanding color models and spaces is crucial for accurate
image capture, editing, and display.
 Computer Vision: Color information is used for object detection, recognition, and
tracking.
 Graphics Design: Effective use of color models and color theory enhances visual
communication.
 Printing: Correct use of subtractive color models ensures accurate reproduction of
colors in print media.

Color fundamentals provide the foundation for working with color in various fields, from
digital imaging to visual arts. They encompass the science of how colors are perceived,
represented, and manipulated in both analog and digital formats. Understanding these
concepts is essential for anyone involved in image processing, computer graphics,
photography, or any field that involves color management and reproduction.

Pseudo color image processing

Pseudo-color image processing is a technique used to enhance visual interpretation of


grayscale images by mapping intensity levels to specific colors. This approach assigns colors
to different ranges of pixel values, making it easier to identify patterns, structures, and details
that might be difficult to discern in a grayscale image.

Key Concepts of Pseudo-Color Image Processing

1. Grayscale Image:
o A grayscale image consists of varying shades of gray, where each pixel's
intensity value ranges from 0 (black) to 255 (white) in an 8-bit image.
o In pseudo-color processing, these grayscale intensity levels are mapped to a
set of colors.
2. Color Mapping:
o Color Look-Up Table (LUT):
 A LUT is a predefined table that associates each intensity value in the
grayscale image with a specific color.
 For example, an intensity value of 0 could be mapped to blue, 128 to
green, and 255 to red.
o Custom Color Maps:
 Users can define custom color maps based on the specific application
or desired visualization effect.
3. Applications of Pseudo-Color:
o Medical Imaging:
 Enhancing features in X-rays, MRIs, or CT scans to better visualize
tissues, tumors, or other structures.
o Remote Sensing:
 Analyzing satellite images to highlight different land covers,
vegetation types, or water bodies.
o Thermography:
 Representing temperature distributions in thermal images, where
different temperatures are assigned different colors.
o Scientific Visualization:
 Visualizing data like elevation maps, heat maps, and other scalar fields
where color mapping helps in interpreting the data.
4. Methods of Pseudo-Color Processing:
o Intensity Slicing:
 The grayscale image is divided into several ranges or "slices" of
intensity values. Each range is assigned a specific color.
 Example: Pixel values from 0-50 could be mapped to blue, 51-100 to
green, 101-150 to yellow, and 151-255 to red.
o Color Transformation Functions:
 A function is applied to the intensity values to produce corresponding
color values. Common functions include linear and nonlinear
transformations.
 Example: Applying a rainbow color map where low intensities map to
blue, mid-range to green, and high intensities to red.
5. Advantages of Pseudo-Color Processing:
o Enhanced Visualization:
 By adding color, important details and patterns in the data become
more apparent, facilitating easier interpretation.
o Improved Contrast:
 Different regions of an image with similar grayscale intensities can be
distinguished more easily when color is applied.
o Customizability:
 Users can apply different color mappings based on the specific
application, allowing for flexible and targeted visualization.
6. Disadvantages of Pseudo-Color Processing:
o Potential Misinterpretation:
 Incorrect or misleading color mappings can cause confusion or
misinterpretation of the data.
o Loss of Intensity Information:
 In some cases, the original intensity information might be obscured or
lost in the color mapping process.

Pseudo-color image processing is a powerful tool for enhancing the visual interpretation of
grayscale images by assigning colors to different intensity levels. It is widely used in various
fields such as medical imaging, remote sensing, and scientific visualization, helping to
highlight important features and patterns that might not be easily detectable in grayscale.
While it offers significant advantages in terms of visualization, careful consideration is
required to avoid potential misinterpretation of the data.

Image Processing using Neural Networks

Neural networks, particularly convolutional neural networks (CNNs), are widely used for
image processing tasks such as classification, segmentation, object detection, and
enhancement. Neural networks are designed to learn patterns and features from images,
making them highly effective in solving complex image-related problems.

1. Convolutional Neural Networks (CNNs) in Image Processing

CNNs are the most popular type of neural networks used for image processing. CNNs are
designed to automatically detect spatial hierarchies in images and learn representations of
image content through a series of layers.

Key Components of CNNs:

 Convolutional Layers: These layers apply a convolution operation (a filter or kernel) to the
input image to extract local features like edges, textures, and shapes.
 Pooling Layers: These layers reduce the spatial dimensions of the feature maps, making the
network more efficient and reducing the risk of overfitting. Max pooling and average pooling
are common pooling techniques.
 Fully Connected Layers: These layers are typically found at the end of the network and are
used to output the final classification result by combining all the learned features.
 Activation Functions: Commonly used functions like ReLU (Rectified Linear Unit) are applied
after convolution to introduce non-linearity to the model, helping it to capture complex
patterns.

Applications in Image Processing:

 Image Classification: The CNN learns features from images and assigns them to a specific
class. For example, classifying an image as a cat or a dog.
 Object Detection: CNNs can detect objects within an image, predicting both the object class
and its location using bounding boxes.
 Image Segmentation: In tasks like semantic segmentation, CNNs assign a label to every pixel
in an image, identifying the objects or regions.
 Image Enhancement: Neural networks can be used for image super-resolution, denoising,
and other enhancement tasks by learning the mapping from low-quality to high-quality
images.
2. Training Neural Networks for Image Processing

Training a neural network on image data involves the following steps:

 Dataset Preparation: Large labeled datasets like ImageNet, CIFAR-10, or custom datasets are
prepared, with images resized and augmented to improve generalization.
 Loss Function: The network is trained using a loss function, such as categorical cross-entropy
for classification or mean squared error for regression tasks.
 Optimization: Backpropagation and optimizers like stochastic gradient descent (SGD) or
Adam are used to minimize the loss and adjust the network weights.
 Evaluation: Once trained, the performance of the model is evaluated using metrics such as
accuracy, precision, recall, and the confusion matrix.

Performance Metrics for Image Processing with Neural Networks

To evaluate the performance of neural networks on image classification tasks, various metrics
are used. One of the most important tools for performance evaluation is the Confusion
Matrix.

1. Confusion Matrix

A confusion matrix is a table used to describe the performance of a classification model by


comparing the predicted labels with the actual labels. It is especially useful for evaluating
multi-class classification problems.

Structure of a Confusion Matrix:

 True Positives (TP): The number of instances where the model correctly predicted the
positive class.
 True Negatives (TN): The number of instances where the model correctly predicted the
negative class.
 False Positives (FP): The number of instances where the model incorrectly predicted the
positive class (also known as Type I error).
 False Negatives (FN): The number of instances where the model incorrectly predicted the
negative class (also known as Type II error).

For a binary classification problem, the confusion matrix can be structured as:

Predicted Positive Predicted Negative

Actual Positive TP FN

Actual Negative FP TN

For multi-class classification, the confusion matrix expands to show the results for each class.

2. Performance Metrics Derived from the Confusion Matrix

Several important performance metrics can be calculated using the confusion matrix:
 Accuracy:

Accuracy is the ratio of correctly predicted instances (both positive and negative) to
the total number of instances.

 Precision:

Precision measures how many of the predicted positive instances are actually positive.
It is also called the Positive Predictive Value (PPV).

 Recall (Sensitivity or True Positive Rate):

Recall measures how many of the actual positive instances were correctly identified
by the model.

 F1-Score:

The F1-score is the harmonic mean of precision and recall. It is useful when there is
an imbalance between classes.

 Specificity (True Negative Rate):

Specificity measures how many of the actual negative instances were correctly
identified.
3. Example: Confusion Matrix for Image Classification

Assume we have a model that classifies images into two classes: "Cats" and "Dogs." The
confusion matrix for this binary classification could look like this:

Predicted
Predicted Dogs
Cats

Actual Cats 50 (TP) 10 (FN)

Actual Dogs 5 (FP) 35 (TN)

From this matrix, we can calculate the following:

Similar calculations can be done for the "Dogs" class.

4. Importance of the Confusion Matrix

The confusion matrix helps provide a detailed breakdown of model performance, especially
when working with imbalanced datasets where accuracy alone might be misleading. For
example, in a dataset where 95% of the images are of cats, a model that always predicts
"cats" will achieve high accuracy but perform poorly on dog images. The confusion matrix
reveals this problem by showing false negatives and false positives for each class.

Neural networks, especially CNNs, play a critical role in modern image processing tasks like
classification, segmentation, and object detection. The performance of these models is often
evaluated using the confusion matrix and derived metrics such as accuracy, precision, recall,
and F1-score. The confusion matrix provides a comprehensive view of the model’s
performance by identifying the number of correct and incorrect predictions for each class,
offering insights beyond just accuracy.

Introduction to basic operation on binary and grayscale images:


Dilation, Erosion, Opening
Introduction to Basic Operations on Binary and Grayscale Images

In image processing, morphological operations are fundamental techniques used for


analyzing and manipulating the structure of images, particularly binary (black and white) and
grayscale images. The most common operations include dilation, erosion, and opening.
These operations are widely used for tasks such as noise removal, image enhancement, shape
analysis, and object detection.

1. Binary Images vs. Grayscale Images

 Binary Images: In a binary image, each pixel has only two possible values: 0 (black) or 1
(white). These images are typically used in tasks where the goal is to distinguish between
foreground (objects) and background.
 Grayscale Images: In grayscale images, each pixel can take a range of values, typically from 0
(black) to 255 (white), representing varying intensities of gray. Grayscale images are often
used when more detail is required compared to binary images.

2. Structuring Element

Before we dive into the specific operations, it's essential to understand the concept of the
structuring element (also called a kernel or mask). This is a small binary or grayscale matrix
used to probe or scan the input image. The structuring element defines the neighborhood
around each pixel that is considered during operations like dilation, erosion, or opening.

Common shapes for structuring elements include:

 Square
 Rectangle
 Disk (circular)
 Cross

The size and shape of the structuring element play a crucial role in how the operations affect
the image.

3. Dilation

Dilation is a morphological operation that expands or "grows" the boundaries of the


foreground (white) objects in an image. It increases the size of the object, filling in gaps and
small holes in binary images. In grayscale images, it brightens the regions around the object.

Working of Dilation:

 In binary images: Dilation adds pixels to the boundaries of objects. If at least one pixel in the
neighborhood (defined by the structuring element) is 1, the central pixel is set to 1.
 In grayscale images: Dilation replaces the pixel value with the maximum value in the
neighborhood. As a result, bright regions (high intensity) grow outward.

Applications of Dilation:

 Filling small gaps in objects.


 Connecting disjointed components in an image.
 Enlarging objects to enhance features.

Example of Dilation (Binary Image):

The white pixel in the center causes the surrounding black pixels to turn white after dilation.

Dilation Function:

In OpenCV (Python):

4. Erosion

Erosion is the opposite of dilation. It "shrinks" the boundaries of foreground objects, reducing
their size. In binary images, erosion removes pixels from the boundaries of objects. In
grayscale images, erosion darkens regions by reducing the intensity of bright areas.

Working of Erosion:

 In binary images: If any pixel in the neighborhood is 0 (black), the central pixel is set to 0.
 In grayscale images: Erosion replaces the pixel value with the minimum value in the
neighborhood. As a result, dark regions grow outward, and bright areas shrink.

Applications of Erosion:

 Removing small white noise or isolated pixels.


 Separating objects that are close to each other.
 Thinning objects and removing small details.

Example of Erosion (Binary Image):

Here, the central white pixel is surrounded by black pixels, so it is eroded away, leaving a
smaller shape.

Erosion Function:

In OpenCV (Python):

5. Opening

Opening is a combination of erosion followed by dilation. It is used to remove small objects


or noise while preserving the shape and size of larger objects in the image. The primary
purpose of opening is to eliminate noise or separate objects that are close to each other.

Working of Opening:

 First Step (Erosion): Removes small objects or noise from the image.
 Second Step (Dilation): Restores the shape of the remaining objects after erosion.

Applications of Opening:

 Removing noise (small bright spots) from binary images.


 Separating closely connected objects by eliminating the narrow connections between them.
 Smoothing the contour of larger objects without significantly changing their size.
Example of Opening:

The small noise is removed after erosion, and dilation restores the shape of the main object.

Opening Function:

In OpenCV (Python):

Visual Summary of Morphological Operations:

 Dilation: Expands objects by adding pixels to their boundaries.


 Erosion: Shrinks objects by removing pixels from their boundaries.
 Opening: Removes noise or small objects using erosion, followed by dilation to
restore the main objects.

Summary

Morphological operations like dilation, erosion, and opening are crucial in image processing
for tasks like noise removal, shape analysis, and object enhancement. These operations can be
applied to both binary and grayscale images using structuring elements.

 Dilation enlarges objects, filling gaps and connecting regions.


 Erosion shrinks objects, removing noise or small unwanted details.
 Opening is erosion followed by dilation, which effectively removes small noise while
preserving the shape of larger objects.

These operations are foundational in image preprocessing and segmentation tasks, especially
in medical imaging, computer vision, and object detection.

Opening & Closing Morphological Algorithms: Boundary & Region Extraction,


Convex Hull, Thinning, Thickening, Skeletons, Pruning

Morphological image processing focuses on the structure or shape of objects in an image.


The operations like opening and closing, combined with advanced morphological techniques
like boundary extraction, convex hull, thinning, thickening, skeletonization, and
pruning, are key tools used in image analysis and processing.

These operations are crucial for tasks like object recognition, shape analysis, noise removal,
and image enhancement.

1. Opening & Closing

Both opening and closing are composite morphological operations involving combinations
of dilation and erosion.

Opening:

 Opening is defined as erosion followed by dilation.


 Purpose: Used to remove small objects or noise, especially when objects are smaller than
the structuring element.
 Effect: Smoothes object contours, breaks narrow connections, and removes thin protrusions.

Closing:

 Closing is defined as dilation followed by erosion.


 Purpose: Used to fill small holes or gaps in objects and smooth boundaries.
 Effect: Joins narrow gaps, closes small holes, and fills in small gaps in object boundaries.

2. Boundary Extraction

Boundary extraction is the process of obtaining the boundaries or edges of objects within an
image.

Method:

1. Erode the original image.


2. Subtract the eroded image from the original image.

The remaining image will highlight the boundaries of objects, as the erosion will shrink the
objects, and subtracting it will leave only the boundary pixels.

Formula:

Where:

 A(x) is the original image.


 SE is the structuring element.

Application:

 Boundary detection of objects in binary images.


3. Region Extraction

Region extraction involves identifying and extracting connected components or regions of


interest from an image. This process often uses connected-component labeling, where all
pixels that are connected and share the same value are grouped as a single region.

Methods:

 Flood fill algorithms: Group pixels with similar properties (e.g., color, intensity) into a single
region.
 Morphological operations: Can help clean the image before region extraction, ensuring that
small noise is removed or objects are connected properly.

Application:

 Extracting objects from images for further analysis (e.g., in medical imaging for tumor
detection).

4. Convex Hull

The convex hull of a shape is the smallest convex shape that entirely encloses the object. It's
like wrapping a rubber band around the shape — the convex hull is the tightest boundary that
can be drawn around it.

Algorithm:

The convex hull can be computed by identifying the extreme points of the object and
connecting them in such a way that the resulting shape is convex.

Application:

 Convex hull is often used in shape analysis and object recognition tasks to simplify complex
shapes.
 It is also used in image processing to approximate the shape of objects or to fill concave
regions.

5. Thinning

Thinning is a morphological operation that reduces the thickness of object boundaries or lines
to a single-pixel-wide skeleton while preserving the original structure and topology.

Method:

Thinning is done by repeatedly applying erosion operations until the objects are reduced to a
skeleton. The process retains the essential features of the shape, such as connectivity and
topology.
Applications:

 Thinning is commonly used in OCR (Optical Character Recognition), where it helps extract
and recognize characters in text images.
 Used in fingerprint recognition for extracting ridges and valleys.

Example (Binary Image Thinning):

6. Thickening

Thickening is the opposite of thinning. It involves expanding the boundaries of objects in an


image by adding pixels to their edges, without significantly altering the overall shape or
structure.

Method:

Thickening is typically performed by a series of dilation operations, gradually expanding the


object until it reaches a desired thickness.

Applications:

 Thickening is useful in applications where the objects are too thin and need to be more
prominent for subsequent processing, such as in road detection or vessel enhancement in
medical images.

7. Skeletonization

Skeletonization reduces a shape to its "skeleton", a thin version of the object that retains its
overall structure but is reduced to a single-pixel-wide representation. This is similar to
thinning, but with the goal of preserving the general form and connectivity of the object.
Method:

 Skeletonization can be seen as a more extreme form of thinning, where all unnecessary
pixels are removed except for those that lie along the "centerline" of the shape.

Applications:

 Skeletonization is useful in applications like pattern recognition and shape analysis, where
the skeleton gives a simplified representation of the object while retaining key structural
information.

Example (Skeletonization):

8. Pruning

Pruning is a process used to remove small, spurious branches from the skeleton of an object.
After thinning or skeletonization, small extraneous parts may remain that are not part of the
main structure of the object. Pruning eliminates these artifacts while preserving the overall
skeleton.
Method:

Pruning is achieved by detecting and removing small, unnecessary branches or endpoints


from the skeleton. This process typically follows thinning or skeletonization.

Application:

 Fingerprint analysis: Removing noise from the skeletons of fingerprint images.


 Road network analysis: Removing small, irrelevant branches that may result from noise in
the detection process.

Summary of Operations:

 Opening: Erosion followed by dilation; used to remove small objects or noise.


 Closing: Dilation followed by erosion; used to fill small holes and close gaps in object
boundaries.
 Boundary Extraction: Extracts the edges of objects by subtracting the eroded image from
the original image.
 Region Extraction: Identifies and extracts connected components from an image.
 Convex Hull: The smallest convex shape that encloses an object, used for shape analysis.
 Thinning: Reduces objects to a single-pixel-wide skeleton while preserving their overall
structure.
 Thickening: Expands the boundaries of objects by adding pixels to their edges.
 Skeletonization: Produces a simplified version of the object by reducing it to a single-pixel-
wide skeleton.
 Pruning: Removes small, unnecessary branches from the skeleton of an object.

Each of these morphological operations is essential in image processing tasks that require
shape analysis, noise removal, object detection, or image enhancement. They enable fine
control over the structure and features of objects in an image, making them powerful tools in
various applications such as medical imaging, character recognition, and object segmentation.

You might also like