0% found this document useful (0 votes)
19 views

Computer vision

Computer vision technology enables machines to interpret and respond to images and videos, facing challenges such as the need for extensive datasets and time-sensitive decision-making. Key tasks include object classification, identification, detection, and segmentation, utilizing various techniques like geometric transformations and feature detection. The document outlines the importance of deep learning algorithms in enhancing object recognition and detection capabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Computer vision

Computer vision technology enables machines to interpret and respond to images and videos, facing challenges such as the need for extensive datasets and time-sensitive decision-making. Key tasks include object classification, identification, detection, and segmentation, utilizing various techniques like geometric transformations and feature detection. The document outlines the importance of deep learning algorithms in enhancing object recognition and detection capabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Computer vision technology is crucial part of AI, that aids in

building a machine with ability to look at an image/video, to


understand it, and to respond to it.

Challenges in Computer Vision:


 Image is stored as vector array in digital form. Deep learning
techniques are required to get insights from this data.
 A very huge data set would be required to train the system
to identify objects at various angles/environmental
conditions.
 Time based decision making. Example: Alert has to be
generated by a surveillance robot, when someone crosses a
railway line and a train is approaching, otherwise, it should
be considered normal.
 In case of living objects, ability to differentiate between, the
live object, a statue of the object, a life size poster/photo of
the object.
 Understanding the object with its context

What is Computer Vision ?


Computer vision tasks include methods
for acquiring, processing, analyzing and understanding digital
images, and extraction of high-dimensional data from the real
world in order to produce numerical or symbolic information, e.g.
in the forms of decisions
PURPOSE OF COMPUTER VISION
Object Classification
Object Identification
Object Verification
Object Detection
Object Landmark Detection
Object Segmentation
Object Recognition
https://infyspringboard.onwingspan.com/common-content-store/Shared/Shared/Public/
lex_auth_012983793978515456273_shared/web-hosted/assets/Concept2.png https://
infyspringboard.onwingspan.com/common-content-store/Shared/Shared/Public/
lex_auth_012983793978515456273_shared/web-hosted/assets/Concept2.png

Visualization
The purpose is to observe the objects that are not visible in an
image

Image Sharpening and Restoration

The purpose is to create a better image

Image Retrieval

The purpose is to seek for the image of interest

Measurement of Pattern

The purpose is to measure various objects in an image

Image Recognition

The purpose is to distinguish the objects in an image


Changing color spaces:
Color spaces define how the range value of pixels are represented
for different mediums. Some of the widely used color spaces
include RGB, HSV, CMYK, LAB color space, YCrCb color space, etc.

Example: By default, digital images will be in RGB (Red, Blue,


Green) color space. In RGB color space, the pixels have only the
color components information, and has no direct details about
brightness or saturation intensity. It would be pose complexity if
we wish to process the brightness of the image. Similarly,
some kind of processing could provide poor results in some color
spaces. Hence, we might choose to convert it to a different color
space based on the requirement.

CMYK - Cyan, Magenta, Yellow and blacK - The color space which
is widely used in printers.

HSV - Hue, Saturation, Value - preferred color space in high


quality graphics, due to the property that, it has separate
components to denote color, intensity and brightness. So it is
easy to maintain the color and just alter the brightness in the
images.
Geometric Transformations:
As the name indicates, geometric transforms refers to altering the
geometric aspects of the images like size, orientation, etc.,
without affecting the actual contents of the image. Some of the
geometric transformation techniques are listed below:

Scaling:

Generally images are scaled down to minimize computational


time and resources, while taking to consideration that, the feature
details are not lost.

Rotation/Reflection/Translation:

Images will be rotated in different angles so that it would be


analysed in different perspectives.
Mirroring can also be done to obtain insights about the image.

mage translation refers to moving pixels in an image from one


position to another, with aim of changing an objects position/
angle.
IMAGE SEGMENTATION

In computer vision, segmentation is the process of extracting


pixels in an image that are related. Segmentation algorithms
usually take an image and produce a group of contours (the
boundary of an object that has well-defined edges in an image) or
a mask where a set of related pixels are assigned to a unique
color value to identify it.

The main purpose for Image segmentation is to partition an


image into a collection of set of pixels and achieve the following
results for

– Meaningful regions (coherent objects)

– Linear structures (line, curve, …)

– Shapes (circles, ellipses, …)

The very simple example for image segmentation is method


based on thresholding.

Binary Segmentation

You have an image, and you want to segment into 2 part (light
and dark). So you can use the threshold is a value, for example,
100. So after segmenting, your image will be segmented into 2
part: the pixels with higher than 100 intensity and the pixels (you
can set value for it 255 - is maximum intensity value, white color)
with less than 100 intensity (corresponding intensity is 0 -
minimum intensity value, black color). This is call binary
segmentation. If you use more thresholds, you have more
segments.
Feature

A feature is a piece of information which is relevant for solving the


computational task related to a certain application. Features may
be specific structures in the image such as points, edges or
objects. Features may also be the result of a general
neighborhood operation or feature detection applied to the
image.

Main Component Of Feature Detection And Matching

Detection:

Identify the Interest Point in the image.

The features that are in specific locations of the images, such as


mountain peaks, building corners, doorways, or interestingly
shaped patches of snow. These kinds of localized features are
often called keypoint features (or even corners) and are often
described by the appearance of patches of pixels surrounding the
point location.
The features that can be matched based on their orientation and
local appearance (edge profiles) are called edges and they can
also be good indicators of object boundaries and occlusion events
in the image sequence.

Description:

The local appearance around each feature point is described in


some way that is (ideally) invariant under changes in illumination,
translation, scale, and in-plane rotation. We typically end up with
a descriptor vector for each feature point.

Matching:

Descriptors are compared across the images, to identify similar


features. For two images we may get a set of pairs (Xi, Yi) ↔ (Xi`,
Yi`), where (Xi, Yi) is a feature in one image and (Xi`, Yi`) its
matching feature in the other image.

IMAGE RECOGNITION
Recognition is one of the toughest challenges in the concepts in
computer vision. For the human eyes, recognizing an object’s
features or attributes would be very easy. Humans can recognize
multiple objects with very small effort. However, this does not
apply to a machine. It would be very hard for a machine to
recognize or detect an object because these objects vary. They
vary in terms of viewpoints, sizes, or scales.

Object Recognition

Object recognition is used for indicating an object in an image or


video. This is a product of machine learning and deep learning
algorithms. Object recognition tries to acquire this innate human
ability, which is to understand certain features or visual detail of
an image.

The output of object recognition will include the identified object


category along with the probability of correctness.

Object recognition refers to identification of what is present in the


image, while object detection refers to locating where it is present
in the image.

Object recognition through deep learning can be achieved


through training models or through utilizing pre-trained deep
learning models. To train models from scratch, the first thing you
need to do is to collect a large number of datasets. Then you
need to design a certain architecture that will be used for creating
the model.

Just like in deep learning, object recognition through algorithmic


approach is also possible.

The following algorithms are commonly uses approaches:

 HOG feature extraction


 Bag of words model
 Viola-Jones algorithm

IMAGE DETECTION
Image or Object Detection is a technique that processes the
image and detects objects in it.

When it comes to applying deep machine learning to image


detection, developers use Python along with open-source libraries
like OpenCV image detection, Open Detection, Luminoth,
ImageAI, and others. These libraries simplify the learning process
and offer a ready-to-use environment.

The commonly used techniques for Object Detection are

• Haar cascades algorithm

• Viola Jones Algorithm

Object detection uses an object’s feature for classifying its class.


For example, when looking for circles in an image, the machine
will detect any round object. To recognize any instances of an
object in a class, this algorithm uses learning techniques and
extracted features of an image.

You might also like