COMPUTER
VISION
Introduction
• Data Sciences is a concept to unify
statistics, data analysis, machine learning
and their related methods in order to
understand and analyse actual phenomena
with data.
• The Computer Vision domain of Artificial
Intelligence, enables machines to see
through images or visual data, process and
analyse them on the basis of algorithms
and methods in order to analyse actual
phenomena with images.
• The concept of computer vision was first
introduced in the 1970s.
• All these new applications of computer
vision excited everyone.
• The computer vision technology
advanced enough to make these
applications available to everyone at
ease today.
• However, in recent years the world
witnessed a significant leap in
technology that has put computer vision
on the priority list of many industries.
Applications of Computer Vision
• Facial Recognition
• Face Filters
• Google’s Search by Image
• Computer Vision in Retail
• Self-Driving Cars
• Medical Imaging
• Google Translate App
Computer Vision: Getting Started
• Computer Vision is a domain of Artificial
Intelligence, that deals with the images.
• It involves the concepts of image processing
and machine learning models to build a
Computer Vision based application.
Computer Vision Tasks
• The various applications of Computer Vision are
based on a certain number of tasks which are
performed to get certain information from the
input image which can be directly used for
prediction or forms the base for further analysis.
• The tasks used in a computer vision application
are :
FOR SINGLE OBJECTS
Classification:
• Image Classification is assigning a label to an
input image from a fixed set of labelling
categories.
• Example, in an image consisting of a
person, to be able to label it as a human
in an image frame.
Classification + Localization:
• This involves both processes of identifying
the object in the image as well as location of
the object within the image.
• It is used only for single objects.
FOR MULTIPLE OBJECTS
Object Detection:
• Object detection is the process of finding (objects)
instances of real-world objects such as humans, animals,
food, faces, bicycles, and buildings in images or videos.
• Object detection algorithms typically use extracted
features and learning algorithms to recognize instances of
an object category.
• It is commonly used in applications such as image
retrieval, unlocking the phone using face recognition and
automated vehicle parking systems.
Instance Segmentation:
• Instance Segmentation is the process of detecting
instances of the objects, giving them a category and then
giving each pixel a label on the basis of that.
• A segmentation algorithm takes an image as input and
outputs a collection of regions (or segments).
Basics of Images
Basics of Pixels
• The word “pixel” means a picture element. Every
photograph, in digital form, is made up of pixels.
• They are the smallest unit of information that
make up a picture.
• Usually round or square, they are typically
arranged in a 2-dimensional grid.
Resolution
• The number of pixels in an image is
sometimes called the resolution.
• When the term is used to describe pixel
count, one convention is to express
resolution as the width by the height, for
example a monitor resolution of 1280×1024.
• This means there are 1280 pixels from one
side to the other, and 1024 from top to
bottom.
Pixel value
• Each of the pixels that represents an image
stored inside a computer has a pixel value
which describes how bright that pixel is,
and/or what colour it should be.
• The most common pixel format is the byte
image, where this number is stored as an 8-
bit integer giving a range of possible values
from 0 to 255.
• Typically, zero is to be taken as no colour or
black and 255 is taken to be full colour or
white.
Pixel value
• Since each pixel uses 1 byte of an image,
which is equivalent to 8 bits of data.
• Since each bit can have two possible values
which tells us that the 8 bit can have 255
possibilities of values which starts from 0
and ends at 255.
Grayscale Images
• Grayscale images are images which have a
range of shades of gray without apparent colour.
• The darkest possible shade is black, which is
the total absence of colour or zero value of pixel.
• The lightest possible shade is white, which
is the total presence of colour or 255 value of a
pixel .
• Intermediate shades of gray are represented
by equal brightness levels of the three primary
colours.
• A grayscale has each pixel of size 1 byte having
a single plane of 2d array of pixels.
• The size of a grayscale image is defined as the
Height x Width of that image.
RGB Images
• All the images that we see around are coloured
images.
• These images are made up of three primary
colours Red, Green and Blue.
• All the colours that are present can be made by
combining different intensities of red, green and
blue.
Image Features
• In computer vision and image processing, a
feature is a piece of information which is
relevant for solving the computational task
related to a certain application.
• Features may be specific structures in the image
such as points, edges or objects.
• Let’s Reflect:
• Let us take individual patches into account at once and then
check the exact location of those patches.
• For Patch A and B: The patch A and B are flat surfaces in the
image and are spread over a lot of area.
• They can be present at any location in a given area in the
image.
• For Patch C and D: The patches C and D are simpler as
compared to A and B.
• They are edges of a building and we can find an approximate
location of these patches but finding the exact location is still
difficult.
• This is because the pattern is the same everywhere along the
edge.
• For Patch E and F: The patches E and F are the easiest to find in
the image.
• The reason being that E and F are some corners of the building.
• This is because at the corners, wherever we move this patch it
will look different.
Conclusion
• In image processing, we can get a lot of features from
the image.
• It can be either a blob, an edge or a corner.
• These features help us to perform various tasks and
then get the analysis done on the basis of the
application.
• Now the question that arises is which of the following
are good features to be used?
• As you saw in the previous activity, the features
having the corners are easy to find as they can be
found only at a particular location in the image,
whereas the edges which are spread over a line or an
edge look the same all along.
• This tells us that the corners are always good features
to extract from an image followed by the edges.
THANK YOU