0% found this document useful (0 votes)
24 views14 pages

Unit-5 Computer Vision(Ai)

Uploaded by

hamsapriyahs59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views14 pages

Unit-5 Computer Vision(Ai)

Uploaded by

hamsapriyahs59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

ARTIFICIAL INTELLIGENCE

MISS VEENA (PGT CS)

UNIT-5 COMPUTER VISION


CLASS X
QUESTION PAPER CODE:417
INTRODUCTION TO CV:

Ar ficial Intelligence is a technique that enables computers to mimic human intelligence. As humans
we can see things, analyse it and then do the required ac on on the basis of what we see.

But can machines do the same?

Can machines have the eyes that humans have?

If you answered Yes, then you are absolutely right. “The Computer Vision domain of Ar ficial
Intelligence, enables machines to see through images or visual data, process and analyse them on the
basis of algorithms and methods in order to analyse actual phenomena with images. “

APPLICATIONS OF CV:

1.Facial Recogni on

• With the advent of smart ci es and smart homes, Computer Vision plays a vital role in making
the home smarter.

• Security being the most important applica on involves use of Computer Vision for facial
recogni on.

Ex: It can be either guest recogni on or log maintenance of the visitors.

Ex: It also finds its applica on in schools for an a endance system based on facial recogni on of
students.

2.Face Filters

• The modern-day apps like Instagram and snapchat have a lot of features based on the usage
of computer vision.

• The applica on of face filters is one among them.

• Through the camera the machine or the algorithm is able to iden fy the facial dynamics of the
person and applies the facial filter selected.

1
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

3. Google’s Search by Image

• The maximum amount of searching for data on Google’s search engine comes from textual
data, but at the same me it has an interes ng feature of ge ng search results through an
image.

• This uses Computer Vision as it compares different features of the input image to the database
of images and give us the search result while at the same me analyzing various features of
the image.

4. Computer Vision in Retail

• The retail field has been one of the fastest growing fields and at the same me is using
Computer Vision for making the user experience more frui ul.

• Retailers can use Computer Vision techniques to track customers’ movements through stores,
analyse naviga onal routes and detect walking pa erns.

• Through security camera image analysis, a Computer Vision algorithm can generate a very
accurate es mate of the items available in the store.

2
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

5. Self-Driving Cars

• Computer Vision is the fundamental technology behind developing autonomous vehicles.

• Most leading car manufacturers in the world are reaping the benefits of inves ng in ar ficial
intelligence for developing on-road versions of hands-free technology.

• This involves the process of iden fying the objects, ge ng naviga onal routes and also at the
same me environment monitoring.

6. Medical Imaging

• For the last decades, computer supported medical imaging applica on has been a trustworthy
help for physicians.

• It doesn’t only create and analyse images, but also becomes an assistant and helps doctors
with their interpreta on.

• The applica on is used to read and convert 2D scan images into interac ve 3D models that
enable medical professionals to gain a detailed understanding of a pa ent’s health condi on.

7. Google Translate App

• All you need to do to read signs in a foreign language is to point your phone’s camera at the
words and let the Google Translate app tell you what it means in your preferred language
almost instantly.

• By using op cal character recogni on to see the image and augmented reality to overlay an
accurate transla on, this is a convenient tool that uses Computer Vision.

3
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

COMPUTER VISION TASKS:

The various applica ons of Computer Vision are based on a certain number of tasks which are
performed to get certain informa on from the input image which can be directly used for predic on
or forms the base for further analysis. The tasks used in a computer vision applica on are:

• Classifica on

Image Classifica on problem is the task of assigning an input image one label from a fixed set of
categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of
prac cal applica ons.

• Classifica on + Localisa on

This is the task which involves both processes of iden fying what object is present in the image and at
the same me iden fying at what loca on that object is present in that image. It is used only for single
objects.

4
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

• Object Detec on

Object detec on is the process of finding instances of real-world objects such as faces, bicycles, and
buildings in images or videos. Object detec on algorithms typically use extracted features and learning
algorithms to recognize instances of an object category. It is commonly used in applica ons such as
image retrieval and automated vehicle parking systems.

• Instance Segmenta on

Instance Segmenta on is the process of detec ng instances of the objects, giving them a category and
then giving each pixel a label on the basis of that. A segmenta on algorithm takes an image as input
and outputs a collec on of regions (or segments).

IMAGES:

We all see a lot of images around us and use them daily either through our mobile phones or computer
system. But do we ask some basic ques ons to ourselves while we use them on such a regular basis.

1. Basics of Pixels:

• The word “pixel” means a picture element.

• Every photograph, in digital form, is made up of pixels.

• They are the smallest unit of informa on that make up a picture.

• Usually round or square, they are typically arranged in a 2-dimensional grid.

• The more pixels you have, the more closely the image resembles the original.

5
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

• Ex: In the image below, one por on has been magnified many mes over so that you can see
its individual composi on in pixels.

• As you can see, the pixels approximate the actual image.

• The more pixels you have, the more closely the image resembles the original.

2. Resolu on

• The number of pixels covered in an image is some mes called the resolu on

• Term for area covered by the pixels in conven onally known as resolu on.

• For e.g. :1080 x 720 pixels is a resolu on giving numbers of pixels in width and height of that
picture.

• A megapixel is a million pixels.

3. Pixel value

• Pixel value represent the brightness of the pixel.

• The range of a pixel value in 0-255(2^8-1)

• where 0 is taken as Black or no colour and 255 is taken as white

Why do we have a value of 255?

• In the computer systems, computer data is in the form of ones and zeros, which we call the
binary system.

• Each bit in a computer system can have either a zero or a one.

• Since each pixel uses 1 byte of an image, which is equivalent to 8 bits of data.

• Since each bit can have two possible values which tells us that the 8 bits can have 255
possibili es of values which starts from 0 and ends at 255.

6
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

4. Grayscale Images

• Grayscale images are images which have a range of shades of gray without apparent color.

• The darkest possible shade is black, which is the total absence of color or zero value of pixel.

• The lightest possible shade is white, which is the total presence of color or 255 value of a pixel.

• A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels.

• The size of a grayscale image is defined as the Height x Width of that image.

Here is an example of a grayscale image.

As you check, the value of pixels is within the range of 0- 255. the computers store the images we see
in the form of these numbers.

5. RGB Images

• All the images that we see around are coloured images.

• These images are made up of three primary colours Red, Green and Blue.

• All the colours that are present can be made by combining different intensi es of red, green
and blue.

How do computers store RGB images?

• Every RGB image is stored in the form of three different channels called the R channel, G
channel and the B channel.

• Each plane separately has a number of pixels with each pixel value varying from 0 to 255.

• All the three planes when combined together form a colour image.

• This means that in a RGB image, each pixel has a set of three different values which together
give colour to that par cular pixel.

7
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

• Example: Here, each colour image is stored in the form of three different channels, each
having different intensity. All three channels combine together to form a colour we see.

• if we split the image into three different channels, namely Red (R), Green (G) and Blue (B), the
individual layers will have the following intensity of colours of the individual pixels.

• These individual layers when stored in the memory looks like the image on the extreme right.

• The images look in the grayscale image because each pixel has a value intensity of 0 to 255
and as studied earlier, 0 is considered as black or no presence of colour and 255 means white
or full presence of colour.

• These three individual RGB values when combined together form the colour of each pixel.

• Therefore, each pixel in the RGB image has three values to form the complete colour.

IMAGE FEATURE

• A feature is a descrip on of an image.

• Features are the specific structures in the image such as points, edges or objects.

• Other examples of features are related to tasks of CV mo on in image sequences, or to shapes


defined in terms of curves or boundaries between different image regions.

• In image processing, we can get a lot of features from the image.

• It can be either a blob, an edge or a corner.

• These features help us to perform various tasks and then get the analysis done on the basis of
the applica on.

• Example: Try to find exact loca on.

8
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

What are good features to be used?

• The features having the corners are easy to find as they can be found only at a par cular
loca on in the image, whereas the edges which are spread over a line or an edge look the
same all along.

• As we saw an example, this tells us that the corners are always good features to extract from
an image followed by the edges.

INTRODUCTION TO OpenCV:

• we have learnt about image features and its importance in image processing, we will learn
about a tool we can use to extract these features from our image for further processing.

• OpenCV or Open-Source Computer Vision Library is that tool which helps a computer extract
these features from the images.

• It is used for all kinds of images and video processing and analysis.

• It is capable of processing images and videos to iden fy objects, faces, or even handwri ng.

• We will use OpenCV for basic image processing opera ons on images such as resizing, cropping
and many more.

• To install OpenCV library, open anaconda prompt and then write the following command:

pip install opencv-python

9
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

Convolu on Neural Networks (CNN):

• Can you guess the layers:

What is a Convolu onal Neural Network?

• A Convolu onal Neural Network (CNN) is a Deep Learning algorithm

• Which can take in an input image, assign importance (learnable weights and biases) to various
aspects/objects in the image and be able to differen ate one from the other.

The process of deploying a CNN is as follows:

We give an input image, which is then processed through a CNN and then gives predic on on the basis
of the label given in the par cular dataset.

The different layers of a Convolu onal Neural Network (CNN) are as follows:

10
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

A convolu onal neural network consists of the following layers:

1) Convolu on Layer

2) Rec fied linear Unit (ReLU)

3) Pooling Layer

4) Fully Connected Layer

1. Convolu on Layer
 It is the first layer of a CNN.
 The objec ve of the Convolu on Opera on is to extract the high-level features such
as edges, from the input image.
 CNN need not be limited to only one Convolu onal Layer.
 Conven onally, the first Convolu on Layer is responsible for capturing the Low-Level
features such as edges, colour, gradient orienta on, etc.
 With added layers, the architecture adapts to the High-Level features as well, giving
us a network which has the wholesome understanding of images in the dataset.

 It uses convolu on opera on on the images.

 In the convolu on layer, there are several kernels that are used to produce several
features.

 The output of this layer is called the feature map.

 A feature map is also called the ac va on map.

There’s several uses we derive from the feature map:

• We reduce the image size so that it can be processed more efficiently.

• We only focus on the features of the image that can help us in processing the image further.

For example, you might only need to recognize someone’s eyes, nose and mouth to recognize the
person. You might not need to see the whole face.

11
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

2. Rec fied Linear Unit Func on


 The next layer in the Convolu on Neural Network is the Rec fied Linear Unit func on
or the ReLU layer.
 A er we get the feature map, it is then passed onto the ReLU layer.
 This layer simply gets rid of all the nega ve numbers in the feature map and lets the
posi ve number stay as it is.
 The process of passing it to the ReLU layer introduces non – linearity in the feature
map.

Let us see it through a graph.

• If we see the two graphs side by side, the one on the le is a linear graph.

• This graph when passed through the ReLU layer, gives the one on the right.

• The ReLU graph starts with a horizontal straight line and then increases linearly as it reaches a
posi ve number.

As shown in the above convolved image, there is a smooth grey gradient change from black to white.
A er applying the ReLu func on, we can see a more abrupt change in color which makes the edges
more obvious which acts as a be er feature for the further layers in a CNN as it enhances the ac va on
layer.

12
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

3. Pooling Layer
 The Pooling layer is responsible for reducing the spa al size of the Convolved Feature
while s ll retaining the important features.
 Type of pooling which can be performed on an image.
 Max Pooling: Max Pooling returns the maximum value from the por on of the image
covered by the Kernel.
 The pooling layer is an important layer in the CNN as it performs a series of tasks which
are as follows:
a. Makes the image smaller and more manageable
b. Makes the image more resistant to small transforma ons, distor ons and
transla ons in the input image.

A small difference in input image will create very similar pooled image.

4. Fully Connected Layer


 The final layer in the CNN is the Fully Connected Layer (FCP).
 The objec ve of a fully connected layer is to take the results of the
convolu on/pooling process and use them to classify the image into a label.
 The output of convolu on/pooling is fla ened into a single vector of values, each
represen ng a probability that a certain feature belongs to a label.
 For example, if the image is of a cat, features represen ng things like whiskers or fur
should have high probabili es for the label “cat”.

13
ARTIFICIAL INTELLIGENCE
MISS VEENA (PGT CS)

Final output

Conclusion

14

You might also like