0% found this document useful (0 votes)
5 views5 pages

MVS_Expt8 Object Detection and Reconstruction Using CNN

Machine vision system research and practical conduction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

MVS_Expt8 Object Detection and Reconstruction Using CNN

Machine vision system research and practical conduction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

JSPM’s

RAJARSHI SHAHU COLLEGE OF ENGINEERING


TATHAWADE, PUNE-33
(An Autonomous Institute Affiliated to Savitribai Phule Pune
University, Pune)
DEPARTMENT OF AUTOMATION AND ROBOTICS

MVS Experiment No. 8 Object Detection and Reconstruction Using CNN

Aim: Object Detection and Reconstruction Using CNN


Theory:

Introduction

In this practical we'll delve into implementing vision two essential computer vision task using
Convolutional Neural Network (CNN). The first task involves abject detection using a CNN
model trained on the CIFAR-10 dataset.

Convolutional Neural Network (CNN)


 A Convolutional Neural Network (CNN) is a type of artificial neural network (ANN)
that is specifically designed to handle image data. CNNs are inspired by the structure
of the human visual cortex and have a hierarchical architecture that allows them to
extract features from images at different scale.
 CNNs use a series of convolutional layers to extract features from images. Each
convolutional layer applies a filter to the input image, and the output of the filter is a
feature map. The feature maps are then passed through a series of pooling layers,
which reduce their size and dimensionality. Finally, the output of the pooling layers is
fed into a fully connected layer, which produces the final output of the network.
 A CNN typically consists of three main types of layers:

 Convolutional layer: The convolutional layer applies filters to the input


image to extract local features.
 Pooling layer: The pooling layer reduces the spatial size of the feature maps
generated by the convolutional layer.
 Fully connected layer: The fully connected layer introduces a more
traditional neural network architecture, where each neuron is connected to
every neuron in the previous layer.
JSPM’s
RAJARSHI SHAHU COLLEGE OF ENGINEERING
TATHAWADE, PUNE-33
(An Autonomous Institute Affiliated to Savitribai Phule Pune
University, Pune)
DEPARTMENT OF AUTOMATION AND ROBOTICS

 There are many popular tools and frameworks for developing CNNs, including:

 TensorFlow: An open-source software library for deep learning developed by


Google.
 PyTorch: An open-source deep learning framework developed by Facebook.
 MXNet: An open-source deep learning framework developed by Apache
MXNet.
 Keras: A high-level deep learning API for Python that can be used with
TensorFlow, PyTorch, or MXNet.

Object detection using CNN: -

1. Objective : Object detection aims to identify and localize within an image


2. Implementation : Tensor flow and Keras to implement the object detection task.
The CIFAR-10 data set, which consists of 60000 32X32 colour images in 10 classes,
will serve as dataset.
3. Model Architecture: We construct CNN model comprising convolutional Layers,
max pooling layers, dense layers and a soft max output layer
4. Traming and Evalution: The, model will be trained using stochastic gradient
descent (SGD) optimizer and categorical cross entropy loss. Training will be
conducted for 10 epcohs with a batch size of 32.
5. Prediction and Visualization: After training we’ll load an external image and make
predictions using the trained model.

Step Involved

1. Import Necessary Libraries

 TensorFlow: For deep learning operations and building the CNN model.
 NumPy: For numerical operations.
 Matplotlib: For visualizing the image and predictions.
 OpenCV: Imported but not directly used in this code.
JSPM’s
RAJARSHI SHAHU COLLEGE OF ENGINEERING
TATHAWADE, PUNE-33
(An Autonomous Institute Affiliated to Savitribai Phule Pune
University, Pune)
DEPARTMENT OF AUTOMATION AND ROBOTICS

2. Define Class Names


class_names = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer',
'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

 These are the 10 class labels corresponding to the CIFAR-10 dataset.

3. Load the CIFAR-10 Dataset


(x_train, y_train), (x_test, y_test) =tf.keras.datasets.cifar10.load_data()

 CIFAR-10 is a dataset containing 60,000 32x32 color images in 10 classes, with


50,000 for training and 10,000 for testing.

4. Preprocess the Data


x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

 Normalization: Divide pixel values by 255 to scale them to the range [0, 1].
 One-Hot Encoding: Convert class labels into a binary matrix for compatibility with
the categorical cross-entropy loss function.

5. Define the Model Architecture


model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32,
32, 3)),
tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D((2, 2)),
tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])

 A Sequential Model is used to stack layers linearly:


o Conv2D Layers: Extract features from images using convolutional filters.
o MaxPooling2D Layers: Reduce spatial dimensions while retaining key
features.
o Flatten Layer: Converts 2D feature maps to a 1D vector for the dense layer.
o Dense Layers: Fully connected layers for classification.
o Softmax Layer: Outputs probabilities for each of the 10 classes.
JSPM’s
RAJARSHI SHAHU COLLEGE OF ENGINEERING
TATHAWADE, PUNE-33
(An Autonomous Institute Affiliated to Savitribai Phule Pune
University, Pune)
DEPARTMENT OF AUTOMATION AND ROBOTICS

6. Compile the Model


model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])

 Optimizer: Adam optimizer adjusts weights to minimize the loss function.


 Loss Function: Categorical cross-entropy computes the difference between predicted
and actual class probabilities.
 Metrics: Accuracy is used to evaluate model performance.

7. Train the Model


model.fit(x_train, y_train, epochs=10, batch_size=32,
validation_data=(x_test, y_test))

 Training Data: x_train and y_train.


 Validation Data: x_test and y_test.
 Epochs: Number of complete passes through the training data.
 Batch Size: Number of samples processed at a time during training.

8. Evaluate the Model


score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

 Computes the loss and accuracy on the test dataset.

9. Load and Preprocess an External Image


image_path = r"D:\pract4 mvs\airplane.jpeg"
img = tf.keras.preprocessing.image.load_img(image_path, target_size=(32,
32))
x = tf.keras.preprocessing.image.img_to_array(img)
x = np.expand_dims(x, axis=0)

 Load an external image and resize it to match the model's input dimensions (32x32).
 Convert the image to an array and add a batch dimension using np.expand_dims.

10. Make Predictions


preds = model.predict(x) pred_class = np.argmax(preds, axis=1)

 Predict: Use the trained model to get class probabilities for the input image.
JSPM’s
RAJARSHI SHAHU COLLEGE OF ENGINEERING
TATHAWADE, PUNE-33
(An Autonomous Institute Affiliated to Savitribai Phule Pune
University, Pune)
DEPARTMENT OF AUTOMATION AND ROBOTICS

 Class Label: Use np.argmax to extract the index of the highest probability,
corresponding to the predicted class.

11. Visualize the Prediction


plt.imshow(img)
plt.axis('off')
plt.title(class_names[pred_class[0]])
plt.show()

 Display the input image and overlay the predicted class label using Matplotlib.

Applications:

1. Object. Detection: Real-time objeet detection for applications such as surveillance


autonomous vehicles.
2. Image reconstruction: De-nosing and enhancing low quality medical images to aid in
diagnosis and treatment planning.

Conclusion:
Code:
Output:

You might also like