0% found this document useful (0 votes)
49 views6 pages

Vehicle Detection and Classification Using Deep Neural Networks

This document discusses the development of an automatic vehicle detection and classification system using deep neural networks, specifically VGG16, VGG19, and YOLOv5, to address traffic management challenges in urban areas. The research utilizes a dataset of 9,058 images of Bangladeshi vehicles and employs techniques such as data augmentation and transfer learning to achieve classification accuracies of 75.15% for VGG16, 79.57% for VGG19, and 83.02% for YOLOv5. The study highlights the effectiveness of convolutional neural networks in improving traffic monitoring and enhancing road safety.

Uploaded by

Pavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views6 pages

Vehicle Detection and Classification Using Deep Neural Networks

This document discusses the development of an automatic vehicle detection and classification system using deep neural networks, specifically VGG16, VGG19, and YOLOv5, to address traffic management challenges in urban areas. The research utilizes a dataset of 9,058 images of Bangladeshi vehicles and employs techniques such as data augmentation and transfer learning to achieve classification accuracies of 75.15% for VGG16, 79.57% for VGG19, and 83.02% for YOLOv5. The study highlights the effectiveness of convolutional neural networks in improving traffic monitoring and enhancing road safety.

Uploaded by

Pavan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Vehicle Detection and Classification Using Deep

Neural Networks
Shuva Chowdhury Shithi Chowdhury
Electrical and Computer Engineering Electrical and Computer Engineering
North South University North South University
Dhaka, Bangladesh Dhaka, Bangladesh
[email protected] [email protected]
Jeba Tahsin Ifty Riasat Khan
Electrical and Computer Engineering Electrical and Computer Engineering
North South University North South University
Dhaka, Bangladesh Dhaka, Bangladesh
[email protected] [email protected]
2022 International Conference on Electrical and Information Technology (IEIT) | 978-1-6654-5303-5/22/$31.00 ©2022 IEEE | DOI: 10.1109/IEIT56384.2022.9967885

Abstract—Urbanized large cities encounter significant learning-based automatic vehicle detection and classification
challenges of manual traffic control due to overpopulation, system. These techniques often employ various deep learning
traffic congestion and land shortage. These cities have initiated approaches for effectively identifying and categorizing
implementing intelligent transportation systems by employing different transportation systems [3]. One such method is the
automatic and efficient traffic monitoring and management. convolutional neural network (CNN) [4]. CNN is a form of
Due to bad conditions in traffic, many minor incidents occur on deep learning framework belonging to the neural network
streets, e.g., avoiding traffic rules, over-taking tendency, and category. Because of its effectiveness, the technique is well-
high-speed and intoxicated driving, overdrive limit, unsafe lane known in picture recognition.
changes on the road, etc. Sometimes these incidences provoke
collisions, accidents, crimes, loss of time and human fatalities. CNN is widely used in image processing to detect and
For this reason, automatic vehicle detection and classification classify vehicles from images. Some recent manuscripts on
are essential to predict traffic congestion levels and lane control automatic vehicle detection and classification have been
and enhance road safety and security. This automated process described briefly in the subsequent paragraphs.
can solve economic, social, and environmental concerns and
impact everyday living issues. In this work, automatic vehicle In [9], the authors applied a CNN structure as a classifier
detection and classification have been introduced using various in both vehicle type and color classification. The suggested
deep neural network frameworks, i.e., VGG16, VGG19 and solution requires one input, i.e., a car image sent into the
YOLOv5. We used transfer learning algorithms built on a system. The original automobile image is compressed to
conventional neural network for native vehicle identification 32×32 pixels using TensorFlow’s resize function. The
and classification. This research used a Bangladeshi vehicle training set comprises 686 photos, and the testing set includes
images dataset containing 9,058 images. Next, we employed the 228 pictures. The proposed method has a color classification
“ImageDataGenerator” Keras data preprocessing and accuracy of 70.09 percent and a vehicle classification
augmentation framework and various deep learning networks, accuracy of 81.62 percent.
VGG16, VGG19, and YOLOv5. Finally, the proposed VGG16
and VGG19 models' accuracies for ten classes are 75.15% and Z. Zhang and his coauthors compare DNN and NN for
79.57%, respectively. The YOLOv5 architecture attained the vehicle categorization in this research [10]. There are 150
best performance with 83.02% accuracy for fifteen vehicle photographs in the data collection, all of which have been
classes. preprocessed. The dataset is randomly partitioned into a
training set with 80% and a testing set with 20% of the data.
Keywords— Convolutional neural network, vehicle detection, In the proposed deep neural network testing, S-shaped
classification, deep learning, ImageDataGenerator, VGG16, functions are employed as activation functions in hidden
VGG19, YOLOv5 layers, and the state-of-the-art backpropagation method is
I. INTRODUCTION used in the training phase. The DNN has a clear benefit over
traditional single-layer neural networks, including a lower
Almost every urban large city now has an intelligent error rate. Finally, the DNN and NN frameworks achieved
traffic surveillance system to monitor transportation incidents vehicle classification accuracies of 96.66% and 93.33%,
and traffic [1]. The modernized surveillance system is respectively.
utilized for various purposes, including real-time vehicle
tracking, traffic and lane occupancy monitoring, and In [11], the authors of this work present a deep learning-
observing congestion levels [2]. Vehicle classification and based vehicle type classification system that uses their private
recognition automation are among the most critical data. Data augmentation and convolutional neural networks
difficulties in modern traffic management and intelligent are the two strategies that make up this proposed system.
transportation systems [7]. Various methods for vehicle There are 2,400 samples in this dataset, which is split into
categorization have been accomplished using complex four groups, i.e., school bus, ambulance, police car, and
sensors, with some successfully solving constrained public bus of Moroccan Transport Company. The mean
circumstances. Recently, computer vision and artificial precision and recall of the CNN model for the four classes are
intelligence techniques have been used for detecting vehicles 0.89 and 0.83, respectively. The authors intend to add
based on camera photos [8]. This study describes a deep automated vehicle number plate recognition, vehicle
estimating, and traffic congestion sensing components.

978-1-6654-5303-5/22/$31.00 ©2022 IEEE


95
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
In [12], the authors of this work utilized the vehicle front- overall number of photos in the combined dataset is 8,059.
view picture dataset from surveillance cameras for automatic There are 6,441 photos for training and 1,618 images for
vehicle classification. This work utilized the ResNet-50 testing. This dataset has 13 different classes of vehicles.
architecture as the foundation model for picture data
classification. The authors divided the dataset into five TABLE I. NUMBER OF TRAINING AND TEST IMAGES FOR VARIOUS
categories of automobiles. The dataset is partitioned into 80 CLASSES OF THE EMPLOYED DATASET
percent training and 20 percent validation. In prior trials, their Vehicle Train images Test images
proposed system outperformed ResNet-50, VGG16 Network, Bicycle 565 142
and CNN in-vehicle type classification, with an accuracy of
96.26 percent. Bike 691 173

In [5], the authors proposed a customized CNN method Bus 361 91


with high accuracy for automatic vehicle classification. The Car 566 142
car images are split into two groups, i.e., training and
CNG 426 107
validation dataset, with a ratio of 7:3. The proposed system
uses data augmentation approaches and shallow CNN Easy bike 492 124
framework. The implemented shallow CNN-based Horse cart 204 52
automobile model obtains 92.341 percent classification
accuracy, whereas the DT and RF models attain 78.380 and Leguna 174 44
78.821 percent accuracy, respectively. Rickshaw 396 99

In [6], the authors introduced a new vehicle dataset of six Truck 584 146
categories of local vehicles in Pakistan. All pre-trained Van 491 123
models are fine-tuned using a self-constructed dataset. This
work consists of 10,000 photos that have been divided into Wheelbarrow 189 48
six categories, each with 1,670 images. On the self- Total 6,441 1,618
constructed and VeRi datasets, the suggested customized
CNN classification system improved accuracy by 99.68
percent and 97.66 percent, respectively. Table I shows the number of photographs of vehicles for
We can conclude from the above literature studies that training and testing dataset used in this work.
many important works on vehicle detection and classification
have been performed. The conventional neural network has
been successfully employed in many researches for
automated vehicle detection and categorization. A broad
spectrum of deep learning techniques, CNN, VGG16,
VGG19, ResNet and YOLO models, have been used for
recognizing and classifying automobiles from photos and live
real-time video sequences.
In this work, automatic vehicle detection and
classification system have been developed using three deep
neural network models, VGG16, VGG19, and YOLOv5. The
dataset contains 9,058 photographs of primarily Bangladeshi
vehicles. This dataset has 13 different classes, i.e., bicycle,
boat, bus, car, CNG, easy-bike (special type of battery-
powered vehicles for rural areas), horse cart, wheelbarrow,
Leguna, motorcycle, rickshaw (human-operated vehicle with
two wheels), van and truck or lorry. Keras data preprocessing
tool “ImageDataGenerator” has been used for image Fig. 1. Sample photographs of various vehicles of the used dataset.
augmentation. The proposed multi-classification deep
learning models’ performance is evaluated in terms of Fig. 1 shows sample photographs of different types of
classification accuracy and loss. vehicles of the used dataset in this work.
The proposed system is discussed in Section II with B. Data Preprocessing
suitable tables, figures, and flowcharts. In Section III, the Image data augmentation is a method of automatically
research’s most important results have been presented. augmenting images that enhances the performance of the
Finally, Section IV ends the study with suggestions on how deep learning models [14]. The Keras “ImageDataGenerator”
to improve this work in the future. package allows users to edit photos in real-time while the
model is still trained [15]. While the image is being provided
II. PROPOSED SYSTEM
to the model, any random alterations can be added to it. This
A. Dataset process will make the model run faster and consume less
In this work, the Bangladesh Poribohon-BD native cars computational resources in terms of memory. It also provides
dataset of Mendeley [13] and a Kaggle open-source dataset modifications of the images to boost the fit models’ ability to
have been used. We combined two datasets to create a larger generalize new photographs. Models can be fitted with
dataset with increased categories and variations. Next, the picture data augmentation using the ImageDataGenerator
entire image is divided into two sets, training and testing. The framework.

96
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
TABLE II. DATA AUGMENTATION PARAMETERS AND THEIR VALUES
Parameter Value
Rescale 1.0/255
Shear range 0.20
Zoom range 0.20
Horizontal flip True
Batch size 64
Target size (224,224)
Class mode categorical

Table II shows the values of various parameters for data


augmentation process. The batch size is set to the greatest
value of 64 in this case. The target and input sizes are the Fig. 3. YOLOv5 architecture diagram.
same. The crop and zoom range parameters are set to 0.20, a
good starting point for most models. Horizontal flip of the
images is considered in this step.
C. Deep Learning Models
In this research, various deep learning frameworks,
VGG16, VGG19 and YOLOv5, have been used for automatic
vehicle detection and classification. This type of deep
learning is created primarily to interpret pixel input. These
advanced AI techniques are typically utilized in image
processing and recognition. It uses a multilayer perceptron-
like technology with low processing requirements. Many
layers make up convolutional neural networks. Each neuron’s
action is determined by its corresponding weight. When the
pixel values are given, the artificial neurons in a
convolutional neural network pick out various visual
features. In the subsequent paragraphs, brief descriptions of
the employed deep learning techniques are discussed.
VGG16: The VGG16 architecture was utilized to win the
ILSVR competition. There are 16 max-pooling, convolution
and dense layers with different weights in this architecture
[16]. ImageNet is used for augmentation. Fig. 2 shows the Fig. 4. Working sequences of the proposed method.
VGG16 architecture.
VGG19: VGG19 is an architecture that was created in Lastly, the working sequences of the proposed automatic
2012 [17]. There are 19 layers with different weights. vehicle detection and classification system have been
ImageNet is used for augmentation in this framework. demonstrated in Fig. 4. After preprocessing the images, it has
been fed to the classifier (VGG16, VGG19 and YOLOv5) to
predict the classes. Finally, the output image is displayed with
the predicted class.
III. RESULTS AND DISCUSSION
The VGG16 and VGG19 models are implemented using
TensorFlow. The Keras functional API was used to create the
VGG16 and VGG19 models, which can handle shared layers,
nonlinear topology, and many inputs and outputs. We also
use ImageDataGenerator, an augmentation framework that
Fig. 2. VGG16 and VGG19 architecture. allows users to edit photos in real-time. Any random
adjustments can be applied to each training image provided
YOLOv5: It is a pre-trained object detection model which to the model. The Google Colab environment is used to
is entirely open-source for research and development. It uses implement all of the models in this research. It includes the
a grid approach to separate photos [18]. The grid is in charge Keras API, TensorFlow backend, and the essential Python 3
of detecting items within the grid. Fig. 3 shows the YOLOv5 packages. The batch size is set to 64, Adam optimizer,
architecture diagram. categorical loss function, 20% shear range, zoom range
parameter 0.20 and horizontal flip. The four essential
components of the employed YOLOv5 model are the input,

97
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
backbone, neck, and output. Data preprocessing, such as Next, the confusion matrix of the VGG16 model for
mosaic data augmentation and adaptive image filling, is various categories of vehicles have been illustrated in Fig. 7.
generally done on the input terminal. YOLO separates an
image into grids, each detecting objects within its confined VGG19 Results:
regions. This approach can detect things in real-time using
data streams.
VGG16 Results:

Fig. 8. Training and validation accuracy curves for the VGG19 model.

Fig. 8 shows the training and validation accuracy curves


for the VGG19 model.
Fig. 5. Training and validation accuracies vs. epochs for the VGG16 model.

Fig. 5 displays the training and validation accuracy curves


for the VGG16 model. As demonstrated in this figure, the
train and validation accuracies gradually increase as the
number of epochs increases.

Fig. 9. Training and validation losses curves for the VGG19 model.

Fig. 9 shows the training loss and validation loss curves


for the VGG19 model. As expected, the train and validation
accuracy gradually increase with the increment of the number
of epochs. Conversely, train and validation loss gradually
Fig. 6. Training and validation loss curves for the VGG16 model.
decrease as the number of epochs increases.
Fig. 6 illustrates the training and validation loss curves for
the VGG16 model. According to this figure, train and
validation loss gradually decreases with the increment of the
number of epochs.

Fig. 10. Confusion matrix of the VGG19 model.

Fig. 10 displays the confusion matrix of the VGG19 model.

Fig. 7. Confusion matrix of VGG16 model.

98
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
YOLOv5 Results: The confusion matrix of the YOLO5 model is depicted in
Fig. 12. The diagonal components indicate the correct
predictions of the YOLOv5 model. For this architecture, the
Easy-bike category produced the highest accuracy of 0.92.
Finally, for ten classes, the accuracies of VGG16 and VGG19
are 75.15% and 79.57%, respectively. Finally, YOLOv5
achieved an accuracy of 83.02% for fifteen types of vehicles.
Finally, Fig. 13 shows some test sample images of the
proposed YOLOv5 model with the predicted classes and their
corresponding confidence score.

TABLE III. COMPARISON OF THE PERFORMANCE METRICS OF


VARIOUS MODELS
Applied Model Accuracy Precision F1 score
VGG16 75.15% 75.14% 74.54%
VGG19 79.57% 79.56% 78.86%
Fig. 11. Various performance metrics curves for the YOLOv5 model.
YOLOv5 83.02% 83% 79%
Fig. 11 shows various performance metrics (training loss,
precision, recall, mAP, etc.) curves for the YOLOv5 model.
As illustrated in Fig. 11, the losses gradually decrease and Table III shows various performance metrics of different
precision, recall and mAP improve with the advancements of deep learning frameworks used in this work. According to
the number of epochs. this table, the YOLOv5 model accomplished the best
performance with the highest accuracy and F1 score.
IV. CONCLUSIONS
In intelligent transportation systems, automatic traffic and
classification play an essential role. This study performed the
automatic detection and classification of primarily
Bangladeshi vehicles using three deep neural network
models, VGG16, VGG19, and YOLOv5. Initially, two open-
source datasets have been merged to create the final dataset
of various vehicle categories. Next, Keras
“ImageDataGenerator” augmentation package is used for
data preprocessing. The proposed system is a deep learning
model with several classifications. To achieve adequate
accuracy, we have employed the most well-known
conventional neural networks, VGG16, VGG19, and
YOLOv5. VGG16 has a 75.15 percent accuracy, VGG19
attains a 79.57 percent accuracy for ten categories, and
YOLOv5 achieves an 83.02 percent accuracy for thirteen
classes. In the future, license plate detection and recognition
can be added to the proposed system. The performance of the
deep learning approaches can be improved by enforcing
Fig. 12. Confusion matrix of the YOLOv5 model.
adversarial and sequential learning techniques.
REFERENCES
[1] K. Shaabana, M. Elaminb and M. Alsoub, “Intelligent Transportation
Systems in a Developing Country: Benefits and Challenges of
Implementation,” Transportation Research Procedia, vol. 55, pp. 1373-
1380, 2021.
[2] A. Maimaris and G. Papageorgiou, “A review of intelligent
transportation systems from a communications technology
perspective,” International Conference on Intelligent Transportation
Systems, pp. 54-59, 2016.
[3] Zulkarnain and T. D. Putri, “Intelligent transportation systems (ITS):
A systematic review using a Natural Language Processing (NLP)
approach,” Heliyon, vol. 7, pp. 1-15, 2021.
[4] I. Damaj et al., “Intelligent transportation systems: A survey on modern
hardware devices for the era of machine learning,” Journal of King
Saud University - Computer and Information Sciences, pp. 1-22, 2021.
[5] P. Ajitha and A. Sivasangari, “Vehicle Model Classification Using
Deep Learning,” International Conference on Trends in Electronics and
Fig. 13. Test samples of proposed YOLOv5 model. Informatics, pp. 1544-1548, 2021.

99
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
[6] M. A. Butt et al. “Convolutional neural network-based vehicle [13] S. Tabassum, S. Ullah, N. H. Al-Nur and S. Shatabda, “Poribohon-BD:
classification in adverse luminous conditions for intelligent Bangladeshi local vehicle image dataset with annotation for
transportation systems,” Complexity, 2021. classification,” Data in Brief, vol. 33, pp. 1-6, 2020.
[7] P. N. Chowdhury, T. Chandra Ray and J. Uddin, “A Vehicle Detection [14] M. T. Islam, S. T. Mashfu, A. Faisal, S. C. Siam, I. T. Naheen and R.
Technique for Traffic Management using Image Processing,” Khan, “Deep Learning-Based Glaucoma Detection With Cropped
International Conference on Computer, Communication, Chemical, Optic Cup and Disc and Blood Vessel Segmentation,” IEEE Access,
Material and Electronic Engineering, 2018. vol. 10, pp. 2828-2841, 2022.
[8] A. Gomaa, T. Minematsu, M. M. Abdelwahab, M. A. Zahhad and R. [15] Y. S. Devi and S. P. Kumar, “A deep transfer learning approach for
Taniguchi, “Faster CNN-based vehicle detection and counting strategy identification of diabetic retinopathy using data augmentation,”
for fixed camera scenes,” Multimedia Tools and Applications, vol. 81, International Journal of Artificial Intelligence, vol. 11, pp. 1287-1296,
pp. 25443–25471, 2022. 2022.
[9] W. Maungmai and C. Nothing, “Vehicle Classification with Deep [16] Y. Mu et al., “Improved Model of Eye Disease Recognition Based on
Learning,” International Conference on Computer and Communication VGG Model,” Intelligent Automation & Soft Computing, vol. 28, pp.
Systems, pp. 294-298, 2019. 729-737, 2021.
[10] Z. Zhang, C. Xu and W. Feng, “Road vehicle detection and [17] S. Mascarenhas and M. Agarwal, “A comparison between VGG16,
classification based on Deep Neural Network,” International VGG19 and ResNet50 architecture frameworks for Image
Conference on Software Engineering and Service Science, pp. 675- Classification,” International Conference on Disruptive Technologies
678, 2016. for Multi-Disciplinary Research and Applications, pp. 96-99, 2021.
[11] B. Hicham, A. Ahmed and M. Mohammed, “Vehicle Type [18] N. H. Tasnim, S. Afrin, B. Biswas, A. A. Anye and R. Khan,
Classification Using Convolutional Neural Network,” International “Automatic classification of textile visual pollutants using deep
Congress on Information Science and Technology, pp. 313-316, 2018. learning networks,” Alexandria Engineering Journal, vol. 62, pp. 391-
[12] E. U. Armin, A. Bejo, and R. Hidayat, “Vehicle Type Classification in 402, 2023.
Surveillance Image based on Deep Learning Method,” International
Conference on Information and Communications Technology, pp. 400-
404, 2020.

100
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl

You might also like