Vehicle Detection and Classification Using Deep Neural Networks
Vehicle Detection and Classification Using Deep Neural Networks
Neural Networks
Shuva Chowdhury Shithi Chowdhury
Electrical and Computer Engineering Electrical and Computer Engineering
North South University North South University
Dhaka, Bangladesh Dhaka, Bangladesh
[email protected] [email protected]
Jeba Tahsin Ifty Riasat Khan
Electrical and Computer Engineering Electrical and Computer Engineering
North South University North South University
Dhaka, Bangladesh Dhaka, Bangladesh
[email protected] [email protected]
2022 International Conference on Electrical and Information Technology (IEIT) | 978-1-6654-5303-5/22/$31.00 ©2022 IEEE | DOI: 10.1109/IEIT56384.2022.9967885
Abstract—Urbanized large cities encounter significant learning-based automatic vehicle detection and classification
challenges of manual traffic control due to overpopulation, system. These techniques often employ various deep learning
traffic congestion and land shortage. These cities have initiated approaches for effectively identifying and categorizing
implementing intelligent transportation systems by employing different transportation systems [3]. One such method is the
automatic and efficient traffic monitoring and management. convolutional neural network (CNN) [4]. CNN is a form of
Due to bad conditions in traffic, many minor incidents occur on deep learning framework belonging to the neural network
streets, e.g., avoiding traffic rules, over-taking tendency, and category. Because of its effectiveness, the technique is well-
high-speed and intoxicated driving, overdrive limit, unsafe lane known in picture recognition.
changes on the road, etc. Sometimes these incidences provoke
collisions, accidents, crimes, loss of time and human fatalities. CNN is widely used in image processing to detect and
For this reason, automatic vehicle detection and classification classify vehicles from images. Some recent manuscripts on
are essential to predict traffic congestion levels and lane control automatic vehicle detection and classification have been
and enhance road safety and security. This automated process described briefly in the subsequent paragraphs.
can solve economic, social, and environmental concerns and
impact everyday living issues. In this work, automatic vehicle In [9], the authors applied a CNN structure as a classifier
detection and classification have been introduced using various in both vehicle type and color classification. The suggested
deep neural network frameworks, i.e., VGG16, VGG19 and solution requires one input, i.e., a car image sent into the
YOLOv5. We used transfer learning algorithms built on a system. The original automobile image is compressed to
conventional neural network for native vehicle identification 32×32 pixels using TensorFlow’s resize function. The
and classification. This research used a Bangladeshi vehicle training set comprises 686 photos, and the testing set includes
images dataset containing 9,058 images. Next, we employed the 228 pictures. The proposed method has a color classification
“ImageDataGenerator” Keras data preprocessing and accuracy of 70.09 percent and a vehicle classification
augmentation framework and various deep learning networks, accuracy of 81.62 percent.
VGG16, VGG19, and YOLOv5. Finally, the proposed VGG16
and VGG19 models' accuracies for ten classes are 75.15% and Z. Zhang and his coauthors compare DNN and NN for
79.57%, respectively. The YOLOv5 architecture attained the vehicle categorization in this research [10]. There are 150
best performance with 83.02% accuracy for fifteen vehicle photographs in the data collection, all of which have been
classes. preprocessed. The dataset is randomly partitioned into a
training set with 80% and a testing set with 20% of the data.
Keywords— Convolutional neural network, vehicle detection, In the proposed deep neural network testing, S-shaped
classification, deep learning, ImageDataGenerator, VGG16, functions are employed as activation functions in hidden
VGG19, YOLOv5 layers, and the state-of-the-art backpropagation method is
I. INTRODUCTION used in the training phase. The DNN has a clear benefit over
traditional single-layer neural networks, including a lower
Almost every urban large city now has an intelligent error rate. Finally, the DNN and NN frameworks achieved
traffic surveillance system to monitor transportation incidents vehicle classification accuracies of 96.66% and 93.33%,
and traffic [1]. The modernized surveillance system is respectively.
utilized for various purposes, including real-time vehicle
tracking, traffic and lane occupancy monitoring, and In [11], the authors of this work present a deep learning-
observing congestion levels [2]. Vehicle classification and based vehicle type classification system that uses their private
recognition automation are among the most critical data. Data augmentation and convolutional neural networks
difficulties in modern traffic management and intelligent are the two strategies that make up this proposed system.
transportation systems [7]. Various methods for vehicle There are 2,400 samples in this dataset, which is split into
categorization have been accomplished using complex four groups, i.e., school bus, ambulance, police car, and
sensors, with some successfully solving constrained public bus of Moroccan Transport Company. The mean
circumstances. Recently, computer vision and artificial precision and recall of the CNN model for the four classes are
intelligence techniques have been used for detecting vehicles 0.89 and 0.83, respectively. The authors intend to add
based on camera photos [8]. This study describes a deep automated vehicle number plate recognition, vehicle
estimating, and traffic congestion sensing components.
In [6], the authors introduced a new vehicle dataset of six Truck 584 146
categories of local vehicles in Pakistan. All pre-trained Van 491 123
models are fine-tuned using a self-constructed dataset. This
work consists of 10,000 photos that have been divided into Wheelbarrow 189 48
six categories, each with 1,670 images. On the self- Total 6,441 1,618
constructed and VeRi datasets, the suggested customized
CNN classification system improved accuracy by 99.68
percent and 97.66 percent, respectively. Table I shows the number of photographs of vehicles for
We can conclude from the above literature studies that training and testing dataset used in this work.
many important works on vehicle detection and classification
have been performed. The conventional neural network has
been successfully employed in many researches for
automated vehicle detection and categorization. A broad
spectrum of deep learning techniques, CNN, VGG16,
VGG19, ResNet and YOLO models, have been used for
recognizing and classifying automobiles from photos and live
real-time video sequences.
In this work, automatic vehicle detection and
classification system have been developed using three deep
neural network models, VGG16, VGG19, and YOLOv5. The
dataset contains 9,058 photographs of primarily Bangladeshi
vehicles. This dataset has 13 different classes, i.e., bicycle,
boat, bus, car, CNG, easy-bike (special type of battery-
powered vehicles for rural areas), horse cart, wheelbarrow,
Leguna, motorcycle, rickshaw (human-operated vehicle with
two wheels), van and truck or lorry. Keras data preprocessing
tool “ImageDataGenerator” has been used for image Fig. 1. Sample photographs of various vehicles of the used dataset.
augmentation. The proposed multi-classification deep
learning models’ performance is evaluated in terms of Fig. 1 shows sample photographs of different types of
classification accuracy and loss. vehicles of the used dataset in this work.
The proposed system is discussed in Section II with B. Data Preprocessing
suitable tables, figures, and flowcharts. In Section III, the Image data augmentation is a method of automatically
research’s most important results have been presented. augmenting images that enhances the performance of the
Finally, Section IV ends the study with suggestions on how deep learning models [14]. The Keras “ImageDataGenerator”
to improve this work in the future. package allows users to edit photos in real-time while the
model is still trained [15]. While the image is being provided
II. PROPOSED SYSTEM
to the model, any random alterations can be added to it. This
A. Dataset process will make the model run faster and consume less
In this work, the Bangladesh Poribohon-BD native cars computational resources in terms of memory. It also provides
dataset of Mendeley [13] and a Kaggle open-source dataset modifications of the images to boost the fit models’ ability to
have been used. We combined two datasets to create a larger generalize new photographs. Models can be fitted with
dataset with increased categories and variations. Next, the picture data augmentation using the ImageDataGenerator
entire image is divided into two sets, training and testing. The framework.
96
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
TABLE II. DATA AUGMENTATION PARAMETERS AND THEIR VALUES
Parameter Value
Rescale 1.0/255
Shear range 0.20
Zoom range 0.20
Horizontal flip True
Batch size 64
Target size (224,224)
Class mode categorical
97
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
backbone, neck, and output. Data preprocessing, such as Next, the confusion matrix of the VGG16 model for
mosaic data augmentation and adaptive image filling, is various categories of vehicles have been illustrated in Fig. 7.
generally done on the input terminal. YOLO separates an
image into grids, each detecting objects within its confined VGG19 Results:
regions. This approach can detect things in real-time using
data streams.
VGG16 Results:
Fig. 8. Training and validation accuracy curves for the VGG19 model.
Fig. 9. Training and validation losses curves for the VGG19 model.
98
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
YOLOv5 Results: The confusion matrix of the YOLO5 model is depicted in
Fig. 12. The diagonal components indicate the correct
predictions of the YOLOv5 model. For this architecture, the
Easy-bike category produced the highest accuracy of 0.92.
Finally, for ten classes, the accuracies of VGG16 and VGG19
are 75.15% and 79.57%, respectively. Finally, YOLOv5
achieved an accuracy of 83.02% for fifteen types of vehicles.
Finally, Fig. 13 shows some test sample images of the
proposed YOLOv5 model with the predicted classes and their
corresponding confidence score.
99
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl
[6] M. A. Butt et al. “Convolutional neural network-based vehicle [13] S. Tabassum, S. Ullah, N. H. Al-Nur and S. Shatabda, “Poribohon-BD:
classification in adverse luminous conditions for intelligent Bangladeshi local vehicle image dataset with annotation for
transportation systems,” Complexity, 2021. classification,” Data in Brief, vol. 33, pp. 1-6, 2020.
[7] P. N. Chowdhury, T. Chandra Ray and J. Uddin, “A Vehicle Detection [14] M. T. Islam, S. T. Mashfu, A. Faisal, S. C. Siam, I. T. Naheen and R.
Technique for Traffic Management using Image Processing,” Khan, “Deep Learning-Based Glaucoma Detection With Cropped
International Conference on Computer, Communication, Chemical, Optic Cup and Disc and Blood Vessel Segmentation,” IEEE Access,
Material and Electronic Engineering, 2018. vol. 10, pp. 2828-2841, 2022.
[8] A. Gomaa, T. Minematsu, M. M. Abdelwahab, M. A. Zahhad and R. [15] Y. S. Devi and S. P. Kumar, “A deep transfer learning approach for
Taniguchi, “Faster CNN-based vehicle detection and counting strategy identification of diabetic retinopathy using data augmentation,”
for fixed camera scenes,” Multimedia Tools and Applications, vol. 81, International Journal of Artificial Intelligence, vol. 11, pp. 1287-1296,
pp. 25443–25471, 2022. 2022.
[9] W. Maungmai and C. Nothing, “Vehicle Classification with Deep [16] Y. Mu et al., “Improved Model of Eye Disease Recognition Based on
Learning,” International Conference on Computer and Communication VGG Model,” Intelligent Automation & Soft Computing, vol. 28, pp.
Systems, pp. 294-298, 2019. 729-737, 2021.
[10] Z. Zhang, C. Xu and W. Feng, “Road vehicle detection and [17] S. Mascarenhas and M. Agarwal, “A comparison between VGG16,
classification based on Deep Neural Network,” International VGG19 and ResNet50 architecture frameworks for Image
Conference on Software Engineering and Service Science, pp. 675- Classification,” International Conference on Disruptive Technologies
678, 2016. for Multi-Disciplinary Research and Applications, pp. 96-99, 2021.
[11] B. Hicham, A. Ahmed and M. Mohammed, “Vehicle Type [18] N. H. Tasnim, S. Afrin, B. Biswas, A. A. Anye and R. Khan,
Classification Using Convolutional Neural Network,” International “Automatic classification of textile visual pollutants using deep
Congress on Information Science and Technology, pp. 313-316, 2018. learning networks,” Alexandria Engineering Journal, vol. 62, pp. 391-
[12] E. U. Armin, A. Bejo, and R. Hidayat, “Vehicle Type Classification in 402, 2023.
Surveillance Image based on Deep Learning Method,” International
Conference on Information and Communications Technology, pp. 400-
404, 2020.
100
uthorized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on July 30,2024 at 14:50:54 UTC from IEEE Xplore. Restrictions appl