0% found this document useful (0 votes)
20 views12 pages

Reasearch Paper

Uploaded by

merahulhelpyou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views12 pages

Reasearch Paper

Uploaded by

merahulhelpyou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

REAL TIME VEHICLE COUNTING SYSTEM

Tanveer Alam Rahul Yadav Shaunav Jadhav


Department of Computer Science Department of Computer Science Department of Computer Science
and engineering (Artificial and engineering (Artificial and engineering (Artificial
Intelligence & Machine Learning) Intelligence & Machine Learning) Intelligence & Machine Learning)
G.L Bajaj Institute of Technology G.L Bajaj Institute of Technology G.L Bajaj Institute of Technology
and Management, Greater Noida, and Management, Greater Noida, and Management, Greater Noida,
UP, India UP, India UP, India
[email protected] [email protected] [email protected]

Abstract -In today's rapidly developing urban areas, Keywords—Deep Learning, yolov7, detection,
vehicle congestion has become a persistent issue, counting, real time, MS COCO dataset.
making traffic monitoring, parking lot management
and many other challenging tasks. Traditional
techniques such as loop detectors and ultrasonic I. INTRODUCTION
sensors not only struggle to effectively manage this Deep Learning (DL) has demonstrated its superiority
congestion but also contribute to increased costs. over traditional Machine Learning (ML) algorithms
Therefore, the real-time vehicle counting system in numerous tasks, particularly within the domain of
based on yolov7 has emerged as a crucial tool for Computer Vision (CV). [1]ML plays a pivotal role in
monitoring traffic congestion and managing parking CV by virtue of its capacity to discern patterns within
lots, highway traffic, smart city initiatives, and images and categorize objects captured by cameras.
incident detection. This system serves two critical In the past, a CV system necessitated a preprocessing
functions: detecting and counting vehicles. and feature extraction step before it could effectively
Additionally, it includes features such as sending detect, classify, or recognize objects within an image
alerts when the vehicle count surpasses a using ML algorithms. Different objects or scenarios
predetermined threshold and generating reports that required distinct techniques in preprocessing and
detail the number of vehicles present at specific time feature extraction, thereby constraining the
after every interval. Key phases of this system capabilities of a conventional CV model to detect or
involve computer vision, feature extraction, recognize only specific objects. In contrast, DL, with
categorization, and counting algorithms. its expansive and intricate networks, autonomously
The primary goal of this system is to provide real- preprocesses and extracts image features within its
time vehicle and counting, thereby enabling networks, subsequently classifying the image class
informed decision-making. The paper demonstrates and even detecting the precise locations of individual
the real-time calculation of vehicle detection and objects within the image. However, it's important to
counting in videos using the yolov7 model, which note that DL demands high-specification hardware
offers high speed and accuracy. By leveraging this and substantial amounts of data to train the networks
technology, it becomes possible to gain insights into and optimize their performance.
traffic patterns and make well-informed decisions precisely pinpoints the location of each object.
based on real-time data. This innovative approach Typically, object detection algorithms output the
shows great potential in tackling the challenges coordinates of a bounding box around the object,
presented by urban traffic congestion, parking lot along with a label identifying the type of object
management, incident detection, and more, detected. This technology finds applications in
ultimately enhancing overall management diverse fields, ranging from surveillance and security
efficiency. systems to autonomous vehicles and augmented
reality, serving as a foundational element in enabling pivotal addition to the arsenal of tools aimed at urban
machines to perceive and comprehend their and highway traffic analysis and planning [1],
surroundings. computer vision's forte encompasses the realm of
Concurrently, the counting algorithm constitutes a vehicle detection and recognition, offering a plethora
set of computational steps designed to ascertain the of possibilities within the domain of automated
quantity of specific items within a given dataset or driving. However, the challenges inherent in real-
scenario. This algorithmic process involves time image acquisition via onboard cameras on road
systematically analysing the elements of interest and vehicles are notably influenced by various factors,
incrementing a counter for each occurrence. including camera angles and inter-object distances.
Counting algorithms are employed across various The conventional trajectory of [4] employing
domains, including computer science, mathematics, Traditional Machine Learning approaches mandates
and data analysis, to derive accurate numerical preprocessing methods to fulfill the task's requisites.
values from extensive datasets. These algorithms Techniques such as image greyscale conversion,
encompass a spectrum of methods, from simple binarization, and background subtraction, at times
tallying techniques to more intricate statistical complemented by edge detection, are deployed. Yet,
approaches, tailored to the nature of the items being this approach encounters limitations; instances such
counted and the specific requirements of the as the presence of shadows cast by vehicles can
application. In the context of the vehicle counting impede precise detection. Similarly, alterations in
system, the counting algorithm is utilized to road surfaces due to repairs, damage, or the presence
accurately tally the number of vehicles present at a of obstacles disrupt the image subtraction process,
specific instance within a designated area. By leading to inaccuracies in detection.
integrating sophisticated algorithms and camera Contrarily, the advent of Deep Learning (DL)
systems, this counting process yields valuable heralds a paradigm shift by affording a more
insights into traffic flow, parking lot occupancy, and adaptable performance sans the need for extensive
overall transportation patterns. By accurately image preprocessing or feature extraction through
detecting and counting objects, computer vision multiple methodologies. [5] Although
systems can gather information to make informed computationally intensive, the DL approach
decisions and take appropriate actions. These transcends these preprocessing demands, albeit
techniques find applications in various fields, requiring substantial volumes of data for network
including robotics and autonomous vehicles, training. Moreover, the evolution of DL
enabling these systems to operate efficiently and architectures, trained on extensive datasets
effectively. YOLOv7 stands out as a popular encompassing millions of instances, has
algorithm for object[3] detection due to its speed, significantly facilitated the development of
accuracy, flexibility, and user-friendly nature. Computer Vision (CV) systems, rendering the
However, other algorithms, such as Faster R-CNN process more streamlined and accessible. The advent
and SSD, may be more suitable for specific of Deep Learning has significantly reshaped the
applications. For instance, Faster R-CNN might be landscape, empowering computer vision systems
better suited for applications prioritizing higher with greater adaptability and resilience, thereby
accuracy at the expense of speed, while SSD could mitigating some of the traditional challenges
be more appropriate for real-time applications encountered in vehicle detection and recognition
requiring low latency. Ultimately, the selection of an within dynamic urban and highway environments.
object detection algorithm hinges on the specific The initial concept of the algorithm employed in this
requirements of the application, encompassing paper is rooted in the RCNN (Region-Based
factors such as desired speed, accuracy, and the Convolutional Neural Network) algorithm [7]. This
categories of objects to be detected. Evaluating the algorithm serves as the foundation for enhancing the
performance of different algorithms on the target YOLO (You Only Look Once) method by adopting
dataset and application requirements is crucial for the mechanism of Faster-anchor RCNN. In a manner
making an informed choice Acknowledged as a similar to YOLOv3, the algorithm generates anchors
at multiple scales on the feature map and creates II. LITERATURE SURVEY
prior boxes on the feature map for making
Before working on our research, we have reviewed
predictions [8]. Currently, the YOLO method stands
out as the most widely used technique for target some works that have been done related to our
identification and finds extensive application in research.
industrial manufacturing. Hence, in this paper, the Numerous methodologies have been proposed for
YOLO algorithm framework is chosen as the basis vehicle detection and counting, each offering distinct
for implementation. According to Wang et al. in approaches to tackle this complex task. Li et al. [5]
2022 [9], this algorithm has achieved remarkable introduced a real-time system that encompasses
performance with YOLOv7. However, there is still several stages. Initially, an adaptive background
room for improvement in terms of the algorithm's subtraction technique identifies moving objects
detection precision. The primary objective of this within video frames. Subsequent binarization and
study is to bridge the gaps and overcome the
morphological operations refine the foreground area,
limitations observed in previous research by
developing a real-time vehicle detection and eliminating noise and shadows. To prevent over
counting system based on the YOLOv7 model. To segmentation, the resulting foreground image is
achieve this, novel techniques or enhancements will combined with the frame's edge image before
be incorporated to enhance the system's performance undergoing hole filling. The system then employs a
and robustness in real-world scenarios. The study virtual road-based detector for vehicle identification
aims to provide a comprehensive solution that and counting, followed by blob tracking across
integrates accurate vehicle detection and reliable frames to monitor vehicle movement.
counting in real-time video streams. The proposed
system holds significant potential across various In a similar vein, Bhaskar and Yong [6] devised a
domains, including transportation management, method employing Gaussian mixture model (GMM)
traffic monitoring, and surveillance applications. By and blob detection. GMM models the background,
offering real-time insights into vehicle movement extracting foreground pixels based on Mahalanobis
patterns and traffic congestion, as well as providing distance. Morphological operations are applied for
accurate vehicle counts, the system can greatly noise removal and blob aggregation, subsequently
contribute to better decision-making and resource facilitating blob analysis for vehicle identification.
allocation. These insights can aid in identifying Counting and tracking mechanisms are then
potential traffic violations and improving overall employed for comprehensive vehicle monitoring.
road safety. Furthermore, the outcomes of this study Contrastingly, Kryjak et al. [7] engineered a
have the potential to enhance the efficiency of traffic hardware-software system tailored for road
management systems and facilitate urban planning. intersections. This system diverges from
By accurately detecting and counting vehicles, the conventional techniques like background subtraction
system can assist in optimizing traffic flow, and optical flow, opting instead for similarity
implementing effective safety measures, and measurements in consecutive frames. Patch analysis
supporting the development of well-planned urban is employed to detect vehicles at red signals,
infrastructures. By critically assessing the enabling effective vehicle counting.
identifying the gaps and limitations in previous Liu et al. [8] introduced a real-time counting method
research, our study aims to contribute to the field of centered on virtual detection lines and spatio-
real-time vehicle detection and counting based on the temporal contour techniques. Leveraging GMM,
YOLOv7algorithm. The proposed enhancements, moving foreground pixels along the detection line
novel techniques, and comprehensive evaluation of are detected, enabling the construction of vehicle
the system can potentially address the identified contours across multiple frames in the spatio-
limitations and provide a more accurate and robust temporal domain. These contours are meticulously
solution for real-world scenarios. analyzed to ascertain the number of vehicles present.
Additionally, shadow detection and removal different lighting conditions and environments.
algorithms have been proposed in [9,10], Many of these studies concentrated solely on
highlighting a specialized area of focus within the detecting vehicles and missed out on tracking and
realm of vehicle detection. These algorithms accurately counting them. This gap in research
contribute to refining the accuracy and precision of highlighted potential limitations in handling
vehicle counting systems by addressing shadow- challenges like obscured views, changes in size, and
related challenges inherent in image processing.
coping with complex traffic scenarios.
Each of these methodologies represents a distinct
approach, showcasing the diversity and innovation By addressing these limitations, the researchers are
within the field of vehicle detection and counting. striving to develop a smarter system that not only
Their individual strengths and novel techniques identifies vehicles but also effectively monitors,
contribute to the ongoing advancements in this tracks, and precisely counts them. This innovative
domain. approach aims to make these smart systems more
In a research study conducted by Md Abdur Rouf, adaptable and reliable, functioning accurately across
Qing Wu,[11] and their team, they scrutinized the diverse real-world situations.
conventional methods used for vehicle detection, In the research conducted by Aman Preet Singh
such as RADAR, LiDAR, RFID, or LASAR. These Gulati,[12] a sophisticated vehicle detection and
methods have some serious drawbacks—they're not counting system was developed employing the
only slow but also quite expensive and require a lot robust OpenCV library and the efficient haar cascade
of human effort. Moreover, these techniques have algorithm. This system was designed to accurately
limitations when it comes to accurately classifying detect and count vehicles within both images and
different types of vehicles or gathering detailed data video streams. Leveraging the comprehensive
about them, like how many vehicles of each kind are capabilities of OpenCV, the study utilized a range of
on the road and in which direction they are moving. image processing operations crucial for this task.
The researchers, instead of relying on these Particularly, the implementation involved the use of
traditional methods, worked on enhancing a more specific car and bus haar cascade classifiers, which
sophisticated algorithm called RCNN (Region-based proved instrumental in effectively identifying and
Convolutional Neural Network). They modified and enumerating cars and buses in the visual data.
improved the YOLO (You Only Look Once) Through this methodical approach, the system
technique by adopting some of the mechanisms from demonstrated a commendable ability to discern and
Faster-anchor RCNN, creating a more efficient and tally vehicles, showcasing its potential for diverse
accurate approach for vehicle detection. YOLO, a applications in traffic management, surveillance, and
prevalent method widely used in industrial settings beyond.
for identifying targets, served as the primary The research outlined in this paper emphasizes the
algorithm framework in their study. implementation of YOLO3[13] (You Only Look
To achieve this goal, the team analyzed how the Once) for the pivotal tasks of vehicle detection,
computer's memory functions, looking into ways to classification, and counting. The primary objective
reduce the time it takes to process information. They was to accurately discern various vehicle types—
explored aspects such as the number of connections specifically, cars, Heavy Motor Vehicles (HMVs),
in the computer's architecture and how it handles and Light Motor Vehicles (LMVs) traversing a
various tasks to make it faster and more efficient. roadway. The authors meticulously employed the
One key aspect they noticed was the attention to capabilities of YOLO3, a state-of-the-art object
activation, a process crucial for how the computer detection system, to achieve this multifaceted goal.
interprets and reacts to data. By leveraging this advanced neural network
While focusing on enhancing the algorithm, they architecture, the study aimed to precisely identify
also critically evaluated past studies. They and categorize diverse vehicles present on the road
discovered that some studies lacked comprehensive while concurrently tallying the total count of
real-world testing, especially in scenarios with vehicles navigating through the given area. This
research marks a significant stride in enhancing YOLO, each with its unique approach, have
automated surveillance, traffic monitoring, and improved feature extraction, object selection, and
transportation management systems by providing a classification capabilities of CNNs. However, while
robust methodology for real-time vehicle analysis the traditional machine vision techniques offer faster
and enumeration. detection speeds, they struggle with changing image
In the research conducted by Atharva Musale,[14] conditions and complex scenarios. On the other
the focus was on employing advanced algorithms for hand, advanced CNNs may struggle with scale
vehicle detection and tracking within visual data. changes and precise detection of small objects. The
Specifically, the study utilized the YOLOv3 current methods also lack precision in recognizing
algorithm, renowned for its precision in object objects of different sizes belonging to the same
detection tasks, to accurately identify vehicles within category, and the use of image pyramids or multi-
the given scenes. Furthermore, the implementation scale input images, although effective, requires
of the Deep Sort Algorithm played a crucial role in significant computational resources. (Prof. Pallavi
tracking these detected vehicles over consecutive Hiwarkar, Damini Bambal, Rishabh Roy)
frames. The integration of YOLOv3 and the Deep The research conducted by Ravula Arun Kumar, D.
Sort Algorithm enabled not only the identification Sai Tharun Kumar, K. Kalyan, and B. Rohan Ram
but also the continuous monitoring and tracking of Reddy [15] focuses on developing an intelligent
vehicles, ensuring their trajectories could be vehicle counting system to address increased traffic
followed across time and space within the video or congestion. This study explores various methods,
image sequences. This methodological combination including blob analysis, background subtraction,
highlights a sophisticated approach towards image enhancement, sensor-based systems, image
comprehensive vehicle analysis, contributing segmentation, and pedestrian detection using neural
significantly to enhanced surveillance systems, networks. They've designed software that processes
traffic management, and logistical tracking video input to count vehicles by performing tasks
solutions. like image segmentation, vehicle tracking, detection,
The paragraph discusses methods used in vehicle and blob analysis for traffic surveillance. Their
object detection,[14] primarily divided into approach also includes utilizing convolutional neural
conventional machine vision techniques and networks (CNNs) for real-time vehicle counting with
complex deep learning strategies. Traditional high accuracy. Additionally, they've explored
techniques use a vehicle's motion to separate it from techniques like virtual coils and CNNs for precise
the background image, employing three main vehicle counting, especially on highways. The
methods: background subtraction, continuous video research emphasizes achieving high accuracy in
frame contrast, and optical flow. These methods vehicle counting using background subtraction
detect moving foreground areas by analyzing alongside virtual collectors and morphological
differences between frames or the motion region in operations for tracking and counting vehicles on
videos. Furthermore, vehicle detection methods roads and highways.
utilizing features like SIFT and SURF have been Previous research predominantly focused on testing
widely used, including 3D models for classification. videos [16] from highways frequented solely by cars,
Deep Convolutional Neural Networks (CNNs) have buses, or trucks, omitting motorcycles. Notably,
significantly advanced vehicle detection by learning buses or trucks were generically categorized as 'cars'
image features and performing tasks like without finer distinctions as 'bus' or 'truck' during
classification and regression. The detection methods counting. However, certain traffic monitoring
are broadly categorized into two: the two-stage systems necessitate more detailed vehicle
method, involving candidate box generation and information, specifying car, truck, bus, or
classification using CNNs, and the one-stage motorcycle types. The earlier studies referenced in
method, directly converting object positioning into a [6]– [9] were mostly conducted in favorable traffic
regression problem. Various models like R-CNN, conditions with responsible driving behaviors to
SPP NET, R-FCN, FPN, Mask RCNN, SSD, and ensure accurate counting. Consequently, our focus in
this work is to develop a system that not only counts domain of vehicle detection. CNNs exhibit
vehicles crossing roads but also categorizes them— remarkable capabilities in learning intricate image
car, bus, truck, or motorcycle—utilizing a Deep features and performing various tasks like
Learning algorithm employing YOLOv3 classification and bounding box regression. The
architecture. research navigates through the landscape of
The Karlsruhe Institute of Technology and Toyota detection methodologies, categorizing them into two
Technological Institute at Chicago[17] (KITTI) primary approaches: two-stage methods (such as R-
designed for self-driving cars, offering extensive CNN) and one-stage methods (like YOLOv3).
traffic scene images aiding in 3D object detection. Two-stage methods, represented by R-CNN, involve
However, TRANCOS, another dataset capturing generating candidate boxes for objects through
traffic jams through surveillance cameras, suffers intricate algorithms, followed by classification via a
from occlusions and lacks vehicle type records, convolutional neural network. Although these
limiting its broader application. Deep learning methods offer high precision, they are
networks follow a two-step method involving computationally intensive and demand significant
candidate box generation and subsequent sample storage memory. On the other hand, one-stage
classification, seen in algorithms like RCNN, Fast R- methods, exemplified by YOLOv3, directly convert
CNN, and Faster R-CNN, known for improved object localization into a regression problem,
feature extraction but slower detection. In contrast, resulting in faster detection speeds. However, they
one-step methods like SSD efficiently locate objects may sacrifice some precision, especially concerning
with default anchors at different resolutions, smaller objects or intricate scenes.
handling various scales, while the YOLO series The research underscores the trade-offs between
divides images into grids for swift object prediction, these methodologies. Traditional methods might
with YOLOv3 utilizing logistic regression and offer faster detection but struggle in challenging
multiple scales for rapid and accurate detection. conditions, while advanced CNNs provide accuracy
Overall, training deep learning models using vehicle but face challenges in handling scale variations. It
datasets leads to highly performing vehicle detection also highlights the need for more adaptable and
models. precise approaches that can handle a diverse range of
The research conducted by Huansheng Song, scenarios.
Haoxiang Liang, Huaiyu Li, Zhe Dai, and Xu Yun To address these challenges, the study suggests
delves[18] into the realm of vehicle object detection, leveraging multi-scale input images, enabling the
exploring both traditional machine vision methods models to handle various object sizes and
and the more intricate deep learning techniques. complexities. By incorporating this approach, the
research aims to overcome the limitations posed by
The traditional approaches primarily rely on the traditional methods and enhance the adaptability and
motion exhibited by vehicles to distinguish them
precision of advanced CNNs in vehicle object
from a static background image. This methodology
encompasses three primary categories: background detection.
subtraction, continuous video frame difference, and III METHDOLOGY
optical flow. Each of these methods operates by
detecting variations in pixel values across A. Data Collection and Preparation
consecutive frames or utilizing the motion regions in
For an accurate system, we start by collecting a
the video, allowing for the identification of moving
objects. These techniques, while foundational, come diverse dataset of images or videos containing
with limitations. They might struggle with abrupt various traffic scenarios. These datasets should
changes in lighting, scenes with periodic motion, or encompass different weather conditions, lighting
scenarios involving slow-moving vehicles. variations, diverse vehicle types (cars, trucks, buses),
Conversely, the study emphasizes the prowess of and traffic densities. Each image or video segment in
Deep Convolutional Neural Networks (CNNs) in the the dataset needs careful annotation, marking the
locations of vehicles. This annotated data helps different parts of the picture. Instead, it takes one
model to learn and identify vehicles accurately. glance and uses a network of patterns it learned from
many other images to instantly recognize the
1) Data Acquisition: Capturing real-world footage or vehicles and count them.
images using cameras or sensors in locations with YOLOv7 is a bit like having a super-smart friend
varying traffic densities is critical. These recordings who's seen tons of cars, trucks, and bikes, and can
should encapsulate different environmental spot them in a picture without needing to check the
conditions, such as daylight, nighttime, adverse same spot multiple times.
weather, or diverse traffic patterns like congested This version, "v7," is an improved and updated
urban roads or highways. version of this smart system. It's even better at
recognizing vehicles accurately and quickly than the
2) Annotation and Labeling: The collected video previous versions. It's learned from more pictures,
frames or images need meticulous annotation or
refined its skills, and now it's faster and more
labeling. Each frame requires a manual or automated
accurate in identifying vehicles in pictures or videos.
process to identify vehicles and accurately count
their numbers. This annotation includes marking C. Model Training
vehicles and specifying the vehicle count per frame.
Gathering data for training the system to recognize Model training is like teaching a computer to
scenarios with high vehicle density accurately. This recognize when there are lots of vehicles in a picture
approach significantly influences the system's or video. To do this, we're tweaking the computer's
performance, reducing false accuracy and ensuring learning so that when it sees too many vehicles, it
does something specific. This involves changing the
the system reacts precisely to threshold breaches.
computer's instructions so that it can notice when
Ultimately, this robust training enables the AI system
there's a high number of vehicles and then start the
to reliably identify high-traffic instances in real-time, system that alerts or makes reports about it. It's like
enhancing the overall efficiency and reliability of the giving the computer a new skill – noticing crowded
vehicle counting system. streets and taking action when there are too many
vehicles.
B. Understanding the Yolov7
model training based on labeled data is a
Analyze how YOLOv7 processes data and detects fundamental aspect of supervised learning in
vehicles. This model identifies objects in an image machine learning. Labeled data refers to a dataset
and outputs bounding boxes around them, along with where each input (such as an image or a video frame)
their class probabilities. The logic for vehicle is paired with its corresponding output label (like the
counting is embedded within this process. count of vehicles in that image). For vehicle
This involves grasping a clever way computers learn counting, this means having images or frames where
to recognize things like vehicles in pictures or the number of vehicles has been manually annotated
videos. YOLO stands for "You Only Look Once," or marked.
and the "v7" refers to its version number, which Model training is essentially teaching an AI system
means it's a more advanced and improved version of to recognize specific patterns or situations. In this
the original YOLO. case, it's about making the system understand what a
Imagine a computer trying to spot and count scene looks like when there are a lot of vehicles
different types of cars, trucks, or bikes in a busy present. To train the system, we use a lot of
street photo. YOLOv7 is like a smart detective – it examples—pictures or video frames that show
looks at the entire picture once and quickly figures. different scenarios with varying vehicle counts.
out where the vehicles are and what types they might The computer learns from these examples. It figures
be. It's fast, accurate, and efficient. out what details and features indicate a high number
The "You Only Look Once" part means it doesn't of vehicles. It might notice things like dense clusters
need to take lots of glances or look repeatedly at of cars, long queues, or crowded intersections. These
examples help the AI understand the context in dataset. When a model is already trained on a large
which the vehicle count is considered high. dataset (pre-trained), fine-tuning involves retraining
Once the computer learns these patterns, we adjust or adjusting some of its layers to improve its
its instructions to make it respond when it recognizes performance on a more specialized dataset.
this scenario. It's like saying, "Hey, when you see lots For instance, in a car detection model, fine-tuning
of vehicles, let me know." This involves tweaking might involve modifying the layers of a pre-trained
the system's code so that it can detect these situations model specialized in object recognition to better
accurately. differentiate between different car types. This
process might include adjusting the learning rate,
D. Validation and Fine-Tuning optimizing specific layers, or modifying the training
dataset to focus on certain features that help the
Validation is essentially a way to test the computer's model distinguish sedan cars from SUVs accurately.
skills after its learning phase. Imagine you're By fine-tuning, the model becomes more adept at
teaching someone to recognize different types of cars detecting and classifying cars with increased
by showing them pictures of cars, trucks, and buses. precision and accuracy.
After this teaching session, you show them a new set
of car images they've never seen before and ask them E. Real-Time Implementation
to identify the vehicles. You already know the
correct answers for these pictures. In the realm of real-time video and image analysis,
Similarly, after the computer has been trained on the implementation involves utilizing a finely-tuned
various vehicle images, validation involves giving it YOLOv7 model. This specialized model is designed
fresh pictures of vehicles it hasn't encountered during to actively process live video feeds, recorded video
its training. The computer then tries to count the streams, or sequences of images in real time. Its
vehicles in these new pictures. You, as the sophisticated configuration allows it to swiftly and
supervisor, compare its counts with what you know accurately identify various objects, particularly
to be accurate. If the computer’s count matches what vehicles, within these visual data sources.
you expected, it shows that it's learned properly from Picture this: Imagine having a smart system that can
the training images. instantly recognize and track vehicles as they appear
The crucial part here is that the computer shouldn’t in a live video feed. This system, powered by the
simply memorize the vehicles it saw before. It must YOLOv7 model, is trained to swiftly process and
understand the general concept of vehicles, like interpret these visual inputs. Whether it's a camera
knowing a car from a bus or a truck. Validation capturing the flow of traffic or a series of images
ensures that the computer can apply what it learned showcasing vehicles on the road, the YOLOv7
to new situations accurately. model is like the detective with a keen eye, instantly
If it struggles during this validation phase, it's like spotting and classifying different types of vehicles.
the person you taught recognizing cars having What's fascinating is the flexibility this system
difficulty identifying new cars. To improve, you'd offers. It's not limited to a single type of visual input.
help the person by showing more examples or Whether it's a live camera feed from a busy
explaining different features to look for. Similarly, in intersection or a sequence of images from different
the computer's case, fine-tuning happens—making angles, the YOLOv7 model remains adept at
adjustments to how it understands vehicles so that it identifying vehicles in real time. Its fine-tuned nature
performs better with new, unseen images. This fine- ensures that it can precisely pinpoint vehicles despite
tuning ensures that the computer gets better at variations in lighting, angles, or environmental
recognizing and counting vehicles accurately in conditions.
various scenarios, not just the ones it trained on. In essence, this system's forte lies in its ability to
Fine-tuning in machine learning refers to the process actively process visual information, employing a
of adjusting the parameters or hyperparameters of a finely-tuned YOLOv7 model as its guiding
pre-trained model to adapt it to a specific task or intelligence. It's like having an expert observer
continuously analyzing incoming visuals, ensuring H. Integration and Deployment
that vehicles are swiftly and accurately identified in
any scenario. Integration and deployment mark the stage where all
the parts come together. We take our well-tested
F. Design Alert System email alert and reporting features and combine them
into the actual live system. It's like assembling
Imagine having a watchful eye that keeps track of
different pieces of a puzzle to create the complete
vehicle numbers in real time and sends out an alert
picture.
when things get a bit crowded. This system works by
We carefully place the email alert and reporting
setting a limit, let's say for instance, if we agree that
mechanisms into the system's framework, making
50 vehicles is our maximum, then the system will
sure they fit seamlessly without causing any
continuously monitor the count.
disruptions. This integration step is critical because
Whenever the number of vehicles crosses this set
it ensures that these new components work smoothly
limit, it's like a red flag waving - the system
with the existing system.
immediately sends an email alert. This email isn't
But before this goes live, we conduct thorough
just a notification; it's a smart one. It contains crucial
checks to ensure everything is compatible and
details about the situation, such as the number of
functioning as expected within the real-time
vehicles surpassing the threshold and even a
environment. It's a bit like trying out a new
snapshot of that particular moment. It's like
component in a machine to see if it works well with
capturing evidence of the crowded scenario.
the rest and doesn't throw off the entire system.
Think of it as your personal assistant, but for vehicle
The goal here is to make sure that when the system
counting. It keeps a close eye on the count and when
goes live, these added features of email alerts and
things get a bit too busy, it sends a detailed report reporting operate harmoniously, enhancing the
with proof of the situation. This way, you're always system's capabilities without causing any glitches or
in the loop about what's happening in real time with slowdowns. This integration and compatibility check
the vehicle count. are essential to guarantee a seamless and efficient
system once it's up and running.
G. Performance Evaluation and Optimization
I. Maintenance and Updates
To ensure our system works flawlessly, it undergoes
a series of trials, almost like practice runs where we
Maintenance and updates are like taking care of a
create situations that cross our vehicle count limit.
garden. Once the system is running, it needs constant
We meticulously examine how accurate and quick
attention to stay in top shape. We keep an eye on how
our email alert and reporting system responds during
well it's working sort of like checking plants for signs
these trials. This evaluation is crucial because it
of wilting.
helps us fine-tune the system. If there are any hiccups
Regularly, we peek at the system's performance,
or delays, we look into those areas and make
checking if it's doing its job properly. If we notice
improvements. For instance, if the alert doesn't go
any hiccups, like the system missing its alert mark or
out quickly enough or if the information in the email
couldn’t detect the vehicle, we step in and make
lacks detail, we tweak the system settings and make
tweaks. It's akin to adjusting watering times if the
necessary enhancements to ensure it works better.
plants seem thirsty or flooded.
It's all about putting the system through its paces,
Also, just like how new flowers are planted in a
testing it thoroughly, and then making it even better garden to keep it vibrant, we update our system
based on the results we get. Just like a car undergoes regularly. This might involve adding new features or
a rigorous inspection before hitting the road, our improving existing ones to keep it up to date and
system goes through extensive testing to ensure it's efficient.
in top-notch condition and performs flawlessly when Maintenance and updates are crucial to ensure the
it counts the most. system stays reliable and effective. It's about
nurturing the system, making sure it runs smoothly IV. CONCLUSION AND RESULT
day in and day out.
The real-time vehicle counting system was designed
with a strong emphasis on efficiency and
compatibility with commonly available computing
resources. The system's performance was evaluated
on a laptop equipped with an Intel Core i5 processor,
with clock speeds ranging from 1.6 GHz to 2.11
GHz, and varying RAM configurations from 4 GB to
12 GB.
During the testing phase, the proposed method
demonstrated an impressive accuracy rate, ranging
from 90% to 96%, highlighting its reliability and
robustness in accurately detecting and enumerating
vehicles. To further validate the system's
performance, it underwent rigorous testing on a
diverse set of video files, encompassing a total of 635
vehicles. The results were highly encouraging, with
the system successfully identifying and counting 617
vehicles, translating to an outstanding accuracy rate
of 97.16%. The experimental results, as illustrated in
Table 1, showcased the system's remarkable
capabilities.
Video Actual Number of Accuracy
No. number of counted
Vehicle vehicle
1 113 110 97.34
2 105 103 98.09
3 107 104 97.19
4 55 53 96.36
5 100 97 97.00
6 85 83 97.64
7 70 67 95.71
Table 1. Experimental Result

The test images demonstrate the system's ability to


detect and classify different types of vehicles, such
as cars, buses, and motorcycles, in recorded video.
Each detected vehicle is enclosed within a bounding
box, with a label indicating the specific type of
vehicle recognized, such as "car," "bus," or
"motorcycle." This classification capability is crucial
for intelligent transportation systems, as it allows for
granular analysis and separate counting of various
vehicle categories, the test images display the real-
time vehicle count overlaid on the detected frames.
Fig.1 Flowchart of working model This count represents the total number of vehicles
detected and tracked by the system up to that videos. For example, the accuracy is highest (around
particular frame. 98%) for Video 1, while it drops to around 96% for
Video 7. This graph allows for a visual comparison
of the vehicle counting system's performance across
multiple videos, highlighting its strengths and
weaknesses in detecting and counting vehicles
accurately.

Fig.2 Test image of vehicle detection.

Fig.3 Test image of detecting and counting of Fig.3 Performance Analysis of Model
vehicle. To ensure comprehensive testing and assess the
The graph displays below the vehicle count and system's adaptability to various real-world scenarios,
accuracy per video for a series of videos. The y-axis the model was evaluated under different lighting
represents the number of vehicles, while the x-axis conditions and traffic densities. Table 2 presents the
shows the video numbers from 1 to 7. The graph has results of these tests, which included scenarios such
three sets of data represented by bars and a line: as normal lighting, evening light, daylight, and
congested traffic situations. The system exhibited
Actual Vehicles (blue bars): This represents the true consistent and reliable performance across all
or actual number of vehicles present in each video. scenarios, further reinforcing its versatility and
Counted Vehicles (green bars): This represents the suitability for deployment in diverse environments.
number of vehicles counted or detected by some
system or algorithm for each video. Scenario Night Day Normal Congested Rainy
Accuracy (red line): This line shows the accuracy Accuracy 97.28 96.23 97.23 99.26 96.11
percentage of the vehicle counting system for each Fig. 4. Performance Analysis of Model in different
video, calculated by comparing the counted vehicles scenario.
to the actual vehicles.
The range of accuracy spans from 96.23% to
By comparing the blue (actual) and green (counted) 99.26%. This small range indicates that the system's
bars, we can see how accurately the system counted performance is consistent across varying conditions,
the vehicles for each video. In some cases, the with a difference of only about 3% between the
system undercounted (green bar shorter than blue), highest and lowest accuracy values. The best
while in others, it overcounted (green bar taller than accuracy of model can be found on congested
blue). The red accuracy line fluctuates, indicating scenario whereas the least good accuracy of model
varying levels of accuracy across the different can be seen in rainy scenario.
References [11] Y. Lu, H. Xin, J. Kong, B. Li and Y. Wang, “Shadow
removal based on shadow direction and shadow
[1] F. Li, C.-H. Lee, C.-H. Chen, L.P. Khoo, Hybrid data- attributes,” Proceedings of International Conference on
driven vigilance model in traffic control center using eye- Computational Intelligence for Modelling, Control and
tracking data and context data, Adv. Eng. Inform. 42, Automation,2006.
2019
[12]Wang, H.; Li, Z.; Ji, X.; Wang, Y.J.a.p.a. Face r-cnn.
[2] Zha, H.; Miao, Y.; Wang, T.; Li, Y.; Zhang, J.; Sun, arXiv 2017, arXiv:1706.01061.
W.; Feng, Z.;Kusnierek, K. Improving Unmanned Aerial
Vehicle Remote Sensing-Based Rice Nitrogen Nutrition [13] Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. Shufflenet
Index Prediction with Machine Learning. Remote. Sens. v2: Practical guidelines for efficient cnn architecture
2020 design. In Proceedings of the European Conference on
Computer Vision (ECCV), Munich, Germany, 8–14
[3] Ahmet Özlü “Vehicle Detection Tracking and September 2018; pp. 116-131.
Counting” May 5, 2018.
[14] Dollár, P.; Singh, M.; Girshick, R. Fast and accurate
[4] P.K. Bhaskar and S.P. Yong, “Image processing based model scaling. In Proceedings of the IEEE/CVF
vehicle detection and tracking method,” Proceedings of Conference on Computer Vision and Pattern Recognition,
International Conference on Computer and Information Nashville, TN, USA, 20–25 June 2021; pp. 924- 932
Sciences (ICCOINS), pp.1–5, June 2014
[15] Wang, S.-H.; Fernandes, S.L.; Zhu, Z.; Zhang, Y.-D.
[5] Dennis Hein. “Traffic Light Detection with AVNC: Attention-Based VGG-Style Network for
Convolutional Neural Networks and 2D Camera Data”. COVID-19 Diagnosis by CBAM. IEEE Sens. J. 2021, 22,
PhD thesis. fu-berlin, 2020. 17431-17438.
[6] D. Li, B. Liang and W. Zhang, “Real-time moving [16] Ahmet Özlü “Vehicle Detection Tracking and
vehicle detection, tracking, and counting system Counting” May 5, 2018.
implemented with OpenCV,” Proceedings of 4th IEEE
International Conference on Information Science and [17] K. Suneetha & Ms. K. Mounika Raj - Automatic
Technology (ICIST), pp.631–634, April 2014. Vehicle Number Plate Recognition System (AVNPR)
using OpenCV Python, Springer Science and Business
[7] P.K. Bhaskar and S.P. Yong, “Image processing based Media Singapore. Lect. Notes in Networks, Syst., Vol.
vehicle detection and tracking method,” Proceedings of 215 : Proceedings of the 2nd International Conference on
2014 International Conference on Computer and Computational and BioEngineering , 978-981-16-1940-3,
Information Sciences (ICCOINS), pp.1–5, June 2014. Sept 2021.
[8] T. Kryjak, M. Komorkiewicz and M. Gorgon, [18] A computer vision-based vehicle detection and
“Hardware-software implementation of vehicle detection
counting system, 2016 8th International Conference
and counting using virtual detection lines,” Proceedings
on Knowledge and Smart Technology (KST) 3-6
of 2014 Conference on Design and Architectures for
Signal and Image Processing (DASIP), pp.1–8, October Feb.2016.
2014.
[19] Wang, S.-H.; Fernandes, S.L.; Zhu, Z.; Zhang,
[9] Y. Liu, G. Li, S. Hu and T. Ye, “Real-time detection Y.-D. AVNC: Attention-Based VGG-Style Network
of traffic flow combining virtual detection-line and for COVID-19 Diagnosis by CBAM. IEEE Sens.
contour feature,” Proceedings of 2011 International J,17431-17438, 2021, 22.
Conference on Transportation, Mechanical, and
Electrical Engineering (TMEE), pp. 408–413, December [20] R. Girshick, Fast r-cnn, in: 2015 IEEE
2011. International Conference on Computer Vision
[10] B. Jianyong, Y. Runfeng and Y. Yang, “A novel (ICCV), pp. 1440-1448, 2015.
vehicle’s shadow detection and removal algorithm,”
Proceedings of 2nd International Conference on
Consumer Electronics, Communication and Networks
(CECNet), pp.822–826, April 2012.

You might also like