B14
B14
BACHELOR OF TECHNOLOGY
IN
ELECTRONICS AND COMMUNICATION ENGINEERING
Submitted by
H. Dinesh (319126512085) A. Pavan Kalyan (319126512067)
We would like to express our deep gratitude to our project guide V. Shireesha, Assistant
professor, Department of Electronics and Communication Engineering, ANITS, for her
guidance with unsurpassed knowledge and immense encouragement. We are grateful to
Dr. B. Jagadeesh, Head of the Department, Electronics and Communication
Engineering, for providing us with the required facilities for the completion of the
project work.
We are very much thankful to the Principal and Management, ANITS, Sangivalasa,
for their encouragement and cooperation to carry out this work.
We would like to thank our parents, friends, and classmates for their encouragement
throughout our project period. At last, but not the least, we thank everyone for
supporting us directly or indirectly in completing this project successfully.
PROJECT STUDENTS
H. Dinesh (319126512085)
A. Pavan Kalyan (319126512067)
V. Shanmukha Raju (319126512126)
Ch. Kumar Charukesh (319126512076)
ii
ABSTRACT
The proposed work is based on the issues related to driver drowsiness detection
and alert system. The proposed algorithm uses the features of deep convolutional neural
network-like RESNET (Residual neural network) and MC-KCF (Multi Convolution
neural network-KCF). These techniques are used to detect driver drowsiness by
measuring yawning, head position and eye rotation.
iii
CONTENTS
ACKNOWLEDGEMENT ii
ABSTRACT iii
LIST OF FIGURES vii
CHAPTER 1: Introduction 1
1.1 Introduction 1
1.2 Problem Statement 1
1.3 Project Requirements 2
1.3.1 Software 2
1.3.2 Hardware 2
CHAPTER 2: LITERATURE SURVEY 3
2.1 Drowsiness Detection System Using Physiological Signals 3
2.2 Drowsiness Detection with OpenCV using EAR. 6
2.3 Driver Drowsiness detection using ANN image processing 8
CHAPTER 3: Methodology 10
3.1 Proposed Methodology 10
3.1.1 Existing System 10
3.1.2 Proposed System 10
3.2 Proposed Techniques: 11
3.2.1 Artificial Intelligence (A.I) 11
3.2.2 Machine Learning 13
3.2.3 Deep Learning 13
3.2.4 Neural Networks 14
3.2.5 Convolution Neural Network (CNN) 15
3.2.5.1 Convolutional Layer 16
3.2.5.2 Pooling Layer 18
3.2.5.2.1 Max Pooling 18
3.2.5.2.2 Average Pooling 18
iv
3.2.5.2.3 Global Pooling 19
3.2.5.3 Fully Connected Layer 19
3.2.5.4 Dropout Layer 20
3.2.5.5 RESNET 20
3.2.5.5.1 Residual Blocks 20
3.2.5.5.2 Architecture Of RESNET 21
3.2.5.5.3 Using ResNet with Keras 23
3.2.6 Back Propagation 23
3.2.7 Activation Functions 24
3.2.8 Training 24
3.2.8.1 Test Loss 25
3.2.8.2 Test Accuracy 25
3.2.8.3 Validation Loss 26
3.2.8.4 Validation Accuracy 26
3.2.9 Train Dataset 26
3.2.10 Testing 27
CHAPTER 4: Software Used 28
4.1 Python 28
4.2 Jupyter NoteBook 28
4.3 Libraries 29
4.3.1 Open Source-Computer Vision Library 29
4.3.2 Numerical Python 30
4.3.3 Tensor Flow 31
4.3.4 Keras 31
4.3.5 Matplotlib 32
4.3.6 OS module in Python 33
4.3.7 Dlib 34
CHAPTER 5: Experimental Results 35
5.1 Results through live face tracking 36
5.1.1 Frames recognized as drowsy 36
5.1.2 Frames recognised as not drowsy 37
v
CHAPTER 6: Conclusion and Future Scope 40
6.1 Conclusion 40
6.2 Future Scope 41
REFERENCES 42
vi
LIST OF FIGURES
vii
LIST OF ABBREVIATIONS
AI : Artificial Intelligence
BSD : Berkely Source Distribution
CNN : Convolution Neural Network
CUDA : Computer Unified Device Architecture
EAR : Eye Aspect Ratio
ECG : Electrocardiography
EEG : Electro Encephalogram
EOG : Electrooculography
FC : Fully-Connected
FFT : Fast Fourier Transform
GB : Giga Byte
GPU : Graphics processor unit
GUI : Graphic User Interface
HF : High Frequency
HRV : Heart Rate Variability
I/O : Input/Output
IDE : Integrated Development Environment
IT : Information Technology
KCF : Kernelized Correlation Filter
LF : Low Frequency
LSST : Large Synoptic Survey Telescope
MATLAB : Matrix Laboratory
MEMS : Micro-Electromechanical Systems
OS : Operating System
PPG : Plethysmo Graphy
PyPI : Python Package Index
viii
RBF : Radial Basic Function
ReLU : Rectified Linera Activation unit
RESNET : Residual Neural Network
ROC : Receiver Operation Curve
SVM : Support Vector Machine
TV : Television
URL : Unifrom Resource Locator
USB : Universal Serial Bus
VGG : Visual Geometry Group
YawDD : Yawning Detection Dataset
ix
CHAPTER 1
Introduction
1.1 Introduction
Around 1.3 million individuals pass away each year due to car accidents, which
are primarily caused by driver distraction and drowsiness. Many individuals travel long
distances on highways, which can lead to fatigue and stress. Drowsiness can arise
unexpectedly, resulting from sleep disorders, medication, or for instance, boredom can
arise while driving for long periods. Therefore, drowsiness can create hazardous
situations and elevate the likelihood of accidents.
Project team has developed a solution to prevent such accidents. The system
involves utilizing a camera to capture the user's visual features, with the use of face
detection and CNN techniques to identify any signs of drowsiness in the driver. When
drowsiness is detected, an alarm will sound to alert the driver, prompting them to take
precautionary measures. The detection of driver drowsiness is instrumental in reducing
the number of fatalities caused by traffic accidents.
Road accidents caused by human errors are responsible for numerous fatalities
and injuries worldwide. The primary reason behind such accidents is the driver's
drowsiness, which could result from sleep deprivation or prolonged driving hours. To
address this issue, it is imperative to develop a system that leverages the latest available
technologies to minimize the likelihood of accidents. The main objective of this system
1
is to create a model that can issue an alert in case the driver shows signs of drowsiness.
This alert will help the driver become aware of their condition and take the necessary
measures to prevent an accident.
1.3.1 Software
Software required are:
• Windows, Linux, MacOS used for operating system.
• Python 3.10(recent version) is used as language.
• Python IDE, Jupiter Notebook used as IDE’s.
1.3.2 Hardware
Minimum Hardware required are:
• High computational processor
• Minimum 4 GB RAM
• Webcam which supports night vision
• Alarm
2
CHAPTER 2
LITERATURE SURVEY
A system has been suggested for identifying driver fatigue and exhaustion
prevent Car accidents resulting from drivers who were sleepy or drowsy. This system
uses Electroencephalogram (EEG) signals to determine the degree of sleepiness or
drowsiness experienced by a driver. The system first identifies an index that
corresponds to various levels of drowsiness. A cheap neuro signal acquisition device
with a single electrode is used to obtain the EEG signal from the driver. A collection of
data designed for simulated car drivers experiencing different levels of drowsiness was
collected locally to evaluate the system. The findings indicated that the system proposed
was successful in detecting fatigue in all the subjects.
3
2.1.2. PULSE SENSOR METHOD
In the past, Applications designed for mobile devices have been created to
identify and detect driver drowsiness. However, these applications can distract drivers
and lead to accidents. To address this issue, Lenget developed a drowsiness detection
system in the shape of a custom-designed wristband that can be worn has been
developed, featuring a PPG signal and galvanic skin response sensor. The information
gathered by these sensors is sent to a mobile device, which functions as the primary
assessment unit. Motion sensors in the mobile device analyse the data, and five features
(heart rate, breathing rate, level of stress, variability in pulse rate, and the count of
adjustments made) are extracted for computation. These characteristics are
subsequently employed as calculation parameters for an SVM classifier, which is
utilized to assess the driver's level of drowsiness. The findings of the experiment
indicated an accuracy up to 98.02% for the proposed system. In the event of drowsiness,
the mobile device generates a warning system that makes use of both visual and
vibrational alerts to notify the driver.
4
2.1.4. WIRELESS WEARABLES METHOD
5
2.2 Drowsiness Detection with OpenCV using EAR.
The paper proposes an algorithm that can detect eye blinks in real-time using
video footage from a standard camera. The algorithm utilizes this paper proposes the
use of landmark detectors that are trained on datasets containing images of people in
everyday settings (in-the-wild datasets) for the purpose of detecting eye blinks in real-
time from a video sequence captured by a standard camera, which are highly robust
against different factors such as head orientation, varying illumination, and facial
expressions. These detectors can precisely detect facial landmarks and the algorithm is
designed to estimate the degree of eye opening, which is essential for detecting eye
blinks accurately.
At present, there are facial landmark detectors available that can accurately
capture various key points on a human face image, including the corners of the eyes
and the eyelids, with high reliability in real-time. These landmark detectors are
advanced and use a regression approach, where a mapping is learned from an image to
the positions of the landmarks or another landmark parametrization. They are trained
on datasets that contain images taken in diverse settings, which makes them robust to
6
challenges like changes in illumination, various facial expressions, and moderate non-
frontal head rotations.
Proposed method:
To blink is to quickly shut and then reopen one's eyes., and the pattern of blinks
varies slightly from person to person, including differences in speed, degree of eye
squeezing, and blink duration. Typically, an eye blink lasts anywhere between 100 to
400 milliseconds. In this paper, it is proposed to use advanced facial landmark detectors
to locate the eyes and define the shape of the eyelids in an image. Based on the
landmarks detected, the eye aspect ratio (EAR) is computed as an indicator of the
degree of eye-opening. However, since the EAR value in each frame may not be able
to accurately detect eye blinks, a classifier is trained to analyse a longer sequence of
frames. When an eye is open, the EAR value remains relatively stable, but it gradually
decreases towards zero as the eye closes.
7
2.3 Driver Drowsiness detection using ANN image processing
The EEG and EOG sensors Electrodes need to be positioned on specific parts
of the body and be connected through conductive gel or wires, causing discomfort to
the user. However, advancements the problems associated with traditional EEG
methods may be addressed by utilizing advancements in materials science and MEMS
technology, including the application of dry electrodes for EEG.
The researchers used MATLAB Neural Network Toolbox and Deep Learning
Toolbox's autoencoder module to determine if these tools could be used to classify
driver drowsiness based on images. They acquired 200 images of a driver during normal
driving, with half showing open or half-open eyes and the other half showing closed
8
eyes. The hypothesis was that closed eyes would indicate drowsiness, while open or
half-open eyes would indicate an alert state. They used a one-layer artificial neural
network for analysis.
9
CHAPTER 3
Methodology
There are several existing systems for detecting driver drowsiness, and they are:
10
A Convolutional Neural Network (CNN) is the model used in this
scenario, which is frequently used for image classification and multi-class classification
of images. The CNN comprises convolution layers that contain adaptable filters. The
filters are moved across the input in a forward propagation process, with each
movement known as a stride. The CNN model enhances the accuracy of the system.
A camera captures continuous images of the driver's face, and a face detection
process identifies the driver's face. The driver's face is then classified as either drowsy
or not drowsy using a CNN-based classification model. The KCF and RESNET CNN
are utilized to construct the classification model.
There are different types of AI, including rule-based or symbolic AI, machine
learning, and deep learning. Rule-based or symbolic AI uses pre-defined rules and logic
to make decisions based on a set of if-then statements. Machine learning is a type of AI
that allows machines to learn from data without being explicitly programmed. It uses
algorithms to identify patterns in data and make predictions based on those patterns.
Deep learning is a type of machine learning that uses neural networks to simulate the
way the human brain works. It can process vast amounts of data and make predictions
with high accuracy.
11
allowing machines to recognize and identify objects, people, and other visual
information. In robotics, AI is used to create intelligent robots that can perform tasks
autonomously. In healthcare, AI is used to analyse medical data and make predictions
about diseases, allowing doctors to provide better diagnoses and treatments. In finance,
AI is used to analyse financial data and make predictions about markets, allowing
investors to make better decisions.
12
3.2.2 Machine Learning:
Although there are various types of machine learning algorithms that are utilized
for use-cases, currently, the three primary techniques being used are:
• Supervised ML Algorithm
• Unsupervised ML Algorithm Reinforcement
• ML Algorithm
Out of the various types of machine learning algorithms available, the one
utilized in this system is a supervised machine learning algorithm.
Deep learning is a form of artificial intelligence that emulates the human brain's
ability to analyse data and detect patterns to support decision-making. It is a subdivision
of machine learning in AI that employs networks capable of unsupervised learning from
unstructured or unlabelled data. This technique is also known as deep neural learning
or deep neural networks.
13
Deep learning is a subfield of artificial intelligence that emulates the way human
brains process information to perform various tasks, including speech recognition,
object detection, language translation, and decision-making. Unlike traditional machine
learning methods, deep learning can learn autonomously without human intervention
and can handle unstructured and unlabelled data. This technology has diverse
applications, such as preventing fraud and money laundering, among other use cases.
Neural networks are artificial systems that are modelled after the structure
and function of biological neural networks. These networks are capable of learning and
adapting to new information without the need for explicit instructions. Instead, they
analyse datasets and examples to identify patterns and relationships on their own.
In neural networks, units across multiple layers are connected to one another,
with each connection carrying a weight that determines the impact of one unit on the
other. The network receives data at the input layer, which is then transmitted through
various layers before producing the final output at the output layer. Along the way, the
network learns from the data and gains a deeper understanding of it, allowing it to make
accurate predictions or classifications.
14
Figure 2 Layers in Neural Network
3.2.5 Convolution Neural Network (CNN):
15
3.2.5.1 Convolutional Layer:
The convolution layer consists of a set of filters, also called kernels or feature
maps, that slide over the input data and perform the convolution operation. Each filter
detects a specific feature or pattern in the input data, such as edges or corners. The
output of the convolution layer is a set of feature maps, where each map represents the
response of one filter to the input data. The feature maps are typically down sampled
using a pooling operation, such as max pooling or average pooling, to reduce their size
and computational complexity. The parameters of the convolution layer include the size
of the filters, the number of filters, and the padding and stride values used during the
convolution operation. These parameters are learned during training using
backpropagation, allowing the network to learn the best set of filters for a given task.
16
Figure 3 Operation of Convolution
Stride
Stride refers to the number of pixels shifts that a filter makes across the input
matrix. The filter moves from left to right across the width of the image using a specified
stride value. Once the filter reaches the end of the row, it moves down to the beginning
of the image (left) with the same stride value and repeats the process until it has
traversed the entire image. If the stride value is 1, the filters move one pixel at a time.
If the stride value is 2, the filters move two pixels at a time. The diagram below
illustrates how convolution operates with a stride value of 2.
Figure 4 Stide
17
3.2.5.2 Pooling Layer:
In max pooling, the input feature map is divided into non-overlapping regions
or windows, typically of size 2x2 or 3x3. For each window, the maximum value within
that region is selected and placed in the output feature map, while the other values are
discarded. This process is repeated for each window, effectively down sampling the
feature map and retaining only the strongest activations.
18
Figure 6 Operation of Average Pooling
3.2.5.2.3 Global Pooling:
Global pooling is a specific type of pooling layer where the entire feature map
is reduced to a single value, instead of dividing the feature map into regions. The most
common type of global pooling is global average pooling, where the average value of
all the activations in the feature map is computed and placed in the output. Global
pooling is often used in the final layers of a CNN for classification tasks, where the
output of the network needs to be a fixed-size vector representing the probability
distribution over the possible classes.
In deep learning, a fully connected layer (FC layer) is a type of layer in a neural
network where all the neurons in one layer are connected to all the neurons in the next
layer. This means that every input neuron is connected to every output neuron, and each
connection has a corresponding weight and bias.
19
3.2.5.4 Dropout Layer:
The dropout layer randomly selects a subset of neurons in the previous layer
and sets their outputs to zero during training. This means that the information flow
through those neurons is temporarily removed from the network, and the remaining
neurons must learn to work together to compensate for the missing information. The
dropout layer is applied during the training phase only, and the full set of neurons is
used during testing.
3.2.5.5 RESNET
The key idea behind ResNet is the use of residual connections, which allow the
network to "skip" over layers and make it easier for gradients to flow back through the
network during training. In a standard neural network, each layer applies a set of
transformations to the input, but in a ResNet, some of the layers have a "shortcut"
connection that adds the input to the output of the layer. This creates a "residual" that
can be passed forward to the next layer, allowing the network to learn more complex
and deeper representations.
20
Figure 7 Residual Blocks
The residual block has been shown to be highly effective in enabling the training of
very deep neural networks. By adding shortcut connections between the layers, the
gradient can flow more easily through the network during backpropagation, which
reduces the vanishing gradients problem and enables the training of deeper networks.
21
Figure 8 Architecture of ResNet
22
3.2.5.5.3 Using ResNet with Keras:
Keras is a deep-learning library that is available for free and can be used on top
of TensorFlow. Within Keras, there is a feature called Keras Applications, which offers
different versions of ResNet.
• ResNet50
• ResNet50V2
• ResNet101
• ResNet101V2
• ResNet152
• ResNet152V2
3.2.6 Back Propagation:
23
Figure 9 Back Propogation
3.2.7 Activation Functions:
1. Sigmoid function
2. Rectified Linear Unit (ReLU) function.
3. Hyperbolic Tangent (tanh) function
4. SoftMax function
3.2.8 Training:
Deep learning neural networks are designed to learn how to map inputs to
outputs. This is accomplished by adjusting the weights of the network in response to
the errors made by the model on the training dataset. These adjustments are made
continuously to minimize the error until the learning process either comes to a stop or
an acceptable level of accuracy is achieved. In other words, the goal of the network is
to continually refine its mapping function to produce more accurate and precise results.
24
The optimization problem in deep learning neural networks is typically solved
using the stochastic gradient descent algorithm. This algorithm utilizes the
backpropagation algorithm to update the model's parameters in each iteration. In other
words, the stochastic gradient descent algorithm is responsible for adjusting the weights
of the network based on the errors calculated during the backpropagation process.
A neural network model learns how to map a particular set of input variables to
the output variable using examples. The goal is to ensure that this mapping works well
not only on the training dataset but also on new, unseen examples. This ability to
function effectively on specific as well as new examples is known as the model's ability
to generalize. Essentially, the model must be able to apply what it has learned to new,
unseen data while still producing accurate results.
The test loss is computed using a loss function that compares the predicted output of
the model to the actual output for each input in the test dataset. The loss function used
depends on the type of problem being solved.
25
To compute test accuracy, the model is first trained on a training dataset,
and its performance is evaluated on a validation dataset. Once the model is trained and
tuned to perform well on the validation dataset, it is then tested on a separate test dataset
to evaluate its performance on new, unseen data.
26
3.2.10 Testing:
When it comes to Machine Learning models, the term "testing" typically refers
to evaluating the accuracy or precision of the model. This is different from the use of
the term in traditional software development.
27
CHAPTER 4
Software Used
4.1 Python
One of the key features of Python is its extensive library of modules, which
allow developers to perform a wide range of tasks without having to write code from
scratch. Python is also known for its versatility, with applications ranging from web
development and data analysis to scientific computing and machine learning.
Python's popularity has grown steadily over the years, due in part to its active
community of developers who have contributed to its open-source codebase. Today,
Python is widely used in academia, industry, and government, and it is considered one
of the most popular programming languages in the world. Python is supported on a
wide range of platforms, including Windows, Linux, and macOS, and it has become a
popular choice for scripting and automation tasks, as well as web development using
frameworks like Django and Flask. With its powerful and flexible syntax, rich library
of modules, and active community, Python continues to be a popular choice for
developers and businesses alike.
28
Jupyter Notebook provides an interactive computing environment, where users
can write and execute code in cells. Each cell can contain code, markdown text, or raw
text. The output of a code cell is displayed directly below the cell, allowing users to see
the results of their code immediately.
4.3 Libraries
The initial version of OpenCV was 1.0, and it is available under a BSD license,
making it free for commercial and academic purposes. OpenCV has interfaces for Java,
Python, C++, and C and is supported by various operating systems, including Mac OS,
Windows, Linux, iOS, and Android. Its primary objective was to support real-time
applications, which is why it is built with optimized C/C++ code to take advantage of
multi-core processing.
OpenCV Functionality:
29
• Image and video input/output: OpenCV can read and write image and video
files in various formats, such as JPEG, PNG, BMP, and MPEG.
• Image processing: OpenCV provides a wide range of image processing
functions such as filtering, thresholding, edge detection, morphology, and
many more.
• Feature detection and extraction: OpenCV includes algorithms for detecting
and extracting various features from images, such as corners, blobs, and
lines.
• Object detection and recognition: OpenCV provides several object detection
and recognition algorithms, such as face detection, pedestrian detection, and
object recognition.
Applications of OpenCV:
There are lots of applications which are solved using OpenCV, some of them
are listed below:
30
NumPy (short for Numerical Python) is a popular Python library used for
numerical computing and scientific computing. It provides a powerful array computing
functionality and a wide range of mathematical functions for working with arrays,
matrices, and other numerical data structures.
NumPy provides an array object that is like a list or a Python array, but with
additional features such as fast and efficient indexing, slicing, and broadcasting. NumPy
arrays are also homogenous, meaning that all elements in an array must have the same
data type, which allows for more efficient memory allocation and computation. In
addition to arrays, NumPy provides a wide range of mathematical functions for working
with arrays, including basic arithmetic operations, linear algebra, Fourier transforms,
random number generation, and more. It also integrates well with other scientific
computing libraries such as SciPy, Matplotlib, and Pandas.
4.3.4 Keras:
31
developed by François Chollet and is now a part of the TensorFlow library. Keras is
known for its user-friendliness, modular design, and flexibility. It is widely used by data
scientists and machine learning practitioners.
Matplotlib can be used to create a wide range of plots, including line plots, scatter
plots, bar plots, histograms, heatmaps, and more. It provides a range of customization
options for creating plots that meet specific requirements, including the ability to
modify colours, fonts, labels, axes, and annotations. Matplotlib is an open-source
library and is actively maintained and developed by a community of contributors. It is
widely used in various fields, including scientific research, finance, engineering, and
machine learning.
32
4.3.6 OS module in Python:
The OS module in Python provides a way to interact with the operating system
on which the Python interpreter is running. It provides a way to access file and directory
management functionality, system information, and process management. The module
includes functions for creating, deleting, moving, and renaming files and directories, as
well as executing shell commands, accessing environment variables, and creating child
processes. The os.path submodule provides functions for working with file paths and
manipulating file and directory names. The OS module is a powerful tool for interacting
with the operating system from within a Python program. It allows developers to create
scripts that can automate common tasks, such as file and directory management, system
administration, and process control.
• os.path: This submodule provides functions for working with file paths and
manipulating file and directory names. Some of the functions in this submodule
include.
• os.system: This submodule provides functions for executing shell commands
from within a Python script. The os.system() function can be used to execute a
command in the shell and return the output.
• os.environ: This submodule provides access to the environment variables on the
system. The os.environ dictionary contains key-value pairs for each
environment variable.
• os.fdopen: This submodule provides a way to open file descriptors using the file
object interface. The os.fdopen() function takes a file descriptor and returns a
file object that can be used for reading and writing.
33
4.3.7 Dlib:
Dlib is a popular C++ library for developing machine learning and computer
vision applications. It is known for its high performance and flexibility and is widely
used by researchers and developers in the field. dlib includes several pre-trained models
for various tasks, such as face detection, facial landmark detection, object detection,
and image segmentation. These models can be easily integrated into applications and
used to quickly achieve state-of-the-art performance. In addition to pre-trained models,
dlib provides tools for training custom models. These include support for various types
of machine learning algorithms, such as SVMs, decision trees, and neural networks, as
well as efficient optimization algorithms for training these models.
dlib also includes several utility classes and functions for working with images,
matrices, and other data structures commonly used in machine learning and computer
vision. It is designed to be portable across platforms and can be used on a variety of
operating systems, including Windows, macOS, Linux, and Android.
34
CHAPTER 5
Experimental Results
Input:
The input for system is a human face image. The system has MRL eye dataset
consisting of 80,000 cropped images of eye region, and the following are driver input
images which were given to detection system.
Output:
The input is captured through camera, then face tracking and detection is done
through MC-KCF. Once face detection and image resizing have been carried out, the
resulting images are as follows:
35
From the above image’s features like eyes are extracted. The extracted features are
passed through ResNet CNN and produce buzzer alert if the driver is in drowsy state.
36
Figure 17 Person 5 in drowsy state
By observing above figures in this section, the system detects eye region in whole face
expression and as the eyes are closed, the detecting system alerts driver through buzzer
sound.
37
Figure 20 Person 3 in alert state Figure 21 Person 4 in alert state
In general, test loss is calculated by evaluating dataset which was not used during
training. A lower test loss indicates that the model is more accurate and better at
generalizing to new data. The test accuracy is calculated as the percentage of correctly
38
predicted labels in the test dataset. Validation loss is computed by evaluating the model
on the validation dataset using a loss function, such as mean squared error or cross-
entropy. The goal of training a model is to minimize the validation loss, which indicates
that the model is becoming more accurate at predicting outputs for new, unseen data.
Validation accuracy is computed by evaluating the model on the validation dataset
and measuring the percentage of correct predictions. The goal of training a model is to
maximize the validation accuracy, minimize the validation loss, which indicates that
the model is becoming more accurate at predicting outputs for new, unseen data.
From the above test results, model contains minimum test loss and validation loss and
maximum validation accuracy, which states that model is well trained and produces
better results than previous models in detecting drowsiness.
39
CHAPTER 6
6.1 Conclusion
This project proposes a new method for driver drowsiness detection using a
combination of MC-KCF and ResNet CNN. The proposed method utilizes the
advantages of both methods to track eye movements using MC-KCF and classify the
driver's face to determine the drowsiness through ResNet CNN in real-time.
Thus, in this project, algorithm for fatigue detection using MC-KCF and ResNet
CNN is successfully designed and executed. Overall, this project paper provides a
promising approach to driver drowsiness detection that can help improve road safety
and prevent accidents caused by driver fatigue.
40
6.2 Future Scope
The future scope for driver drowsiness detection using MC-KCF and Residual
Neural Networks is vast and promising. Here are a few potential areas of development
and improvement for this project:
Multi-modal input: While the current implementation of the project relies solely
on visual cues to detect drowsiness, incorporating other modalities, such as audio or
physiological signals, could improve the accuracy and robustness of the system.
41
REFERENCES
42
Second International Conference on Inventive Communication and Computational
Technologies (ICICCT)].
[12] SaeidFazli, Parisa Esfehani, “Tracking Eye State for Fatigue Detection”, ICACEE,
November 2012.
[13] Shruti Wasudeo Deolikar, “ Drowsy Driving Safety Detection”, IRJMETS, vol 05,
issue 01, Jan 2023.
[14] Gao Zhenhai, Le DinhDat, Hu Hongyu, Yu Ziwen and Wu Xinyu, “Driver
Drowsiness Detection Based on Time Series Analysis of Steering Wheel Angular
Velocity”, IEEE, January 2017[ 2017 9th International Conference on Measuring
Technology and Mechatronics Automation (ICMTMA)].
[15] Seth Weidman, “Deep Learning from Scratch”, O’Reilly ,2019.
[16] Eli Stevens, Luca Antiga, and Thomas Viehmann, “Deep Learning with PyTorch”,
Manning,Jul-2020.
[17] Aleksandar Čolić , Oge Marques and Borko Furht, “ Driver Drowsiness Detection-
Systems and Solutions”, Spring,2014.
[18] Shehzad Saleem, “Risk assessment of road traffic accidents related to sleepiness
during driving: a systematic review”, Eastern Mediterranean Health Journal,vol 28,
issue 9, 2022.
[19] Z Tian and H Qin, “Real-time driver's eye state detection. Proceedings of the IEEE
International Conference on Vehicular Electronics and Safety, October 2005.
[20] Marco Javier Flores, José María Armingol and Arturo de la Escalera, “ Real-Time
Drowsiness Detection System for an Intelligent Vehicle”, IEEE, June 2008[2008
IEEE Intelligent Vehicles Symposium].
[21] Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling Li, Zhixi Feng and
Rong Qu, “ A Survey of Deep Learning-based Object Detection”, IEEE access, vol
7,2019.
[22] Kaiming He, Xiangyu Zhang, “Deep Residual Learning for Image
Recognition”,ILSVRC,2015.
[23] J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, ‘‘High-speed tracking with
kernelized correlation filters,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 37,
no. 3, pp. 583–596, Mar. 2015.
43
[24] Y. Zhang and C. Hua, ‘‘Driver fatigue recognition based on facial expression
analysis using local binary patterns,’’ Optik, vol. 126, no. 23, pp. 4501–4505, Dec.
2015.
[25] R. O. Mbouna, S. G. Kong, and M.-G. Chun, ‘‘Visual analysis of eye state and
head pose for driver alertness monitoring,’’ IEEE Trans. Intell. Transp. Syst., vol.
14, no. 3, pp. 1462–1469, Sep. 2013.
[26] International Organization of Motor Vehicle Manufacturers. Provisional
Registrations or Sales of New Vehicles. Available: http://www.oica.net/wp-
content/uploads/
[27] Wards Intelligence. World Vehicles in Operation by Country, 2013– 2017.
Available: http://subscribers.wardsintelligence.com/databrowse-world.
[28] National Highway Traffic Safety Administration. Traffic Safety Facts 2016.
Available: https://crashstats.nhtsa.dot.gov.
44