0% found this document useful (0 votes)
172 views10 pages

Detection and Prediction of Driver Drowsiness Using Artificial Neural Network

Uploaded by

shilps1234abc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
172 views10 pages

Detection and Prediction of Driver Drowsiness Using Artificial Neural Network

Uploaded by

shilps1234abc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Accident Analysis and Prevention 126 (2019) 95–104

Contents lists available at ScienceDirect

Accident Analysis and Prevention


journal homepage: www.elsevier.com/locate/aap

Detection and prediction of driver drowsiness using artificial neural network T


models

Charlotte Jacobé de Nauroisa,b, , Christophe Bourdina, Anca Stratulatb, Emmanuelle Diazb,
Jean-Louis Verchera
a
Aix Marseille Univ, CNRS, ISM, Marseille, France
b
Groupe PSA, Centre Technique de Vélizy, Vélizy-Villacoublay, Cedex, France

A R T I C L E I N F O A B S T R A C T

Keywords: Not just detecting but also predicting impairment of a car driver’s operational state is a challenge. This study
Drowsiness aims to determine whether the standard sources of information used to detect drowsiness can also be used to
Prediction predict when a given drowsiness level will be reached. Moreover, we explore whether adding data such as
Artificial neural network driving time and participant information improves the accuracy of detection and prediction of drowsiness.
Physiological measurement
Twenty-one participants drove a car simulator for 110 min under conditions optimized to induce drowsiness. We
Behavioral measurement
Driving performance and activity
measured physiological and behavioral indicators such as heart rate and variability, respiration rate, head and
eyelid movements (blink duration, frequency and PERCLOS) and recorded driving behavior such as time-to-lane-
crossing, speed, steering wheel angle, position on the lane. Different combinations of this information were
tested against the real state of the driver, namely the ground truth, as defined from video recordings via the
Trained Observer Rating. Two models using artificial neural networks were developed, one to detect the degree
of drowsiness every minute, and the other to predict every minute the time required to reach a particular
drowsiness level (moderately drowsy). The best performance in both detection and prediction is obtained with
behavioral indicators and additional information. The model can detect the drowsiness level with a mean square
error of 0.22 and can predict when a given drowsiness level will be reached with a mean square error of
4.18 min. This study shows that, on a controlled and very monotonous environment conducive to drowsiness in a
driving simulator, the dynamics of driver impairment can be predicted.

1. Introduction type of impaired operational state: drowsiness. Drowsiness is an inter-


mediate state between alertness and sleep. In this article, we will con-
Driving a car is a complex, multifaceted and potentially risky ac- sider drowsiness as a continuum, or scalar state. Unfortunately, drow-
tivity requiring full mobilization of physiological and cognitive re- siness cannot be recorded directly but has to be estimated, and several
sources to maintain performance over time. Any loss of these resources estimation techniques have been proposed in the literature. These
can have dramatic consequences, including accidents. Moreover, the methods can be classified in different categories according to source of
promise of autonomous vehicles makes it even more important to de- information: subjective assessment, sensorimotor indicators, physiolo-
termine the driver’s operational state. This has recently generated a gical features and driving behavior and performance (Dong et al.,
large number of studies, both from the fundamental perspective and 2011).
with a view to potential applications. The challenge is ambitious: not In the last few years, the Karolinska Sleepiness Scale (KSS), a 9-
only detecting, but also predicting, degradation in the driver’s opera- graded Lickert scale (Shahid et al., 2011), has become the most com-
tional state. monly employed instrument for the subjective self-assessment of
A driver’s operational state while driving a car involves a complex drowsiness (Alhazmi, 2013; Daza et al., 2014; Friedrichs and Yang,
set of psychological, physiological and physical parameters. During 2010; Krajewski et al., 2009a,b; Lee et al., 2016; Li et al., 2014; Murata
driving activities, several factors can be critical: in particular, fatigue and Naitoh, 2015). Nonetheless, although often used, this method raises
and monotony may cause a loss of attention, drowsiness and even three principal issues. Firstly, the driver’s state can only be assessed
sleepiness (Dong et al., 2011). The present study focuses on a specific every 15 min, since greater frequency would probably keep the driver


Corresponding author at: Aix Marseille Univ, CNRS, ISM, Marseille, France.
E-mail address: [email protected] (C. Jacobé de Naurois).

https://doi.org/10.1016/j.aap.2017.11.038
Received 27 July 2017; Received in revised form 12 October 2017; Accepted 27 November 2017
Available online 06 December 2017
0001-4575/ © 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/BY-NC-ND/4.0/).
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

awake. Secondly, according to Friedrichs and Yang (2010), when the specific indicator of drowsiness, various measures are often used
experiment involves more than three hours of monotonous driving, the jointly. Such a hybrid approach minimizes the number of false alarms
KSS becomes inadequate because drivers have difficulty judging their while maintaining a high rate of recognition (essential for good ac-
alertness. Lastly, subjective assessment clearly does not constitute an ceptance of the system by the human operator, Dong et al., 2011),
objective measure of drowsiness, and when the task is very mono- mainly because no signal emerges as the reference marker allowing real
tonous, individual ratings on drowsiness differ from the person’s phy- time measurement that is both relatively non-invasive and reliable.
siological alertness level (Brown, 1997). Moreover, there is no direct link between all these features and the
Features extracted from eye and head movements, classified as “operational state”, which is why methods such as machine learning or
sensorimotor indicators, are also promising parameters to detect the statistical models are used, combining the different measures.
operational state and are now included in many research approaches The different algorithms used include k-nearest neighbors (Chauhan
(Chen and Ji, 2012; Liu et al., 2009). Video-oculo-graphy (VOG) is et al., 2015), decision trees (Lee et al., 2010; Sukanesh and
commonly used to study the following features: blink frequency, blink Vijayprasath, 2013), Bayesian classifiers (Lee and Chung, 2012; Yang
duration and PERCLOS (PERcentage of eye CLOSure). Changes in these et al., 2010), Support Vector Machines (Bhowmick and Chidanand
features are considered under low-level control, offering an easy way to Kumar, 2009; Krajewski et al., 2009a,b; Liang et al., 2007; Yeo et al.,
monitor the activity of the neurovegetative system (Caffier et al., 2003; 2009), artificial neural networks (ANN) (Bundele and Banerjee, 2009;
Wierwille and Ellsworth, 1994). These features are generally extracted Eskandarian et al., 2007; Sayed and Eskandarian, 2001; Samiee et al.,
with image processing algorithms based on eye, head and gaze move- 2014), ensemble methods like random forest (Krajewski et al., 2009a,b;
ment tracking. Thus, the quality of the estimation is highly dependent McDonald et al., 2013; Torkkola et al., 2008; Zhang et al., 2004) and,
on this first signal-processing step. more recently, deep learning (Hajinoroozi et al., 2015). Most studies
Physiological features are also frequently used to assess drowsiness consider the problem of estimating the driver’s impaired operational
because they are continuously available and could be considered as an state as a classification problem. Is the driver in an impaired state or
objective, more direct, measure of the functional state. The main re- not? Is the driver drowsy or not? However, the evolution of the state of
cordings of signals related to drowsiness are the electroencephalogram the driver can also be considered as a regression problem, i.e. the driver
(EEG), the electrocardiogram (EKG) and electro-dermal activity (EDA) goes through various continuous states, although regression models are
(Borghini et al., 2014; Dong et al., 2011). The gold standard appears to rarely used in the literature (Murata and Naitoh, 2015). Nonlinear
be the EEG, the most direct indicator of central nervous system activity modeling machine learning (such as with ANNs) is also often used. With
(De Gennaro et al., 2001). However, the EEG is quite intrusive, and these techniques, the model can extract information from noisy data,
proper installation of an extensive set of electrodes on the participant’s and can avoid over-fitting, making it generally more robust (Dong et al.,
scalp requires expertise and time. It has been established that when a 2011). Since in the context of driving we expected over-fitting and
change in vigilance is observed, changes on psychophysiological noisy data, the present study uses machine-learning techniques based
arousal can be also observed, and these changes can be monitored by on artificial neural networks.
measures of the central and autonomic nervous system activity (Haar- Most research focuses on the detection/estimation of an impaired
mann and Boucsein, 2008). Concerning EKG, since heart rate variability state, rather than on its prediction, even though they adopt the term
(HRV) is linked to the autonomic nervous system this feature is often “prediction” (Chen, 2013; Hargutt and Kruger, 2001; Ji et al., 2004;
used as an indicator of drowsiness because change on HRV can provide Verwey and Zaidel, 2000). This is because in machine learning, the
information about the autonomic nervous system (Elsenbruch et al., term “prediction” is used to infer the label of an object not seen during
1999; Lal and Craig, 2001; Riemersma et al., 1977; Stein and Pu, 2012). the learning phase. However, some studies try to predict what the
Moreover, some studies on drowsiness, vigilance or workload also re- ground truth will be in the subsequent few minutes: the ground truth
cord and analyze respiration rate and amplitude (Besson et al., 2013; Ju was shifted for one epoch (Kaida et al., 2007), while different lags (+1,
et al., 2015; Reimer et al., 2009; Rodriguez Ibañez et al., 2011). +2, +3, +4, +5,+7,+10 min) were tested by (Larue, 2010). Murata
Yet a direct relationship between physiological features and cogni- et al. (2016) obtained the highest prediction accuracy using the data
tive state is hard to define, because these physiological features vary between 20 and 120 s before the prediction. Watson and Zhou (2016)
with other states (including, but not limited to, emotion, workload, detect micro-sleep with 96% accuracy and are able to predict, between
physical fatigue) or with the context. These variations according to 15 s and 5 min in advance, the time when the next micro-sleep will
state also differ from one person to another. Thus, each physiological occur. However, the time when the first micro-sleep occurs obviously
indicator has its own limits. Heart rate usually decreases during driving cannot be predicted by such methods.
and when the driver is tired (Lal and Craig, 2001), but the opposite may As explained above, using a single source of information does not
also occur (Apparies et al., 1998). Peiris et al. (2005) showed that two seem to be an efficient way to accurately assess the state of the driver.
independent experts analyzing EEGs to detect drowsiness may not make Different sources of information and different models are used in the
the same assessment for the same participant at the same time. On the literature, and results are hard to generalize away from well-controlled
other hand, EDA can be influenced by stress (Healey and Picard, 2005) laboratory conditions. In the present study, we collected information
and emotions (Rebolledo-Mendez et al., 2014). Taken alone, therefore, originating from different sources: physiological, behavioral, and psy-
these indicators in themselves cannot be considered as adequate and chological data from the driver, as well as performance information
exclusive indicators of drowsiness or fatigue. from the vehicle. The goal of this study is to develop and evaluate a
Driving behavior and performance analyses have the main ad- model with an artificial neural network (ANN), so as to predict when a
vantage of being non-intrusive. Some signals such as pressure on pedals given impaired state will be reached in addition to detecting this im-
or car movements are easily available. The standard deviation of car paired state. We deliberately chose unobtrusive recording techniques
position relative to lane midline (also named standard deviation of lane easily applicable in a car. Different datasets using different sources of
position (SDLP)), and steering wheel movements, are the most common information were tested, to determine which kind of information yields
features used to detect drowsiness (Arnedt et al., 2001; De Valck et al., the most powerful model. We put forward two hypotheses. First, we
2003; Liu et al., 2009; Philip et al., 2004). However, here again, driving hypothesized that it is possible to predict when the impaired state will
performance and activity are not specific indicators of drowsiness. For arise by using the sensorimotor, physiological and performance in-
example, driving performance can decrease with other factors such as dicators used to detect drowsiness. Second, we hypothesized that
distraction (Tango et al., 2009), or with a decline in attention (Marin- adding information such as driving time and participant information
Lamellet et al., 2003) will improve the accuracy of the model.
Since none of these feature families is consensually considered as a

96
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

2. Materials and methods with a step of 0.5) as proposed by Belz et al. (2001). This ground-truth
determination method was chosen because the assessment by video
2.1. Participants coding is reliable and allows a comprehensive assessment of the driver
state. Other methods, such as questionnaires (e.g. KSS), reaction time to
A total of 21 participants were included in the study (mean a double task or even EEG are quite invasive and may disturb the driver
age ± SD: 24.09 ± 3.41 years; 11 men and 10 women). On the day of and thus influence his/her state. However, video analysis is long and
the experiment, the participants were not allowed to drink alcohol, requires several observers with a certain level of training. In order to be
coffee or tea. Inclusion criteria were: valid driver’s licence for at least 6 more reliable, this method can use criteria and rating scale as a basis for
months, no visual correction needed to drive, not susceptible to simu- different observers. The ORD relies on a continuous scale from “alert”
lator sickness (as assessed by the Motion Sickness Susceptibility to “extremely drowsy” with a list of criteria which can be observable in
Questionnaire, Short-form (MSSQ-Short, Golding, 1998) and an Ep- the driver, characteristics of a drowsy driver (Wierwille and Ellsworth,
worth scale score (assessing susceptibility to drowsiness) below 14 1994).The two trained raters evaluated each minute of video and rated
(Johns, 1991). A score of below 8 on this scale means the person has no each segment on a scale ranging from 0 (alert) to 4 (extremely drowsy).
sleep debt. A score of from 9 to 14 means the person shows signs of The mean of the two raters was taken as the drowsiness level. Inter-
sleepiness, and if the score is above 15, the person shows signs of ex- rater reliability was computed with the Pearson's linear correlation
cessive sleepiness. Before the experiment, participants were questioned (R = 0.71 and p = 0.00).
on their age, their quality of sleep (on a scale of 1–10), their caffeine In order to synchronize data obtained at various sampling fre-
consumption (never, rarely, one or two cups per day, more than two quencies, we averaged data over periods of 1 min. Thus, the final
cups per day), driving frequency (occasionally, several times a month, a sampling rate is 1/min for each feature, including ground truth.
week or a day), number of kilometers per year. To assess their circadian The modeling process can be divided into two phases. First, one
typology, their score on the Horne and Ostberg morning/evening Artificial Neural Network (ANN) detects the level of drowsiness from a
questionnaire (Horne and Ostberg, 1975) was also noted. All these in- predetermined set of features (detection model). This ANN is used to
dicators concerning the participants were later considered as partici- detect the impaired state (level of drowsiness). Second, if drowsiness is
pant information, and used with a view to improving the performance under 1.5, a second ANN predicts (in min) when it will reach 1.5 and
of the model. gives this time as its output (for instance when the level is reached),
otherwise its output is 0 (prediction model). The threshold was set at
2.2. Protocol 1.5 for the following reason. McDonald et al. (2013) defined the limit
between “not drowsy” and “drowsy” at a level between 1 and 2 (0 or 1,
The participants drove during between 100 and 110 min in a static not drowsy; 2, 3, 4: moderately, very or extremely drowsy). We chose
driving simulator in an air-conditioned room with temperature control the level of 1.5 as a threshold for defining the impaired state because
set at 24° Celsius, after lunchtime. According to the literature about this level means that at a given time, one of the two raters has evaluated
circadian rhythms, the probability of falling asleep between 02:00 to the state of the participant as moderately drowsy (level 2) while the
06:00 and 14:00 to 16:00 is 3 times higher than at 10:00 or at 19:00, other evaluated the state as 1. These two ANNs were trained in-
respectively (Horne and Reyner, 1999). We chose a period corre- dependently.
sponding to an intermediate level between a low risk of drowsiness (in The neural network toolbox (Beale et al., 1992) of Matlab R2013a
the morning) and the highest risk (end of the night). The road and was used to create the ANNs. Two feedforward neural networks were
traffic were generated with SCANeR Studio®. While driving, data on used with 2 hidden layers, and a back propagation training method was
driving performance, eyelid and head movements, and physiological applied using the Levenberg-Marquardt algorithm (Levenberg, 1944).
data were recorded using the following hardware and software: The error was validated by ten-fold cross-validation and a search grid.
SCANeR Studio® for driving performance at 10 Hz, faceLAB® for sen- The performance function used for learning was the mean squared error
sorimotor signals at 60 Hz, and EKG, pulse plethysmography (PPG), (the average squared error between the network outputs and the target
EDA and Respiration with the Biopac® MP150 system and Acqknow- output). To avoid overfitting, the total dataset was distributed in a
ledge® software at 1000 Hz. In this study, EDA was also recorded but training sub-dataset (70% of the total set, to learn the network’s node
not used due to extensive signal loss. A webcam was placed on top of weights), a validation sub-set (15%: to stop learning and avoid over-
the central screen of the simulator to video-record the participants training) and a testing sub-set (15%: to evaluate the model’s ability to
during the session. work on previously unseen data. This property is also called ‘general-
At the beginning of the session, the participants drove along a ization’).
highway for roughly 90 min, then turned off the highway and drove for In addition, three other metrics were used to evaluate the model:
around 5 min to reach a city. Finally, they drove in an urban environ- first, the percentage of numbers of absolute errors below a threshold
ment for roughly 5 min. During most of the highway stretch, there was (0.5 for detection of degree of impairment and 5 min for predictions
no traffic. Some 2/3 of the way along, 22 cars appeared from the right and for the testing dataset: the higher this metric, the better the model
of the highway, disappearing a few kilometers later. This sudden ad- performs); second, the range of errors containing 95% of the values;
dition of traffic was intended to change the driver’s level of drowsiness. and third, the coefficient R of the correlation between outputs and
Rossi et al. (2011) demonstrated that a driver is more susceptible to targets.
sleepiness in a simulator with a monotonous scenario, and during the Driving performance and driving behavior indicators (car dataset)
afternoon. used in the model were: lateral distance relative to the midline, time-to-
line-crossing (Bergasa et al., 2006), steering wheel angle, accelerator
2.3. Data analysis and modeling pedal angle, shift relative to the lateral line, speed, and number of line
crossings. Physiological features used in the model (physiological da-
The level of drowsiness, the so-called ground truth (indeed, the real taset) were the heart rate and its variability, and the respiration rate
state of the driver is not directly accessible and must be evaluated), and its variability. Sensorimotor features (behavioral dataset) extracted
determined as a reference in this study is based on subjective assess- from FaceLab data were blink duration and its frequency, PERCLOS,
ment by video analysis and independently coded by two raters. Their head movement in translation and rotation, and saccade frequency.
evaluation was based on a method proposed by Wierwille and Ellsworth Participant information recorded consisted of score on circadian ty-
(1994), which used a scale between 0 and 100. For practical reasons in pology, score on Epworth scale, sleep quality, driving frequency,
relation with the ANN, we decided to use a smaller scale (from 0 to 4 number of cups of coffee a day and age. Driving time (the time elapsed

97
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

Table 1
All the variables (grouped by source of information, in column) computed for each participant for each minute, used as input for ANNs.

Physiological measurements Behavioral measurements Car measurements

HR: Heart Rate (average and standard deviation) Blink duration (average and standard deviation) Lateral distance from the closest lane and the center of the car in m
(beat/min) (average and standard deviation)
Svlf: HR signal Very Low Frequency Power (0.0- Blink frequency (average and standard deviation) Time to lane crossing (average and standard deviation)
0.04 Hz) (per minute)
Slf: HR signal Low Frequency Power (0.04-0.15 Hz) PERCLOS (average and standard deviation) (% of Steering angle (average and standard deviation)
eye-closure time)
Shf: HR signal High Frequency Power (0.15-0.4 Hz) Head position x (average and standard deviation) Steering angle velocity (average and standard deviation)
Svhf: HR signal Very High Frequency Power (0.4- Head position y (mean and standard deviation) Steering entropy (computed from steering angle)
3.0 Hz)
Sympathetic ratio (Slf/(Svlf + Slf + Shf) Head position z (average and standard deviation) Number of direction change (0-crossings) per minute (computed
from steering angle)
Vagal ratio (Shf/(Svlf + Slf + Shf) Head rotation x (average and standard deviation) Accelerator pedal angle (average and standard deviation)
Sympathetic-vagal ratio (Slf/Shf) Head rotation y (average and standard deviation) Lateral shift of the vehicle center relative to the lane center (average
and standard deviation)
Respiration Rate (average and standard deviation) Head rotation z (average and standard deviation) Vehicle speed (km/h) (average and standard deviation)
(per minute) Saccade frequency (mean and standard deviation) Number of out-the road per minute
(per minute)

since the beginning of the driving session, in minutes) was also used as physiological sensors and an A/D system (in our experiment with
an input feature for the model (see Table 1). In an attempt to rebase Biopac®, in a real car it could be with a smart-watch).
individual differences, we subtracted from each signal the mean of the
first five minutes of this signal, so that the signal represents variation
3.1. Detection
from an initial state. To optimize learning, each feature was normalized
such that minimum and maximum values lie within [−1;1].
In this section, we present model performance in detecting drow-
siness level, as defined by the ORD scale (from 0 to 4, see Methods
3. Results section). The error is the difference between the real state (as given by
the subjective evaluation, the so-called ground truth) and the output,
The ANNs were trained 16 times (4 × 2 × 2) with different data- squared and averaged over epochs to provide the mean squared error of
sets. Each dataset results from the combination of the following: the the trained model.
three sources of information tested alone or all together (thus 4 com- From an absolute point of view, the dataset configuration providing
binations), with or without elapsed time (2 cases) and with or without the best performance (lowest mean square error) in training the model
information about the participants (2 cases). The Tables 2 and 3 present contains driving time, participant information and behavioral features
the performance obtained with each of the 16 datasets. In this section, (# in Table 2). With this dataset, the mean square error is 0.22 ± 0.02
the results will be presented with the driving time (labeled with ‘1′ in and more than 80% of the absolute value of the error of the testing data
tables) and without (labeled with ‘0′ in Tables 2 and 3), with the in- is under 0.5 (less than one-half of a state level, as defined by the ORD
formation about the participant (labeled with ‘1′ in tables) and without scale). Ninety-five percent of the absolute value of the error is under
(labeled with ‘0′ in Tables 2 and 3). The grouping was decided ac- 0.87. In other words, the model is off by less than one drowsiness level
cording to how these variables were recorded in our experiment (and on our scale, in 95% of cases. Performance is similar when car in-
possibly in a real car), that is to say with which equipment. Indeed, the formation is included. The mean square error is 0.23 ± 0.06. More
vehicle information can be recorded from the vehicle’s Controller Area than 86.34% of the absolute value of the error of the testing data is
network (in our experiment with SCANeR® software), the behavioral under 0.5. Ninety-five percent of the absolute value of the error is under
measurements with a camera and a specific image processing system (in 0.73, i.e. in 95% of cases the model is off by less than one drowsiness
our experiment with faceLAB®) and physiological measurements with level on our scale.

Table 2
Model performance in detecting drowsiness level for the testing dataset: mean square error (MSE), standard deviation (STD), according to dataset used, with (1) or without (0) driving
time, with (1) or without (0) participant information. The worst performance (highest MSE) is highlighted in bold and with a * while the best performance (lowest MSE) is highlighted in
bold and with a #.

Driving Time Participant information Dataset Source MSE STD |Error |95% % Error < 0.5

0 0 Testing All 0.43 0.04 1.16 0.63


0 0 Testing Behavioral 0.42 0.02 1.16 0.64
0 0 Testing Car 0.69 0.04 1.48 0.50
0 0 Testing Physiological 0.81* 0.05 1.51 0.43
0 1 Testing All 0.41 0.04 1.10 0.62
0 1 Testing Behavioral 0.39 0.04 1.14 0.69
0 1 Testing Car 0.62 0.03 1.34 0.54
0 1 Testing Physiological 0.76 0.03 1.52 0.44
1 0 Testing All 0.27 0.02 0.91 0.80
1 0 Testing Behavioral 0.23 0.02 0.80 0.83
1 0 Testing Car 0.40 0.05 1.20 0.66
1 0 Testing Physiological 0.38 0.05 1.06 0.70
1 1 Testing All 0.24 0.02 0.84 0.81
1 1 Testing Behavioral 0.22# 0.02 0.87 0.80
1 1 Testing Car 0.23 0.06 0.73 0.86
1 1 Testing Physiological 0.29 0.07 0.75 0.82

98
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

Table 3
Performance of the model in predicting drowsiness level with the testing dataset: mean square error (MSE), standard deviation (STD), according to whether dataset is used with (1) or
without driving time (0), participant information, and source of recorded information. The * symbol indicates the worst performance and the # symbol the best performance. The best and
worst performance are also higlighted in bold.

Driving Time Participant information Dataset Source MSE STD |Error |95% % Error < 5

0 0 Testing All 33.64 7.63 9.29 0.79


0 0 Testing Behavioral 23.61 3.15 8.12 0.86
0 0 Testing Car 60.09* 6.19 13.12 0.73
0 0 Testing Physiological 43.77 6.24 11.47 0.74
0 1 Testing All 28.26 2.82 8.79 0.82
0 1 Testing Behavioral 22.83 4.03 7.98 0.89
0 1 Testing Car 50.22 8.84 12.11 0.73
0 1 Testing Physiological 41.82 4.11 11.83 0.74
1 0 Testing All 10.64 3.39 4.26 0.97
1 0 Testing Behavioral 5.46 1.50 2.92 0.99
1 0 Testing Car 31.14 10.73 6.25 0.93
1 0 Testing Physiological 15.97 1.70 7.01 0.89
1 1 Testing All 7.69 2.17 3.12 0.98
1 1 Testing Behavioral 4.18# 1.17 1.98 0.99
1 1 Testing Car 4.67 1.33 2.43 0.99
1 1 Testing Physiological 5.51 1.84 2.62 0.98

When neither driving time nor participant information is used (line regression lines are very close to unity (0.87, 0.88, 0.88 respectively for
0-0 in Table 2), or when only one of these is used (0-1 or 1-0), the model the training, validation and testing datasets) and the intercepts are close
performs better with all datasets used together or with the behavioral to zero (0.17 for all three datasets). Errors are calculated, at each 1 min
dataset used alone; performance is slightly worse with the physiological epoch, as the difference between the output of the model and the
or car datasets used alone. As stated above, the model performs best, for ground truth. The graph on the left of Fig. 1 shows a peak at 0.05,
each dataset or for all three datasets used together, when both driving meaning that most of the errors are close to 0. Also, more than 95% of
time and participant information are included (1-1). the instances had an error of between −1.16 and 1.16. In Fig. 2, the
Figs. 1 and 2 present, respectively with (Fig. 1) and without (Fig. 2) correlations between output and target are still good but there is greater
driving time and participant information, the frequency histogram of variability (R = 0.87, 0.74, 0.78 respectively for the training, valida-
distribution of error (left panel, A) and the correlation (right panel, B) tion and testing datasets). The model used for the results presented in
between real state (target, horizontal axis) and estimated state, the Fig. 2 (behavior, physiology and car) is less accurate than the model
output of the ANN (vertical axis). The model is trained with behavioral which results are presented on Fig. 1 (behavior, elapsed time and
data in Fig. 1 and with all datasets in Fig. 2; thus, Fig. 1 illustrates the participant information). As for errors, the graph on the left shows a
best, and Fig. 2 the worst, performance for the training, validation and single but broader peak at 0.2 and −0.02, also meaning that most of the
testing datasets. Linear regressions were applied to the output of the errors are close to 0.
model to correlate them with the ground truth. With a perfect model,
all data points would be on the diagonal line of the correlation graph. 3.2. Prediction
Fig. 1 shows that, for each of the three datasets, simulated values are
well correlated with expected values (ground truth). The R-values are This section presents the performance of the second model, aimed at
actually very close to unity (0.93, 0.91, 0.91 respectively for the predicting when a driver will reach a given drowsiness level (here 1.5).
training, validation and testing datasets). Moreover, the slopes of the The error, for each epoch, is the difference between the time remaining

Fig. 1. frequency histogram of error distribution (left panel) and correlation (right panel) between real and estimated state, for a model trained with behavioral dataset, driving time and
participant information.

99
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

Fig. 2. frequency histogram of error distribution (left panel) and correlation (right panel) between real and estimated state, for a model trained with behavioral, car and physiological
datasets.

from the current epoch before the target level is really reached (as per technological developments are not sufficient to meet the challenge of
the subjective evaluation) and the time predicted by the trained model safety in modern vehicles. Predicting the degree of driver impairment,
(squared and averaged over epochs to provide the mean squared error). and when it will occur, remain important research objectives requiring
The best performance is achieved with a combination of driving more complex treatment of heterogeneous information from diverse
time, participant information and the behavioral dataset. The mean sources. The objective of this study was to assess whether the time of
square error is 4.18 ± 1.17 min. For 95% of the testing data, the ab- occurrence of a given state of drowsiness could be predicted by using
solute value of error is under 2 min and more than 99% of the absolute ANN models (one to detect drowsiness and a second one to predict
value of error is under 5 min. Similar, but not higher, accuracy is drowsiness).
achieved with the car and physiological datasets (4.67 ± 1.33 and Overall, our results demonstrate that, using an ANN trained with the
5.51 ± 1.84). Ninety-five percent of the absolute value of error is same information used to detect drowsiness, it is possible to predict
under 2.43 and 2.62, respectively. For more than 97% of the testing when a driver’s impairment will appear to an accuracy of approxi-
data, the absolute value of error is under 5 min. mately 5 min. Moreover, to further improve accuracy, external in-
The worst model performance in predicting drowsiness is with the formation such as driving time or a driver profile can be added to the
car dataset alone (60.09 ± 6.19 min). Performance improves with the model. In his study, Larue (2010) accurately predicted a driver’s de-
addition of participant information (50.21 ± 8.84 min), or of driving creased vigilance up to five minutes in advance, and up to 10 min in
time (31.14 ± 10.72 min). The model becomes very accurate when advance with 70% to 80% accuracy. Under quite different conditions,
both driving time and participant information are included with the car and with different types of information, our model seems to be more
dataset (4.67 ± 1.33 min). accurate. In our worst case, for 95% of the test dataset, the model can
For each source of information (all, behavioral, car and physiolo- predict when the impairment will appear to within 13.11 min. In our
gical datasets), the model is more accurate when both driving time and best case, for 95% of the test dataset, the model can predict the im-
participant information are included in the dataset than with either pairment to within 1.97 min.
driving time or participant information alone, or with no additional As explained in the results section, model performance, both on
information. detecting a drowsiness level and on predicting when this level will be
Figs. 3 and 4 present the frequency histogram of distribution of reached, varies considerably according to the datasets used to train the
errors (left panel) and the correlation between real time (target, hor- model. This raises the question of the relevance of using physiological
izontal axis) and estimated time (vertical axis) of appearance of signals, behavioral features and driving activity, and of the respective
drowsiness, respectively with (Fig. 3) and without (Fig. 4) driving time roles of these different datasets in model performance. An important
and participant information, The model is trained with behavioral data point highlighted by our results is how temporal (driving time) and
in Fig. 3 and with all datasets in Fig. 4, so that Fig. 3 illustrates the best, idiosyncratic (participant information) data impact model performance.
and Fig. 4 the worst, performance. On Fig. 3, the graph on the right The limitations of our model with regard to generalization (i.e. the
shows that the relation between target and output is very precise, data ability of the model to accurately treat previously unseen data), and
are close to the diagonal (very high R, better than 0.98 for the training, from a more general point of view, inter-individual variability, will also
validation and testing datasets, the slopes are better than 0.99). On the be discussed.
left part of Fig. 3, the main peak is at 0.3, meaning that the model has
an error inferior at 0.3.
4.1. Dataset comparison: behavioral/physiological/car

4. Discussion Our objective was to use the same information both to detect
drowsiness and to predict the time when a given drowsiness level would
Detecting impairment of a driver’s operational state is a major safety be reached. Interestingly, when trained with all datasets, either singly
issue, addressed in numerous studies. While recent car models go some or in combination, the model gave satisfactory results. The dataset
way towards providing this detection capacity, it is clear that recent giving the best performance is the behavioral dataset (followed by the

100
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

Fig. 3. frequency histogram of error distribution (left panel) and correlation (right panel) between real and estimated times, for a model trained with behavioral dataset, driving time and
participant information.

physiological dataset and finally the car dataset), both in detecting the 2015). Wang and Xu (2016) consider eye features as the prime input for
degree of drowsiness and in predicting when a given drowsiness level detection of drowsiness. However, since they are usually computed by
will occur. Similar results were previously reported. Samiee et al. image processing, these features cannot be considered fully reliable.
(2014) showed that information about blinks leads to highly accurate Although techniques have progressed considerably in recent years,
detection (90.74% detection of a drowsy state), while lateral deviation detecting face and gaze movements remains tricky in complex situa-
of the car and steering wheel angle provide 85.37% and 87.22% ac- tions (for example, subjects with glasses, variable or low light condi-
curacy, respectively. However, when all three sources of information tions, Benoit and Caplier, 2005; Friedrichs and Yang, 2010).
(blinking, lateral position and steering angle) were used together, ac- Our behavioral, physiological and, to a lesser extent, car datasets led
curacy increased to 94.69%, although this was not borne out by our to the best model performance. With all sources of information in the
study. As in our study, Daza et al. (2014) obtained better results with same neural network, performance could be expected to improve be-
features extracted from eyelid movement (such as PERCLOS) than with cause the neural network can better learn dependencies between dif-
features extracted from driving behavior. In the literature, HRV data ferent kinds of information. Unfortunately, our results do not bear this
showed a correlation with drowsiness (Elsenbruch et al., 1999; Lal and out. A single ANN-based model may not be the best way to take ad-
Craig, 2001; Stein and Pu, 2012). Yet our model gave better results with vantage of the dependencies between the different sources of in-
ocular and head parameters than with physiological variables: the ORD formation. An alternative, inspired by Samiee et al. (2014), might be to
scale showed a stronger correlation with the ocular parameters than linearly combine the outputs of three ANNs, each trained with a dif-
with physiological variables such as EKG and Respiration (Rost et al., ferent dataset: car, physiological or behavioral.

Fig. 4. frequency histogram of error distribution (left panel) and correlation (right panel) between real and estimated times, for a model trained with behavioral, car, and physiological
datasets.

101
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

Surprisingly, other information often included in the literature ap- experiments.


pears less relevant here. For instance, car deviations relative to the road
and line crossings are often considered as signs of drowsiness (Philip 4.3. Generalization and inter-individual variability
et al., 2005). Yet our results unexpectedly show that a model trained
with the car dataset alone is less accurate than models trained with Generalization is highly relevant in an industrial context. However,
other datasets. This may be due to the fact that driving activity and we cannot prove that our model can be generalized to new participants
performance are non-linearly correlated with degree of drowsiness. whose data have never been used to train the model. Inter-individual
Thus, they may be more useful to detect a critical state (very or ex- variability (sensitivity to drowsiness, behavioral, physiological or psy-
tremely drowsy) than to assess a monotonous evolution of the driver’s chological idiosyncrasies) may be a limiting factor for generalization
state (alert, slightly, moderately, very, extremely drowsy). Ingre et al. (how the model behaves with previously unseen data) and transfer
(2006) showed that the SDLP score (Standard Deviation of the Lateral (how knowledge acquired in a given domain can be adapted to another
Position) dramatically increased with a subjective measure of drowsi- domain). In our study, the data subset used for the tests (e.g. to evaluate
ness (KSS scale: from 1 to 9). Since our postulate was to consider the model’s ability to treat previously unseen data, also called the
drowsiness as a continuous variable, the car dataset was obviously not ‘generalization process’) was randomly chosen among the full set of
the most appropriate for training our model. data from all subjects. Thus, at this stage, it is not possible to determine
Finally, a potential bias in detection might be suspected from the whether the algorithm would perform well with the full dataset for a
fact that the ground truth is based on subjective evaluations from video given subject whose data were not used to train the model. To do so
recordings of the participant’s motor behavior, which could be thought would require multiple replications of the experiment under the same
to explain the superior performance of the behavioral dataset. conditions over a longer period.
However, it is worth noting that these features are consensually de- It is a major challenge to find a general model which can be trained
scribed in the literature as the most objective and pertinent indicators with a limited number of drivers and then applied to other drivers
of drowsiness. It is therefore difficult to conclude on whether the high (Karrer et al., 2004), due to inter-individual variability. Many studies
performance of a model trained with behavioral data is due to the way (for a review see Liu et al., 2009) reported great variability in how
ground truth is set or to the greater relevance of this particular set of drowsiness affects performance and physiological parameters in gen-
data. eral. It is now recognized that neurobehavioral and cognitive perfor-
mances vary considerably from one individual to another (Van Dongen
4.2. The role of driving time et al., 2004a, 2004b). For instance, Philip et al. (2004) studied cognitive
performance after sleep deprivation. They found that performance was
Driving time (the time elapsed since the beginning of the driving highly impaired, but more so in elderly participants than in younger
session, in minutes) plays an important role here, greatly improving the participants. In car driving, according to Ingre et al. (2006), there is
performance of the model. Obviously, the longer a driver drives under extensive inter-individual variability in driving behavior and eye be-
monotonous conditions, the greater the probability of being drowsy havior: under similar conditions, individuals can present differing
(Philip et al., 1999a,b). This is why drivers travelling on highways are profiles of drowsiness evolution over time, and for a given self-declared
often reminded to take a rest break after two hours of driving (Philip drowsiness level, markers such as eye blink duration also vary con-
et al., 1999a,b). Thus, the model can be considered to have learned a siderably. In our study, participant information (like age or circadian
linear relationship between elapsed time and the remaining time before activity) significantly improved accuracy both in detection and in
the occurrence of the critical state (naturally until the critical level is prediction. These results point in the same direction as those of Wang
reached, after that the predicted time will be 0). It could therefore be and Xu (2016), who found that including individual factors improved
deduced that driving time is sufficient per se to predict impairment of accuracy. Sensitivity to drowsiness is an idiosyncratic factor which may
the driver’s state. However, our experiment showed that participants also impact generalization. According to Van Dongen et al. (2003), the
reached a critical level at different times after the session began. Some high variability in individual performance following sleep deprivation
participants reached the critical state as early as 10 min after the be- can be explained by the cognitive performance observed when the in-
ginning of the driving session, and others after around 30 min. More- dividual is not sleep-deprived. Van Dongen et al. (2003) also showed
over, we observed that some participants could be drowsy at a parti- that individuals probably differ in their vulnerability to sleep depriva-
cular time and subsequently become alert again. It can therefore be tion, and that this is partially predictable from individual cognitive
concluded that there is not a simple linear relationship between driving performance without deprivation, i.e. from the individual cognitive
time and the time before a given drowsiness level is reached. To de- profile. Indeed, in driving simulator studies, drowsiness is often ob-
termine the real weight of driving time, we consecutively trained two served to develop in differing ways (Thiffault and Bergeron, 2003).
models with this sole feature, and then tested their detection (model 1) Situational and personality factors, sleeping habits and driving history
and prediction (model 2) capabilities. For the detection of the drowsi- can contribute to the understanding of why some people fall asleep at
ness level, the mean square error was 0.47 ± 0.54. For the prediction the wheel while others do not. This points to the need to take into
of the time before the drowsiness level is reached, the mean square account drivers’ traits or profiles when calibrating systems for the de-
error in the generalization phase was 17.77 ± 2.15 min. Interestingly, tection and prediction of driver fatigue.
we find that the models trained with driving time alone perform better
than models trained with car or physiological datasets alone, but worse 5. Conclusion
than models trained with behavioral dataset alone or with behavioral,
car and physiological datasets combined. This shows that, while driving In this study, different ANNs were used either to detect a drowsiness
time is a good predictor of drowsiness, it is not the best. level or to predict when a driver’s state will become impaired. The best
Secondly, a model based on driving time alone would be unable to models (those whose rates of successful detection or prediction are the
account for wakening events, such as a rest period or a traffic change. highest) used information about eyelid closure, gaze and head move-
For instance, caffeine is reported to reverse time-on-task degradation of ments and driving time. Performance on prediction is very promising,
performance on sleep-deprived participants (Wesensten et al., 2004). A since the model can predict to within 5 min when the driver’s state will
short nap or rest may counteract drowsiness (Anund et al., 2015). Thus, become impaired. Moreover, modeling drowsiness as a continuum can
if the driver drinks a cup of coffee or takes a rest, a model based on lead to more precise detection systems offering refined results beyond
driving time alone would need to be reinitialized. How and when this simply detecting whether the driver is alert or drowsy. Future perfor-
reset should be performed is an important question, requiring further mance improvements could be achieved by using recurrent neural

102
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

networks or dynamic neural networks to add temporality to the model, De Valck, E., De Groot, E., Cluydts, R., 2003. Effects of slow-release caffeine and a nap on
or adding other features like context information (traffic, type of road, driving simulator performance after partial sleep deprivation. Percept. Mot. Skills 96
(1), 67–78.
weather etc.). These factors can influence the driver’s state. However, Dong, Y., Hu, Z., Uchimura, K., Murayama, N., 2011. Driver inattention monitoring
as eyelid and head movements are difficult to record in a real car, the system for intelligent vehicles: a review Intelligent Transportation Systems. IEEE
focus should be on improving a model using only driving performance, Trans. 12 (2), 596–614.
Elsenbruch, S., Harnish, M.J., Orr, W.C., 1999. Heart rate variability during waking and
driving behavior (based on data provided by sensors in the car) and sleep in healthy males and females. Sleep 22 (8), 1067–1071.
physiological measurements. Finally, a larger and more realistic dataset Eskandarian, A., Sayed, R., Delaigue, P., Blum, J., Mortazavi, A., 2007. Advanced Driver
(far more subjects (wider range for age for example)), recorded in real, Fatigue Research. Federal Motor Carrier Safety Administration, Washington, DC
Report: FMCSA-RRR-07–001.
on-road, conditions (different times of the day for example) would be Friedrichs, F., Yang, B., 2010. Drowsiness monitoring by steering and lane data based
required to validate these models. features under real driving conditions. In: Proceedings of the European Signal
Processing Conference. Aalborg, Denmark. pp. 23–27.
Golding, J.F., 1998. Motion sickness susceptibility questionnaire revised and its re-
Conflict of interests
lationship to other forms of sickness. Brain Res. Bull. 47 (5), 507–516.
Hajinoroozi, M., Mao, Z., Huang, Y., 2015. Prediction of driver’s drowsy and alert states
None. from EEG signals with deep learning. 2015 IEEE 6th International Workshop on
Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) 493–496.
Hargutt, V., Kruger, H.-P., 2001. Eyelid movements and their predictive value for fatigue
Acknowledgments stages. In: Presented at the International Conference on Traffic and Transport
Psychology ? ICTTP 2000. HELD 4–7 September 2000, Bern, Switzerland.
This research was funded by a PhD grant from PSA Group via the Healey, J.A., Picard, R.W., 2005. Detecting stress during real-world driving tasks using
physiological sensors Intelligent Transportation Systems. IEEE Trans. 6 (2), 156–166.
OpenLab agreement with Aix-Marseille University and CNRS entitled Horne, J.A., Ostberg, O., 1975. A self-assessment questionnaire to determine morning-
“Automotive Motion Lab”. We thank Marjorie Sweetko for correcting ness-eveningness in human circadian rhythms. Int. J. Chronobiol. 4 (2), 97–110.
and improving the English manuscript, all the participants in this study, Horne, J., Reyner, L., 1999. Vehicle accidents related to sleep: a review. Occup. Environ.
Med. 56 (5), 289–294.
L. Marrou for helping to evaluate video recordings, P. Vars, V. Honnet, Ingre, M., Åkerstedt, T., Peters, B., Anund, A., Kecklund, Gör., 2006. Subjective sleepi-
and M. Hing for their help with SCANeR® Software Developments. ness, simulated driving performance and blink duration: examining individual dif-
ferences. J. Sleep Res. 15 (1), 47–53.
Ji, Q., Zhu, Z., Lan, P., 2004. Real-time nonintrusive monitoring and prediction of driver
References fatigue Vehicular Technology. IEEE Trans. 53 (4), 1052–1068.
Johns, M.W., 1991. A new method for measuring daytime sleepiness: the Epworth slee-
Alhazmi, S., 2013. Towards Context-based Fatigue Detection System in Vehicular Area piness scale. Sleep 14 (6), 540–545.
Network. University of Ottawa, Canada (Unpublished doctoral thesis). Ju, J.H., Park, Y.J., Park, J., Lee, B.G., Lee, J., Lee, J.Y., 2015. Real-Time driver’s bio-
Anund, A., Fors, C., Kecklund, G., Leeuwen, W.V., Åkerstedt, T., 2015. Countermeasures logical signal monitoring system. Sens. Mater. 27 (1), 51–59.
for Fatigue in Transportation: a Review of Existing Methods for Drivers on Road, Rail, Kaida, K., Åkerstedt, T., Kecklund, G., Nilsson, J.P., Axelsson, J., 2007. Use of subjective
Sea and in Aviation. Statens väg- Och Transportforskningsinstitut. (VTI report 852A). and physiological indicators of sleepiness to predict performance during a vigilance
Apparies, R.J., Riniolo, T.C., Porges, S.W., 1998. A psychophysiological investigation of task. Ind. Health 45 (4), 520–526.
the effects of driving longer-combination vehicles. Ergonomics 41 (5), 581–592. Karrer, K., Vöhringer-Kuhnt, T., Baumgarten, T., Briest, S., 2004. The role of individual
Arnedt, J.T., Wilde, G.J., Munt, P.W., MacLean, A.W., 2001. How do prolonged wake- differences in driver fatigue prediction. In: Third International Conference on Traffic
fulness and alcohol compare in the decrements they produce on a simulated driving and Transport Psychology. Nottingham, UK. pp. 5–9.
task? Accid. Anal. Prev. 33 (3), 337–344. Krajewski, J., Batliner, A., Golz, M., 2009a. Acoustic sleepiness detection: framework and
Beale, M., Hagan, M.T., Demuth, H.B., 1992. Neural Network Toolbox. Neural Network validation of a speech-adapted pattern recognition approach. Behav. Res. Methods 41
Toolbox, The Math Works 5. pp. 25. (3), 795–804.
Belz, S.M., Robinson, G.S., Casali, J.G., 2001. An on-Road investigation of commercial Krajewski, J., Sommer, D., Trutschel, U., Edwards, D., Golz, M., 2009b. Steering wheel
motor vehicle operator self assessment of fatigue as an indicator of driver fatigue. In: behavior based estimation of fatigue. Proceedings of the Fifth International Driving
SAGE Publications Sage CA: Los Angeles, CA. Proceedings of the Human Factors and Symposium on Human Factors in Driver Assessment, Training and Vehicle Design
Ergonomics Society Annual Meeting Vol. 45. pp. 1576–1580. 118–124.
Benoit, A., Caplier, A., 2005. Hypovigilence analysis: open or closed eye or mouth? Lal, S.K., Craig, A., 2001. A critical review of the psychophysiology of driver fatigue. Biol.
Blinking or yawning frequency? In: IEEE Conference on Advanced Video and Signal Psychol. 55 (3), 173–194.
Based Surveillance. AVSS 2005. pp. 207–212. Larue, G.S., 2010. Predicting effects of monotony on driver’s vigilance. Centre for
Bergasa, L.M., Nuevo, J., Sotelo, M.A., Barea, R., Lopez, M.E., 2006. Real-time system for Accident Research and Road Safety. Queensland University of Technology, Australia
monitoring driver vigilance. IEEE Trans. Intell. Transp. Syst. 7 (1), 63–77. (Unpublished doctoral thesis).
Besson, P., Bourdin, C., Bringoux, L., Dousset, E., Maiano, C., Marqueste, T., Vercher, J.- Lee, B.G., Chung, W.-Y., 2012. Driver alertness monitoring using fusion of facial features
L., 2013. Effectiveness of physiological and psychological features to estimate heli- and bio-Signals. IEEE Sens. J. 12 (7), 2416–2422.
copter pilots… workload: a bayesian network approach. IEEE Trans. Intell. Transp. Lee, J.D., Fiorentino, D., Reyes, M.L., Brown, T., Ahmad, O., Fell, J., Dufour, R., 2010.
Syst. 14 (4), 1872–1881. Assessing the Feasibility of Vehicle-based Sensors to Detect Alcohol Impairment 811.
Bhowmick, B., Chidanand Kumar, K.S., 2009. Detection and classification of eye state in National Highway Traffic Safety Administration, Washington, DC,DOT HS, pp. 358.
IR camera for driver drowsiness identification. 2009 IEEE International Conference Lee, B.L., Lee, B.G., Chung, W.Y., 2016. Standalone wearable driver drowsiness detection
on Signal and Image Processing Applications (ICSIPA) 340–345. system in a smartwatch. IEEE Sens. J. 16 (13), 5444–5451.
Borghini, G., Astolfi, L., Vecchiato, G., Mattia, D., Babiloni, F., 2014. Measuring neuro- Levenberg, K., 1944. A method for the solution of certain non-linear problems in least
physiological signals in aircraft pilots and car drivers for the assessment of mental squares. Q. Appl. Math. 2 (2), 164–168.
workload, fatigue and drowsiness. Neurosci. Biobehav. Rev. 44, 58–75. Li, L., Werber, K., Calvillo, C.F., Dinh, K.D., Guarde, A., König, A., 2014. Multi-Sensor soft-
Brown, I.D., 1997. Prospects for technological countermeasures against driver fatigue. Computing system for driver drowsiness detection. In: Snášel, V., Krömer, P., Köppen,
Accid. Anal. Prev. 29 (4), 525–531. M., Schaefer, G. (Eds.), Soft Computing in Industrial Applications. Springer
Bundele, M.M., Banerjee, R., 2009. Detection of fatigue of vehicular driver using skin International Publishing, pp. 129–140.
conductance and oximetry pulse: a neural network approach. In: Proceedings of the Liang, Y., Reyes, M.L., Lee, J.D., 2007. Real-Time detection of driver cognitive distraction
11th International Conference on Information Integration and Web-based using support vector machines. IEEE Trans. Intell. Transp. Syst. 8 (2), 340–350.
Applications & Services. New York, NY, USA : ACM. pp. 739–744. Liu, C.C., Hosking, S.G., Lenné, M.G., 2009. Predicting driver drowsiness using vehicle
Caffier, P.P., Erdmann, U., Ullsperger, P., 2003. Experimental evaluation of eye-blink measures: recent insights and future challenges. J. Saf. Res. 40 (4), 239–245.
parameters as a drowsiness measure. Eur. J. Appl. Physiol. 89 (3–4), 319–325. Marin-Lamellet, C., Paire-Ficout, L., Lafont, S., Amieva, H., Laurent, B., Thomas-Antérion,
Chauhan, A., Saroliya, A., Sharma, V., 2015. Design & Analysis of KNN algorithm for C., Fabrigoule, C., 2003. Mise En Place d’un Outil d’évaluation Des déficits
fatigue detection in vehicular drivers using Pulse Oximetry parameter. Int. J. Eng. Attentionnels Affectant Les Capacités De Conduite Au Cours Du Vieillissement
Technol. Manage. 2 (3), 107–110. Normal Et Pathologique: L’étude SÉROVIE 81. Recherche – Transports – Sécurité, pp.
Chen, J., Ji, Q., 2012. Drowsy driver posture, facial, and eye monitoring methods. In: 177–189.
Eskandarian, A. (Ed.), Handbook of Intelligent Vehicles. Springer, London, pp. McDonald, A.D., Lee, J.D., Schwarz, C., Brown, T.L., 2013. Steering in a random forest
913–940. ensemble learning for detecting drowsiness-Related lane departures. Hum. Factors J.
Chen, R., 2013. Sitting Behaviour-based Pattern Recognition for Predicting Driver Hum. Factors Ergon. Soc (18720813515272).
Fatigue. Deakin University, Australia (Unpublished doctoral thesis). Murata, A., Naitoh, K., 2015. Multinomial logistic regression model for predicting driver’s
Daza, I.G., Bergasa, L.M., Bronte, S., Yebes, J.J., Almazán, J., Arroyo, R., 2014. Fusion of drowsiness using only behavioral measures. J. Traffic Trans. Eng. 3, 80–90.
optimized indicators from Advanced Driver Assistance Systems (ADAS) for driver Murata, A., Ohta, Y., Moriwaka, M., 2016. Multinomial logistic regression model by
drowsiness detection. Sensors 14 (1), 1106–1131. stepwise method for predicting subjective drowsiness using performance and beha-
De Gennaro, L., Ferrara, M., Curcio, G., Cristiani, R., 2001. Antero-posterior EEG changes vioral measures. In: In: Goonetilleke, R., Karwowski, W. (Eds.), Advances in Physical
during the wakefulness?sleep transition. Clin. Neurophysiol. 112 (10), 1901–1911. Ergonomics and Human Factors 489. Springer International Publishing, Cham, pp.

103
C. Jacobé de Naurois et al. Accident Analysis and Prevention 126 (2019) 95–104

665–674. Stein, P.K., Pu, Y., 2012. Heart rate variability, sleep and sleep disorders. Sleep Med. Rev.
Peiris, M.T.R., Jones, R.D., Davidson, P.R., Carroll, G.J., Signal, T.L., Parkin, P.J., Bones, 16 (1), 47–66.
P.J., 2005. Identification of vigilance lapses using EEG/EOG by expert human raters. Sukanesh, R., Vijayprasath, S., 2013. Certain investigations on drowsiness alert system
2005 27th Annual International Conference of the IEEE Engineering in Medicine and based on heart rate variability using LabVIEW. WSEAS Trans. Inf. Sci. Appl. 10 (11).
Biology Society 1–7, 5735–5737. Tango, F., Calefato, C., Minin, L., Canovi, L., 2009. Moving attention from the road: a new
Philip, P., Taillard, J., Guilleminault, C., Quera, S., Bioulac, B., Ohayon, M., 1999a. Long methodology for the driver distraction evaluation using machine learning ap-
distance driving and self?induced sleep deprivation among automobile drivers. Sleep proaches. 2nd Conference on Human System Interactions 2009, 596–599 (HSI ’09).
22 (4), 475–480. Thiffault, P., Bergeron, J., 2003. Fatigue and individual differences in monotonous si-
Philip, P., Taillard, J., Quera-Salva, M., Bioulac, B., Åkerstedt, T., 1999b. Simple reaction mulated driving. Personality Individual Differences 34 (1), 159–176.
time, duration of driving and sleep deprivation in young versus old automobile dri- Torkkola, K., Gardner, M., Schreiner, C., Zhang, K., Leivian, B., Zhang, H., Summers, J.,
vers. J. Sleep Res. 8 (1), 9–14. 2008. Understanding driving activity using ensemble methods. In: Prokhorov, D.
Philip, P., Taillard, J., Sagaspe, P., Valtat, C., Sanchez-Ortuno, M., Moore, N., Bioulac, B., (Ed.), Computational Intelligence in Automotive Applications. Springer, Berlin
2004. Age, performance and sleep deprivation. J. Sleep Res. 13 (2), 105–110. Heidelberg, pp. 39–58.
Philip, P., Sagaspe, P., Taillard, J., Valtat, C., Moore, N., Åkerstedt, T., Bioulac, B., 2005. Van Dongen, H.P.A., Rogers, N.L., Dinges, D.F., 2003. Sleep debt: theoretical and em-
Fatigue, sleepiness, and performance in simulated versus real driving conditions. pirical issues. Sleep Biol. Rhythms 1 (1), 5–13.
Sleep 28 (12), 1511. Van Dongen, H.P.A., Baynard, M.D., Maislin, G., Dinges, D.F., 2004a. Systematic inter-
Rebolledo-Mendez, G., Reyes, A., Paszkowicz, S., Domingo, M.C., Skrypchuk, L., 2014. individual differences in neurobehavioral impairment from sleep loss: evidence of
Developing a body sensor network to detect emotions during driving. IEEE Trans. trait-like differential vulnerability. Sleep 27 (3), 423–433.
Intell. Transp. Syst. 15 (4), 1850. Van Dongen, H.P.A., Maislin, G., Dinges, D.F., 2004b. Dealing with inter-individual dif-
Reimer, B., Coughlin, J.F., Mehler, B., 2009. Development of a driver aware vehicle for ferences in the temporal dynamics of fatigue and performance: importance and
monitoring, managing & motivating older operator behavior. Proceedings of the ITS- techniques. Aviat. Space Environ. Med. 75 (3), A147–A154.
America 1–9. Verwey, W.B., Zaidel, D.M., 2000. Predicting drowsiness accidents from personal attri-
Riemersma, J.B.J., Sanders, A.F., Wildervanck, C., Gaillard, A.W., 1977. Performance butes, eye blinks and ongoing driving behaviour. Personality Individual Differences
decrement during prolonged night driving. Vigilance. Springer, pp. 41–58. 28 (1), 123–142.
Rodriguez Ibañez, N., García González Á, M., Ramos Castro, J.J., Fernández Chimeno, M., Wang, X., Xu, C., 2016. Driver drowsiness detection based on non-intrusive metrics
2011. Drowsiness detection by thoracic effort signal snalysis with professional drivers considering individual specifics. Accid. Anal. Prev. 95, 350–357 (Part B).
in real environments. Driver Distraction & Inattention 2011: Program, Presentations Watson, A., Zhou, G., 2016. Microsleep prediction using an EKG capable heart rate
& Reviewed Papers. monitor. 2016 IEEE First International Conference on Connected Health:
Rossi, R., Gastaldi, M., Gecchele, G., 2011. Analysis of driver task-related fatigue using Applications, Systems and Engineering Technologies (CHASE) 328–329.
driving simulator experiments. Proc. Soc. Behav. Sci. 20, 666–675. Wesensten, N.J., Belenky, G., Thorne, D.R., Kautz, M.A., Balkin, T.J., 2004. Modafinil vs.
Rost, M., Zilberg, E., Xu, Z.M., Feng, Y., Burton, D., Lal, S., 2015. Comparing contribution caffeine: effects on fatigue during sleep deprivation. Aviat. Space Environ. Med. 75
of algorithm based physiological indicators for characterisation of driver drowsiness. (6), 520–525.
J. Med. Bioeng. 4 (5), 391–398. Wierwille, W.W., Ellsworth, L.A., 1994. Evaluation of driver drowsiness by trained raters.
Samiee, S., Azadi, S., Kazemi, R., Nahvi, A., Eichberger, A., 2014. Data fusion to develop a Accid. Anal. Prev. 26 (5), 571–581.
driver drowsiness detection system with robustness to signal loss. Sensors 14 (9), Yang, G., Lin, Y., Bhattacharya, P., 2010. A driver fatigue recognition model based on
17832 (14248220). information fusion and dynamic Bayesian network. Inf. Sci. 180 (10), 1942–1954.
Sayed, R., Eskandarian, A., 2001. Unobtrusive drowsiness detection by neural network Yeo, M.V.M., Li, X., Shen, K., Wilder-Smith, E.P.V., 2009. Can SVM be used for automatic
learning of driver steering. Proceedings of The Institution of Mechanical Engineers EEG detection of drowsiness during car driving? Saf. Sci. 47 (1), 115–124.
Part D-Journal of Automobile Engineering 215 (9), 969–975. Zhang, Y., Owechko, Y., Zhang, J., 2004. Driver cognitive workload estimation: a data-
Shahid, A., Wilkinson, K., Marcu, S., Shapiro, C.M., 2011. Karolinska sleepiness scale driven perspective. The 7th International IEEE Conference on Intelligent
(KSS). In: Shahid, A., Wilkinson, K., Marcu, S., Shapiro, C.M. (Eds.), STOP, THAT and Transportation Systems, 2004. Proceedings 642–647.
One Hundred Other Sleep Scales. Springer, New York, pp. 209–210 (ch 47).

104

You might also like