0% found this document useful (0 votes)
71 views9 pages

For Severe and Fatal Road Traffic Accidents

Uploaded by

tilahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views9 pages

For Severe and Fatal Road Traffic Accidents

Uploaded by

tilahun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

International Journal of Injury Control and Safety

Promotion

ISSN: 1745-7300 (Print) 1745-7319 (Online) Journal homepage: https://www.tandfonline.com/loi/nics20

Using statistical modelling to analyze risk factors


for severe and fatal road traffic accidents

Katharine Reeves, Joht Singh Chandan & Siddhartha Bandyopadhyay

To cite this article: Katharine Reeves, Joht Singh Chandan & Siddhartha Bandyopadhyay
(2019): Using statistical modelling to analyze risk factors for severe and fatal road
traffic accidents, International Journal of Injury Control and Safety Promotion, DOI:
10.1080/17457300.2019.1635625

To link to this article: https://doi.org/10.1080/17457300.2019.1635625

View supplementary material

Published online: 09 Jul 2019.

Submit your article to this journal

Article views: 26

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=nics20
INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION
https://doi.org/10.1080/17457300.2019.1635625

Using statistical modelling to analyze risk factors for severe and fatal road
traffic accidents
Katharine Reevesa, Joht Singh Chandanb and Siddhartha Bandyopadhyaya
a
Department of Economics, University of Birmingham, Birmingham, UK; bInstitute of Applied Health Research, College of Medical and
Dental Sciences, University of Birmingham, Birmingham, UK

ABSTRACT ARTICLE HISTORY


Road traffic accidents (RTAs) are still frequent events in the UK with severe/fatal RTAs leading to sig- Received 24 February 2019
nificant morbidity and mortality. Therefore, this study aimed to explore clusters of risk factors which Revised 10 June 2019
affect the severity of RTAs in the UK. A retrospective analysis of 76,334 driver-level records between Accepted 20 June 2019
2005 and 2014 was conducted. Two methods were used: ‘partially constrained generalized logistic
KEYWORDS
regression models’ and ‘classification and regression tree’ (CART) analysis in order to identify individual Road traffic accidents; fatal
factors and combinations of risk factors relating to severity of accidents. Several established risk factors accidents; morbidity;
were confirmed which contribute to the severity of RTAs. Specific combinations of factors were identi- mortality; safety;
fied which were more likely to lead to fatal accidents: the involvement of one older person, one or no transport; police
cycles involved, speed limit over 40 mph in culmination with several other factors. This study reaf-
firmed risk factors relating to severity of RTAs in the UK, but also established combinations of risk fac-
tors which led to the most severe outcomes allowing for targeting of accident-prevention measures.
In addition, this study demonstrates the use of CART analysis which can be used in wider public
health evaluations where multiple risk factors are at play.

Introduction report on road traffic prevention identified that lack of in-


vehicle crash protection, non-use of crash helmets by two-
In 2016, there were 181,384 accidents recorded on roads in
wheeled vehicle users, non-use of seatbelts and the presence
Great Britain and of these 1,792 led to fatalities (Jackson &
of roadside objects were risk factors influencing severity of
Cracknell, 2018). Globally, the World Health Organization injury caused due to RTAs (Peden et al., 2004). The risk of
(WHO) identified that more than 1.25 million people are serious or fatal injuries was much higher in low- and mid-
killed due to fatal road traffic accidents (RTAs) and this dle-income settings (WHO, 2015a). Alongside other factors,
remains the leading cause of death among people between this has been demonstrated in other studies possibly due to
15 and 29 years (WHO, 2015b). Even after considering the inequality in access to health care (Sherafati, Homaie-Rad,
reduction in number of RTAs in the UK and globally, mor- Afkar, Gholampoor-Sigaroodi, & Sirusbakht, 2017).
bidity and mortality still remain key issues (Ernstberger Fewer studies recently have explored these risk factors
et al., 2015; Jackson & Cracknell, 2018). The costs of RTAs relating to fatality and serious injury in high-income coun-
are multifaceted, ranging from lost output to the economy tries. Within high-income countries such as the UK, the risk
to medical and policing time (see Supplementary 1 which factors relating to severity of RTAs are often complex and
has been reproduced from the Department for Transport multifactorial as some of the factors described in the above
Statistics) (Department for Transport Statistics, 2012). The WHO report are mitigated within UK law and policy, such
total cost of RTAs, both fatal and non-fatal, can be calcu- as the mandatory requirement to wear seatbelts (Road
lated using a willingness to pay approach, which gives us Safety Observatory, 2018). Therefore, of the less extensive
the average cost per fatality to be £1,548,105, per serious literature present within the UK and high-income countries,
casualty to be £173,964 and per slight casualty to be £13,411 often differing patterns of risk factors are identified
(Department for Transport, 2014). Therefore, considering (Bachani, Peden, Gururaj, Norton, & Hyder, 2017). Of the
the incidence of mortality and morbidity and the conse- evidence available in high-income countries, it has sug-
quent economic cost, it is clear that managing RTAs poses a gested that young drivers have different risk factor pro-
significant, international public health challenge (Racioppi, files compared to older counterparts, such as for young
Eriksson, Tingvall, & Villaveces, 2004). drivers’ risk factors relate to inexperience, lack of
Globally, there has been a push to identify key risk fac- skill and increased risk taking behaviour compared to vis-
tors responsible for fatal or severe accidents. The WHO ual, cognitive and mobility impairment in older drivers

CONTACT Siddhartha Bandyopadhyay [email protected]


Supplemental data for this article is available online at https://doi.org/10.1080/17457300.2019.1635625.
ß 2019 Informa UK Limited, trading as Taylor & Francis Group
2 K. REEVES ET AL.

(Ball, Edwards, Ross, & McGwin Jr., 2010; Langford & The severity level of an accident (slight, severe or fatal) is
Koppel, 2006; Rolison, Hanoch, Wood, & Liu, 2014; recorded as a categorical variable. Further, the numerical
Rolison, Regev, Moutari, & Feeney, 2018). Surveys in the value assigned to this variable (1, 2 or 3) is ordinal in
UK and Europe have explored the driver’s perceptions of fac- nature. Ordered categorical variables are non-continuous,
tors which contribute to the severity of the RTAs from their bounded and cannot be measured on an interval or ratio
past experience and these include driving behaviour (speeding, scale; therefore, an ordinary least squares approach would not
distraction, lapses of attention and aggression), risk taking and be suitable in this circumstance to assess risk factors leading
driving when fatigued (Antov et al., 2010; Smith, 2016). to such outcomes. It is necessary to use a group of models
However, due to differing risk profiles in the UK compared to specifically designed for this type of dependent variable,
other global cohorts, it is important to continually explore known as ordered logistic regression models (OLOGIT) or
possible risk factors for the risk of severe and fatal RTAs proportional odds models. For this type of model (OLOGIT),
which could influence future policy making in the area. Also, the dependent variable has M discrete levels and M-1 binary
it is clear that few global studies have utilized statistical mod- logistic regressions are estimated using grouped values of the
elling methods such as the classification and regression tree dependent variables. Supplementary 4 illustrates an example
(CART) to allow for the identification of combination of risk where the dependent variable has three levels and the model
factors, as opposed to the individual effects of each risk factor. estimates two logistic regressions. For the first regression, the
The aim of the research in this article is to identify driver dependent variable is equal to 0 when Y ¼ 1 and one when
characteristics and environmental factors which affect the Y ¼ 2 or 3; for the second, it is equal to 0 when Y ¼ 1or 2
severity of RTAs in a population in two English counties and one when Y ¼ 3. Accordingly, the estimated coefficients
(Norfolk and Suffolk) using statistical methods. This type of are the effect of a change in the confounding factors on the
analysis would be beneficial to the police, health services and odds that Y ¼ 2 or 3 in the first regression and Y ¼ 3 in
road safety agencies, as they can identify groups of drivers the second.
and circumstances which are most at risk of killed or ser- Equation 1 gives the model for the generalized ordered
iously injured (KSI) accidents. Two methods have been used logit regression (GOLOGIT), of which OLOGIT is a special
which can be used in other public health analysis settings case. As can be seen in the model, the proportional odds
particular where the cause of a change in the dependent vari- assumption is relaxed, and a separate set of coefficients is
able is multifactorial such as in a complex service analysis. estimated for each logistic regression. This model is suitable
when the above tests indicate that the assumption for
OLOGIT is not met.
Methods Equation 1:
Study population and period expðaj þ Xi bj Þ
P ðYi > jÞ ¼ ; j ¼ 1; 2; . . . ; M  1
1 þ ½expðaj þ Xi bj Þ
This article uses a driver-level dataset of 76,334 records
compiled by Norfolk and Suffolk police forces from RTAs
The disadvantage of this, however, is that there are now
during the years 2005 to 2014. Police reports are used to
M-1 sets of coefficients to interpret and work with, rather
record the driver characteristics and environmental factors
than just one. So the most common approach is a mixture
surrounding an accident and the details of these variables
of the two above called the partially constrained generalized
are described in Supplementary 2.
ordered logit regression (PC-GOLOGIT) (Long, 1997;
The dataset includes driver characteristics such as gender,
Williams, 2016; Williams & Williams, 2006). This model is
age, ethnicity, breath test result, whether they were wearing often the most appropriate in practice as it can accommo-
a seatbelt and whether or not they hold a UK driving date both OLOGIT and GOLOGIT models where needed
licence. Additionally, several variables are included to and is seen below in Equation 2, as is what is used for the
describe the environment and characteristics of the accident first part of analysis.
itself including the condition of the road, visibility, number Equation 2:
of casualties, time of day, day of the week, road class and 
exp aj þ X1i b1 þ X2i b2 þ X3i b3j
type, speed limit and weather conditions. The outcome vari- PðYi >jÞ ¼   ; j ¼ 1; 2; . . . ; M  1
able for this analysis is the severity of an accident, which is 1 þ exp aj þ X1i b1 þ X2i b2 þ X3i b3j
recorded categorically as 1, 2 or 3 for the categories ‘slight’,
‘severe’ and ‘fatal’, respectively, as defined by the UK govern- Statistical significance will be set at p < 0.05.
ment (Department for Transport, 2017) (see Supplementary 3
for definitions of slight, severe and fatal). Classification and regression tree analysis
The second method used is the CART model. This method
Statistical analysis enables us to identify groups of risk factors that are related
to severity of an accident. This type of analysis lends itself
Partially constrained generalized ordered logistic regres- well to datasets with several categorical variables and is also
sion (PC-GOLOGIT) good at handling interactions in addition to heterogeneity.
The first model used in order to identify individual risk fac- CART is used in data mining which seeks to predict the
tors associated with severity is the PC-GOLOGIT model. outcome of future events based on the characteristics of past
INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION 3

events. There are several algorithm options, all of which cre- There is, therefore, a trade-off between producing a small
ate splits in the group of observations and create a tree. tree which is easy to interpret and over-fitting the model to
There are several advantages to using this type of model, the dataset in order to achieve a low percentage error. The
particularly when analyzing categorical variables, as is often majority of papers who use this analysis method settle on
the case when the data are derived from forms or surveys using 50% of the dataset in order to strike a balance
such as this road accident data. The repeated splitting of between these two objectives and this is the training set size
observations into groups is most natural for categorical vari- which is used in this article.
ables which can be split into clearly defined categories. An The test which maximizes the information gain is chosen
additional benefit of CART over traditional regressions is at each stage in order to create the next set of branches. A
that they work well with interactions, for example a fatal test for a continuous variable often comes in the form of an
accident may be more likely if a driver is both a certain age inequality, and for a categorical variable, branches represent
and driving at certain speed. These interactions can be diffi- one or more classes within that category.
cult to include in regressions since there are so many possi- Repetitive splitting of observations in a non-trivial way
bilities to consider. Adding in further interaction terms will always result in single-class leaves eventually, but the
within a logistic regression model requires substantially aim of designing a model is to produce a tree which is small
more computing power and decreases the degrees of free- enough to interpret with the minimum percentage error
dom. In a CART model, however, the nature of the tree possible (Quinlan, 1993). With this in mind, Supplementary
means that all interactions are naturally included if they are 5 illustrates the tree size and the percentage error when the
optimal without having to specify them and having to minimum number of cases per leaf is restricted. This is a
reduce the degrees of freedom. form of ‘pre-pruning’, where a decision is made about the
There are several versions of the CART model which use size of the tree before it is run. By increasing the number of
slightly different algorithms. The analysis in this article uses minimum cases to eight, the size of the tree becomes man-
the C5.0 algorithm (Quinlan, 1993). Due to the versatility of ageable without much of an effect on the percentage error.
this method of analysis and advancements in computing The result of this process is the classification tree in Figure
power, it can be applied to, for example, designing predict- 3, which when tested on the test dataset, correctly predicts
ive models of medical outcomes such as mortality rates or the severity of accidents for 83.7% of observations.
in other healthcare settings (Morgan, 2014).
The C5.0 algorithm starts with a training dataset, a subset
of the original dataset on which the model is built in order Results
to predict the outcomes of the remaining observations Individual variables influencing the severity of traffic
For this analysis, driver-level data were identified in
accidents using PC-GOLOGIT
order to analyze the effect of driver characteristics in add-
ition to factors surrounding the accident on the severity of The results for the partially constrained model are presented
an accident. The classification tree in this analysis is used to in Table 1 as the change in log odds. Each variable has two
identify both groups and situations where the risk of KSI estimated coefficients, one for changing from ‘slight’ to
accidents is higher as well as predict periods of peak ‘severe or fatal’ and one for ‘slight or severe’ to ‘fatal’.
demand on services. This is done by building a model Where there is no significant difference between them, only
which has predictive power for a test dataset. one is reported. In summary, the following variables have
The driver-level dataset for Norfolk and Suffolk contains statistically significant estimated coefficients in terms of log
76,334 records, for which the accident severity, driver char- odds (95% confidence intervals in parentheses):
acteristics and other confounding factors are recorded. In
order to avoid selection bias, half of the dataset is selected  Female driver: 0.41 (0.59, 0.23)
at random to create the training set on which to build the  Positive breath test: 1.04 (0.69, 1.39)
model and the other half is reserved for testing when the  No seatbelt: 1.42 (1.03, 1.81)
model is complete. The training set, therefore, contains  No UK licence: 0.63 (1.12, 0.14)
38,167 observations which belong to one of three severity  Dark with no street lighting: 0.33 (0.13, 0.53) and
classes: slight, severe or fatal. 0.85 (0.42, 1.29)
The first decision to make is the proportion of the data  Casualties: 0.39 (0.32, 0.46)
which should be used as the training set and consequently  Slip road: 1.98 (3.62, 0.33)
how large the test set should be, which is the set of observa-  Speed limit: 0.02 (0.01, 0.03)
tions on which the model will be tested, and a percentage  Raining with high winds: 0.94 (1.72, 0.16)
error will be calculated to show the proportion of observations  Old age pensioners (OAPs): 0.44 (0.30, 0.57) and
for which the outcome variable class in incorrectly predicted. 0.67 (0.43, 0.90)
As shown in Figure 1, the percentage of observations  Pedestrians: 1.25 (0.63, 1.88)
used in the training set affects both the size of the tree pro-
duced and the percentage of incorrectly predicted observa- As the outcomes are in log odds, this can be difficult to
tion classes. As the size of the training set increases, so too interpret; therefore, Supplementary 6 presents the marginal
does the size of the tree and the percentage error falls. effects which give the probability of a particular outcome
4 K. REEVES ET AL.

Figure 1. The effect of the size of the training set.

occurring for each value of a variable when the value of all


other variables is held constant at the mean. Figure 2 shows
that for any individual who has a road accident (excluding
all individuals who had no injury), there is an 85% chance
that it will be defined as a ‘slight’ accident, a 13% chance
that it will be ‘severe’ and a 2% chance that it will be ‘fatal’.
The robustness of the model is tested by estimating the
PC-GOLOGIT model for the years 2005–2009 and
2010–2014 (Supplementary 7). The majority of estimates
appear to be robust to the time sample and the estimated
coefficients are similar to those estimated across the two
time periods and for the entire sample. There are however a
few, such as the number of vehicles, the driver’s age and
black or mixed background ethnicity, which are not robust
across time periods. This may suggest that their importance Figure 2. Marginal effects model.

in explaining accidents has varied over time or that the


number of observations is low in these categories when a The accuracy of the decision tree is illustrated in
shorter time period is analyzed and therefore the PC- Supplementary 8 which shows that, when used to predict
GOLOGIT model does not perform well. the outcome of the observations in the test dataset, 16.3%
are incorrectly predicted. The observations along the diag-
onal show cases which are correctly predicted, for example
Combinations of variables affecting the risk of there were 19 cases which were classified as fatal and were,
experiencing serious or fatal accidents using in fact, fatal. Off the diagonal are the cases which are incor-
CART analysis rectly predicted, for example there are 13 cases which were
Figure 3 demonstrates the classification tree developed as a classified as fatal but were really slight. The error percentage
result of the CART analysis. At each node is a variable, is calculated as the percentage of total predictions which are
either categorical or continuous, and it is followed by a set incorrect, which is illustrated in Equation 3.
of branches, each with a result of the test which leads to the Equation 3:
 
next node. For example, the first node is concerned with the ð20þ772þ6þ5465þ13þ61Þ
100¼16:3%
number of OAPs involved in the accident and splits into ð19þ20þ772þ6þ120þ5465þ13þ61þ32481Þ
two branches depending whether there were fewer than or
equal to one OAP involved or more than one. These tests As illustrated by the classification tree, there is only one
continue along each branch until a leaf is reached, which is leaf containing observations where the majority are fatal
represented as an oval. Each leaf gives the class to which the and these observations share the following characteristics:
majority of observations in the subset belong and the ratio more than one OAP involved, one or no cycles involved,
to other classes. The interpretation for the first branch is as more than two casualties, zero pedestrians, speed limit over
follows: in an accident where 0 or 1 OAPs are involved and 40 mph, either a dual carriageway, one way street or a single
there are more than 5 casualties, 37 out of 54 drivers were carriageway, more than five casualties and more than
involved in a severe accident and therefore any drivers who five vehicles.
meet these conditions in the future would also be predicted There are several important points to keep in mind when
to have a severe accident rather than slight or fatal. interpreting the results of a classification tree. The first is
INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION 5

Figure 3. Classification tree.

that this is not the only combination of factors which will which are a positive breath test, not wearing a seatbelt and
result in a fatal accident, but also there may be drivers in having an accident while not on a slip road.
other leaves who fall into the same class. The classification Using CART analysis, we have identified several combi-
merely shows that the majority of this group were involved nations of factors that are found to be associated with the
in a fatal accident and also share these characteristics. risk of fatal accidents such as the accident including: more
Another point to keep in mind is that the results refer to than one OAP involved, one or no cycles involved, zero
combinations of factors, which is different to the marginal pedestrians, speed limit over 40 mph, either a dual carriage-
effects of individual variables found in regression analysis. way, one-way street or a single carriageway, more than five
For example, a speed limit which is over 40 mph only casualties and more than five vehicles. This indicates that
increases the probability of an accident being fatal when there are specific groups which could be the focus of poli-
combined with the other relevant factors. cies aimed at reducing the severity of future accidents.

What is already known on this topic


Discussion It is clear from previous literature that morbidity and mor-
Main findings of the study tality still remain key issues relating to RTAs (Ernstberger
et al., 2015; Jackson & Cracknell, 2018), with their aetiology
Using two statistical models, this study has been able to being multifaceted. Global bodies have published a fair
identify variables influencing severity of RTAs and also amount of literature suggesting the impact of factors relating
assess combinations of these factors with their effect on to severity of accidents particular in low- and middle-
severity of RTAs. The results using a PC-GOLOGIT model income settings (Peden et al., 2004; WHO, 2015b). Key fac-
show that several independent variables have a statistically tors identified by WHO (2015b) suggest the key factors
significant effect on the severity of accidents, the largest of affecting crash severity are as follows: human tolerance
6 K. REEVES ET AL.

Table 1. PC-GOLOGIT results.


Severe or fatal (baseline ¼ slight) Fatal (baseline ¼ slight or severe)
95% 95%
Confidence Confidence
Variables Log odds Odds interval Log odds Odds interval
Vehicles 0.02 0.98 0.1 0.06 0.12 1.13 0.01 0.25
Driver sex (baseline ¼ male) 0.41 0.66 0.59 0.23 0.41 0.66 0.59 0.23
Driver age 0 1 0 0.01 0 1 0 0.01
Breath test (baseline ¼ negative) 1.04 2.84 0.69 1.39 1.04 2.84 0.69 1.39
Hit and run (baseline ¼ no) 0.67 0.51 1.76 0.42 1.07 2.9 0.81 2.94
Seatbelt (baseline ¼ yes) 1.42 4.14 1.03 1.81 1.42 4.14 1.03 1.81
UK licence (baseline ¼ full licence) Provisional 0.16 1.17 0.37 0.68 0.16 1.17 0.37 0.68
Unlicensed 0.63 0.53 1.12 0.14 0.63 0.53 1.12 0.14
Ethnicity (baseline ¼ white) Asian 0.18 1.19 0.5 0.85 0.18 1.19 0.5 0.85
Black 0.31 1.36 0.6 1.22 1.79 5.99 0.58 3
Mixed background 0.42 1.52 0.43 1.26 0.42 1.52 0.43 1.26
Oriental 0.28 0.76 0.78 0.23 0.28 0.76 0.78 0.23
Road condition (baseline ¼ dry) Wet/damp 0.03 0.97 0.24 0.17 0.03 0.97 0.24 0.17
Snow 0.03 0.97 1.13 1.07 0.03 0.97 1.13 1.07
Frost/ice 0.52 0.6 1.05 0.02 0.52 0.6 1.05 0.02
Flood 0.63 1.89 0.42 1.69 0.63 1.89 0.42 1.69
Visibility (baseline ¼ daylight) Dark with streetlights 0.31 1.36 0.02 0.63 1.82 0.16 3.82 0.17
Dark 0.33 1.39 0.13 0.53 0.85 2.34 0.42 1.29
Casualties 0.39 1.48 0.32 0.46 0.39 1.48 0.32 0.46
Road type (baseline ¼ roundabout) One way street 0.12 0.89 1.44 1.2 0.12 0.89 1.44 1.2
Dual carriageway 0 1 0.73 0.73 0 1 0.73 0.73
Single carriageway 0.59 1.81 0.11 1.3 0.59 1.81 0.11 1.3
Slip road 1.98 0.14 3.62 0.33 1.98 0.14 3.62 0.33
Speed limit 0.02 1.02 0.01 0.03 0.02 1.02 0.01 0.03
Weather (baseline ¼ fine, no high winds) Raining, no high winds 0.17 0.84 0.46 0.11 0.17 0.84 0.46 0.11
Snowing, no high winds 0.24 1.27 0.72 1.21 0.24 1.27 0.72 1.21
Fine þ high winds 0.33 1.39 0.2 0.85 0.33 1.39 0.2 0.85
Raining þ high winds 0.94 0.39 1.72 0.16 0.94 0.39 1.72 0.16
Snowing þ high winds 0.17 0.85 1.9 1.57 0.17 0.85 1.9 1.57
Fog or mist 0.33 0.72 0.99 0.33 0.33 0.72 0.99 0.33
Weekday (baseline ¼ weekend) 0.1 1.1 0.09 0.28 0.1 1.1 0.09 0.28
OAPs 0.44 1.55 0.3 0.57 0.67 1.95 0.43 0.9
Pedestrians 1.25 3.51 0.63 1.88 1.25 3.51 0.63 1.88
Cycles 0.36 0.7 0.92 0.21 0.36 0.7 0.92 0.21
Constant 4.54 0.01 5.44 3.63 7.48 0 8.49 6.48
Baseline is the variable group of interest being treated as the reference value. Variables which met statistical significance set at p 0.1, p 0.05 and p 0.01.

factors, inappropriate or excessive speed, not using seatbelts the notable risk factors for severity of accidents in this study
or child restraints, not using helmets on two-wheeled have been demonstrated in the literature elsewhere. One of
vehicles, insufficient crash protection, involvement of drugs the strongly correlated associated risk factors, having a posi-
or alcohol, roadside objects not crash protective. However, tive breath test, is a clear indicator as to the importance of
less research has been specifically conducted in high-income alcohol impairing driving ability. Previous literature has
countries such as the UK to identify the importance of these indicated that the use of breath testing is still not necessarily
risk factors leading to different severity levels of accidents. routinely carried out at all RTAs within the UK (Rolison
Using multivariate modelling methods, a recent paper et al., 2018; Tunbridge & Harrison, 2017). Considering the
assessed the impact of speed variation on severity of seriously increased risk of serious injury and fatality associ-
accidents, identifying that speed alone is not the only ated with alcohol use, policy must certainly focus on educat-
factor contributing to accident severity but acts in combin- ing drivers as to the dangers of drink driving (Tunbridge &
ation of several other factors such as weather also identified Harrison, 2017). Not wearing a seatbelt identified as an
in this current study (Choudhary, Imprialou, Velaga, & important risk factor in our study has also been highlighted
Choudhary, 2018). within the WHO report, and it is clear from evaluations
conducted globally that encouraging seatbelt use is a good
marker for reducing severity of RTAs (WHO, 2015a).
What this study adds
Particularly interestingly, within our study we identified that
Our study provides up to date analysis on a wide variety of unlicensed drivers were also a contributing factor to the
risk factors, including those not previously clearly suggested severity of RTAs. A possible explanation for this is that
in the literature relating to the severity of accidents such as unlicensed drivers are thought to engage in more high-risk
possession of a UK licence and involvement of OAPs. By driving behaviours and also have a much decreased odds of
using two complementary methods, the strength of individ- utilizing safety features within vehicles such as seatbelts (Fu,
ual risk factors as well as the importance of a combination Anderson, Dziura, Crowley, & Vaca, 2012). Within the UK,
of risk factors which can be targeted was identified. Some of it is currently against the law to drive without a licence;
INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION 7

however, we hope this research highlights the significance of risk factors in RTAs. Through use of sophisticated statistical
doing so as to further educate those who may consider modelling, several risk factors have been identified which
engaging in such activities. Finally, another key result we could be used to target policies to discourage drivers from
identified was the negative impact of driving on unlit roads. acting in ways that increase the probability of being KSI if
The Cochrane collaboration has identified that street light- they are involved in an accident, as well as identify at-risk
ing is a low cost and cost-effective method for reducing groups who could be targeted in peak demand. However,
RTAs (Beyer & Ker, 2009). However, more recent research further work needs to be done across England to identify
indicates that amending street lighting alone is not an whether this model appropriately predicts risk of KSI in
effective response, it must happen in conjunction with other other settings.
policy measures to tackle other risk factors identified in
our study and previous literature (Fotios & Price, 2017). In
addition to this, this study has demonstrated the use of Acknowledgements
CART methodology in assessing risk factors related to a We would like to thank Anindya Banerjee and Eddie Kane for com-
public health issue. Increasing the use of decision tree ments on a draft of this article and Norfolk and Suffolk Constabulary
models in public health and epidemiological research can for assisting and providing us with accident data.
be useful in deepening our understanding of complex
problems with multiple risk factors, as regression models Funding
may not always be adequately suited to these challenges
(Lemon, Roy, Clark, Friedmann, & Rakowski, 2003; This work was supported by funding from Norfolk and Suffolk con-
stabulary. KR also received funding from ESRC for a doctoral student-
Venkatasubramaniam et al., 2017). ship during which time this article was written.

Limitations of the study Disclosure statement


Although a sufficient number of records were collected, they The authors declare no competing interests.
only relate to Norfolk and Suffolk and therefore may not be
generalizable to the rest of the country. Expansion of this
work could be carried out if similar datasets are made avail- ORCID
able for other police force areas in England. The model can Joht Singh Chandan http://orcid.org/0000-0002-9561-5141
be tested on them to find whether it works well when
applied to other geographical areas. Thus, a more generaliz-
able model could be developed if the training set included References
observations from different areas. This method is not neces- Antov, D., Banet, A., Barbier, C., Bellet, T., Bimpeh, Y., Boulanger, A.,
sarily superior to traditional regression analysis; rather, it … Zavrides, N. (2010). European road users’ risk perception and
offers a different way of looking at the relationships by clas- mobility: The SARTRE 4 survey, Retrieved from https://ec.europa.
sifying them in a non-linear way. The main drawback spe- eu/transport/road_safety/sites/roadsafety/files/pdf/projects_sources/
sartre4_final_report.pdf
cifically of the PC-GOLOGIT model is that it can estimate Bachani, A.M., Peden, M., Gururaj, G., Norton, R., & Hyder, A.A.
unusually high coefficients and negative p-values when a (2017). Road traffic injuries. Injury Prevention and Environmental
category has very few observations. This problem has Health. The International Bank for Reconstruction and Development/
occurred in the 2005–2009 model and has affected the esti- the World Bank. Retrieved from https://doi.org/10.1596/978-1-4648-
mates for several variables; for example, the second esti- 0522-6/CH3
Ball, K., Edwards, J.D., Ross, L.A., & McGwin, G. Jr., (2010). Cognitive
mated coefficient for Ethnicity ¼ Black is 5,870,000,000. training decreases motor vehicle collision involvement of older driv-
However, the majority of estimates appear to be robust to ers. Journal of the American Geriatrics Society, 58(11), 2107–2113.
the time sample. During the study period, we were unable Retrieved from doi:10.1111/j.1532-5415.2010.03138.x
to collect information on airbag use, which may independ- Beyer, F.R., & Ker, K. (2009). Street lighting for preventing road traffic
ently affect the injury severity, as drivers in cars with airbags injuries. Cochrane Database of Systematic Reviews. Retrieved from
https://doi.org/10.1002/14651858.CD004728.pub2
are less likely to be involved in severe or fatal accidents, Choudhary, P., Imprialou, M., Velaga, N.R., & Choudhary, A. (2018).
given that they are involved in an accident. However, other Impacts of speed variations on freeway crashes by severity and
variables were included, such as the driver’s age, which may vehicle type. Accident Analysis & Prevention, 121, 213–222.
implicitly be related to the likelihood of a car having an air- Retrieved from doi:10.1016/j.aap.2018.09.015
bag, and it is important that future research attempts to Department for Transport. (2014). TAG UNIT A4.1: Social impact
appraisal. Author. Retrieved from https://www.gov.uk/transport-ana-
include this as a variable of interest.
lysis-guidance-webtag#a4-social-and-distributional-impacts
Department for Transport. (2017). Reported road casualties in Great
Britain: Notes, definitions, symbols and conventions. Author, 1–6.
Conclusion Retrieved from https://www.gov.uk/transport-statistics-notes-
Department for Transport Statistics. (2012). A valuation of road acci-
The purpose of this research is to identify potential risk fac-
dents and casualties in Great Britain: Methodology note, 1993–1996.
tors or groups of risk factors which could translate to ser- Retrieved from https://doi.org/https://assets.publishing.service.gov.
ious or fatal accidents. To our knowledge, this was the first uk/government/uploads/system/uploads/attachment_data/file/
study to use CART analysis in the application of predicting 254720/rrcgb-valuation-methodology.pdf
8 K. REEVES ET AL.

Ernstberger, A., Joeris, A., Daigl, M., Kiss, M., Angerpointner, K., Racioppi, F., Eriksson, L., Tingvall, C., & Villaveces, A. (2004).
Nerlich, M., & Schmucker, U. (2015). Decrease of morbidity in road Preventing road traffic injury: A public health perspective for Europe.
traffic accidents in a high income country – An analysis of 24,405 Copenhagen: World Health Organization. Retrieved from www.
accidents in a 21 year period. Injury, 46, S135–S143. Retrieved euro.who.int
from https://doi.org/10.1016/S0020-1383(15)30033-4 doi:10.1016/ Road Safety Observatory (2018). Seat belts – How effective? Retrieved
S0020-1383(15)30033-4 June 9, 2019, from https://www.roadsafetyobservatory.com/HowEffective/
Fotios, S., & Price, T. (2017). Road lighting and accidents: Why light- vehicles/seat-belts
ing is not the only answer. Lighting Journal, 82(5), 22–26. Rolison, J.J., Hanoch, Y., Wood, S., & Liu, P.-J. (2014). Risk-taking dif-
Retrieved from http://eprints.whiterose.ac.uk/116229/ ferences across the adult life span: A question of age and domain.
Fu, J., Anderson, C.L., Dziura, J.D., Crowley, M.J., & Vaca, F.E. (2012). The Journals of Gerontology: Series B, 69(6), 870–880. doi:10.1093/
Young unlicensed drivers and passenger safety restraint use in U.S. geronb/gbt081
Fatal crashes: Concern for risk spillover effect? Paper presented at Rolison, J.J., Regev, S., Moutari, S., & Feeney, A. (2018). What are the
Annals of Advances in Automotive Medicine, Association for the factors that contribute to road accidents? An assessment of law
Advancement of Automotive Medicine, Annual Scientific Conference, enforcement views, ordinary drivers’ opinions, and road accident
56, 37–43. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/ records. Accident Analysis & Prevention, 115, 11–24. doi:10.1016/j.
23169115 aap.2018.02.025
Jackson, L., & Cracknell, R. (2018). Road accident casualties in Britain Sherafati, F., Homaie-Rad, E., Afkar, A., Gholampoor-Sigaroodi, R., &
and the world. London: House of Commons Library Retrieved from Sirusbakht, S. (2017). Risk factors of road traffic accidents associated
https://researchbriefings.parliament.uk/ResearchBriefing/Summary/ mortality in Northern Iran; a single center experience utilizing
CBP-7615 Oaxaca blinder decomposition. Bulletin of Emergency and Trauma,
Langford, J., & Koppel, S. (2006). Epidemiology of older driver crashes 5(2), 116–121. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/
– Identifying older driver risk factors and exposure patterns. 28507999
Transportation Research Part F: Traffic Psychology and Behaviour, Smith, A.P. (2016). A UK survey of driving behaviour, fatigue, risk tak-
9(5), 309–321. Retrieved from doi:10.1016/j.trf.2006.03.005 ing and road traffic accidents. BMJ Open, 6(8), e011461. Retrieved
Lemon, S.C., Roy, J., Clark, M.A., Friedmann, P.D., & Rakowski, W. from doi:10.1136/bmjopen-2016-011461
(2003). Classification and regression tree analysis in public health: Tunbridge, R., & Harrison, K. (2017). Fifty years of the breathalyser –
Methodological review and comparison with logistic regression. where now for drink driving? Retrieved from http://www.pacts.org.
Annals of Behavioral Medicine, 26(3), 172–181. Retrieved from doi: uk/wp-content/uploads/sites/2/129256_PACTS_50YearsBreathalyser_
10.1207/S15324796ABM2603_02 V5-1.pdf
Long, J.S. (1997). Regression models for categorical and limited depend- Venkatasubramaniam, A., Wolfson, J., Mitchell, N., Barnes, T., JaKa,
ent variables. Sage Publications. Retrieved from https://uk.sagepub. M., & French, S. (2017). Decision trees in epidemiological research.
com/en-gb/eur/regression-models-for-categorical-and-limited-dependent- Emerging Themes in Epidemiology, 14(1), 11. Retrieved from doi:10.
variables/book6071 1186/s12982-017-0064-4
Morgan, J. (2014). Classification and regression tree analysis. Retrieved Williams, R. (2016). Understanding and interpreting generalized
from https://www.bu.edu/sph/files/2014/05/MorganCART.pdf ordered logit models. The Journal of Mathematical Sociology, 40(1),
Peden, M., Scurfield, R., Sleet, D., Mohan, D., Hyder, A.A., Jarawan, 7–20. Retrieved from doi:10.1080/0022250X.2015.1112384
E., & Mathers, C. (2004). World report on road traffic injury pre- Williams, R., & Williams, R. (2006). Generalized ordered logit/partial
vention. Retrieved from http://www.who.int/violence_injury_preven- proportional odds models for ordinal dependent variables. Stata
tion/publications/road_traffic/world_report/intro.pdf Journal, 6(1), 58–82. Retrieved from https://econpapers.repec.org/
Quinlan, J.R. (1993). C4.5: Programs for machine learning. Morgan article/tsjstataj/v_3a6_3ay_3a2006_3ai_3a1_3ap_3a58-82.htm
Kaufmann Publishers. Retrieved from https://books.google.co.uk/ World Health Organization (WHO). (2015a). Global status report on
books?hl=en&lr=&id=b3ujBQAAQBAJ&oi=fnd&pg=PP1&dq=%5B5% road safety 2015. Injury Prevention. Retrieved from https://doi.org/
5D+Ross+Quinlan.+C4.5:+Programs+for+Machine+Learning.+Morganþ http://www.who.int/violence_injury_prevention/road_safety_status/
Kaufmann+Publishers,+1993.&ots=sQ4vQLGoF5&sig=RrwZAxg4UU- 2013/en/index.html
I4DWfYyADv_vZX5A#v=onepage&q=%5B5%5D Ross Quinlan. C4. WHO. (2015b). Road traffic injuries: The facts. Global Status Report on
5%3A Programs for Machine Learning. Morgan Kaufmann Road Safety, 2015. Retrieved from http://www.who.int/news-room/
Publishers%2C 1993.&f¼false fact-sheets/detail/road-traffic-injuries

You might also like