Flight Delay and Cancellation
Flight Delay and Cancellation
A R T I C L E I N F O A B S T R A C T
Keywords: To mitigate air traffic demand-capacity imbalances, large European airports implement strategic flight schedules,
strategic flight schedule where flights are assigned arrival/departure slots several months prior to execution. We propose a generic
Delay prediction assessment of such strategic schedules using predictions about arrival/departure flight delays and cancellations.
Cancellation prediction
We demonstrate our approach for strategic flight schedules in the period 2013–2018 at London Heathrow
Machine learning
Schedule ranking
Airport. Together with the development of dedicated strategic flight schedule optimization models, our proposed
approach supports an integrated strategic flight schedule assessment, where schedules are evaluated with respect
to flight delays and cancellations.
* Corresponding author.
E-mail address: [email protected] (M. Mitici).
https://doi.org/10.1016/j.jairtraman.2019.101737
Received 14 May 2019; Received in revised form 24 September 2019; Accepted 23 October 2019
Available online 6 November 2019
0969-6997/© 2019 Elsevier Ltd. All rights reserved.
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
delays or cancellations. The strategic flight schedules, currently ob et al., 2008). proposes a statistical model to estimate flight departure
tained following optimization of the slots requests and the IATA delay distributions and seasonal trends at Denver International Airport.
guidelines up to 6 month prior to the day of the flight execution, do not The authors consider in their model a seasonal trend, daily propagation
give an indication on the potential flight delays and cancellations patterns and random residuals (Abdel-Aty et al., 2007). develops a fre
associated. In turn, the actual impact of the strategic schedules on the quency analysis to detect flight delay patterns at Orlando International
airport on-time performance is unknown at the moment of schedule Airport. To analysis the propagation of delays in a network of airports,
generation. To address this, a methodology is needed to assess strategic Pyrgiotis et al. (2013) proposes a network of queues, while Xu et al.
flight schedules with respect to potential flight delays and cancellations (2005) make use of Bayesian networks.
and to provide airports with insights into potential performance bot In the last years, an increasing number of studies have analyzed flight
tlenecks. Such insights are particularly important to support the airport delays using machine learning approaches (Sternberg et al., 2017).
coordinators in developing strategic schedules that not only meet the Several studies (Kim et al., 2016; Choi et al., 2017; Alonso and Loureiro,
IATA guidelines, but also enable a smooth and robust air traffic that 2015; Klein et al., 2010; Rebollo and Balakrishnan, 2014; Chen and Li,
benefits both airlines, airports and passengers. 2019) consider a short prediction horizon of up to 1 day before the flight
In this paper we propose a machine learning-based approach to as execution (Kim et al., 2016). classifies delays at several US airports using
sesses the impact of strategic, IATA guidelines-compliant flight sched recurrent neural networks and several weather-related features. The
ules on the on-time performance at an airport. In particular, we propose models achieve a classification accuracy of 0.874 (Choi et al., 2017).
classification algorithms to predict whether flights scheduled in the employs random forests and weather-related features to classify flight
strategic phase (6 months prior to the day of the execution) are subject to delay with an accuracy of 0.828(Alonso and Loureiro, 2015). develops a
arrival/departure delays and cancellations during execution. Using the multi-class classification algorithm to predict flight departure delay at
obtained flight delay and cancellation results, we propose a generic Porto airport, achieving an accuracy of 0.57. One of the most important
methodology to rank the strategic schedules by comparing and con feature used for classification is the amount of delay experienced by the
trasting the associated flight delay and cancellation predictions. This previous flight arrival. In Klein et al. (2010), airport delay predictions
analysis provides a means to assess strategic schedules based on their are determined using as explanatory feature the weather conditions. In
predisposition to have flight delays and cancellations. We demonstrate Rebollo and Balakrishnan (2014), the authors develop predictions
our assessment methodology using 10 strategic flight schedules from models for airport and network delays for a 2 h prediction horizon and
2013 to 2018 at London Heathrow Airport (LHR), which is one of the achieve an average regression test error of 19%. The propagation of
busiest airports in Europe. To the best of our knowledge, in this paper we flight delays is analyzed in Chen and Li (2019) making use of multi-label
address for the first time the assessment of strategic fight schedules with random forest classification algorithm. The authors achieve an accuracy
respect to potential flight delays and cancellations. of 0.8 or more for their prediction algorithms and show that departure
The main contribution of this paper is that it provides a generic and arrival delay have the main explanatory power. In contrast, for our
assessment of strategic flight schedules, at the moment when they are case we cannot consider such features, as we assume strategic schedules,
generated, using KPIs derived from predictions on flight delays and where flights are scheduled to arrive and depart on time.
cancellations. The generality of our proposed assessment relies on the (Choi et al., 2016; Belcastro et al., 2016; Horiguchi et al., 2017)
fact that we use a relative performance comparison between the assessed propose machine learning approaches to classify flight delays with a
strategic schedules, rather than assigning user-defined weights to the prediction horizon of several days prior to the day of the flight execution
target KPIs. Thus, we assess, using a generic methodology, the robust (Choi et al., 2016). achieves an accuracy of 0.268 using weather fore
ness of strategic, IATA-compliant flight schedules with respect to delays casts available 5 days prior to the day of the flight execution. The au
and cancellations. Together with the development of dedicated opti thors employ random forests classifier that are exclusively trained with
mization models that aim to satisfy airlines’ requests for slots in the weather-related features (Belcastro et al., 2016). proposes a model to
presence of airport capacity constraints, our approach provides the classify flights as being delayed exclusively as a result of unfavorable
airport coordinators with an integrated assessment of the performance weather conditions. The authors use a balanced flight dataset, where a
of the IATA-compliant, airport slot allocation process. random under-sampling algorithm is used to decrease the number of
The remainder of this paper is organized as follows. Section 2 dis delayed samples. The features considered are the scheduled departur
cusses existing machine learning approaches for flight delay and e/arrival time, the origin/destination airport and the weather condi
cancellation predictions and their performance. Section 3 describes the tions. The proposed random forests classifier obtains an accuracy of
flight schedules and the flight delay and cancellation data from LHR in 0.858, with a recall of 0.869 with a 60 min delay threshold (a fight
the period 2013–2018. Section 4 presents our proposed machine considered to be delayed if it has a delay of 60 min or more relative to
learning approach for flight delay and cancellation classification. Sec the scheduled arrival time) (Horiguchi et al., 2017). considers a flight
tion 5 describes a generic approach to assess strategic flight schedules delay prediction horizon of 5 months before the day of the flight
based on KPIs that are derived in Section 4 using machine learning- execution. A XGBoost classifier achieves an area under the ROC curve
based predictions for flight delay and cancellation. Section 6 discusses (AUC score) of 0.534 when predicting flight delay for 20 airports in Asia
the implications of our results. as Section 7 provides conclusions and for a low-cost airline.
outlines future research directions. Several studies analyze flight delay cancellations at an airport
(Sridhar et al., 2009). proposes a neural network approach that aims at
1.1. Related work predicting the total aggregate number of flight cancellations. The ac
curacy of the predictions obtained is 0.79. Many studies also propose
The analysis of flight delays has been extensively addressed in the logit models to explain the influence of several variables on a flight
literature (Mueller and Chatterji, 2002; Wu, 2014; Tu et al., 2008). being cancelled (Rupp and Holmes, 2006; Xiong and Hansen, 2009).
propose data-driven models to estimate flight delay distributions at This paper expands this previous work on flight delay and cancel
non-European airports (Mueller and Chatterji, 2002). determines flight lation prediction by developing machine learning classifiers to predict
delay statistics for 10 major US airports by analyzing historical fight flight delays and cancellations with a 6-month prediction horizon at a
data. Based on these statistics, departure and arrival delays have been large European airport. These flight delay and cancellation predictions
model as a Poisson process and a normal distribution, respectively (Wu, are further used to assess strategic flight schedules on their impact on the
2014). estimates the probability density function of departure and airport’s on-time performance.
arrival delays at Beijing Capital International Airport using historical
flight delay data and an optimal generalized extreme value model (Tu
2
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 1
Example of strategic flight schedule with flights scheduled to arrive at/depart from LHR.
FlightID Arrival/Departure Time Start Date End Date Days of the week Terminal Origin/Destination Aircraft
KL1031 Arr. 1755 April 01, 2013 July 01, 2013 123456⋅ 4 AMS 73 W
BA830 Dep. 0930 April 01, 2013 June 20, 2013 12⋅⋅567 1 DUB 320
DL100 Arr. 0800 April 01, 2013 May 13, 2013 1⋅⋅45⋅7 3 JFK 764
3
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 3 Table 5
Feature selection for flight delay and cancellation classifiers with a 6-months Example of encoding methods for feature Airline and classifier departure delay.
prediction horizon. Here we consider 3 airlines, i.e., TAP, KLM, BA, and a total of 6 flights.
Classifier Features Airline Delayed One-hot Ordinal Binary Target
a a a c a encoding encoding encoding encoding
Departure Delay Airline , Terminal , Aircraft , Distance, Airport , Country ,
Seats, Year, Monthb, Hourb, Day of yearb, Day of monthb, Day of TAP Yes 100 1 00 0.5
weekb KLM No 010 2 01 0
Arrival Delay Airlinea, Terminala, Aircrafta, Distance, Airportc, Countrya, BA Yes 001 3 10 0.67
Seats, Year, Monthb, Hourb, Day of yearb, Day of monthb, Day of TAP No 100 1 00 0.5
weekb BA Yes 001 3 10 0.67
Flight Airlinea, Terminala, Aircrafta, Distance, Airporta, Countrya, Year, BA No 001 3 10 0.67
Cancellation Hourb, Day of yearb, Day of monthb, Day of weekb
a
Feature prepossessed with the target encoding method.
b encoded as 23 ¼ 0:67 since there are 2 BA flights delayed from a total of 3
Feature transformed by trigonometric functions.
c
Categorical feature encoded using geographic coordinates. BA flights.
The features Hour, Day of year and Month have been transformed by
trigonometric functions to account for periodicity (Horiguchi et al.,
Table 4 2017). For example, for a specific hour of the day t, we use the trigo
� � � �
Description of features used for flight delay and cancellation classification al
gorithms - 6 months prediction horizon. C¼Categorical, N¼Numerical, nometric functions sin 224
π t and cos 2π t to ensure a 24hrs periodicity.
24
T ¼ trigonometric transform function. As a consequence, t ¼ 24:00 and t ¼ 1:00 become sequential hours.
Features Feature Feature description Similarly, we ensure a periodicity of 365 days for the feature Day of the
type year and a periodicity of 12 for the feature Month.
Airline C airline operating the flight
Terminal C arrival/departure airport terminal assigned to a
flight
2.2. Flight delay and cancellation classification algorithms
Aircraft C aircraft type
Distance N distance between origin and destination airport
(km) In this Section we present three machine learning classification al
Airport C origin/destination airport of the flight gorithms to classify flight delays and cancellations 6 months in advance
Country C country of origin/destination airport of the day of the flight execution: LightGBM, multilayer perceptron
Seats N number of seats of the aircraft assigned to a
(MLP) and random forests (RF). These three algorithms belong to
flight
Year N scheduled year of flight arrival/departure different machine learning types of algorithms: gradient boosting deci
Month T scheduled month of flight arrival/departure sion tree, neural networks and random decision forests, respectively. We
Hour T scheduled hour of the day of flight arrival/ make use of three different classification algorithms to cross check our
departure
results.
Day of year T scheduled day of the year of flight arrival/
departure
LightGBM (Ke et al., 2017) is a tree-based machine learning algo
Day of month T scheduled day of the month of flight arrival/ rithm where ensembles of decision trees are trained in sequence by
departure fitting negative gradients of the loss. LightGBM uses Gradient-based
Day of week T scheduled day of the week of flight arrival/ One-Side Sampling, which excludes data instances with small gradi
departure
ents, and Exclusive Feature Building, which bundles mutually exclusive
Arrival ATFM N daily average ATFM arrival delay (min)
delay variables, thus, reducing the number of features. To estimate the
hyperparameters that yield the best performance, we use the Python
library hyperot (Bergstra et al., 2013) to optimize the f1-score metric
used either since it assumes an unnecessary ordering within a feature. Duda et al. (2012), i.e., the harmonic mean between precision and
For example, an ordinal encoding of the airlines such as 1; 2; 3; … would recall, by performing Bayesian optimization. Table 6 shows the hyper
mean that an airline encoded as 1 is more similar to an airline encoded parameters of the LightGBM classifiers. The best performance is ach
as 2 than an airline encoded as 8. Table 5 gives an example for each of ieved with a high learning rate and with a relatively small number of
the mentioned encoding methods, where one-hot encoding uses strings decision trees.
of bits with only one high bit (1) and the rest low bits (0) for each airline Multilayer perceptron (MLP) (Hinton, 1990) is a feed-forward neural
type, ordinal encoding uses ordered integers for each airline type, binary network that has consecutive layers with adaptive weights. The vector of
encoding uses binary strings of bits for each airline type. Lastly, the inputs of MLP was normalized Nð0; 1Þ. The initialization of the weights
target encoding method (Micci-Barreca, 2001), taking the case of de follows a normal distribution Nð0; 0:01Þ. To increase the stability of the
parture delay classifiers, encodes an airline type based on the probability neural network, all the hidden layers have batch normalization. Table 7
that a flight from this airline is delayed (the target variable). As an shows the hyperparameters of the MLP classifiers. All classifiers pro
example, in Table 5 there are a total of 6 flights and the airline BA is duced superior results when trained with two hidden layers and with the
4
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 6
Hyperparameters of LightGBM flight delay and cancellation classifiers with a 6-month prediction horizon.
Classifier Number input features Learning rate Max depth of tree Trees Subsample Weight positive class
Table 9 shows the performance of LightGBM, MLP and RF to classify where F is the set of all features considered for the classification algo
arrival/departure flights as being delayed and cancelled. We note that rithm, S⊆F is a subset of features obtained from the set F except feature i,
the prediction horizon is 6 months prior to the day of the flight execu and fðSÞ is the expected classification output given by the set S of
tion. A 5-fold cross validation is performed using the data on the flights features.
arriving and departing in the period 2013–2018 from LHR. Among the 3 The SHAP values show which features have a significant positive or
negative impact on the delay/cancellation flight classification and what
is the magnitude of the impact, i.e., how much a specific feature value
Table 8
drives the classification of a flight as delayed/cancelled. For a specific
Hyperparameters of RF flight delay and cancellation classifiers with a 6-month
flight, a large positive (large negative) SHAP value of a feature indicates
prediction horizon.
that this feature has a large contribution for the flight to be classified as
Classifier Number Number trees Max Percentage
delayed/cancelled (not delayed/cancelled). In this paper, the SHAP
input generated depth of features for each
features tree split
values are expressed in log odds, where the log odd of a variable A is
defined as:
Dep. Delay 18 500 11 0.60
Arr. Delay 18 1000 12 0.55
Cancellation 14 500 10 0.60
5
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 9
5-fold cross validation results for machine learning models with 6-month pre
diction horizon.
Classifier LightGBM MLP RF
� �
PðAÞ
log ; with PðAÞ < 1: Fig. 4. ROC curves of flight cancellation classifiers.
1 PðAÞ
We note that, given a feature, a SHAP value in log odds close to zero
given feature, each dot corresponds to a flight and an associated SHAP
indicates that this feature does not contribute/does not help deciding in
value. For a given feature, the color blue of a dot (flight) indicates that,
classifying a flight as being delayed/cancelled or not.
for this flight, the value of the feature is small, while the color red in
Figs. 5–7 are summary plots that show the SHAP values for all fea
dicates that the value of the feature is large. For example, in Fig. 5, for
tures for all flights considered for classification, i.e., these figures show
the feature Arrival ATFM Delay, the dots (flights) colored red have large
an aggregation of dots, where each dot corresponds to a flight. For a
Arrival ATFM delays, while the dots (flights) colored blue have small
Arrival ATFM delays. For a given feature, an accumulation of dots in
dicates that there is a large number of flights that have similar SHAP
values. As an example, in Fig. 5, for the feature Arrival ATFM Delay,
there is a significant number of flights where this feature has a SHAP
value between 1 and 0, i.e., an accumulation of blue dots corre
sponding to a SHAP value between 1 and 0. Again, the color blue in
dicates that all these flights have a low Arrival ATFM delay. The blue
dots (flights) that have a negative SHAP value are those flights with a
low Arrival ATFM delay (blue color) and that are classified as not
delayed (negative SHAP). In particular, for the blue dots (flights) where
the SHAP value is close to zero, the Arrival ATFM delay is low (blue
color), but it does not significantly impact the classification of these
Table 10
Computational time for LightGBM, MLP and RF classifiers.
Classifier LightGBM (sec) MLP (sec) RF (sec)
6
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Fig. 7. SHAP values (log odds) of the features used for flight cancellation
Fig. 5. SHAP values (log odds) of the features used for delayed departure flight classification - LightGBM, 6 months prediction horizon.
classification using LightGBM - 6 months prediction horizon.
features Month and Day of the month have the lowest feature impor
tance. Fig. 7 shows that the features origin/destination Airport and
Airline have the highest feature importance for the flight cancellation
classification algorithm. The feature Aircraft also have a high impor
tance in the cancellation classifier when compared with the flight delay
classifiers. The feature Day of the month shows the lowest feature
importance from for the cancellation classifier.
Figs. 5–7 also allow for a detailed analysis of the impact of each
feature. For a given feature, the color red used of a dot, i.e., flight, and a
corresponding large SHAP value shows that large values (red) of this
feature are very significant (large SHAP value) for the classification. For
example, Fig. 5 shows that for the feature Arrival ATFM Delay, there are
several dots, i.e., flights, that are red and that have a positive and large
SHAP value. This means that a large (red) value for the Arrival ATFM
Delay is very significant (large SHAP value) and drives the classification
of a departure flight as being delayed (positive SHAP value). In Fig. 6
there is a larger accumulation of blue dots (flights), with SHAP values
between 1 and zero. The color blue indicates that these departure
flights have low Arrival ATFM Delays. Here, the blue dots (flights) with
negative SHAP values away from zero indicate that, for these flights,
small (blue) Arrival ATFM Delays drive the classification of these de
parture flights as being not delayed (negative SHAP value). Also, the
blue dots (flights) with negative SHAP values close to zero indicate that,
Fig. 6. SHAP summary plot (log odds) of the features used for delayed arrival for these departure flights, the small (blue) Arrival ATFM Delays do not
flight classification using LightGBM - 6 months prediction horizon. significantly impact the classification of these flights. In Fig. 5, for
feature Arrival ATFM Delay, we also note that for the dots (flights) with
flights (SHAP value in log odds close to zero). SHAP values around zero, i.e., the feature does not significantly drive
In Figs. 5–7, the features are sorted by the sum of the SHAP values the classification of a flight as being delayed or not delayed, the values of
magnitudes over all samples such that the feature at the top of the graph the Arrival ATFM Delay are low (blue color). Thus, low Arrival ATFM
has the highest impact on the flight classification, whereas the feature at Delays have a low classification importance.
the bottom of the graph has the lowest impact. For example, in Fig. 5, the A similar analysis can be made for the other features in Figs. 5–7. We
feature Arrival ATFM delay is at the top of the graph since it has the note that for the categorical features Airline, Country, Aircraft, Terminal
highest impact on the flight delay classification. The features Hour and which are encoded using the target encoding method, high feature
Airline have the second and third largest impact on the flight delay values means high probabilities of delay. Here, it can be seen that, for
classification. Similarly, Fig. 6 shows that the features Arrival ATFM these features, high values of these features, i.e., high probabilities of
Delay, Airline and Hour have the highest impact on the arrival flight delay, correspond to high SHAP values. For the features encoded with
delay classification. Both Figs. 5 and 6 show that the feature Seats also trigonometric functions (see also Section 4.1), we note that a detailed
has a high importance for both arrival and departure delay classifiers. analysis of the summary plots is not straightforward as we apply sin and
The feature Terminal has the lowest feature importance for departure cos transformations. As such, for these features, we make use of the
flight delay classification. For the arrival flight delay classification, summary plots to determine their feature importance relative to the
however, the feature Terminal has a larger importance, whereas the other features, as discussed above.
7
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
The numerical examples in this section are based on strategic flight k¼1
schedules from LHR in the period 2013–2018. k6¼i
2.5. Generic schedule assessment methodology where 1 � j � m, and 1ðAÞ is an indicator function that takes value 1 if A
is true and zero otherwise.
In this Section we apply an assessment methodology for strategic As an example, in Fig. 8, DðS2 Þ ¼ 12 þ 12 þ 13 since S2 dominates S4 from
flight schedules using a set of general KPIs. This assessment can be done layer 2, S5 from layer 2 and S6 from layer 3; DðS4Þ ¼ DðS5 Þ ¼ 13 because
both before and after the execution of the flights, as long as the values of both S4 and S5 dominate S6 from layer 3; and DðS1 Þ ¼ DðS3 Þ ¼ DðS6 Þ ¼ 0
the KPIs are known. When assessing the flight schedules, we make use of since S1 ; S3 ; S6 do not dominate other schedules.
the notion of schedule domination, which we define below, rather than Lastly, we rank the strategic flight schedules i 2 S ; 1 � i � N based
assuming user-defined weights for the KPIs considered. Thus, we pro on their dominance power.
pose a generic assessment methodology that does not depend on the We are interested in those schedules with the highest dominance
weights of the KPIs, which are user-specific. power. We assign a ranking position of 1 for the schedule(s) with the
We characterize a strategic flight schedule i by a set of n KPIs, i.e. Si : highest dominance power, a ranking position of 2 for the schedule(s)
ðKPIi1 ;…;KPIin Þ. We are interested in those schedules where the values of with the second highest dominance power and so on.
all n KPIs are minimal. To this end, we define the concept of schedule
domination as follows. We say that schedule i, Si : ðKPIi1 ; KPIi2 ; …; KPIin Þ,
j j 2.6. Assessing strategic flight schedules - results
dominates schedule j, Sj : ðKPI1 ; KPI2 ; …; KPIjn Þ, if: 8u 2 f1; 2; …; ng;
KPIiu � KPIju and there exists at least one KPI Kl ; l 2 f1; 2; …; ng such that In this Section we assess 10 strategic flight schedules using the
j
KPIil < KPIl (Boyd and Vandenberghe, 2004). methodology introduced in Section 5.1. In doing so, we consider 5 KPIs,
We consider the set S ¼ fS1 ; …; SN g of N schedules. The Pareto front which are based on flight delay and cancellation predictions with a
of the schedules i 2 S ;1 � i � N, with respect to the KPIs KPI1 ;…;KPIn , horizon of 6 months prior to the flight execution day (see Section 4): 1)
is the subset S 1 of schedules that are not dominated by any other the predicted percentage of flights cancelled, 2) the predicted percent
schedule (Boyd and Vandenberghe, 2004), where S 1 ⊂ S . We say that age of departure delayed flights, 3) the predicted percentage of arrival
layer 1, which we denote by L1 , consists of all the schedules in S 1 . We flights delayed, 4) the predicted percentage of departure yellow days, 5)
next partition the set of remaining schedules S nS 1 into additional
layers. We define layer 2, i.e., L2 , of the schedules i 2 S ; 1 � i � N as the
set of schedules that are in the Pareto front of the schedules S nS 1 ;S n
S 1 6¼ ∅. In general, we define layer m, denoted by Lm , of the schedules
i 2 S ; 1 � i � N as the set of schedules in the Pareto front of the
schedules S nðS 1 [S 2 [… [S m 1 Þ, with S nðS 1 [S 2 [… [S m 1 Þ 6¼
∅.
Fig. 8 shows an example of dominance relationships between 6
schedules S1 ; S2 …; S6 . Schedules S1 ; S2 ; S3 form layer 1. Schedules S4 ; S5
form layer 2. Schedule S6 is layer 3. Fig. 8 also shows the dominance
boundaries for each schedule, i.e., the bounds of the set of points that a
schedule dominates. All the schedules i; 1 � i � 6, located within the
dominance boundaries of a given schedule j 6¼ i; 1 � j � 6, are domi
nated by schedule j. Here, schedules S1 , S6 and S3 do not dominate any
other schedule, schedules S4 and S5 dominate schedule S6 , schedule S2
dominates schedules S4 , S5 and S6 .
We next define the dominance power of a schedule i 2 S ;1 � i � N,
as introduced in Valkanas et al. (2014). We consider a total of m > 0
layers. We say that the dominance power of schedule i;i 2 S , which we
denote by DðSi Þ, is as follows: Fig. 9. Layers of the strategic flight schedules when considering the percentage
of predicted flights cancelled and the percentage of predicted delayed
arrival flights.
8
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 11
Dominance power of the 10 strategic schedules S1 ;…;S10 , when considering the
percentage of cancelled flight and percentage of delayed arrival flights.
S3 S1 S4 S2 S6 S8 S10 S5 S7 S9
Table 12
Schedule ranking with respect to percentage of flights cancelled and percentage
of arrival delays under schedule Si .
Ranking position 1 2 3 4 5 6 7 8 9 10
Fig. 11. Layers of the strategic flight schedules when considering the per
centage of predicted delayed departure and arrival flights.
Table 13
Dominance power of the 10 strategic schedules S1 ; …; S10 , when considering the percentage of cancelled flight and percentage of delayed departure flights.
S3 S1 S4 S6 S8 S7 S2 S10 S5 S9
Table 14
Schedule ranking with respect to percentage of flights cancelled and percentage of departure delays under schedule Si .
Ranking Position 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
9
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
Table 15
Dominance power of the 10 strategic schedules S1 ; …; S10 when considering the percentage of delayed departure and arrival flights.
S3 S7 S9 S1 S5 S8 S6 S4 S10 S2
Table 16
Schedule ranking with respect to percentage of delayed departure and arrival flights under schedule Si .
Ranking position 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
3. Discussion
10
M. Lambelho et al. Journal of Air Transport Management 82 (2020) 101737
cancellations. This insight can support the airport management in Corolli, L., Lulli, G., Ntaimo, L., 2014. The time slot allocation problem under uncertain
capacity. Transp. Res. C Emerg. Technol. 46, 16–29.
identifying mitigation actions for these performance bottlenecks such as,
Duda, R.O., Hart, P.E., Stork, D.G., 2012. Pattern Classification. John Wiley & Sons.
for instance, assigning more resources in a specific part of the airport. EUROCONTROL, 2017. Performance Review Report - an Assessment of Air Traffic
A second implication is that, in the case when airport performance Management in Europe during the Calendar Year 2017. Performance Review
bottlenecks are expected, airport coordinators are provided with support Commission.
Granitto, P.M., Furlanello, C., Biasioli, F., Gasperi, F., 2006. Recursive feature
to propose, in the limits of the IATA slot allocation guidelines and elimination with random forest for ptr-ms analysis of agroindustrial products.
following negotiations with the airlines that operate the flights associ Chemometr. Intell. Lab. Syst. 83 (2), 83–90.
ated with the performance bottlenecks, changes to schedule such as Guyon, I., Weston, J., Barnhill, S., Vapnik, V., 2002. Gene selection for cancer
classification using support vector machines. Mach. Learn. 46 (1–3), 389–422.
alternative arrival/departure time slots or alternative types of aircraft Hinton, G.E., 1990. Connectionist learning procedures. In: Machine Learning, vol. III.
used for the flight execution. Thus, the results of this assessment provide Elsevier, pp. 555–610.
a quantified motivation for potential schedule alternatives during the Horiguchi, Y., Baba, Y., Kashima, H., Suzuki, M., Kayahara, H., Maeno, J., 2017.
Predicting fuel consumption and flight delays for low-cost airlines. In: Innovative
negotiation for slots between airport coordinators and airlines. Applications of Artificial Intelligence (IAAI) Conference, pp. 4686–4693.
Last, but not least, we note that when evaluating the strategic flight Hossin, M., Sulaiman, M., 2015. A review on evaluation metrics for data classification
schedules using flight delay and cancellation-based KPIs, we do not as evaluations. Int. J. Data Min. Knowl. Manag. Process 5 (2), 1.
C. International Air Transport Association, Montreal. Worldwide Slot Guidelines, eighth
sume user-specific weights. As such, our strategic schedule assessment is ed., 2017
generic and can be applied to any large European airport where the slot Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.Y., 2017. Light
allocation process is in place. GBM: A highly efficient gradient boosting decision tree. In: Advances in Neural
Information Processing Systems, pp. 3146–3154.
Kim, Y.J., Choi, S., Briceno, S., Mavris, D., 2016. A deep learning approach to flight delay
4. Conclusion prediction. In: Digital Avionics Systems Conference (DASC), 2016 IEEE/AIAA 35th.
IEEE, pp. 1–6.
To support the slot allocation process at airports, in this paper we Kingma, D.P., Adam, J. Ba, 2014. A Method for Stochastic Optimization arXiv preprint
arXiv:1412.6980.
have developed a machine learning approach to evaluate the resulting Klein, A., Craun, C., Lee, R.S., 2010. Airport delay prediction using weather-impacted
flight schedules in terms of predicted flight delays and cancellations. traffic index (witi) model. In: 29th Digital Avionics Systems Conference. B. IEEE, 2.
Based on these predictions, we have developed a generic ranking of the Lundberg, S.M., Lee, S.-I., 2017. A unified approach to interpreting model predictions. In:
Advances in Neural Information Processing Systems, pp. 4765–4774.
strategic schedules. We have implemented our proposed approach for Marsland, S., 2011. Machine Learning: an Algorithmic Perspective. Chapman and Hall/
strategic flight schedules at London Heathrow Airport in the period CRC.
2013–2018. Micci-Barreca, D., 2001. A preprocessing scheme for high-cardinality categorical
attributes in classification and prediction problems. ACM SIGKDD Explor. Newsl. 3
In practice, our methodology can provide airport coordinators with (1), 27–32.
insight into potential delays and cancellations associated with the stra Mueller, E., Chatterji, G., 2002. Analysis of aircraft arrival and departure delay
tegic schedules. In the case when on-time performance bottlenecks are characteristics. In: AIAA’s Aircraft Technology, Integration, and Operations, vol.
5866. (ATIO) 2002 Technical Forum.
identified, our approach provides the airport coordinators with a Pellegrini, P., Boli�c, T., Castelli, L., Pesenti Sosta, R., 2017. An effective model for the
quantitative support to motivate potential actions to mitigate flight simultaneous optimisation of airport slot allocation. Transp. Res. E Logist. Transp.
delays, such as changes to the flight schedule. Together with the Rev. 99, 34–53.
Pyrgiotis, N., Malone, K.M., Odoni, A., 2013. Modelling delay propagation within an
development of dedicated flight schedule optimization models, our
airport network. Transp. Res. C Emerg. Technol. 27, 60–75.
approach supports an integrated strategic flight schedule assessment, Rebollo, J.J., Balakrishnan, H., 2014. Characterization and prediction of air traffic
where strategic flight schedules are evaluated with respect to on-time delays. Transp. Res. C Emerg. Technol. 44, 231–241.
airport performance. Ribeiro, N.A., Jacquillat, A., Antunes, A.P., Odoni, A.R., Pita, J.P., 2018. An optimization
approach for airport slot allocation under IATA guidelines. Transp. Res. Part B
As future work, we consider extending the set of features for the Methodol. 112, 132–156.
prediction algorithms to improve the accuracy of the predictions. In Rupp, N.G., Holmes, G.M., 2006. An investigation into the determinants of flight
addition, we will evaluate the impact of considering flight delay and cancellations. Economica 73 (292), 749–783.
Sridhar, B., Wang, Y., Klein, A., Jehlen, R., 2009. Modeling flight delays and
cancellation predictions in the flights scheduling optimization models, cancellations at the national, regional and airport levels in the United States. In: 8th
at the strategic phase. USA/Europe ATM R&D Seminar, Napa, California (USA).
Sternberg, A., Soares, J., Carvalho, D., Ogasawara, E., 2017. A Review on Flight Delay
Prediction arXiv preprint arXiv:1703.06118.
References Tu, Y., Ball, M.O., Jank, W.S., 2008. Estimating flight departure delay distributions a
statistical approach with long-term trend and short-term pattern. J. Am. Stat. Assoc.
Abdel-Aty, M., Lee, C., Bai, Y., Li, X., Michalak, M., 2007. Detecting periodic patterns of 103 (481), 112–125.
arrival delay. J. Air Transp. Manag. 13 (6), 355–361. Valkanas, G., Papadopoulos, A.N., Gunopulos, D., 2014. Skyline ranking a la IR. In:
Alonso, H., Loureiro, A., 2015. Predicting flight departure delay at porto airport: a EDBT/ICDT Workshops, pp. 182–187.
preliminary study. In: Proceedings of the 7th International Joint Conference on Wu, Q., 2014. A stochastic characterization based data mining implementation for
Computational Intelligence (IJCCI), vol. 3. IEEE, pp. 93–98. airport arrival and departure delay data. In: Applied Mechanics and Materials, vol.
Belcastro, L., Marozzo, F., Talia, D., Trunfio, P., 2016. Using scalable data mining for 668. Trans Tech Publ, pp. 1037–1040.
predicting flight delays. ACM Trans. Intell. Syst. Technol. (TIST) 8 (1), 5. Xiong, J., Hansen, M., 2009. Value of flight cancellation and cancellation decision
Bergstra, J., Yamins, D., Cox, D.D., 2013. Hyperopt: A python library for optimizing the modeling: ground delay program postoperation study. Transp. Res. Rec.: J. Transp.
hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python Res. Board 2106, 83–89.
in Science Conference. Citeseer, pp. 13–20. Xu, N., Donohue, G., Laskey, K.B., Chen, C.-H., 2005. Estimation of delay propagation in
Boyd, S., Vandenberghe, L., 2004. Convex Optimization. Cambridge University Press. the national aviation system using bayesian networks. In: 6th USA/Europe Air
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. Traffic Management Research and Development Seminar. FAA and Eurocontrol,
Castelli, L., Pellegrini, P., Pesenti, R., 2012. Airport slot allocation in Europe: economic Baltimore, MD.
efficiency and fairness. Int. J. Revenue Manag. 6 (1–2), 28–44. Zografos, K.G., Salouras, Y., Madas, M.A., 2012. Dealing with the efficient allocation of
Chen, J., Li, M., 2019. Chained predictions of flight delay using machine learning. In: scarce resources at congested airports. Transp. Res. C Emerg. Technol. 21 (1),
AIAA Scitech 2019 Forum, p. 1661. 244–256.
Choi, S., Kim, Y.J., Briceno, S., Mavris, D., 2016. Prediction of weather-induced airline Zografos, K.G., Madas, M.A., Androutsopoulos, K.N., 2017. Increasing airport capacity
delays based on machine learning algorithms. In: Digital Avionics Systems utilisation through optimum slot scheduling: review of current developments and
Conference (DASC), 2016 IEEE/AIAA 35th. IEEE, pp. 1–6. identification of future needs. J. Sched. 20 (1), 3–24.
Choi, S., Kim, Y.J., Briceno, S., Mavris, D., 2017. Cost-sensitive prediction of airline
delays using machine learning. In: IEEE/AIAA 36th Digital Avionics Systems
Conference (DASC), pp. 1–8.
11