0% found this document useful (0 votes)
35 views21 pages

Predicting Football Match Outcomes with Stats

My lesson note

Uploaded by

ljasper433
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views21 pages

Predicting Football Match Outcomes with Stats

My lesson note

Uploaded by

ljasper433
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Journal of Sports Analytics 7 (2021) 77–97 77

DOI 10.3233/JSA-200462
IOS Press

Forecasting football matches by predicting


match statistics
Edward Wheatcroft∗
London School of Economics and Political Science, Houghton Street, London, United Kingdom, WC2A 2AE

Abstract. This paper considers the use of observed and predicted match statistics as inputs to forecasts for the outcomes of
football matches. It is shown that, were it possible to know the match statistics in advance, highly informative forecasts of the
match outcome could be made. Whilst, in practice, match statistics are clearly never available prior to the match, this leads to a
simple philosophy. If match statistics can be predicted pre-match, and if those predictions are accurate enough, it follows that
informative match forecasts can be made. Two approaches to the prediction of match statistics are demonstrated: Generalised
Attacking Performance (GAP) ratings and a set of ratings based on the Bivariate Poisson model which are named Bivariate
Attacking (BA) ratings. It is shown that both approaches provide a suitable methodology for predicting match statistics in
advance and that they are informative enough to provide information beyond that reflected in the odds. A long term and
robust gambling profit is demonstrated when the forecasts are combined with two betting strategies.

Keywords: Probability forecasting, sports forecasting, football forecasting, football predictions, soccer predictions

1. Introduction match statistics such as the numbers of shots, corners


and fouls by each team. This creates huge potential for
Quantitative analysis of sports is a rapidly grow- those able to process the data in an informative way.
ing discipline with participants, coaches, owners, as This paper focuses on probabilistic prediction of the
well as gamblers, increasingly recognising its poten- outcomes of football matches, i.e. whether the match
tial in gaining an edge over their opponents. This ends with a home win, a draw or an away win. A prob-
has naturally led to a demand for information that abilistic forecast of such an event simply consists of
might allow better decisions to be made. Associa- estimated probabilities placed on each of the three
tion football (hereafter football) is the most popular possible outcomes. Statistical models can be used to
sport globally and, although, historically, the use of incorporate information into probabilistic forecasts.
quantitative analysis has lagged behind that of US The basic philosophy of this paper is as follows.
sports, this is slowly changing. Gambling on football Suppose, somehow, that certain match statistics, such
matches has also grown significantly in popularity as the number of shots or corners achieved by each
in recent decades and this has contributed to an team, were available in advance of kickoff. In such a
increased demand for informative quantitative anal- case, it would be reasonable to expect to be able to use
ysis. this information to create informative forecasts and
Today, in the most popular football leagues glob- it is shown that this is the case. Obviously, in reality,
ally, a great deal of match data are collected. Data on this information would never be available in advance.
the location and outcome of every match event can However, if one can use statistics from past matches to
be purchased, whilst free data are available including predict the match statistics before the match begins,
∗ Corresponding author: Edward Wheatcroft, London School
and those predictions are accurate enough, they can
of Economics and Political Science, Houghton Street, London, be used to create informative forecasts of the match
United Kingdom, WC2A 2AE. E-mail: e.d.wheatcroft@lse.ac.uk. outcome. The quality of the forecast is then dependent

ISSN 2215-020X © 2021 – The authors. Published by IOS Press. This is an Open Access article distributed under the terms
of the Creative Commons Attribution-NonCommercial License (CC BY-NC 4.0).
78 E. Wheatcroft / Forecasting football matches by predicting match statistics

both on the importance of the match statistic itself fivethirtyeight.com produces probabilities for NFL
and the accuracy of the pre-match prediction of that (FiveThirtyEight, 2020a) and NBA (FiveThirtyEight,
statistic. 2020b) based on Elo ratings. A limitation of the Elo
In this paper, observed and predicted match statis- rating system is that it does not account for the size of
tics are used as inputs to a simple statistical model to a win. This means that a team’s ranking after a match
construct probabilistic forecasts of match outcomes. would be the same after either a narrow or convinc-
First, observed match statistics in the form of the ing victory. Some authors have adapted the system to
number of shots on target, shots off target and cor- account for the margin of victory (see, for example,
ners, are used to build forecasts and are shown to be Lasek et al. (2013) and Sullivan and Cronin (2016)).
informative. The observed match statistics are then The original Elo rating system assigns a single rat-
replaced with predicted statistics calculated using ing to each participating team or player, reflecting
(i) Generalised Attacking Performance (GAP) Rat- its overall ability. This does not directly allow for
ings, a system which uses past data to estimate a distinction between the performance of a team in
the number of defined measures of attacking per- its home or away matches. Typically, some adjust-
formance a team can be expected to achieve in a ment to the estimated probabilities is made to account
given match (Wheatcroft, 2020), and (ii) Bivariate for home advantage. Other rating systems distinguish
Attacking (BA) ratings which are introduced here between home and away performances. One sys-
and are a slightly modified version of the Bivariate tem that does this is the pi-rating system in which
Poisson model which has demonstrated favourable a separate home and away rating is assigned to each
results in comparison to other parametric approaches team (Constantinou and Fenton, 2013). The pi-rating
(Ley et al. 2019). Whilst, unsurprisingly, it is found system also takes into account the winning mar-
that predicted match statistics are less informative gin of each team, but this is tapered such that the
than observed statistics, they can still provide useful impact of additional goals on top of already large
information for the construction of the forecasts. It is winning margins is lower than that of goals in close
shown that a robust profit can be made by construct- matches.
ing forecasts based on predicted match statistics and The GAP rating system, introduced in Wheatcroft
using them alongside two different betting strategies. (2020) and used in this paper, differs from both the
For much of the history of sports prediction, rating Elo rating and the pi-rating systems in that, rather than
systems in a similar vein to the GAP rating system producing a single rating, each team is assigned a sep-
used in this paper have played a key role. Probably arate attacking and defensive rating both for its home
the most well known is the Elo rating system which and away matches. This results in a total of 4 ratings
was originally designed to produce rankings for chess per team. The approach of assigning attacking and
players but has a long history in other sports (Elo, et defensive ratings has been taken by a large number
al. 1978). The Elo system assigns a rating to each of authors. An early example is Maher (1982) who
player or team which, in combination with the rating assigned fixed ratings to each team and combined
of the opposition, is used to estimate the probability of them with a Poisson model to estimate the number of
each possible outcome. The ratings are updated after goals scored. They did not use their ratings to estimate
each game in which a player or team is involved. A match probabilities but Dixon and Coles (1997) did
weakness of the original Elo rating system is that it so using a similar approach. Combined with a value
does not estimate the probability of a draw. As such, in betting strategy, they were able to demonstrate a sig-
sports such as football, in which draws are common, nificant profit for matches with a large discrepancy
some additional methodology is required to estimate between the estimated probabilities and the proba-
that probability. bilities implied by the odds. Dixon and Pope (2004)
Elo ratings are in widespread use in football and modified the Dixon and Coles model and were able
have been demonstrated to perform favourably with to demonstrate a profit using a wider range of pub-
respect to other rating systems (Hvattum and Arntzen, lished bookmaker odds. Rue and Salvesen (2000)
2010). Since 2018, Fifa has used an Elo rating system defined a Bayesian model for attacking and defen-
to produce its international football world rankings sive ratings, allowing them to vary over time. Other
(Fifa, 2018). Elo ratings have also been applied examples of systems that use attacking and defensive
to a wide range of other sports including, among ratings can be found in Karlis and Ntzoufras (2003),
others, Rugby League (Carbone et al., 2016) and Lee (1997) and Baker and McHale (2015). Ley et
video games (Suznjevic et al., 2015). The website al. (2019) compared ten different parametric models
E. Wheatcroft / Forecasting football matches by predicting match statistics 79

(with the parameters estimated using maximum like- are also described. In section 4, the accuracy of
lihood) and found the Bivariate Poisson model to give predicted match statistics in terms of how close
the most favourable results. Koopman and Lit (2015) they get to observed statistics under the GAP and
used a Bivariate Poisson model alongside a Bayesian BA rating systems is compared. Match forecasts
approach to demonstrate a profitable betting strategy. formed using different combinations of observed
The use of rating systems naturally leads to the and predicted statistics are then compared using
question of how to translate them into probabilistic model selection techniques. Next, the performance
forecasts. One of two approaches is generally taken. of forecasts formed using combinations of predicted
The first is to model the number of goals scored statistics is compared. Finally, the profitability of
by each team using Poisson or Negative Binomial two betting strategies is compared when used along-
regression with the ratings of each team used as side forecasts formed using different combinations
predictor variables. These are then used to estimate of predicted match statistics. Section 6 is used for
match probabilities. The second approach is to pre- discussion.
dict the probability of each match outcome directly
using methods such as logistic regression. There is
little evidence to suggest a major difference in the
2. Background
performance of the two approaches (Goddard, 2005).
In this paper, the latter approach is taken, specifically
2.1. Betting odds
in the form of ordinal logistic regression.
The idea that match statistics might be more
In this paper, betting odds are used both as poten-
informative than goals in terms of making match pre-
tial inputs to models and as a tool with which to
dictions has become more widespread in recent years.
demonstrate profit making opportunities. Decimal,
The rationale behind this view is that, since it is diffi-
or ‘European Style’, betting odds are considered
cult to score a goal and luck often plays an important
throughout. Decimal odds simply represent the num-
role, the number of goals scored by each team might
ber by which the gambler’s stake is multiplied in the
be a poor indicator of the events of the match. It was
event of success. For example, if the decimal odds are
shown by Wheatcroft (2020) that, in the over/under
2, a £ 10 bet on said event would result in a return of
2.5 goals market, the number of shots and corners pro-
2 × £10 = £20.
vide a better basis for probabilistic forecasting than
Another useful concept is that of the ‘odds implied’
goals themselves. Related to this is the concept of
probability. Let the odds for the i-th outcome of an
‘expected goals’ which is playing a more and more
event be Oi . The odds implied probability is sim-
important role in football analysis. The idea is that
ply defined as the multiplicative inverse, i.e. ri = O1i .
the quality of a shot can be measured in terms of
For example, if the odds on two possible outcomes
its likelihood of success. The expected goals from a
of an event (e.g. home or away win) are O1 = 3
particular shot corresponds to the number of goals
and O2 = 1.4, the odds implied probabilities are
one would ‘expect’ to score by taking that shot. The
r1 = 13 ≈ 0.33 and R2 = 1.4 1
≈ 0.71. Note how, in
number of expected goals by each team in a match
this case, r1 and r2 add to more than one. This is
then gives an indication of how the match played out
because, whilst, conventionally, probabilities over a
in terms of efforts at goal. Several academic papers
set of exhaustive events should add to one, this need
have focused on the construction of expected goals
not be the case for odds implied probabilities. In fact,
models that take into account the location and nature
usually, the sum of odds implied probabilities for
of a shot (Eggels, 2016; Rathke, 2017).
an event will exceed one. The excess represents the
This paper is organised as follows. In section 2,
bookmaker’s profit margin or the ‘overround’ which
background information is given on betting odds
is formally defined as
and the data set used in this paper. The Bivariate
Poisson model, which is used for comparison pur-  m 
 1
poses in the results section and forms the basis of π= − 1. (1)
the Bivariate Attacking (BA) rating system is also Oi
i=1
described. In section 3, the GAP and BA rating sys-
tems are described along with the approach used Generally, the larger the overround, the more difficult
for constructing forecasts of match outcomes. The it is for a gambler to make a profit since the return
two betting strategies used in the results section from a winning bet is reduced.
80 E. Wheatcroft / Forecasting football matches by predicting match statistics

Table 1
Data used in this paper
League No. matches Match data available Excluding burn-in
Belgian Jupiler League 5090 480 384
English Premier League 9120 7220 5759
English Championship 13248 10484 8641
English League One 13223 10460 8608
English League Two 13223 10459 8613
English National League 7040 5352 4642
French Ligue 1 8718 4907 4126
French Ligue 2 7220 760 639
German Bundesliga 7316 5480 3502
German 2.Bundesliga 5670 1057 753
Greek Super League 6470 477 381
Italian Serie A 8424 5275 4439
Italian Serie B 8502 803 680
Netherlands Eredivisie 5814 612 504
Portugese Primeira Liga 5286 612 504
Scottish Premier League 5208 4305 3427
Scottish Championship 3334 524 297
Scottish League One 3335 527 298
Scottish League Two 3328 525 297
Spanish Primera Liga 8330 5290 4449
Spanish Segunda Division 8757 903 771
Turkish Super lig 5779 612 504
Total 162435 77124 62218

2.2. Data 2.3. Bivariate poisson model

This paper makes use of the large repository of data Poisson models are forecasting models that use the
available at www.football-data.co.uk, which supplies Poisson distribution to model the number of goals
free match-by-match data for 22 European Leagues. scored by each team in a football match. Whilst many
For each match, statistics are given including, among variants of the Poisson model have been proposed, in
others, the number of shots, shots on target, cor- this paper, we consider the Bivariate Poisson model
ners, fouls and yellow cards. Odds data from multiple proposed by Ley et al. (2019), who compared it with
bookmakers are also given for the match outcome nine other models and found it to achieve the most
market, the over/under 2.5 goal market and the Asian favourable forecast performance (according to the
Handicap match outcome market. For some leagues, ranked probability score).
match statistics are available from the 2000/2001 sea- The aim of a Poisson model is to estimate the Pois-
son onwards. For others, these are available for later son parameter for each team, which can then be used
seasons. Therefore, since the focus of this paper is to determine a forecast probability for each outcome
forecasting using match statistics, only matches from of a match. Whilst Poisson models typically make the
the 2000/2001 season onwards are considered. The assumption that the number of goals scored by each
data used in this paper are summarised in Table 1 in team in a match is independent, there is some evi-
which, for each league, the total number of matches dence that this is not the case. The Bivariate Poisson
since 2000/2001, the number of matches in which includes an additional parameter that removes this
shots and corner data are available and the num- assumption.
ber of these excluding a ‘burn-in’ period for each In the context of this paper, the Bivariate Poisson
season are shown. The meaning of the ‘burn-in’ model has two purposes. Firstly, since it has been
period is explained in more detail in section 4.1 shown to perform favourably with respect to a number
but simply omits the first six matches of the sea- of other models, it provides a powerful benchmark for
son played by the home team. All leagues include comparison in section 5.3. Secondly, it provides the
data up to and including the end of the 2018/19 basis for the Bivariate Attacking (BA) rating system
season. described in section 3.1.2.
E. Wheatcroft / Forecasting football matches by predicting match statistics 81

Let Gi,m and Gj,m be random variables for the match m is given by
number of goals scored in the m-th match by teams   xm
i and j, respectively, where team i is at home and 1 H
wtime,m (xm ) = , (4)
team j is away. In a match between the two teams, a 2
Poisson model can be written as
where xm is the number of days since the match was
P(Gi,m = α, Gj,m = β) played and H is the half life (e.g. if the half life is two
β
years, a match played two years ago receives half
λαi,m exp(−λi,m ) λj,m exp(−λj,m ) the weight of a match played today). The adjusted
= . , (2)
α! β! likelihood to be maximised is then given by
where λi,m and λj,m are the means of Gi,m and Gj,m ,
respectively. 
M
The Bivariate Poisson model is an extension of L= P(Ghm ,m = αm , Gam ,m = βm )wtime,m (xm )
another model, also described by Ley et al. (2019), m=1
called the Independent Poisson model and it is useful (5)
to define this first. The Independent Poisson Model where, for the m-th match, αm denotes the number of
parametrises the Poisson parameters for a home team goals scored by the home team hm , and β the number
i against an away team j as λi,m = exp(c + (ri + scored by the away team am .
h) − rj ) and λj,m = exp(c + rj − (ri + h)), respec- Performing maximum likelihood estimation with
tively, where c is a constant parameter, h is a a large number of parameters is, in general, difficult
home advantage parameter and r1 , ..., rT are strength and there is a risk of falling into local optima. We
parameters for each team. follow the approach used by Ley et al. (2019) who
The Bivariate Poisson model closely resembles the use the Broyden-Fletcher-Goldfarb-Shanno (BFGS)
independent model but introduces an extra parame- algorithm, a quasi-Newton method known for its
ter to account for potential dependency between the robust properties, implemented with the ‘fmincon’
number of goals scored by each team. Under the function in Matlab. Strictly positive parameters are
Bivariate Poisson model, the joint distribution for initialised at one and each of the other parameters
the number of goals in a match between teams i and is initialised at zero. The sum of the team ratings
j is given by r1 , ..., rT is constrained to zero.
A convenient property of the Poisson model is
that the difference between two Poisson distributions
follows a Skellam distribution and therefore match
P(Gi,m = α, Gj,m = β)
outcome probabilities can be estimated from the Pois-
β son parameters for each team. For more details, see
λαi,m λj,m
= exp(−(λi,m + λj,m + λc )) Karlis and Ntzoufras (2009).
α!β!
 
min(x,y)
x
  
y λc

k! (3) 3. Methodology
k k λi,m λj,m
k=0

where λc is a parameter that introduces a dependency 3.1. Ratings systems


in the number of goals scored by each team and λi,m
and λj,m are parametrised in the same way as the In this paper, two different approaches are used to
Independent Poisson model. For the Bivariate Pois- produce predictions for the number of goals, shots
son model, the Poisson parameter for the home and on target, shots off target and corners achieved by
away team is λc + λi,m and λc + λj,m , respectively. each team in a given football match. Each approach
Both the Independent and Bivariate Poisson mod- is described below.
els are parametric models in which the parameters
are estimated using maximum likelihood. However, 3.1.1. GAP ratings
in both cases, a slight adjustment is made to the likeli- The Generalised Attacking Performance (GAP)
hood function such that matches that happened more rating system, introduced by Wheatcroft (2020), is
recently are given more weight than those that hap- a rating system for assessing the attacking and defen-
pened longer ago. To do this, the weight placed on sive strength of a sports team with relation to a
82 E. Wheatcroft / Forecasting football matches by predicting match statistics

particular measure of attacking performance such as influence of a match on the ratings of each team. The
the number of shots or corners in football. For a par- parameter φ1 governs how the adjustments are spread
ticular given measure of attacking performance, each over the home and away ratings of the i-th team (the
team in a league is given an attacking and a defen- home team), whilst φ2 governs how the adjustments
sive rating, both for its home and away matches. An are spread over the home and away ratings of the j-th
attacking GAP rating can be interpreted as an esti- team (the away team). After any given match, a home
mate of the number of defined attacking plays the team is said to have outperformed expectations in an
team can be expected to achieve against an average attacking sense if its attacking performance is higher
team in the league, whilst its defensive rating can be than the mean of its attacking rating and the opposi-
interpreted as an estimate of the number of attacking tion’s defensive rating. In this case, its home attacking
plays it can be expected to concede against an average rating is increased (or decreased, if its attacking per-
team. The ratings for each team are updated each time formance is lower than expected). If the parameter
it plays a match. The GAP ratings of the i-th team in φ1 > 0, a team’s away ratings will be impacted by
a league who have played k matches are denoted as a home match, whilst a team’s home ratings will be
follows: impacted by an away match if φ2 > 0.
r H a - Home attacking GAP rating of the i-th In this paper, GAP ratings are used to estimate the
i,k attacking performance of each team. For a match
team in a league after k matches.
r H d - Home defensive GAP rating of the i-th involving the i-th team at home to the j-th team,
i,k where the teams have played k1 and k2 previous
team in a league after k matches.
r Aa - Away attacking GAP rating of the i-th matches in that season, respectively, the predicted
i,k numbers of defined attacking plays for the home and
team in a league after k matches.
r Ad - Away defensive GAP rating of the i-th away teams are given by
i,k
team in a league after k matches. a + Ad
Hi,k1 Aaj,k2 + Hi,k1
d
j,k2
Ŝh = Ŝa = . (8)
The ratings are updated as follows. Consider a match 2 2
in which the i-th team in the league is at home to The predicted number of attacking plays by the
the j-th team. The i-th team have played k1 previ- home team is therefore the average of the home
ous matches and the j-th team k2 . Let Si,k1 and Sj,k2 team’s home attacking rating and the away team’s
be the number of defined attacking plays by teams i away defensive rating whilst the predicted number of
and j in the match (note in many cases, both teams attacking plays by the away team is given by the aver-
will have played the same number of matches and k1 age of the away team’s away attacking rating and the
and k2 will be equal). The GAP ratings for the i-th home team’s home defensive rating. The predicted
team (the home team) are updated in the following difference in the number of defined attacking plays
way made by the two teams is given by Ŝh − Ŝa and it is
a
Hi,k 1 +1
= max(Hi,k
a
1
+ λφ1 (Si,k1 − (Hi,k
a
1
+ Adj,k2 )/2), 0), this quantity that is of interest in the match prediction
model later in this paper.
Aai,k1 +1 = max(Aai,k1 + λ(1 − φ1 )(Si,k1 − (Hi,k
a
+ Adj,k2 )/2), 0),
1 GAP ratings are determined by three parameters
d
Hi,k 1 +1
= max(Hi,k
d
1
+ λφ1 (Sj,k2 − (Aaj,k2 + Hi,k
d
1
)/2), 0), which are estimated by minimising the mean abso-
lute error between the estimated number of attacking
Adi,k1 +1 = max(Adi,k1 + λ(1 − φ1 )(Sj,k2 − (Aaj,k2 + Hi,k
d
)/2), 0).
1 plays and the observed number. The function to be
(6)
minimised is therefore
1 
The GAP ratings for the j-th team (the away team) N

are updated as follows: f (λ, φ1 , φ2 ) = |Sh,m − Ŝh,m | + |Sa,m − Ŝa,m |


N
m=1
Aaj,k2 +1 = max(Aaj,k2 + λφ2 (Sj,k2 − (Aaj + Hid )/2), 0),
(9)
a
Hj,k2 +1
= max(Hj,k
a
2
+ λ(1 − φ2 )(Sj,k2 − (Aaj + Hid )/2), 0), where, for the m-th match, Sh,m and Sa,m are the
observed numbers of attacking plays for the home
Adj,k2 +1 = max(Adj,k2 + λφ2 (Si,k1 − (Hia + Adj )/2), 0),
and away team, respectively, and Ŝh,m and Ŝa,m
d
Hj,k = max(Hj,k
d
+ λ(1 − φ2 )(Si,k1 − (Hia + Adj )/2), 0), are the predicted numbers from the GAP rating
2 +1 2
(7) system.
where λ > 0, 0 < φ1 < 1 and 0 < φ2 < 1 are param- In this paper, optimisation is performed using the
eters to be estimated. Here, λ determines the overall fminsearch function in Matlab which implements the
E. Wheatcroft / Forecasting football matches by predicting match statistics 83

Nelder-Mead simplex algorithm. The small number For a match in which team i is at home against team
of parameters required to be optimised makes the risk j, the estimated number of defined attacking plays
of falling into local minima small. for the home team in match m is given by Ŝh,m =
Note that the approach to parameter estimation in λc + exp(c + (ri + h) − rj ) and for the away team
this paper, in which the parameters are based purely Ŝa,m = λc + exp(c + rj − (ri + h)). The function to
on the prediction accuracy of the GAP ratings with be minimised is
relation to the observed match statistics, differs from
the approach taken in Wheatcroft (2020), in which
the parameters are optimised with respect to the per- 1 
M
MAE = wtime,m (xm )(|Sh,m − Ŝh,m | + |Sa,m − Ŝa,m |),
formance of the probabilistic forecasts for which the M
m=1
ratings are predictor variables (in that paper, the fore- (10)
casts predict the probability that the total number where M is the number of matches over which the
of goals will exceed 2.5). Whilst a similar approach parameters are optimised, Sh,m and Ŝh,m are the
could be taken here, our chosen approach is selected observed and predicted numbers of attacking plays
to simplify the forecasting process and allow us to use for the home team in the m-th match and Sa,m and
as predictor variables GAP ratings based on multi- Ŝa,m are the same but for the away team. The inclu-
ple measures of attacking performance. For example, sion of wtime,m (xm ), defined in equation (4), means
this allows for both predicted shots on target and that more weight is placed on more recent matches.
predicted corners to be used as predictor variables As for the Bivariate Poisson model, the half life is
without requiring simultaneous optimisation of the determined by the chosen value of H and xm is the
GAP rating parameters. number of days between match m and the present day.
It is useful to note that, whilst the above approach
3.1.2. Bivariate attacking ratings is based on the Bivariate Poisson model, the switch
We present an alternative approach to the GAP rat- from maximum likelihood estimation to the minimi-
ing system for predicting match statistics which we sation of the mean absolute error removes the use of
call the Bivariate Attacking (BA) rating system. The the Poisson distribution entirely since, here, we are
approach is similar to the Bivariate Poisson model interested in single valued point predictions rather
described in section 2.3 but differs in a number of than probability distributions.
ways. Firstly, whilst the Bivariate Poisson model is Similarly to the Bivariate Poisson model, parame-
typically used to model the number of goals scored by ter estimation for BA ratings is somewhat difficult as
each team, it is just as straightforward to extend this there are a large number of parameters and therefore
to match statistics of attacking performance such as the risk of falling into local optima is high. In the
shots and corners and this is the approach taken here. results section, we consider a large number of past
The second adjustment is the cost function used to matches and several different values of the half life
select the parameters. Whilst the Bivariate Poisson parameter and we therefore need an algorithm that is
model defined by Ley et al. (2019) uses maximum both accurate and fast. Here, we use the ‘fmincon’
likelihood estimation, here we aim to minimise the function in Matlab, selecting the ‘active-set’ algo-
mean absolute error (MAE) between the estimated rithm which provides a compromise between speed
number of defined match statistics and the observed and accuracy. To initialise the optimisation algorithm
number. This is done because the predicted number of at the beginning of the season, each team’s ratings are
shots or corners cannot directly be used to model the set to zero. Under this initialisation, the algorithm
match outcome. The aim is therefore to make deter- requires a large number of iterations and is therefore
ministic predictions of a chosen match statistic and relatively slow to converge. Therefore, subsequently
use this as an input to a statistical model of the match (i.e. once the first match of the season has been
outcome. The MAE loss function also has the added played), the optimisation algorithm is initialised with
advantage that it is relatively robust with respect to the optimised parameter values from the previous run.
outliers. This speeds up the process considerably because a
Similarly to the Bivariate Poisson model, let c be team’s previous ratings are expected to be similar
a constant parameter, h a home advantage parameter, to its new ratings, reducing the required number of
r1 , ..., rT strength parameters for each team and λc iterations for convergence. The sum of r1 , ..., rT is
a parameter that determines the dependency between constrained to zero whilst all other parameters are
the number of defined attacking plays by each team. initialised at zero.
84 E. Wheatcroft / Forecasting football matches by predicting match statistics

3.2. Constructing probabilistic forecasts 3.3. Betting strategies

The nature of football matches is that the three Following Wheatcroft (2020), in this paper, fore-
possible outcomes can be considered to be ‘ordered’. casts are constructed and used alongside two betting
Clearly, a home win is ‘closer’ to a draw than it is strategies: a simple level stakes value betting strategy
to an away win. As such, an appropriate model for and a strategy based on the Kelly Criterion. These are
predicting the probability of each outcome is ordinal both described below.
logistic regression and this is the approach taken here. Under the Level stakes betting strategy, a unit bet is
Define an event with J ordered potential out- placed on the i-th outcome of an event when p̂i > ri ,
comes 1, .., J. Let Ybe a random variable such that where p̂i and ri are the predicted probability and the
p(Y = i) = pi and Ji=1 pi = 1 The ordinal logistic odds-implied probability, respectively. The simple
regression model is parametrised as idea here is that, if the true probability is higher than
  the odds-implied probability, the bet offers ‘value’,
p(Y ≥ i) 
K
that is the statistical expectation of the net return from
log = αi + β j Vj +  (11)
p(Y < i) the bet is positive. The idea is to use the forecast prob-
j=1
abilities to try and find these value bets. Of course, the
where V1 , ..., VK are predictor variables and α and success of the strategy depends on the performance
β1 , ..., βK are parameters to be selected. In football of the forecast probabilities in terms of uncovering
matches, since, in some sense, a home win is ‘greater’ such opportunities.
than a draw which is ‘greater’ than an away win, from The Kelly strategy is based on the Kelly Criterion
equation (11), the model can be parameterised as (Kelly Jr, 1956) and has been used in, for exam-
ple, Wheatcroft (2020) and Boshnakov et al. (2017).
  
K Under this approach, the amount staked on a bet
ph is dependent on the difference between the forecast
log = α1 + βj Vj + , (12)
pd + p a probability and the odds implied probability. When
j=1
the discrepancy between the forecast probability and
and the odds-implied probability is high, a greater amount
  
K of money is staked. Under the Kelly Criterion, bets
ph + pd are placed as a proportion of one’s wealth. For a par-
log = α2 + βj V j +  (13)
pa ticular outcome, the proportion of wealth staked is
j=1
given by
where ph , pd and pa are the probabilities of a home
 
win, a draw and an away win respectively. These ri + p̂i − 1
are easily estimated by solving with respect to equa- fi = max ,0 (14)
ri − 1
tions 12 and 13. Throughout this paper, least squares
parameter estimates are used to select the regression where p̂i is the estimated probability of the outcome
parameters α1 , α2 and β1 , ..., βk . and ri represents the decimal odds on offer. Under the
Combinations of the following predictor variables Kelly strategy used in this paper, we take a slightly
are used: different approach in that the stake does not depend on
r The home team’s odds-implied probability of the bank but is given by si = kfi where k is a normal-
winning. ising constant set such that m1 m i=1 kfi = 1, where
r Observed differences in the number of shots on fi is calculated from equation (14) and m is the total
target, shots off target and corners achieved by number of bets placed. The normalising constant is
each team. included purely so that the average stake is 1 mak-
r Differences in the predicted number of shots ing the profit/loss from the Kelly Strategy directly
on target, shots off target, corners and goals for comparable with that of the Level Stakes strategy.
each team. Both the Level Stakes and Kelly betting strategies
focus on the concept of ‘value’ in which bets are
The home team’s odds-implied probability is only taken if the forecast implies a positive expected
included in order to assess the importance of match return. It should be noted, however, that the two
statistics both individually and when used alongside strategies are only guaranteed to find bets with value
the other information reflected in the odds. if the estimated probability and the true probability
E. Wheatcroft / Forecasting football matches by predicting match statistics 85

coincide. In practice, due to model error in the fore- match statistics are available for any of the consid-
casts, this can never be expected to be the case and ered leagues (2000/2001) is used only to optimise
the performance of the strategies must therefore be the GAP rating parameters for the following seasons,
assessed empirically. and therefore is not considered in the assessment of
the performance of the forecasts or in variable selec-
tion. A team’s GAP ratings are updated each time it
4. Results plays a match. However, this leaves open the ques-
tion of how to initialise the ratings for each team.
4.1. Calculation of ratings Whilst there are a number of approaches that could
be taken, in the first season in which match statistics
In the following experiment, we assess the per- are available in a particular league, all GAP ratings are
formance of differences in observed and predicted initialised at zero. For subsequent seasons, a team’s
numbers of shots on target, shots off target, cor- ratings are retained from one season to the next if
ners and goals as potential predictor variables for they remain in the same league. Teams relegated to a
the outcomes of football matches. Different combi- league are assigned the average ratings of those teams
nations of observed and predicted match statistics are that were promoted in the previous season and teams
then assessed both with and without the odds-implied that are promoted are assigned the average ratings
probability of the home team (calculated using the of those teams that were relegated in the previous
maximum odds over all bookmakers) included as an season (note that promoted teams tend to outperform
extra predictor variable. relegated teams. In the English Premier League, pro-
The experiment aims to assess the performance moted teams have been found to achieve an average
of observed and predicted match statistics in the of around 8 more points than the teams they replaced
forecasting of match outcomes. This is done in the (Constantinou and Fenton, 2017)). Despite this, we
context of (i) traditional variable selection (using consider our approach to be reasonable whilst noting
model selection techniques), (ii) assessment of fore- that more sophisticated approaches might be more
cast performance, and (iii) betting performance. In effective.
cases (i) and (ii), observed and predicted match statis- For Bivariate Attacking ratings, optimisation is
tics are used as inputs to an ordinal regression model performed on each day in which at least one match
whilst, in (iii), only predicted statistics are consid- occurs in a given league and the ratings are used for
ered. Whilst extra details of the experiment are given all matches on that day.
under the following headings, here we describe the
process of producing sets of predicted match statistics 4.2. Evaluating predicted match statistics
using GAP and BA ratings.
We look to test forecast performance over as large Before assessing the performance of probabilistic
a number of matches as possible. However, since we match forecasts, we assess the performance of the
plan to use match statistics to build our forecasts and predicted match statistics in terms of how well they
we look to assess betting performance, we are limited predict the observed statistics.
to those matches in which both match statistics and To provide a benchmark for the performance of
betting odds are available. In addition, whilst we use the forecasts, a very simple alternative prediction for
all matches that have this information available for the each match statistic is given by the sample mean of
calculation of ratings, we exclude from the analysis that statistic over all matches played by all teams in
all matches within a ‘burn-in’ period in which the the data set previous to the day on which the match
home team has played six or fewer matches so far occurs. For the j-th match, this is given for the home
in that season to give the ratings sufficient time to and away team, respectively, by
‘learn’ about the relative strengths of the teams. Nprev
For the GAP rating system, parameter estimation 1 
fh,j = Sh,i , (15)
is performed simultaneously over all leagues and Nprev
i=1
takes place between seasons such that, at the begin-
ning of each season, optimisation is performed over and
all previous seasons in which the relevant statistics Nprev
1 
are available. Those parameters are then used for fa,j = Sa,i , (16)
the entirety of the season. The first season in which Nprev
i=1
86 E. Wheatcroft / Forecasting football matches by predicting match statistics

where Sh,i and Sa,i are the number of defined attack-


ing plays in the i-th match by the home and away
teams, respectively, and Nprev is the number of
matches played prior to the present day and in which
that match statistic is available. We refer to this
approach as the mean-benchmark model.
To assess the performance of the predicted match
statistics as predictors of observed statistics, we com-
pare the mean absolute error with that achieved with
the mean-benchmark model. The mean absolute error
over N forecasts (predicted match statistics) and out-
comes (observed match statistics) is given by

1 
N
MAE = |Sh,i − Sh,i
ˆ | + |Sa,i − Sa,i
ˆ |. (17)
N Fig. 1. Values of R for GAP ratings (straight lines) and BA ratings
i=1
(curves with open circles) for each match statistic. The latter is
The ratio of the MAE for each approach is given by shown as a function of half life.

MAEm
R= (18)
MAEb
there is more information on which to base the fore-
where MAEm and MAEb are the mean absolute casts. BA ratings do not outperform GAP ratings for
error for the predicted statistics and for the mean- match statistics other than goals for any tested half
benchmark model, respectively. When R < 1, the life.
model produces forecasts closer to the true value than
the mean benchmark model.
The performance of the two approaches (GAP rat-
ings and BA Ratings) in terms of the prediction of 5. Variable selection
match statistics is assessed by comparing the value
of R. The values of R for both GAP and BA ratings Our next focus is on variable selection and the
are shown in Fig. 1 for each of the four measures of aim is to find the combination of (i) observed and
attacking performance (goals, corners, shots on target (ii) predicted match statistics that explain the match
and shots off target). For BA ratings, R is shown as a outcomes most effectively. Variable selection is per-
function of the chosen ‘half life’. In all cases, the GAP formed using Akaike’s Information Criterion (AIC),
ratings are able to outperform the mean-benchmark which weighs up the fit of the model to the data with
model and this is generally also the case for BA rat- the number of parameters selected in-sample (see
ings. Note that, due to high computational intensity, R appendix A for details). As required for the calcu-
is not shown for values of the half life longer than 135 lation of information criteria, the ordinal regression
days. However, as described in the next section, we parameters are selected in-sample and therefore, in
are primarily interested in relatively short values of order to calculate the likelihood, a single set of param-
the half life that reflect a team’s recent performances eters is selected over all available matches.
and are able to augment the information contained in To provide further context to the calculated AIC
the match odds. We therefore find that the half life values, we make use of the confidence set approach
that maximises the performance of forecasts of the described by Anderson and Burnham (2004). Here,
match outcome is relatively short compared with that the Akaike weights for each model (which can be
which minimises R. thought of as the probability that each one repre-
There is a notably high degree of variation in the sents the best approximating model) are calculated
performance of the predicted statistics. Under the and sorted from largest to smallest. Models are then
GAP rating system, the value of R is smallest for shots added to the confidence set in order of their Akaike
off target, whilst for goals and corners, R is not much weights (largest first) until the sum of the weights
smaller than 1. This is likely explained by the fact exceeds 0.95. The confidence set then represents the
that there are typically a larger number of shots off set in which the best approximating model falls with
target in a game than the other statistics and therefore at least 95 percent probability.
E. Wheatcroft / Forecasting football matches by predicting match statistics 87

Table 2
AIC of each combination of observed match statistics with and without the home odds-implied probability included as a predictor variable.
Variables that are included are denoted with a star and, in each case, AIC is given with that of model A0 subtracted. The combination of
variables with the lowest AIC is highlighted in bold and each one that falls into the 95 percent confidence set is highlighted in italic (which
is only combination A1 in this case)
Combination of Shots on Shots off Corners AIC w/o odds AIC w. odds
variables Target Target
A1 ∗ ∗ ∗ −15125.4 −19473.6
A3 ∗ ∗ −14804.3 −18572.7
A2 ∗ ∗ −13530.9 −17124.8
A4 ∗ −12239.9 −14643.5
A5 ∗ ∗ −18.5 −9150.4
A6 ∗ −18.3 −8658.7
A7 ∗ −9.2 −8598.3
A0 0 −5619.1

5.1. Variable selection: observed match statistics on target and/or the home odds-implied probability.
It is a property of generalised linear models that some
The results of variable selection when using predictor variables are only informative in combina-
observed match statistics are shown in Table 2. Here, tion with other predictor variables and this appears to
the AIC for different combinations of statistics is be the case here.
shown both with and without the home odds-implied Finally, all three match statistics add information
probability included as an additional predictor vari- even when the odds-implied probability is included in
able. Note that the AIC in each case is expressed the model. This is perhaps not surprising since match
with that of model A0 (fitted without the odds- statistics give an indication of how the match actually
implied probability) subtracted such that negative went.
values imply better support for a particular combi- In practice, of course, observed statistics are never
nation of predictor variables than that of the model available pre-match. Despite this, the results shown
fitted without any predictor variables. The lower the here have important implications. Match statistics can
AIC, the more support for that particular combination be predicted and, if those predictions are informative
of variables. enough, it stands to reason that informative forecasts
The results yield a number of conclusions. The of the outcome of the match can be made.
best AIC is achieved when the model includes all
three observed match statistics both when the home 5.2. Variable selection: predicted match
odds-implied probability is included as an additional statistics
predictor variable and when it is not. That the number
of shots on target should have an impact on the match In section 4.2, the results of predicting match statis-
result should not come as a surprise, since all goals tics using GAP and BA ratings were presented. It was
other than own goals and highly unusual events (such shown that, in the latter case, the choice of half life has
as the ball deflecting off the referee or, in one case an important impact on the MAE of the predictions.
in 2009, a beachball) result from a shot on target. Although, typically, longer half lives tend to provide
Interestingly, however, the inclusion of the number better predictions for the match statistics, it may not
of corners and shots off target, which don’t usually be the case that they provide a more useful input for
directly result in goals, improves the model even once probabilistic forecasts of the match outcome. This is
shots on target are considered. because a consistently strong team like, say, Manch-
It is also interesting to compare the effects of ester United will be expected to take a larger number
each observed match statistic as an individual pre- of shots and corners than a weaker side over a long
dictor variable. Unsurprisingly, the number of shots period of time and this will be reflected in the ratings.
on target provides the most information, followed by However, we are looking for information that is not
corners and shots off target. Interestingly, shots off reflected in the odds and thus to augment the informa-
target and corners do not provide much information tion the odds provide. For example, if a team’s recent
when considered individually but add a great deal of results have not reflected their performances, we look
information when combined with the number of shots to identify that this is the case from their match
88 E. Wheatcroft / Forecasting football matches by predicting match statistics

Fig. 2. AIC as a function of half life for forecasts produced using different combinations of (i) BA ratings (lines with points) and (ii) GAP
ratings (straight horizontal lines). In both cases, the home odds-implied probability is used as an additional predictor variable.

statistics in recent matches. It therefore seems rea- in the model. This means that, on average, both sets of
sonable to expect that a shorter half life should be predicted match statistics (from GAP and BA ratings)
more useful in this case. On the other hand, looking provide information beyond that contained in the
only at more recent matches gives us a less robust odds-implied probabilities. However, given the uni-
reflection of a team’s strength and we therefore have versally lower AIC values, the GAP rating approach
a trade-off. Here, for simplicity, we choose a single appears to be more effective.
half life for use in the rest of the paper based on the fol- It is of interest to note the relative importance of
lowing fairly ad-hoc approach. Looking at the results the different predicted match statistics. Consistent
in Fig. 2, since a half life of 45 days gives the lowest with the findings of Wheatcroft (2020), the predicted
AIC for the case in which predictions of all match number of goals provides relatively little information
statistics are used in the model (bottom right panel), when combined with the odds-implied probabilities
this value is used for all further results shown in this whilst predictions of other match statistics are much
paper. more effective in improving the forecast model. It is
The results of variable selection with predicted also notable that whilst, in the observed case, the num-
match statistics are shown in Table 3. Unsurprisingly, ber of shots on target provides the most information
the AIC is generally higher than for the observed about the outcome of the match, in the predicted case,
case, implying that the information content is lower. shots off target is the most informative. At first, this
Despite this, predicted match statistics are able to seems counterintuitive. However, it should be noted
provide information regarding match outcomes, even that the information in the prediction is dependent
when the home odds-implied probability is included both on the impact of the observed statistic on the
E. Wheatcroft / Forecasting football matches by predicting match statistics 89

Table 3
AIC of each combination of predicted match statistics under both GAP and BA ratings with and without the home odds-implied probability
included as a predictor variable. Included variables are denoted with a star and each AIC value is given relative to that of the regression
model with only a constant term. The combination of variables with the lowest AIC is highlighted in bold and each one that falls into the 95
percent confidence set is highlighted in italic
Combination of Goals Shots on Shots off Corners GAP:AIC GAP:AIC BA:AIC BA:AIC
variables Target Target w/o odds w. odds w/o odds w. odds
B1 ∗ ∗ ∗ −5453.6 −7619.9 −4405.5 −7595.0
B9 ∗ ∗ ∗ ∗ −6365.0 −7618.5 −5363.4 −7593.2
B2 ∗ ∗ −5359.5 −7604.3 −4176.2 −7578.5
B5 ∗ ∗ −4124.4 −7604.1 −2959.1 −7573.7
B10 ∗ ∗ ∗ −6309.5 −7602.9 −5153.1 −7576.5
B13 ∗ ∗ ∗ −6268.3 −7602.7 −4914.3 −7573.0
B11 ∗ ∗ ∗ −6245.6 −7596.1 −5072.4 −7555.9
B3 ∗ ∗ −5357.5 −7596.0 −4072.2 −7557.9
B7 ∗ −3286.5 −7573.5 −2185.0 −7549.2
B15 ∗ ∗ −6146.9 −7573.3 −4481.4 −7547.8
B6 ∗ −3499.6 −7566.5 −2063.6 −7527.8
B14 ∗ ∗ −6051.3 −7564.8 −4405.6 −7526.2
B12 ∗ ∗ −6087.3 −7557.9 −4631.3 −7520.7
B4 ∗ −5146.7 −7556.5 −3583.2 −7521.8
B0 0.0 −7473.9 0.0 −7473.9
B8 ∗ −5573.3 −7473.9 −3342.7 −7471.9

match and the quality of the prediction of that statis- Ranked Probability Score (Constantinou and Fenton,
tic. Recall that Fig. 1 suggests GAP and BA rating 2012). The ignorance score, also commonly known
predictions of shots off target improve more on the as the log-loss is given by
mean-benchmark model than those of the other match
statistics and this superior prediction accuracy is the S(p, Y ) = − log2 (p(Y )), (19)
likely explanation.
where p(Y ) is the probability placed on the outcome
Finally, it is notable that, when considered as
Y.
individual predictor variables, the predicted num-
To define the Ranked Probability Score, for an
ber of shots off target and corners outperforms the
event with r possible outcomes, let pj and oj be the
equivalent observed statistics. Again, this seems
forecast probability and outcome at position j where
counterintuitive but can probably be explained by
the ordering of the positions is preserved. The Ranked
the fact that the predicted values consider the per-
Probability Score (RPS) is given by
formances of the teams over multiple past matches,
gaining some information about the relative strengths 
r−1 
i
of the two teams. S(p, Y ) = (pj − oj )2 . (20)
i=1 j=1

5.3. Forecast performance The RPS is often considered appropriate for eval-
uating forecasts of football matches because it takes
We now turn our focus onto the question of forecast into account the ordering of the outcomes, i.e. a
performance. Though closely related to model selec- home win is ‘closer’ to a draw than it is to an away
tion, this allows us to assess the relative performance win (Constantinou and Fenton, 2012). However, it
of the forecasts out-of sample and therefore as if they has also been argued that the ordered nature of the
were produced in real time. In order to produce the RPS provides little practical benefit and that only the
forecasts, new regression parameters are selected on probability placed on the outcome should be taken
each day in which at least one match is played and into account, as per the ignorance score (Wheatcroft,
are calculated based on all past matches which fall 2019). Here, we consider it useful to evaluate the
outside of the ‘burn-in’ period and which have shots forecasts using both approaches.
and corner data as well as match odds available. To provide some context regarding the perfor-
We compare forecast performance using two com- mance of the forecasts, we compare the performance
monly used scoring rules: the Ignorance Score with that of an alternative, strongly perform-
(Roulston and Smith, 2002; Good, 1952) and the ing approach to forecasting football matches. The
90 E. Wheatcroft / Forecasting football matches by predicting match statistics

Table 4
Mean RPS for each combination of variables and, for comparison, that of the Bivariate Poisson model. Included variables are denoted with
a star. The combination with the highest performance is highlighted in bold and each one that falls into the Model Combination Set is
highlighted in italic
Combination of Goals Shots on Shots off Corners GAP:RPS GAP:RPS BA:RPS BA:RPS
variables Target Target w/o odds w. odds w/o odds w. odds
B5 ∗ ∗ 0.2149 0.2058 0.2191 0.2059
B9 ∗ ∗ ∗ ∗ 0.2090 0.2058 0.2128 0.2059
B2 ∗ ∗ 0.2116 0.2058 0.2161 0.2059
B1 ∗ ∗ ∗ 0.2113 0.2058 0.2154 0.2059
B13 ∗ ∗ ∗ 0.2093 0.2058 0.2140 0.2059
B10 ∗ ∗ ∗ 0.2092 0.2058 0.2135 0.2059
B11 ∗ ∗ ∗ 0.2093 0.2058 0.2136 0.2060
B7 ∗ 0.2171 0.2059 0.2212 0.2060
B3 ∗ ∗ 0.2116 0.2058 0.2163 0.2060
B6 ∗ 0.2166 0.2059 0.2214 0.2060
B14 ∗ ∗ 0.2099 0.2059 0.2153 0.2060
B15 ∗ ∗ 0.2096 0.2059 0.2152 0.2060
B12 ∗ ∗ 0.2098 0.2059 0.2150 0.2061
B4 ∗ 0.2121 0.2059 0.2178 0.2061
B0 0.2264 0.2062 0.2264 0.2062
B8 ∗ 0.2111 0.2062 0.2182 0.2062
Bivariate Poisson ∗ 0.2121 0.2121

Bivariate Poisson model, described in Section 2.3, selection results is that the model performs consis-
has been shown to perform favourably with respect tently better when match statistics are predicted using
to 9 other forecast models (Ley et al., 2019). We GAP ratings rather than BA ratings.
apply the model to our data set using the optimal When considering the performance of the Bivari-
half life parameter of 390 days determined by Ley et ate Poisson model, it is worth noting that it only takes
al. (2019). goals into consideration. In terms of the information
Similarly to the Akaike weights confidence set used, its performance can be compared with model
used in section 5, we take a similar approach here B8 for the case in which the odds-implied probabil-
using the Model Confidence Set (MCS) methodol- ity is not included. Here, the Bivariate Poisson model
ogy proposed by Hansen et al. (2011). Here, the aim does slightly worse though the difference is small.
is to identify the set of models in which there is a 95 It is when predictions of other match statistics are
percent probability that the ‘best’ model falls, given included that there is a large increase in performance
the chosen measure of performance. We highlight the over the Bivariate Poisson model. This suggests that
combinations of variables that fall into this set. much of the improvement results from the additional
The mean RPS and Ignorance of each combination information in the match statistics rather than the
of variables as well as the Bivariate Poisson model structure of the model.
are shown in Tables 4 and 5, respectively. In the lat-
ter case, the scores are given with that of model B0 5.4. Betting performance
subtracted such that negative scores imply better per-
formance than the model applied with no predictor In this section, the performance of the forecasts
variables. The 95 percent Model Confidence Set in in section 5.3 when used alongside the Level Stakes
each case is highlighted in italic. Note that, since the and Kelly betting strategies described in section 3.3 is
Bivariate Poisson model does not make use of match assessed. Here, it is assumed that a gambler is able to
odds, a fair comparison is only provided by com- ‘shop around’ different bookmakers and take advan-
paring these combination of variables in which the tage of the highest odds offered on each outcome.
odds-implied probabilities are not included. The maximum odds over all available bookmakers
Similarly to the variable selection results in sec- are thus assumed to be obtainable (note that the actual
tion 5.2, including predictions of match statistics bookmakers included in the data set vary over time).
other than goals in the model improves overall pre- Note that bets placed on draws are not considered
dictive performance of the match outcomes according due to the inherent difficulty of predicting them and
to both scoring rules. Also consistent with the model therefore only bets on home or away wins are allowed.
E. Wheatcroft / Forecasting football matches by predicting match statistics 91

Table 5
Mean ignorance scores for each combination of variables and, for comparison, that of the Bivariate Poisson model. Included variables are
denoted with a star. The combination with the highest performance is highlighted in bold and each one that falls into the Model Combination
Set is highlighted in italic
Combination of Goals Shots on Shots off Corners GAP:IGN GAP:IGN BA:IGN BA:IGN
variables Target Target w/o odds w. odds w/o odds w. odds
B9 ∗ ∗ ∗ ∗ −0.0739 −0.0888 −0.0626 −0.0887
B1 ∗ ∗ ∗ −0.0635 −0.0888 −0.0516 −0.0887
B2 ∗ ∗ −0.0624 −0.0887 −0.0490 −0.0886
B10 ∗ ∗ ∗ −0.0733 −0.0886 −0.0602 −0.0886
B5 ∗ ∗ −0.0480 −0.0887 −0.0345 −0.0885
B13 ∗ ∗ ∗ −0.0728 −0.0886 −0.0572 −0.0885
B11 ∗ ∗ ∗ −0.0727 −0.0887 −0.0592 −0.0883
B7 ∗ −0.0382 −0.0883 −0.0257 −0.0883
B3 ∗ ∗ −0.0625 −0.0887 −0.0477 −0.0883
B15 ∗ ∗ −0.0714 −0.0883 −0.0522 −0.0882
B6 ∗ −0.0410 −0.0884 −0.0241 −0.0880
B14 ∗ ∗ −0.0704 −0.0884 −0.0513 −0.0880
B12 ∗ ∗ −0.0709 −0.0883 −0.0541 −0.0880
B4 ∗ −0.0601 −0.0883 −0.0421 −0.0880
B0 0.0000 −0.0875 0.0000 −0.0875
B8 ∗ −0.0650 −0.0874 −0.0388 −0.0875
Bivariate Poisson * −0.0614 −0.0614

Table 6
Mean percentage profit of Level Stakes strategy with each combination of predicted match statistics with and without odds-implied
probabilities included as a predictor variable. Included variables are denoted with a star
Combi- Goals Shots Shots Cor- GAP:Profit GAP:Profit BA:Profit BA:Profit
nation of on off ners w/o odds w. odds w/o odds w. odds
variables Target Target
B5 ∗ ∗ +0.54(−0.83, +1.98) +1.85(+0.45, +3.34) −0.29(−1.68, +1.15) +1.41(−0.10, +3.09)
B9 ∗ ∗ ∗ ∗ +0.60(−0.89, +2.09) +1.55(+0.32, +3.12) +0.23(−1.37, +1.73) +1.24(+0.01, +2.59)
B2 ∗ ∗ +0.36(−1.00, +1.76) +1.73(+0.23, +3.18) +0.07(−1.51, +1.32) +1.28(−0.30, +2.85)
B1 ∗ ∗ ∗ +0.67(−1.07, +1.88) +1.48(−0.11, +2.79) +0.25(−1.02, +1.68) +1.30(−0.11, +2.80)
B13 ∗ ∗ ∗ +0.33(−1.23, +2.07) +1.77(+0.20, +3.01) −0.18(−1.67, +1.41) +1.26(−0.16, +2.78)
B10 ∗ ∗ ∗ +0.02(−1.42, +1.71) +1.60(+0.07, +3.12) −0.63(−2.18, +0.78) +1.21(+0.05, +2.83)
B11 ∗ ∗ ∗ +0.00(−1.31, +1.58) +0.93(−0.80, +2.32) −0.43(−1.88, +0.89) +0.76(−0.54, +2.53)
B7 ∗ −0.44(−2.05, +0.79) +1.15(−0.52, +2.78) −0.89(−2.17, +0.67) +0.85(−0.51, +2.38)
B3 ∗ ∗ +0.37(−1.20, +1.88) +1.00(−0.28, +2.49) −0.23(−1.45, +1.22) +0.81(−0.60, +2.42)
B6 ∗ −0.74(−2.26, +0.69) +1.16(−0.23, +2.67) −1.15(−2.66, +0.27) +0.43(−1.17, +2.04)
B14 ∗ ∗ −0.62(−2.00, +0.82) +0.83(−0.40, +2.15) −1.02(−2.53, +0.49) +0.33(−1.49, +1.60)
B15 ∗ ∗ −0.41(−1.67, +1.09) +0.83(−0.40, +2.15) −1.03(−2.39, +0.33) +0.84(−0.45, +2.42)
B12 ∗ ∗ −1.07(−2.63, +0.26) +0.46(−0.88, +2.01) −1.08(−2.77, +0.25) −0.34(−1.49, +1.81)
B4 ∗ −0.44(−1.89, +1.04) +0.13(−1.42, +1.89) −0.74(−2.25, +0.95) −0.36(−1.66, +1.36)
B0 −2.33(−3.84, −0.73) −1.26(−3.06, +0.48) −2.33(−3.55, −1.02) −1.26(−3.20, +0.20)
B8 ∗ −2.69(−4.22, −1.32) −1.70(−3.41, −0.34) −2.84(−4.28, −1.55) −1.37(−2.94, +0.48)

The mean percentage profit obtained from the Level probability as an additional predictor variable yields
Stakes betting strategy when used alongside forecasts an increase in profit. In some cases, when the home
derived from each combination of predicted match odds-implied probability is included, the profit is sig-
statistics is shown in Table 6, along with 95 percent nificant, i.e. the bootstrap resampling interval does
bootstrap resampling intervals. The resampling inter- not include zero. Whilst caution is advised in com-
vals are presented to demonstrate the robustness of the paring the precise rankings of different combinations
profit and, if the interval does not contain zero, the of variables, the best performing combinations tend
profit can be considered to be statistically significant. to include the predicted number of shots off target.
It is clear from the results that including com- The predicted number of goals, on the other hand,
binations of predicted match statistics as predictor tends to have limited value. When individual pre-
variables tends to yield a profit. In addition, for dicted statistics are considered, the ranking of the
all combinations, including the home odds-implied results is consistent with the variable selection results
92 E. Wheatcroft / Forecasting football matches by predicting match statistics

Table 7
Mean percentage profit from the Kelly strategy using forecasts based on each combination of predicted match statistics with and without the
home odds-implied probability included as a predictor variable. Included variables are denoted with a star
Combi- Goals Shots Shots Cor- GAP:Profit GAP:Profit BA:Profit BA:Profit
nation of on off ners w/o odds w. odds w/o odds w. odds
variables Target Target
B1 ∗ ∗ ∗ +3.72(+1.61, +5.48) +4.88(+3.22, +6.39) +3.13(+1.27, +5.01) +4.27(+2.61, +5.85)
B9 ∗ ∗ ∗ ∗ +2.33(+0.20, +4.15) +4.87(+3.41, +6.45) +2.46(+0.58, +4.27) +4.24(+2.73, +5.84)
B10 ∗ ∗ ∗ +2.14(+0.45, +3.93) +4.66(+3.05, +6.21) +1.87(+0.04, +3.68) +3.90(+2.12, +5.45)
B2 ∗ ∗ +3.45(+1.51, +5.33) +4.67(+3.11, +6.11) +2.48(+0.60, +4.60) +3.94(+2.26, +5.58)
B5 ∗ ∗ +2.93(+1.04, +5.06) +4.56(+3.06, +6.12) +2.10(+0.03, +4.20) +3.93(+2.37, +5.65)
B13 ∗ ∗ ∗ +1.79(−0.01, +3.67) +4.52(+2.97, +6.14) +1.71(−0.20, +3.54) +3.89(+2.22, +5.53)
B11 ∗ ∗ ∗ +1.36(−0.57, +3.38) +4.02(+2.39, +5.67) +0.90(−0.98, +2.78) +2.55(+1.00, +4.18)
B7 ∗ +2.02(+0.27, +4.01) +4.09(+2.44, +5.66) +0.66(−1.56, +2.76) +3.25(+1.64, +4.99)
B3 ∗ ∗ +2.97(+1.09, +4.90) +4.00(+2.25, +5.67) +1.71(−0.27, +3.82) +2.58(+0.93, +4.22)
B15 ∗ ∗ +1.26(−0.60, +3.13) +4.07(+2.45, +5.75) +0.54(−1.36, +2.31) +3.23(+1.62, +4.84)
B12 ∗ ∗ +0.52(−1.42, +2.60) +2.92(+1.19, +4.64) −0.22(−2.15, +1.73) +1.35(−0.47, +3.15)
B6 ∗ +1.18(−0.84, +3.31) +2.96(+1.38, +4.62) +0.16(−1.89, +2.25) +1.78(−0.12, +3.53)
B14 ∗ ∗ +0.05(−1.87, +2.01) +2.97(+1.31, +4.62) −0.36(−2.19, +1.55) +1.74(+0.07, +3.41)
B4 ∗ +2.14(+0.29, +4.16) +2.85(+1.30, +4.44) +0.58(−1.48, +2.63) +1.33(−0.45, +3.13)
B8 ∗ −2.64(−4.77, −0.75) −1.36(−3.31, +0.73) −3.07(−5.29, −0.80) −1.11(−3.17, +0.91)
B0 −3.07(−5.51, −0.66) −1.06(−3.17, +0.99) −3.12(−5.60, −0.59) −1.06(−3.27, +1.04)

of Table 3 in that the best performing predicted vari- ture is that there appears to be a downturn in profit
able is shots off target, followed by corners, shots in recent seasons. Whilst this could conceivably be
on target and goals. It is also notable that forecasts explained by random chance, it is perhaps more likely
built using BA ratings do not perform as well as those that something fundamental changed over that time.
formed using GAP ratings. That predicted match statistics provide information
The mean profit obtained from using the forecasts additional to that contained in the odds suggests that,
alongside the Kelly strategy are shown in Table 7. in general, the odds do not adequately account for the
Here, under both the GAP and BA rating systems, ability of teams to create shots and corners. However,
notably, the mean profit is generally substantially as more data have become available and quantitative
higher than that achieved using the Level Stakes analysis has become more sophisticated, it seems a
strategy. Again, including the home odds-implied reasonable claim that such information is now more
probability as an additional predictor variable yields likely to be reflected in the odds on offer and it may
improved results for all combinations of variables. therefore be the case that the betting opportunities
In fact, the profit is significant in all cases in which available in earlier seasons simply don’t exist any-
at least one predicted match statistic other than the more.
number of goals is included alongside the home odds- It is worth considering how the profits from each
implied probability. Again, the results obtained from betting strategy are distributed between the different
the GAP rating approach are almost always better leagues and whether losses in any particular subset of
than under the BA rating approach. leagues can explain the observed downturn. Focusing
For the remainder of this section, given the supe- on the case in which the home odds-implied proba-
rior performance of GAP ratings relative to the BA bility is included as a predictor variable, in Fig. 4 the
ratings, we focus on the betting performance of cumulative profit made in each league is shown as a
forecasts formed using predicted shots on target, function of time. Here, the decline in profit appears
shots off target and corners simultaneously under this to be fairly consistent over all leagues considered
approach. We do this both with and without the home and therefore, if the information reflected in the odds
odds-implied probability as an additional predictor really has increased over time, this appears to be fairly
variable. universal over the different leagues.
The cumulative profit achieved with each of the Finally, it is important to assess the impact of the
two betting strategies is shown in Fig. 3. As already overround on the profitability of the betting strategies.
shown in Tables 6 and 7, a substantial profit is made in In this experiment, it is assumed that the gambler is
all four cases. The figure, however, shows how each able to find the best odds on offer on each possible
strategy performs over time and an interesting fea- outcome, over a range of bookmakers. Due to
E. Wheatcroft / Forecasting football matches by predicting match statistics 93

Fig. 3. Cumulative profit from using the Kelly strategy (solid lines) and the level stakes strategy (dashed lines) with forecasts formed using
GAP rating predictions of shots on target, shots off target and corners both when the home odd-implied probability is included as a predictor
variable in the model (blue) and when it is excluded (red).

Fig. 4. Cumulative profit as a function of time in each league for the case in which predicted shots on target, shots off target and corners
along with the home odds-implied probability are included as predictor variables.
94 E. Wheatcroft / Forecasting football matches by predicting match statistics

increased competition, there has been a trend towards


reduced profit margins in recent years. This can have
a knock on effect on the overround of the best odds.
A histogram of the overround of the best odds for
all matches deemed eligible for betting is shown in
Fig. 5. Whilst, in the majority of cases, the overround
is positive, in around 18 percent of cases, it is nega-
tive. This gives rise to arbitrage opportunities, which
means that a guaranteed profit can be made, without
any need for a model. It is therefore important to dis-
tinguish cases in which profits are made due to the
performance of the forecasts from those in which a
profit could be guaranteed through arbitrage.
To assess the importance of the overround, five dif-
ferent intervals are defined and the mean profit from
matches whose overround falls into each one is calcu- Fig. 5. Histogram of overrounds under the maximum odds.
lated under both betting strategies. The first interval
contains all matches with an overround less than zero,
whilst, for matches with a positive overround, inter-
vals with a width of 2.5 percent are defined. The
interval containing matches with the largest over-
rounds consider those in which the overround is
greater than 7.5 percent. In Fig. 6, the mean over-
round for matches contained in each interval is plotted
against the mean profit under each of the two betting
strategies. The error bars correspond to 95 percent
bootstrap resampling intervals of the mean profit. In
all five intervals, and under both betting strategies,
the mean profit is positive. Under the Kelly strategy,
three out of the five intervals yield a significant profit,
whilst this is true in one interval for the Level Stakes
strategy. Interestingly, the mean profit is not signif-
icantly different from zero when the overround is Fig. 6. Mean overround against mean profit under the Kelly strat-
negative. This, however, is consistent with the decline egy (blue) and the Level Stakes strategy (red) for each considered
interval. The error bars represent 95 percent bootstrap resampling
in profit in recent seasons that has tended to coincide intervals of the mean.
with lower overrounds. Overall, the fact that signif-
icant profits can be made for matches in which the
overround is positive suggest that, over the course and/or the match odds are taken into account. With
of the dataset, the forecasts in combination with the this in mind, the key claim of this paper is that pre-
two betting strategies would have been successful in dictions of match statistics, if accurate enough, can
identifying profitable betting opportunities. be informative about the outcome of the match and,
crucially, since the predictions are made in advance,
this can aid betting decisions.
6. Discussion Both GAP and BA ratings have been demon-
strated to provide a convenient and straightforward
In this paper, relationships between observed and approach to the prediction of match statistics. The
predicted match statistics and the outcomes of foot- former, however, has been shown to perform consis-
ball matches have been assessed. Unsurprisingly, the tently better in terms of predicting match outcomes.
observed number of shots on target is a strong predic- A number of other interesting, and perhaps surpris-
tor of the match outcome whilst the observed numbers ing, conclusions have been revealed. Notably, in the
of shots off target and corners also provides some prediction of match results, the most informative
predictive value, once the number of shots on target observed statistics do not coincide with the most
E. Wheatcroft / Forecasting football matches by predicting match statistics 95

informative predicted statistics. Whilst the number incorporate more information than at the beginning
of shots on target was found to be the most informa- of the data set. It would be interesting to investigate
tive observed statistic, the most informative predicted this further.
statistic was found to be the number of shots off target. This paper demonstrates a new way of thinking
As pointed out earlier in the paper, this can likely be about match statistics and their relationship with the
explained by the fact that the information in the pre- outcomes of football matches and sporting events in
dicted statistics reflects both the importance of the general. It is hoped that this can help provide a better
statistic itself, in terms of the match outcome, and the understanding of the role of match statistics in sports
accuracy of the prediction of that statistic. That there prediction and GAP ratings provide a straightforward
is agreement on this between GAP and BA ratings and intuitive way in which to do this.
provides further evidence for this claim.
The observation above has interesting implications
for the philosophy of sports prediction. The impor-
tance of match statistics and, in particular, statistics References
such as expected goals that are derived from match
events is becoming clear. The aim of expected goals Anderson, D. and Burnham, K. 2004, Model selection and multi-
can broadly be considered to be to estimate the model inference, Second. NY: Springer-Verlag, 63(2020), 10.
expected number of goals a team ‘should’ score, Baker, R. D. and McHale, I. G. 2015, Time varying ratings in
association football: the all-time greatest team is.., Journal of
given the location and nature of the shots it has taken. the Royal Statistical Society: Series A (Statistics in Society),
A shot taken close to the goal and at a favourable angle 178(2), 481-492.
has a high chance of being successful and therefore Boshnakov, G., Kharrat, T. and McHale, I. G. 2017, A bivari-
contributes more to a team’s expected goals than a ate Weibull count model for forecasting association football
shot that is far away and from which it is difficult to scores, International Journal of Forecasting, 33(2), 458-466.
score. As such, expected goals ought to reflect the Carbone, J., Corke, T. and Moisiadis, F. 2016, The Rugby League
likelihood of each match outcome better than tradi- Prediction Model: Using an Elo-based approach to predict
the outcome of National Rugby League (NRL) matches,
tional statistics like the number of shots on target. International Educational Scientific Research Journal, 2(5),
The results in this paper, however, suggest that it is 26-30.
not necessarily the case that predictions of the num- Constantinou, A. C. and Fenton, N. E. 2012, Solving the problem of
ber of expected goals by each team would outperform in-adequate scoring rules for assessing probabilistic football
predictions of, or ratings based on, other statistics. forecast models, Journal of Quantitative Analysis in Sports,
Interesting future work would therefore be to predict 8(1).
the number of expected goals in a similar way to that Constantinou, A. C. and Fenton, N. E. 2013, Determining the level
of ability of football teams by dynamic ratings based on the
demonstrated in this paper to assess the effect on the relative discrepancies in scores between adversaries, Journal
forecasting of match outcomes. of Quantitative Analysis in Sports, 9(1), 37-50.
The results in this paper inspire a number of future Constantinou, A. and Fenton, N. 2017, Towards smart-data:
avenues for research. There is a wide and grow- Improving predictive accuracy in long-term football team
ing range of betting markets available for football performance, Knowledge-Based Systems, 124, 93-104.
matches and GAP ratings may be useful in informing Dixon, M. J. and Coles, S. G. 1997, Modelling association foot-
such bets. This has already been shown by Wheatcroft ball scores and inefficiencies in the football betting market,
Journal of the Royal Statistical Society: Series C (Applied
(2020) in the over/under 2.5 goal market but could Statistics), 46(2), 265-280.
also be applied to other markets such as Asian Hand-
Dixon, M. J. and Pope, P. F. 2004, The value of statistical forecasts
icap, the number of shots taken in a match, half time in the UK association football betting market, International
results and many more. The philosophy demonstrated Journal of Forecasting, 20(4), 697-711.
in this paper could also be applied to other sports. For Eggels, H. 2016, Expected goals in soccer: Explaining match
example, in ice hockey, GAP ratings could be used results using predictive analytics, in The Machine Learning
to estimate the number of shots at goal, whilst, in and Data Mining for Sports Analytics workshop, pp. 16.
American Football, they could be used to predict the Elo, A. E. 1978, The rating of chessplayers, past and present, Arco
number of yards gained by each team in the match. Pub.
Another interesting feature of the results presented Fifa 2018, Revision of the FIFA / Coca-Cola World
Ranking, https://resources.fifa.com/image/upload/fifa-
in this paper is the decline in profit over the last few
world-ranking-technical-explanation-
seasons. This was briefly discussed in the results sec- revision.pdf?cloudid=edbm045h0udbwkqew35a. Accessed:
tion and it was suggested that betting odds may now 27/04/2019.
96 E. Wheatcroft / Forecasting football matches by predicting match statistics

FiveThirtyEight 2020a, The complete history of the NFL, Lee, A. J. 1997, Modeling scores in the premier league: is Manch-
https://projects.fivethirtyeight.com/complete-history-of-the- ester United really the best?, Chance, 10(1), 15-19.
nfl/. Accessed: 16/01/2020.
Ley, C., Van de Wiele, T. and Van Eetvelde, H. 2019, Ranking soc-
FiveThirtyEight 2020b, NBA Elo Ratings, https://fivethirty cer teams on the basis of their current strength: A comparison
eight.com/tag/nba-elo-ratings/. Accessed: 16/01/2020. of maximum likelihood approaches, Statistical Modelling,
19(1), 55-73.
Goddard, J. 2005, Regression models for forecasting goals and
match results in Association Football, International Journal Maher, M. J. 1982, Modelling association football scores, Statis-
of Forecasting, 21(2), 331-340. tica Neer-landica, 36(3), 109-118.
Good, I. J. 1952, Rational Decisions, Journal of the Royal Statis- Rathke, A. 2017, An examination of expected goals and shot effi-
tical Society. Series B (Methodological), 14(1), 107-114. ciency in soccer.
Hansen, P. R., Lunde, A. and Nason, J. M. 2011, The model con- Roulston, M. S. and Smith, L. A. 2002, Evaluating probabilistic
fidence set, Econometrica, 79(2), 453-497. forecasts using Information Theory, Monthly Weather Review,
130(6), 1653-1660.
Hvattum, L. M. and Arntzen, H. 2010, Using ELO ratings for
match result prediction in association football, International Rue, H. and Salvesen, O. 2000, Prediction and retrospective anal-
Journal of Forecasting, 26(3), 460-470. ysis of soccer matches in a league, Journal of the Royal
Statistical Society: Series D (The Statistician), 49(3), 399-
Karlis, D. and Ntzoufras, I. 2003, Analysis of sports data by using
418.
bivariate Poisson models, Journal of the Royal Statistical
Society: Series D (The Statistician) 52(3), 381-393. Sullivan, C. and Cronin, C. 2016, Improving Elo rankings for sports
experimenting on the English Premier League, Virginia Tech
Karlis, D. and Ntzoufras, I. 2009, Bayesian modelling of football
outcomes: using the Skellams distribution for the goal dif- CSx824/ECEx424technical report.
ference, IMA Journal of Management Mathematics, 20(2), Suznjevic, M., Matijasevic, M. and Konfic, J. 2015, Application
133-145. context based algorithm for player skill evaluation in MOBA
Kelly Jr, J. 1956, A new interpretation of the information rate, Bell games, in 2015 International Workshop on Network and Sys-
System Technical Journal, 35, 917-926. tems Support for Games (NetGames), IEEE, pp. 1-6.

Koopman, S. J. and Lit, R. 2015, A dynamic bivariate Poisson Wheatcroft, E. 2019, Evaluating probabilistic forecasts of foot-
model for analysing and forecasting match results in the ball matches: The case against the Ranked Probability Score,
English Premier League, Journal of the Royal Statistical Soci- arXiv preprint arXiv:1908.08980.
ety. Series A (Statistics in Society) pp. 167-186. Wheatcroft, E. 2020, A profitable model for predicting the
Lasek, J., Szlávik, Z. and Bhulai, S. 2013, The predictive power of over/under market in football, International Journal of Fore-
ranking systems in association football, International Journal casting.
of Applied Pattern Recognition, 1(1), 27-46.
E. Wheatcroft / Forecasting football matches by predicting match statistics 97

A Akaike’s Information Criterion (AIC) where k is the number of estimated parameters and L̂
is the maximised log-likelihood given by
Akaike’s Information Criterion (AIC) weighs up

n
the likelihood of a model with the number of esti- L̂ = pi (Yi ) (22)
mated parameters to provide an indication of the fit of i
the model out-of-sample. In the context of predicting
football match outcomes, AIC is given by where pi (Yi ) is the probability placed on the outcome
Yi in game i.
AIC = −2 log(L̂) + 2k (21)

You might also like