Soccer Goals Assessment for Betting
Soccer Goals Assessment for Betting
in soccer matches
A Master’s Thesis
by
Rasmus B. Olesen
Resume
This report documents the research and results made during a master’s thesis in Machine
Intelligence. The topic of the report is sports betting and the automatic assessment of
the total number of goals in soccer matches.
The goal of the project is to develop, examine and evaluate proposed assessors, with
regards to determining if it is possible to create a probability assessor which at the min-
imum can match the bookmakers’ assessments on the total number of goals in soccer
matches. Secondarily, it has been examined if it is possible using defined betting strate-
gies and probability assessor to bet at bookmakers, and earn a profit.
This project proposes a total of three different probability assessors. The gamblers’ ap-
proach uses the empirical probability in history matches, to assess the probability of a
soccer match will have more or less than 2.5 goals. The Poisson approach uses a calcu-
lated expected number of goals for a match as the mean in a Poisson distribution, which
forms a probability distribution over the number of goals. The third approach, is that
of Dixon-Coles, which in the past has shown good results in predicting the outcome of
matches. It utilizes history match data to form offensive and defensive strength measures
to determine a probability distribution for the possible results of a match. These three
approaches are measured and compared to the assessment of the bookmakers. In this
report, formulas have been derived for determining the bookmakers’ probability assess-
ment for over or under 2.5 goals, using either the odds for the two total goal outcomes or
by combining odds data for other over/under odds lines to derive the needed assessment.
The assessors are in turn evaluated based on the scores achieved using an absolute scor-
ing rule, where each assessment is assigned a score of the logarithm of the probability
assessed for the observed outcome of the event. An assessors total score is its average
log score over a total set of matches.
The secondary part of the project is to evaluate to different betting strategies. The
first uses the expected value of a bet, to determine if a bet should be placed, using the
known history odds data and a probability assessors assessment of a match. The second
approach is a rule-based approach which uses the distance between the expected number
of goals and the offered odds to determine if a bet should be placed. The strategies
are evaluated on the basis of their ability to generate a profit and the total return of
investment over a set of bets.
The parameters for each of the assessors have been tuned using a training data set con-
taining a total of four and a half season of matches. Using the average log score as a
measure, the best parameter settings for each of the assessors have been found. These
settings were used to evaluate the assessors on a test data set containing a half a season.
The results show, that the bookmakers’ assessment is better than those of the assessors.
Of the three proposed assessors, the gamblers’ and Poisson approach was, a bit surpris-
ingly, the better. The Dixon-Coles approach was the worst of the four in the larger part
of the tests. In order to establish the statistical significance of the results found, hypoth-
esis testing using the Wilcoxon Signed-Rank Test has been used. These tests showed,
that no of the three proposed assessors where significantly better than the bookmaker,
nor were any of them better than the other assessors. In one out of three tests, it was
determined that the bookmakers’ assessments were significantly better than those of the
Poisson and Dixon-Coles approach.
The evaluations of the betting strategies gave irregular results. There was no consistent
performance by any of the value betting strategies (using the proposed assessors), nor
by the threshold strategy. In some of the strategy runs, some of the strategies, primar-
ily the value betting using the Dixon-Coles assessor and the threshold strategy showed
very promising results with a very high net profit. However, the inconsistency with very
fluctuant net results and return of investments leads to the conclusion than none of the
strategies would over a longer period in time be able to create a profit. If an even better
probability assessor could be modeled, perhaps the value betting strategy could return
a profit.
This report concludes, that it with the approaches taken, was not possible to create
probability assessments which were better than those of the bookmakers. However,
results show, that it is possible to almost match them. This leads to a possible discussion
as to whether a cheap automatic probability assessor with assessments almost as good
as a human bookmaker, could replace an expensive human bookmaker. This report
proposes possible additions to the assessors evaluated in the project, in order for them
to get even closer to the bookmakers’ evaluations. Despite it not being possible to fully
match and beat the bookmaker, the results of this report show indications of it being
possible to create an automatic assessor which possibly with some additions could be
used as a replacement of a human bookmaker, when setting the odds on sporting events.
Department of Computer Science
Selma Lagerlöfs Vej 300
DK-9220 Aalborg Ø
Telephone: (45) 9635 8080
Telefax: (45) 96359798
http://www.cs.aau.dk
Synopsis:
Title:
Assessing the number of goals in
soccer matches This project proposes a number of
models, for assessing the number of
goals scored in a soccer match. The
Topic:
motivation lies in the challenge of au-
Machine Intelligence
tomatizing the task of setting odds
for certain type of goal bets in soc-
Project period: cer betting. The models proposed uses
DAT6, spring 2008 historic result data solely, each in its
February 4th - June 12th own way, to assess the probabilities
for the number of goals. The models
use empirical probability, Poisson dis-
Project group: tributions and the concepts of offensive
d531a strength and defensive weakness. Each
of the proposed assessors are measured
with each other and compared to the
Member of the group:
assessments made by actual bookmak-
Rasmus B. Olesen
ers. The evaluation is based on hypoth-
esis testing using scoring rules. The re-
Supervisor: sults of this report conclude that, it is
Manfred Jaeger very difficult to create automatic prob-
ability assessors which can outperform
the bookmakers. However it is possi-
ble, through rather simple methods, to
Number of copies: 3
create assessments which are very close
Pages: 75 to those of the bookmakers.
Preface
This project is a master’s thesis, made at Aalborg University in the Machine Intelligence
department. The topic of the project is automatic assessment of the probabilities of the
number of goals occurring in a soccer match.
In the course of this project, knowledge and data have been provided by extern sources.
I therefore give great thanks to:
• Klaus Rasmussen for the initial inspiration for the project, through the threshold
betting strategy concept.
• Frederik Skov, Scandic Bookmakers, and Søren Hansen, Danbook, for their input
on assessing the number of goals in soccer matches.
A CD containing the code and implementation has been enclosed in the report.
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Report Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Sports Betting 5
2.1 Calculating Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 European Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Asian Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Betvalue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Candidate Assessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Gamblers’ Assessment . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Poisson Distribution Assessment . . . . . . . . . . . . . . . . . . . 9
2.2.3 Dixon-Coles Assessment . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 Bookmakers’ Assessment . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5 Assessor Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Betting Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Value Betting Strategy . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Threshold Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Theoretical Concepts 13
3.1 Expected Value and Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Scoring Assessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Scoring Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.1 Wilcoxon Signed-Rank Testing . . . . . . . . . . . . . . . . . . . . 17
4 Data 19
4.1 Result Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Odds Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
i
CONTENTS
6 Bookmakers Prediction 29
6.1 Calculating Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.1.1 Asian Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.1.2 Verifying Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Bookmaker Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.3 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7 Gamblers’ Approach 35
7.1 Gamblers’ Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2 Evaluating Gamblers Assessment . . . . . . . . . . . . . . . . . . . . . . 35
7.3 Optimal Number of Games . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.3.1 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8 Poisson Assessment 39
8.1 Goal Histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.1.1 Poisson Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.1.2 Optimal Number of Matches . . . . . . . . . . . . . . . . . . . . . 41
8.2 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
9 Dixon-Coles Approach 45
9.1 Dixon-Coles Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.2 Parameter Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
9.2.1 Optimizing Local Parameters . . . . . . . . . . . . . . . . . . . . 48
9.2.2 Fade-out Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9.3 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
11 Results 65
11.1 Assessor Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.1.1 Test Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11.1.2 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.1.3 Assessor Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.1.4 Significance Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
ii
CONTENTS
12 Conclusion 73
12.1 Project Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.1.1 Assessor Performance . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.1.2 Betting Strategy Results . . . . . . . . . . . . . . . . . . . . . . . 74
12.1.3 Assessors as Bookmakers . . . . . . . . . . . . . . . . . . . . . . . 74
12.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A Interviews 77
A.1 Gambler Interview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A.2 Bookmaker Interview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2.1 Frederik Skov, Scandic Bookmakers . . . . . . . . . . . . . . . . . 78
A.2.2 Søren Hansen, Danbook.com . . . . . . . . . . . . . . . . . . . . . 78
iii
Chapter 1
Introduction
Since the betting industry went online in the 1990’s, there has been a tremendous growth
in all areas. More and more bookmakers emerge, now with a total of several hundred
providers. As the turnover grows, and the competition increases, the betting market is
constantly evolving and many attempts are made towards lowering costs and maximizing
the profit.
1.1 Motivation
The concept of gambling has been around for centennials, evolving from chance games
with dice to modern day casinos. In the late 19th century the preliminaries to present
day bookmakers saw the light of day, being the first step on the way to the multi billion
dollar industry that the betting industry is today. For the remainder of this report,
the term betting industry refers to the market of bookmakers with focus on gambling
on sporting events, while the term betting is sports gambling. The betting industry
rests on the conflicting interest of bookmakers and customers wanting to earn money,
respectively. In Chapter 2 the structure of sporting event betting is introduced, and the
mathematical conflict of bookmakers and customers are explained in detail. Needles to
say is, that customers seek to win money away from bookmakers, while bookmakers want
to, in the long run, create a steady return. Both sides will look for ways to improve their
possibilities of achieving their goals, which raises a motivation of investigating how this
could be done. From a customers point of view, it would most certainly be welcome if
a ”machine” for finding good bets existed, which could guarantee a constant profit. To
believe that it is possible to create such a mechanism, is naive. However it is none the less
possible to generate a positive return on investment by betting at the bookmakers (see
Appendix A.1). It is interesting to look into formalizing the methods of a professional
gambler to see if algorithms for evaluating odds offers to see if a bet should be made.
From the bookmakers’ point of view, a tool for determining (close to) correct probability
distributions and setting odds on sporting events would be a very strong instrument.
Bookmakers’ have employees which are constantly alert for news of the sports world, to
1
CHAPTER 1. INTRODUCTION
always be a step ahead when offering odds. Being well informed is a time consuming
task, and expensive for a bookmaker with a large team of odds setters. If the employees
tasks of creating the odds could be eased, then time and money could be saved. If it was
possible to create applications strong and good enough, the odds setters could be out of
a job. However, it is not as simple as that, but it is interesting to see how far one can
go towards creating tools for determining probabilities and setting odds.
1.2 Goals
In sports betting the largest sport is soccer, and the best known bet type is outcome
betting: ”Who wins the match between Arsenal and Manchester United?”. The intense
competition between the great number of bookmakers has over the last decades caused
the development of new betting types. For soccer a very popular betting type is ”Total
goals”, which is betting on the total number of goals in a specific match, i.e. ”Over or
under 2.5 goals?”. Any soccer team can in some way be said to have a tendency towards
the goals scored in a match. For the top teams it can (in most cases) be said that they
are good at scoring goals, and keeping the opponents from scoring. For a bottom team
the opposite can be said, that they are poor at scoring goals and at defending. If two
teams which are good at scoring goals, and mediocre in defending is to play each other,
it would be a fair assumption to say that there is a good chance of a lot goals in the
match. The goal of this project is to, in a number of different approaches investigate
the correlation between a soccer teams history results and the probability distribution
for the number of goals in a given soccer match. The examination of soccer result and
odds data is to lead to the establishment of a model for assessing the probability of the
number of goals scored in a soccer match. The model will use only historical match result
data, and has no other prior information on the match which it is assessing. The goal
of this project is to put forth a number of candidate assessors and through evaluation
establish the possibility of creating an automatized assessment of the number of goals in
a soccer match, which can at the least match those of the bookmakers.
2
1.3. REPORT STRUCTURE
and implementation of the four assessors used in the project, while chapter 10 defines
the betting strategies. In all of these chapters, the the training of parameter settings
have been made, in order to optimize the performance of the assessors and strategies for
the final tests. Chapter 11 presents the results found for both the assessors and betting
strategies, and discusses which of these have shown the best performance. Chapter 12
reflects on the project, and put forth possible future additions for the models to improve
the predictions, and concludes on the project goals.
3
Chapter 2
Sports Betting
The goal of this project is primarily to create a prediction model which can assess the
probabilities of the number of goals in a soccer match as well as the bookmakers. Secon-
darily the goal is to devise a betting strategy, which in the long run can minimally break
even when betting at bookmakers. In order to create such a model and strategy, one
must fully understand the mathematics behind sports betting odds and the mechanisms
which influence them. In this chapter the basic theoretical background for understand-
ing sports betting is presented, and various concepts related to this are accounted for.
Finally previous works in the subject of odds assessment and betting are discussed, with
regards to their possible utilization in this project.
5
CHAPTER 2. SPORTS BETTING
Bookmakers are basically companies trying to make money through sports wagers.
Therefore they operate with a theoretical payback percentage when offering odds to
their customers. The theoretical payback is set by the bookmaker and is the percentage
of the turnover on a betting event which is expected to be paid back to the customers.
The payback is less than 100%, normally around 90-95%. The higher the percentage,
the lower the margin of theoretical profit for the bookmaker and the higher the odds.
An odds for an outcome, can be calculated as:
1
Oddsi = tpb · ,
P (Outcome = i)
where Oddsi is the odds for outcome i, and tpb is the bookmakers theoretical payback.
An odds calculated with a payback of 1 (100%) is called a fair odd, since there is no
theoretical advantage.
For a soccer match with the outcomes home, draw and away, with a probability distri-
bution of 60%, 25% and 15% respectively the odds can be calculated. Using equation
2.2 with a theoretical payback of 100% the odds for the respective outcomes would be
1.67, 4.00 and 6.67. If instead a payback of 92% is used, the odds would be 1.53, 3.68
and 6.13. A rather big difference. If one knows the theoretical payback for an event and
the odds of one of the outcomes, the bookmakers assessment of the probability of the
outcome can be found:
tpb
P (Outcome = i) = · ,
Oddsi
If one has the odds for all possible outcomes for an event, the theoretical payback can
be calculated:
1
tpb = i
X 1
i=0
Oddsi
In this report the focus is on the total number of goals in a soccer match and the bets
possible in this area. A very popular variety, which is offered by almost any bookmaker,
on almost any match, is ”over/under 2,5 goals”. If the total number of goals in a match
is 0, 1 or 2, then outcome =”under 2.5” obviously, and if there are three goals or more,
then outcome =”over 2.5”.
6
2.1. CALCULATING ODDS
the result was 1-1, the bet result is 0.50-1 and the bet is lost. The Asian variety of odds
always has two outcomes, and the application of the handicap ensures the possibility
of making a more ”even” match up. If team A is a very big favorite, with 75% chance
and 92% payback percentage giving and odds of 1.23. By instead assigning A with a
handicap of -1.5, and setting P (A − 1.5) to 48%, OddA−1.5 is 1.77, while OddB+1.5 is 1.92.
If the handicap is -0.00, this means that the team has no actual handicap. However the
event is still regarded to have only two deciding outcomes. If the match ends in a draw,
the bet is paid back as a win with odds 1.00. Similarly, if A has a handicap of -1,00 and
they win by exactly one goal, the bet is ”voided” and payed back.
A very interesting, and complicated, aspect of the Asian handicaps is the possibility of
quarter handicaps. Instead of a handicap -0,5, it is possible to have a handicap of -0,25,
which is called a split bet, where the stake is divided into two and placed on the separate
bets; one with handicap -0,00 and one with handicap -0,50. When the match result is
in, the two bets are evaluated separately.
The Asian variety is also very common in the over/under market. Instead of a line of 2,5
goals, it is not unusual to see 2,25 or 2,00. In the latter case, a score of exactly two goals
would result in a void bet, while in the case of two goals in an over 2,25 line bet, the
result would be a half loss (refund on the half of the bet on over 2,00 goals, and loss on
over 2,50 goals). In an under 2,25 line bet, a result of two goals would yield a half win
(refund on under 2,00 goals, and win on under 2,50 goals). Table 2.1 shows a win/loss
explanation for the most commonly used lines for the possible goal outcome.
When placing ones bet and several lines are offered, choosing the ”correct” line can be
crucial, as can be seen from the table. However more important, is the odds at which
7
CHAPTER 2. SPORTS BETTING
you bet. A bet must have value, in order for a gambler to win in the long run.
2.1.3 Betvalue
The term betvalue is often used in discussions about bets between gamblers. The reason
for bookmakers making fortunes is, that due to the margin achieved through the theo-
retical payback, most gamblers place bets which are under value. Meaning, that in the
long run the bookmaker wins. At a casino, for instance, all games have rules for how
the game plays out and how winnings are won. These rules are carefully set, so that
the casino in the long run will make money. A single gambler can very well get lucky
and score a big winning, but were he to carry on playing he would in the end have less
money than when he started. We define the betvalue for the outcome i, BVi , as:
BVi = Oddsi · P (outcome = i) (2.2)
Here Oddsi is the odds for outcome i, and P (outcome = i) is the assessed probability that
the outcome will happen. A bet is then said to have value if, the odds and probability
together yields a betvalue, BV > 1.
The roulette is a good example of this. The board is a spinning wheel, with 37 numbers
on it. 18 are marked as black numbers, 18 are marked as red and one is green. It is the
single green number that does the trick. If a gambler plays ”red”, and the color comes
out, he wins an amount equal to the stake placed. He doubles up. If the color comes out
black, he loses. If the color comes out green he loses. The chance winning is therefore
18
37
and the odds is 2,00. The betvalue is calculated:
18 36
BV = 37
· 2,00 = 37
Since BV < 1, the bet can be said to be under value, and should not be placed. At least
not if the meaning is to win.
8
2.2. CANDIDATE ASSESSORS
matches. If, i.e. a team has played 11 matches, with 7 of them having more than 2.5
7
goals, the probability is assumed to be 11 = 0.636. The gamblers’ assessment approach
uses results of prior matches, to count the instances of matches with a certain number of
goals, and uses the empirical probabilities as the probability assessment of the number
of goals in a match.
This very simple assessment is the common approach for identifying over/under bets of
a lot of gamblers, hence the name of this approach. The bookmaker is aware of this, and
the odds set for a match takes this into account (see Appendix A.2 for interview).
tpb
P (Outcome = i) =
Oddsi
9
CHAPTER 2. SPORTS BETTING
In this way, from a data set containing odds for soccer matches, the bookmakers’ prob-
ability assessment can be calculated. For this project, the bookmakers’ assessment is
an important part of the evaluation process, to see if the other proposed probability
assessors are better than or match the bookmakers’ assessments.
10
2.4. SUMMARY
2.4 Summary
The foundations for the project has been set, introducing the basic concepts of book-
making and sports betting. Based on knowledge about the sports betting industry and
research into prior attempts at predicting soccer scores, four assessment approaches have
been proposed, and will in turn be examined and evaluated The concept of betting strate-
gies has been accounted for, and two different strategy types have been presented. In the
following chapters the approaches will individually be implemented and examined, and
the parameters tuned to maximize the performance. Finally the assessors and strategies
will be evaluated on a test data set, and compared to establish the better probability
assessor and the better betting strategy.
11
CHAPTER 2. SPORTS BETTING
Table 2.2: Overview of the four assessors, showing what estimates are made and how
the probability assessment is made for a single match.
12
Chapter 3
Theoretical Concepts
This chapter serves the purpose of introducing theoretical concepts and terminology used
in this project. Firstly the concept of expected value is defined, followed up by a definition
of the Poisson distribution, while the rest of this chapter is devoted to the evaluation
of predictions by the proposed assessors. Here scoring rules and testing methods are
accounted for.
Definition 1. (Expected Value). For an event with discrete outcomes, the expected value
is[DS]:
X
E[X] = p i xi ,
where pi is the probability of outcome i, and xi is the is reward given for the outcome i.
An example can be a soccer match between two soccer teams, where the probability of
the three possible outcomes (home, draw, away) is (0.30, 0.31, 0.39) and the correspond-
ing odds is (3.00, 3.10, P
2.65). The expected values of the three possible bets are:
E[Bet = Home win] P= pi xi = 0.30 · 3.00 + 0.31 · 0 + 0.38 · 0 = 0.90
E[Bet = Draw] = piPxi = 0.30 · 0 + 0.31 · 3.10 + 0.38 · 0 = 0.961
E[Bet = Away win] = pi xi = 0.30 · 0 + 0.31 · 0 + 0.39 · 2.65 = 1.034
Of the three possible bets on the outcome, with the mentioned probability assessment,
the only bet with a positive expected value is the bet on ”Away win”.
13
CHAPTER 3. THEORETICAL CONCEPTS
F : (P~ , D)
~ →R (3.1)
For this project two scoring rules are taken into consideration; quadratic scoring and
logarithmic scoring.
Before presenting the two rules, some notation should be in place. Let E be an event,
under assessment, with n mutually exclusive outcomes (E1 , ..., En ). Let vector R ~ =
hr1 , ..., rn i be a probability assessors assessment, P~ = hp1 , ..., pn i be the true probability
distribution and D ~ = hd1 , ..., dn i represent an observation of the event, di = 1 if Ei
Xn X n
occurs, and zero otherwise. Here ri ≥ 0, ri ≥ 0, ri = 1 and pi = 1.
i=1 i=1
14
3.3. SCORING ASSESSORS
With a proper scoring rule, the score is maximized if, and only if, the assessors assessment
~ is set to the true probability distribution P~ . We define properness property:
R
Definition 4. (Properness). Let P~ be the true probability distribution for the event event
with i outcomes, and R ~ be a probability assessors’ assessment. Let D ~ i be an observation
of the i’th outcome, where the i’th entry is 1 and all other entries 0. A scoring rule is
said to be proper, if the following holds:
E[S(P~ , D)]
~ ≥ E[S(R,
~ D)]
~
In the following sections, two scoring rules will be presented, both of them being proper.[WM]
If the j’th outcome is observed, then dj is 1, and all other entries in vecD is 0. This
yields: X X
~ D)
Qj (R, ~ = 1 − (rj − 1)2 − (ri − 0)2 = 2rj − ri2 (3.3)
i6=j i6=j
15
CHAPTER 3. THEORETICAL CONCEPTS
From 3.3 it can be seen, that if rj is set to 1, this maximizes the score. The expected
score for the quadratic scoring rule is:
X X
E(Q) = pj (2rj − ri2 ) (3.4)
j i6=j
or X X
E(Q) = pj − (rj − pj )2 (3.5)
j i6=j
The expected score is therefore maximized if R ~ is set to P~ , and the quadratic scoring
rule is therefore proper.
The quadratic scoring rules takes the assessment of all outcomes into consideration when
scoring. Two assessors A1 and A2 has the probabilities h0.4, 0.3, 0.3i and h0.45, 0.5, 0.05i
respectively. The first outcome is then observed, yielding a quadratic score of 0.46 for
A1 and 0.445 for A2 . One would expect that A2 which has the highest probability for
the observed outcome. However, the results are due to that the quadratic scoring rule
penalizes the unobserved outcomes. If an assessor has an uneven assessment for the
unobserved it is penalized more than if it had an even distribution over the unobserved
outcomes. A significantly uneven distribution over the unobserved outcomes will blur
the ability to evaluate the score without looking at the single assessment.
The fact that only rj is taken into account, yields that the higher the probability assessed
for the observed outcome the higher the score. Reusing the example from before, the
logarithmic scoring rules gives A1 a score of -0,916 and A2 a score of -0,799. Here A2 is,
as one could expect, regarded as the better assessor.
16
3.4. HYPOTHESIS TESTING
Among other things, hypothesis testing can be used for validating the significance in the
difference in performance between two classification or prediction models.
In hypothesis testing often two contrasting hypothesis are used; the null hypothesis and
the alternative hypothesis. The procedure of testing hypothesis can be seen as a four
step procedure:
3. For the data, compute the value of θ, and determine the p-value using the proba-
bility distribution of the test statistic.
4. Definition of a significance level, which is used for determining in which range the
θ values leads to rejection of the null hypothesis.
It is often used, that the null hypothesis is formulated as an unwanted result, while the
alternative hypothesis is actually the result one is seeking for. The objective of the test
is then to reject the null hypothesis.
When performing hypothesis testing, there are two types of errors which one can make.
A type 1 error is rejecting a true null hypothesis, while a type 2 error is accepted a
false null hypothesis. For this project the hypothesis which are to be tested regards one
assessors performance against another (see Section 5.1.2). The testing and evaluations
in this report must therefore minimize the type 1 and 2 errors. The use of Wilcoxon
Signed-Rank Test[Wil] ensures this.
17
CHAPTER 3. THEORETICAL CONCEPTS
null hypothesis µ = 0, the Wilcoxon rank test can be used to determine whether the
hypothesis should be rejected or not. By summing the ranks for each sign, a positive
rank sum and a negative rank sum is found. On the null hypothesis the two rank sums
are expected to be equal. If the positive rank sum is the smaller, the null hypothesis will
be rejected at a predetermined level of significance, in favor of the alternative hypothesis
µ > 0. If the negative rank sum is the smaller, µ < 0 would be the alternative if the null
hypothesis was rejected.
In statistics the confidence of a conclusion is highly related to the estimated significance.
If a result is said to be significant, it means that with a high probability the result is not
faulty. Meaning that it with a high probability the result is correct. In order to decide it
this is so, a significance level, α is used. This value could be 0.05 meaning that the ob-
servation is significant, or 0.01 meaning that the observation is highly significant[Kee95].
These are often used significance levels, but normally the α should be chosen appropriate
for the test at hand. For determining if a test result is significant, the p-value is found
and compared to the α. The p-value is the smallest level of significance for which the
null hypothesis would be rejected.
[Kee95] states that, the minimum rank sum, T , is approximately normally distributed
with mean:
N (N + 1)
µ= ,
4
and variance:
2N + 1
σ 2 = N (N + 1)
24
From this, Z can be calculated, being the minimum rank sum, T fitted to a standard
normal distribution:
T −µ
Z=
σ
From the Z value, the cumulative standard normal distribution can be used to find
the exact probability. Then, the test is said to reject the null hypothesis if |Z| ≥
Φ−1 (1 − α)/2). The p-value is the smallest value of α, for which this statement is true.
18
Chapter 4
Data
In the task of creating and examining prediction models for assessing soccer matches in
this project, two data sets are used; a data set containing soccer match results, and a
data set containing the odds corresponding to the result data set.
Table 4.1: Example of database entries for the Danish SAS League
19
CHAPTER 4. DATA
Date Home Away Bookmaker Over Under Line Offset Change Date
02-08-2006 Vejle OB sbobet.com 1.68 2.282 2.25 0 02-08-2006
02-08-2006 Vejle OB sbobet.com 2 1.9 2.75 13 31-07-2006
02-08-2006 Vejle OB Pinnacle 1.971 1.935 2.5 0 02-08-2006
02-08-2006 Vejle OB Pinnacle 2.01 1.901 2.75 15 31-07-2006
02-08-2006 Vejle OB Mansion88 1.952 1.935 2.5 0 02-08-2006
02-08-2006 Vejle OB Mansion88 1.971 1.87 2.75 36 31-07-2006
02-08-2006 Vejle OB IBCBET 2 1.9 2.5 0 02-08-2006
02-08-2006 Vejle OB IBCBET 2 1.9 2.75 42 31-07-2006
Table 4.2: Example of odds database entries for the SAS League
Table 4.2 shows the entries in the odds data set for a single match. For the match between
Vejle and OB, played on August 2nd 2006, four bookmakers offered odds. Notice that
each of the four bookmakers have two entries. Each of the entries have an Of f set value,
which indicates the time at which the data was collected. A value of ”0” means that this
is the last odds collected, while ”13” is an offset used by Tip-Ex to distinguish different
collected data. In the case of this project the values will be used to distinguish opening
from closing odds. ”0” means closing, and all other values mean opening odds. The last
20
4.2. ODDS DATA SET
attribute in the table, ChangeDate, has no importance the use in this project. For each
entry in the data set there is a Line, and the odds corresponding for over and under this
line. For the match in the example each bookmaker has only one line at a given time,
and each of the four has lowered the line and changed the odds accordingly.
21
Chapter 5
The goal of this project is to find a best assessor for determining the probabilities of the
number of goals in a soccer match, to see if it is possible to create a probability assessor
which can match the bookmakers’ assessment with regards to setting odds. The second
part is to examine how the proposed assessors and betting strategies perform on actual
offered odds. In the following sections the evaluation plan for these two evaluations are
presented.
23
CHAPTER 5. ASSESSOR AND STRATEGY EVALUATION
Figure 5.1: Division of the data set for learning and evaluating the probability assessors.
Table 5.1: The average log score of the bookmakers’ assessment over the fall season of
2006, for each of the leagues.
24
5.1. ASSESSOR EVALUATION
(a) a
(b) b
Figure 5.2: Two individual matches from the test set under evaluation. In (a) the line
before match A indicates the border of the training data matches. In (b) match B is
under evaluation, again with the line drawing the training set border. Notice that match
A has crossed over, from the test set to the training set.
assessment, the local training set is used for calculating the local parameters. In the
assessments made in this project, the local training set will always be a subset of the
global training set, where the match date is prior to that of the match under assessment.
In (b) the match B is now under assessment and constitutes the local test set. Notice
now the local training set, now containing the match A. When a match has been assessed
in the evaluation process, it simply crosses over and becomes a part of the local training
set for later matches.
For the final tests in this report, the spring 2007 season is the global test set, and the
training set in Figure 5.1 is the global training set. In Figure 5.3 an assessment scenario
from the final test is shown. Here the local test set is match C, and the local training
set is the global training set and the matches from the test set which have crossed over
after assessment.
25
CHAPTER 5. ASSESSOR AND STRATEGY EVALUATION
Figure 5.3: Test scenario with a global test set containing the spring 2007 season, and
a global training set containing fall 2002 to fall 2006. Notice how the already assessed
matches in the test set becomes a part of the local training set for later assessments.
5.1.1 Overfitting
When dealing with probability assessor evaluation using a data set partition into training
and test data, there are certain things one must be aware of. Training data is used for
adjusting the parameters of an assessor to create the best probability assessments. A
probability assessor, however, must not only fit the training data well, but also the test
data. [TSK].
See Figure 5.4. Two graphs are plotted to show the score of a probability assessor, which
is evaluated on the number of historic matches used for prediction. Notice with a low
number of matches, the assessor performs poorly over both the training and test data.
A concept known as underfitting. As more matches are used, the performance over the
training data improves, as does the performance over the test data. However, if the
number of matches is increased even more, the performance over the test data becomes
worse. Mitchell[Mit97] speaks about overfitting in respect to classifying instances of
observations, and defines overfitting as follows:
In the setting of soccer match prediction, the concept of overfitting is relevant. In this
project, the probability assessors base their predictions on historical data, observations
of soccer match results. However, the relevance of a soccer match in 1980, can not
be said to bear as much influence on a prediction on a match in present time, as a
match played last week. An investigation of the parameters for some of the assessors
need to be made. For the Dixon-Coles and the bookmakers’ assessments, the aspect of
over/underfitting is not deemed relevant and will not be examined. However it is for the
26
5.1. ASSESSOR EVALUATION
Figure 5.4: The error rate of a probability assessor over the training and test data
respectively. The error rate is plotted against the number of parameters.
two naive approaches, the Poisson and gamblers’ assessment the number of prior matches
the prediction is to be based upon need to be examined, to find a suiting number. Using
too low a number of matches to base the prediction upon will make the predictions too
adapt to the training data, overfitting the model. Instead, using too high a number of
matches will adjust the prediction specifically to the training data, and will not allow a
single new observation a significant impact. It will be too overly general. In example a
Poisson prediction over 150 soccer matches with an average of 2.51 goals, will not differ
much from the prediction using 151 matches, where a new match with three goals have
been added to the training data. The model is said to be underfit. For both approaches,
a best number of matches used will be examined. The results of this examination can
be found in Sections 8.1.2 and 7.3.
27
CHAPTER 5. ASSESSOR AND STRATEGY EVALUATION
In order to say, that a probability assessor is better than the bookmakers assessment, the
null hypothesis needs to be rejected, within a specified significance level. The significance
level α chosen for the tests is 0.05, based on [Kee95], which states that a 0.05 significance
level enables the conclusion of a difference being significant.
28
Chapter 6
Bookmakers Prediction
When trying to establish the quality of a probability assessor, the log score is a good
method for measurement. However, it is not enough to say, that the assessor with the
highest average log score is a good assessor for creating soccer match odds. If an assessor
is to be used for setting odds, it must at the very least almost match the quality of the
bookmakers assessments. In this chapter, the bookmakers odds data are examined and
used to establish the bookmakers prediction.
29
CHAPTER 6. BOOKMAKERS PREDICTION
of goals. Where it is quite simple to calculate the odds for over/under 2.5 goals, a
combination of two bets are needed to calculate odds for over/under 2,75. We will use
under 2.75 as an example, which, as mentioned, is a combination of under 2.50 and under
3.00. (6.1) shows the calculation of the two lines respectively.
1
OddsU nder2.50 = tpb · P (≤2)
(6.1)
1−P (3)
OddsU nder3.00 = tpb · P (≤2)
In the case of exactly three goals, the under 3.00 bet will be void and the bet is refunded.
Therefore the probability of exactly three goals is subtracted from the numerator. The
odds for under 2.75 can now be made, as the average of these two:
1 1 1 − P (3) 2 − P (3) 1 − 21 P (3)
OddsU nder2.75 = (tpb · + tpb · ) = tpb · = tpb ·
2 P (≤ 2) P (≤ 2) 2 · P (≤ 2) P (≤ 2)
In the same way the odds equations can be found for all other combination lines. The
following table denotes the equations for calculating the odds for different lines based on
the probability distribution over the number of goals. Let tpb be the theoretical payback,
and P (x) the assessed probability for x goals scored in the match, then the odds for the
lines can be calculated as such:
tpb · (1 − P (2))
Under 2,00
P (0) + P (1)
tpb · (1 − P (2))
Over 2,00
P (≥ 3)
tpb · (1 − 12 P (2))
Under 2,25
P (0) + P (1) + 12 P (2))
tpb · (1 − 12 P (2))
Over 2,25
P (≥ 3)
tpb · (1 − 12 P (3))
Under 2,75
P (≤ 2)
tpb · (1 − 12 P (3))
Over 2,75
P (≥ 4) + 12 P (3)
tpb · (1 − P (3))
Under 3,00
P (≤ 2)
tpb · (1 − P (3))
Over 3,00
P (≥ 4))
Table 6.1: Using a probability assessment and a theoretical payback one can set odds
for asian line over/under bets.
For a given match, if the line 2.50 is not present, the odds for either of the line pairs
2.75/3.00 or 2.00/2.25 can be used to calculate the probabilities for over/under 2.50. If
30
6.1. CALCULATING PROBABILITIES
the lines 2.75 and 3.00 are offered, the probabilities can be calculated for P (3) using the
formula for under 2.75 and under 3.00. Notice, that the theoretical payback for the two
offered lines are not necessarily the same, and needs to be calculated individually. See
Section 2.1 for how to do so. In the following P (≤ 2) means P (0) + P (1) + P (2):
By isolating P (≤ 2) in each equation, Equation 6.2 and 6.3 can be set equal to each
other. By doing so, P (3) can be found:
1
· OddsU nder3.00 · tpbU nder2.75
2
OddsU nder2.75 · tpbU nder3.00 − 12 · OddsU nder3.00 · tpbU nder2.75
P (≤ 2) = (6.5)
OddsU nder3.00
Here P (≤ 2) is the probability of under 2.5 goals. The probability for over 2.5 goals can
be found by P (≥ 3) = 1 − P (≤ 2). In the same way, using the odds for over 2.00 and
over 2.00, P (≥ 3) can be found:
1
· OddsOver2.00 · tpbOver2.25
2
OddsOver2.25 · tpbOver2.00 − 12 · OddsOver2.00 · tpbOver2.25
P (≥ 3) =
OddsOver2.00
If the over/under odds for 2.50 are not present, and there is no pair of 2.00/2.25 or
2.75/3.00 odds present, then it is not possible to calculate the bookmakers assessment,
and the match can therefore not be included in the evaluation.
31
CHAPTER 6. BOOKMAKERS PREDICTION
Table 6.2: The bookmaker IBCBets odds for over/under bets on Brøndby-FC Nordsjæl-
land on July 21st 2007, along with the corresponding theoretical payback, for use in the
calculations
By inserting the odds and theoretical paybacks into Formula 6.5, a value for the prob-
ability for under 2.50 goals is found to be P (≤ 2) = 0.43143. Using Formula 2.2 the
bookmakers odds for under 2.5 can be found using the found probability and the theo-
tpb2.50 0.9614
retical payback: OddsU nder2.50 = = = 2.228 According to the odds data
P (≤ 2) 0.43143
set, the odds for under 2.5 goals is 2.23, which corresponds to the found value.
32
6.3. TEST RESULTS
hold data ranging from fall 2006 to spring 2008, and due to the fact that the spring 2007
data is to be used for testing, it is only possible to examine the bookmakers’ assessment
on the fall season of 2006.
For each of the three leagues, a single bookmaker has been chosen from the odds data set.
The chosen bookmaker has been picked based on the odds provided for the given league.
For the SAS League the bookmaker chosen is 188bet, which is the only bookmaker with
at least two lines for all matches in the SAS League season 2006-07. For the Segunda
Division and Premier League, the bookmaker 10Bet has been chosen. When any of the
leagues assessors are compared in the remainder of this report, the chosen bookmaker for
the respective league will be referred to as the bookmaker. Since the results presented
Table 6.3: The average log score of the bookmakers’ assessment over the fall season of
2006, for each of the leagues.
in Table 6.3 are only for the fall season of 2006 it can not be used in direct comparison
with the results for the entire training data (2002 to 2006). The average log score for a
half season is not necessarily representative for the several seasons, since results can vary.
The results found in this section can however be used as an indication of the bookmakers’
33
CHAPTER 6. BOOKMAKERS PREDICTION
level. In order to enable comparison, average log scores will be found for the fall season
of 2006 for the other assessors.
34
Chapter 7
Gamblers’ Approach
The gamblers’ approach is the empirical probability of the number of goals in a match,
being higher or lower than a specified line. With the result of prior matches known, a
count of the instances of over 2.5 and under 2.5 can be made, in order to establish the
probability of the match ending with a low or high score. In this chapter, the gamblers’
approach will be examined and evaluated. First the approach is explained in detail, and
then followed by the determination of the best parameter setting. Lastly, preliminary
results for the assessor is made based on the training data set.
35
CHAPTER 7. GAMBLERS’ APPROACH
4: Select the latest k away matches for the away team from M atches
5: Count the number of matches, xawayunder where the number of goals is less than Line
a score, just as it was the case in Algorithm 1 with the bookmakers’ assessment scoring.
The calculation of the score for the gamblers’ assessment is made in the same way, only
with the minor change that the CalculateOver() and CalculateU nder() functions are
replaced by the GamblersAssess() function from Algorithm 2.
36
7.3. OPTIMAL NUMBER OF GAMES
Figure 7.1: The log score of the gamblers approach, on all three league training data
sets, plotted against the number of matches used to create the assessment.
again is poor, showing low average log scores. This is due to a low number of contributing
matches, due to a high number of history matches used. For m=75, there are in fact only
two matches in the training set, which contribute to the log score. Notice that the curve
is relatively smooth in the interval from 1 to 60. For larger values the curve becomes
more irregular. This is due to the lack of contributing matches, why the values over 60
do not come in to consideration as possible optimal values.
For the Premier League the curve is more smooth over the entire interval. It does not
become irregular with high values for history matches. However, it does have the similar
behavior with a low log score for values higher than 72.
For the Segunda Division the curve becomes very irregular for values higher than 55.
There is a very differing log score in the interval from 55 to 90, why these values can not
come into consideration as optimal values.
37
CHAPTER 7. GAMBLERS’ APPROACH
Table 7.1: The optimal value for the number of history matches to base the Gamblers’
assessment upon. For each league the optimal number and corresponding log score is
presented.
better result. The result data set does, however, not hold more matches than the used,
and it has therefore not been possible to evaluate for larger numbers. For the comparison
tests performed in later chapters of this report, the values presented in the above table
has been used.
38
Chapter 8
Poisson Assessment
As mentioned in Section 2.2.2, this approach was inspired by an initial betting strategy
introduced by [Ras08]. The idea of using an expected number of goals for a match,
calculated from the average goals in historic matches, inspired the investigation of the
distribution of the number of goals in a soccer match. Dixon-Coles introduced the use
of a Poisson distribution as a part of predicting the probability of a given result of a
match, based on the participating teams offensive and defensive skills. Instead of using
the individual skill measures, the expected number of goals is viewed as a representation
of the combination of the two teams skills. It is therefore examined if the number of
goals can be used in a Poisson distribution to predict the number of goals. Before doing
so, the expected number of goals is defined:
Definition 6. (Expected number of goals) In a match between home team i and away
team j, the expected number of goals, Avr, based on i’s last n home matches and j’s last
n away matches is:
n
X n
X
Goalsi,k + Goalsj,k
k=1 k=1
Avr(i, j, n) =
2n
, where Goalsi,k and Goalsj,k is the total number of goals in the k’th home and away
match respectively.
39
CHAPTER 8. POISSON ASSESSMENT
number of goals for the SAS League 2006/07 season. The red columns show the Poisson
distribution with mean value 2.80, which is the average number of goals scored per match
in the 2006/07 season.
Figure 8.1: Plot of histogram and poisson distribution for the SAS League season 2006/07
Figure 8.2: Plot of histogram and poisson distribution for the Premier League and
Segunda Division season 2006/07
There is a strong resemblance of the two distributions for the SAS League season. This
is also the case for the English Premier League and the Spanish Segunda Division, see
Figure 8.2. The use of the expected number of goals as the mean of a Poisson distribution
therefore seems to be a good approximation to the distribution of goals in soccer matches,
and therefore can be viewed as a candidate for assessing the number of goals in a given
soccer match.
40
8.1. GOAL HISTOGRAMS
In lines 2-5 the total number of goals scored in the latest k home matches for the home
team and k away matches for the away team are found. In line 6-7 these are used for
calculating the expected number of goals in massess . Lines 8-12 calculates the probabili-
ties of P(> Line) and P(< Line) using a Poisson distribution with a mean value equal
to the expected number of goals. The result is returned in line 13.
41
CHAPTER 8. POISSON ASSESSMENT
Figure 8.3: The log score of the Poisson assessment, for all three leagues, plotted against
the number of matches used to create the assessment.
The Segunda Division data shows a more irregular behavior over the entire interval.
Notable irregularities are present from 60 and upwards. In the interval from 1 to 60, the
curve is smoother, however it is not steady, as was the case with the SAS and Premier
League.
42
8.2. TEST RESULTS
Table 8.1: The optimal value for the number of history matches to base the Poissont
assessment upon. For each league the optimal number and corresponding log score is
presented.
is somewhat expected, since a higher frequency of over 2.5 matches in the gamblers’
approach will influence the expected number of goals to be higher. With a high number
of history matches with more than 2.5 goals, the chance of the expected number of goals
being relatively high is larger. This could give reason to believe that the gamblers’ and
Poisson approach will give rather similar predictions.
43
CHAPTER 8. POISSON ASSESSMENT
Figure 8.4: The log score of both the Poisson assessment and the gamblers’ assessment,
for all three leagues, plotted against the number of matches used to create the assessment.
44
Chapter 9
Dixon-Coles Approach
The Dixon-Coles approach [DC97] is a predictive model, which uses only the goals scored
in previous matches to predict the probability of scores of a soccer match. Historic results
are considered as a measure for the teams offensive and defensive qualities, since a team
that scores a lot of goals are assumed to be offensively potent and a team which concedes
a lot of goals are considered defensively weak. This chapter presents the fundamentals of
the Dixon-Coles approach, and fits it to the problem domain of this project. The model
originally is used for predicting match outcomes, while this project seeks to predict the
total number of goals. The model settings are presented and the parameters determined,
which are to be used for evaluating the Dixon-Coles performance. Conclusively the
results for the assessor on the training set is presented.
Xi,j ∼ P oisson(αi βj γ)
Yi,j ∼ P oisson(αj βi )
In soccer, the home team often has an advantage of playing games at their home field.
The support from the crowd and familiar surroundings give an advantage, which is clear
by viewing any soccer league results. In the 2006/07 season in the SAS League, 43%
of the matches ended in a home win, while 24% and 33% ended in draw and away win
respectively [Bet]. The model implements this advantage, by introducing the home team
45
CHAPTER 9. DIXON-COLES APPROACH
advantage factor, γ, which is multiplied to αi βj when calculating the mean value for the
Poisson distribution for the home team goals.
In the above Xi,j and Yi,j are independent and α, β > 0. The independency gives us,
that the probability of a match result is given by the product of the probability of the
home team goals and the away team goals:
λx exp(−λ) µy exp(−µ)
P (Xi,j = x, Yi,j = y) = τλ,µ (x, y) , (9.2)
x! y!
The Dixon-Coles approach can be used for assessing probabilities of outcome of a soccer
match, based solely on statistical data on scores of previous matches. By estimating the
probability of all results, the probabilities of home win, draw and away win can be found.
For example, the probability of a home win is:
X λx exp(−λ) µy exp(−µ)
P (Xi,j = x, Yi,j = y|x > y) = τλ,µ (x, y) ,
x>y
x! y!
In this project the goal is to assess the probability of the total number of goals in a
match, and not the actual outcome. The winner of the match is not important to bets
46
9.2. PARAMETER CALCULATION
on over/under 2.5 goals. So instead of summing the probabilities for outcomes where the
number of home goals is larger than the number of away goals to find the probability of a
home win, the probabilities where the sum of the number of home and away goals is less
or greater than 2.5 are summed to find the probabilities for under and over respectively.
For over 2.5 goals, the equation is:
X λx exp(−λ) µy exp(−µ)
P (Xi,j = x, Yi,j = y|x + y > 2.5) = τλ,µ (x, y) ,
x+y>2.5
x! y!
In order to create the assessments for a given match, the parameters must be determined.
The global parameter, the fade factor , and the local parameters, the home advantage
γ, the dependency factor ρ and the offensive strength and defensive weakness α and β.
N
Y
L(α1 ,...,αn ,β1 ,...,βn ,γ,ρ)= τλk ,µk (xk , yk )e−λk λxkk e−µk µykk
k=1
(9.3)
where
λk = αi(k) βj(k) γ,
µk = αj(k) βi(k) ,
47
CHAPTER 9. DIXON-COLES APPROACH
Here i(k) and j(k) respectively denotes the indices of the home and away team in match
k, while xk and yk denote the number of goals scored each team in the match. By making
a maximum likelihood estimate of (9.3), the local parameters can be found.
In soccer, teams change over time. Players move around, and teams can hit winning
or losing streaks. A lot of factors affect a teams quality, and in general recent form is
one of the most important factors when assessing a soccer match. The above approach
does not take these changes into account, and weight all matches used in the estimation
as equal. A modification is made to (9.3), introducing a fade factor, downgrading the
importance of older matches:
N
Y
L(α1 ,...,αn ,β1 ,...,βn ,γ,ρ)= τλk ,µk ((xk , yk )e−λk λxkk e−µk µykk )φ(t−tk )
k=1
Here t is the time at which the assessment is made, and tk is the time at which match k
was played. The fade function should yield a smaller value, the farther apart t and tk are.
In this way, older matches are given a smaller significance, while more recent matches
are given a higher significance. The function φ can be chosen in many ways, and several
choices can be used for this. Dixon-Coles have examined some of the possibilities and
suggest the use of:
φ(t − tk ) = e−(t−tk )
Using this φ will downgrade the history matches exponentially. With = 0 all matches
will be weighted equally, while increasing the value will weight recent matches higher.
The nature of the fade function makes it impossible optimize it using the maximum
likelihood measure. Instead it will be estimated deterministically, with regards to the
assessments made by the model on over/under outcomes. The estimation is presented
later in this chapter.
48
9.2. PARAMETER CALCULATION
N
Y
ln (τλk ,µk (xk , yk )e−λk λxkk e−µk µykk )φ(t−tk ) =
k=1
N
X
φ(t − tk )ln(τλk ,µk (xk , yk )e−λk λxkk e−µk µykk )=
k=1
N
X
φ(t − tk )(ln(τλk ,µk (xk , yk )) − λk + xk ln(λk ) − µk + yk ln(µk ))
k=1
(9.5)
Remember that λk = αi(k) βj(k) γ and µk = αj(k) βi(k) , where i(k) and j(k) are id’s for the
home and away teams of match k. In order to find the gradient descent, it is necessary to
~
find the partial derivatives of all variables. The value vector, values, holds values which
need to be found to maximize the likelihood:
α1
..
.
αn
β1
. (9.6)
..
β
n
γ
ρ
It is necessary to find the partial derivative of the offensive strength αi and the defensive
weakness βi for any team i of the total n teams, along with the home advantage factor
γ and the dependency factor ρ. The partial derivative for the offensive strength of team
i is:
0 if i 6= i(k) and i 6= j(k)
−βj(k)µk γρ
− βj(k)γ + αxi(k)
1−λk µk ρ
k
if i = i(k) and xk =0 and yk = 0
βj(k)γρ
− βj(k)γ + αxi(k)
k
if i = i(k) and xk =0 and yk = 1
N
1+λk ρ
∂LL X
xk
= φ(t − tk ) −βj(k)γ + αi(k) if i = i(k) and xk 6= 0 and yk 6= 0,1
∂αi −βi(k)λk ρ
k=1
− βi(k)γ + αyj(k)
k
if i = j(k) and xk =0 and yk = 0
1−λk µk ρ
βi(k)ρ
1+µk ρ
− βi(k) + αyj(k)
k
if i = j(k) and xk =1 and yk = 0
−βi(k)γ + yk
if i = j(k) and xk 6= 0,1 and yk 6= 0
αj(k)
(9.7)
49
CHAPTER 9. DIXON-COLES APPROACH
Notice how the τ function imposes constraints on the number of goals scored in the k’th
match. Therefore low scoring games do not contribute to the derivative in the same way
as high scoring matches. The derivative takes the dependence into account. Similarly to
the offensive strength, the partial derivative of the defensive weakness βi is:
0 if i 6= i(k) and i 6= j(k)
−λk αj(k) ρ
− αj(k) + βyi(k)
1−λk µk ρ
k
if i = i(k) and xk =0 and yk = 0
αj(k) ρ
− αj(k) + βyi(k)
k
if i = i(k) and xk =1 and yk = 0
1+µ kρ
N
∂LL X
yk
= φ(t−tk ) −αj(k) + βi(k) if i = i(k) and xk 6= 0,1 and yk 6= 0
∂βi −αi(k) µk γρ
k=1 − αi(k) γ + βxj(k)
k
if i = j(k) and xk =0 and yk = 0
1−λ k µk ρ
αi(k) γρ
1+λk ρ
− αi(k) γ + βxj(k)
k
if i = j(k) and xk =0 and yk = 1
−αi(k) γ + xk
if i = j(k) and xk 6= 0 and yk 6= 0,1
βj(k)
(9.8)
For the home advantage factor γ, the partial derivative is:
−αi(k) βj(k) µk ρ xk
∂LL X
N 1−λk µk ρ − αk βk + γ if xk =0 and yk = 0
αi(k) βj(k) ρ
= φ(t − tk ) 1+λk ρ
− αk βk + xγk if xk =0 and yk = 1 (9.9)
∂γ k=1
xk
−αk βk + γ if xk 6= 0 and yk 6= 0,1
The dependency factor used to infer dependence in low scoring games is:
−λ µ
k k
1−λk µk ρ
if xk =0 and yk = 0
λk
if xk =0 and yk = 1
N 1+λk ρ
∂LL X µk
= φ(t − tk ) 1+µk ρ
if xk =1 and yk = 0 (9.10)
∂ρ k=1
−1
if xk =1 and yk = 1
1−ρ
0 if xk 6= 0,1 and yk 6= 0,1
By setting the value vector presented in equation 9.6 to a starting point, the calculation
of the derivatives based on the initial values will give a vector which points toward the
maximum likelihood. In order to carry out these calculations, a vector class has been
implemented in .NET C#, so it was possible to make the necessary vector operations.
Algorithm 4 shows how the parameter optimization has been implemented according to
the Dixon-Coles approach.
In lines 2-3 the necessary vectors are initialized and set to values recommended by [DC97].
Lines 4 to 26 is a while loop, which breaks when no further improvements are made. In
this loop, the gradient vector is calculated using the functions described previously.
From the starting point (the initial settings for the values~ vector, steps are taken along
the gradient vector until a point is reached where no improvements are made to the
value returned by the likelihood function. When the best point is found, the process is
repeated, calculating a new gradient vector and finding a new best value. When the best
values have been found, the vector values~ is returned.
50
9.2. PARAMETER CALCULATION
8: ~
gradient[team id + noOfTeams] = βteam ∂LL
id
9: end for
10: ~
gradient[noOfTeams · 2 + 1] = ∂LL γ
11: ~
gradient[noOfTeams · 2 + 2] = ρ ∂LL
51
CHAPTER 9. DIXON-COLES APPROACH
Algorithm 5 presents the approach for finding the best value for a set of matches.
Provided with a minimum, maximum and stepsize increment (and a set of matches),
the best is returned. In order to find the best for use on the test data in the tests later
in this report, an examination is made on all three leagues using the above algorithm.
The algorithm simply creates an assessment for each match in the match data set, and
calculates the log score, based on the assessment and the observed result of the match.
The best is the one with the highest total log score over the entire training set.
52
9.3. TEST RESULTS
mining the best for the SAS League. The estimation of the values has been made
deterministically, in accordance with the settings proposed by [DC97]. The time unit
used is half weeks, and the search has been narrowed to the interval 0.00 to 0.02 for the
values. Using values higher than 0.02 has shown worse results than using values within
the interval. By examining the interval, and finding average log scores for every 0.001,
an initial 20 runs of the algorithm has been made for each league. Based on the initial
runs, a curve for the average log score as a function of has been created for the SAS
League.
Figure 9.1 shows the plot of the average log score as a function of the values. The figure
shows a graph over the connected points, as well as a fitted function. The irregularity
in the graph indicates that it seems random that any value in the 0.004-0.007 interval is
better than any of the already measured. A further increase in granularity has therefore
not been made. Instead the fitted function is used. It indicates, that 0.007 is a suitable
value, why this is chosen. The estimation of best values for each separate league has
not been made. This due the fact, that the estimation is very time costly, with a single
run taking 10-15 hours, dependant on the number of matches in the data set. This also
due to, that the difference between using an value of 0.004 instead of 0.005 does not
53
CHAPTER 9. DIXON-COLES APPROACH
seem to yield that large a change in the average log score. Therefore, the value of 0.007
is used for all leagues in the Dixon-Coles assessments for the remainder of this report.
Table 9.1 shows the Dixon-Coles estimation of the team offensive strength and defensive
weaknesses. These are presented along with the final position in the league for season
2005/06 and the goals scored and conceded by each team. At the top of the league
Recreativo is the team with the best offensive strength with a value of 1.15. They are
also the team with the lowest defensive weakness at a value of 0.85. This in accordance
with the actual results, since they are the most scoring team and the team with the
least conceded goals. Malaga B in 21st place is the team with the most conceded goals,
and also with the highest defensive weakness. There is a clear correlation between the
number of goals a team scores and concedes and the strength values.
Position Team α β Scored Conceded
1 Recreativo 1.15 0.85 67 32
2 Gimnastic 1.00 0.89 48 38
3 Levante 1.04 0.90 53 39
4 Ciudad Murcia 1.04 0.93 53 42
5 Lorca 1.06 0.91 56 39
6 Almeria 1.05 0.94 54 43
7 Xerez 1.09 0.97 60 46
8 Numancia 1.02 1.04 50 55
9 Gijon 0.94 0.87 41 34
10 Valladolid 1.05 1.03 54 54
11 Real Madrid B 1.05 0.99 55 50
12 Castellon 0.99 0.99 46 50
13 Albacete 0.97 1.05 44 57
14 Elche 0.99 1.02 47 54
15 Poli Ejido 0.96 0.99 43 50
16 Murcia 0.95 0.92 41 40
17 Hercules 0.92 0.99 39 49
18 Tenerife 1.04 1.07 53 60
19 Lleida 0.96 1.03 43 53
20 Ferrol 0.97 1.10 44 63
21 Malaga B 0.96 1.15 42 68
22 Eibar 0.83 0.95 28 45
Table 9.1: Using the Dixon-Coles approach the offensive strengths and defensive weak-
nesses have been made for Segunda Division, at the end of the 2005/06 season.
Over the entire season there was scored over 2.5 goals in 40.58% of the matches, and under
2.5 goals in 59.52%, and in average 2.30 goals was scored per match [Bet]. Looking at
a match between Valladolid and Tenerife, both being teams which have strength values
in the top half of the table, would be expected as a match up where there is a high
54
9.3. TEST RESULTS
probability of over 2.5 goals (in comparison with any other match in the league). A
Dixon-Coles assessment predicts a 52% chance of under 2.5 goals and a 48% chance
of over 2.5 goals. Despite these probabilities not indicating a match with many goals,
the probability of over 2.5 goals is higher than the league frequency. For a league with
relatively few goals, the probability assessment seems plausible.
Figure 9.2: Brøndbys offensive strength and defensive weakness in the period January
2005 to December 2006.
Table 9.2 shows the offensive strength and defensive weakness for the Danish team
Brøndby over time. The period of time is from January 2005 to December 2006, thus
covering the half of the 04/05 season, the whole 05/06 season and the half of the 06/07
season. It is clear to see, that the Dixon-Coles adjusts the offense and defense parameters
over time. It is worth noticing, that the two parameters do not follow the same curve,
meaning that a team can have an improvement in offense, without it influencing the
defensive qualities. Looking at the two curves, two points are the most interesting. At
July-August of 2005, which is just after the season, the offensive strength is high. This
55
CHAPTER 9. DIXON-COLES APPROACH
is natural due to this being the season where Brøndby won the championship. Another
interesting thing to notice is the high stable level of offensive strength and low defensive
weakness from late 2005 to mid 2006. In this season Brøndbys performance was high,
scoring 60 goals and conceding 34 that season, coming in second place.
Table 9.2 shows the average log scores for the Dixon-Coles approach for each of the three
leagues, for the fall 2006 season. The suggested value of 0.007 has been used.
Table 9.2: The average log score of the Dixon-Coles assessor using the estimated value.
The average log scores for the Dixon-Coles approach are not as good as the bookmakers’
log scores presented in Section 6.3. On all three leagues, the Dixon-Coles performs worse
than the bookmaker, however the average is not that far behind. The preliminary results
show indications of assessments not far from the bookmakers.
56
Chapter 10
As introduced in Section 2.3, two different betting strategies are under evaluation in this
project. One is the value betting strategy, which based on a prediction by an assessor and
offered odds will decide to bet or not. The other betting strategy is the threshold betting
strategy, which as the name implies uses a threshold to decide if the distance between
the offered odds line and the expected number of goals in a match is sufficiently large
for a bet to be placed. This chapter maps out the details of the two betting strategies,
and examines and presents the settings to be used in the final tests.
57
CHAPTER 10. BETTING STRATEGY EVALUATION
several formulas for calculating the value is needed. In a similar way as the construction
of the formulas in Section 6.1.1, is made and can be seen in Table 10.1.
For all three leagues in the match and odds data sets, a value betting strategy run will
be made for each of the three assessors on the final test data. A strategy run on the test
data takes all matches in the spring season of 2007, and simulates the placement of bets,
if the BestV alueBet() function returns a bet for the match. Before performing strategy
runs on the test data, a parameter tuning is made on the minimum value parameter
setting. Each assessor will be run on each league using three different settings for the
minimum value parameter: 1.00, 1.10 and 1.20. The parameter tuning is performed on
the fall season of 2006. This is done in order to determine if raising the demands to
the expected value will yield better betting results. Both for parameter tuning and final
testing, the global parameter settings found in Chapters 7, 8 and 9 are used for the
respective assessors. In the parameter tuning, each league will be submitted to a total of
fifteen strategy runs. This being with three different minimum value settings for each of
the three assessors, and additional three different minimum value settings for two of the
assessor ran on a limited data set. For the final test strategy runs, the best minimum
value setting is used.
For a strategy run, a number of bets are found and placed in simulation. The result data
set is then used to pay out the simulated bets using the observed results. By paying out
all bets found in a betting strategy run, an overall result can be found. By comparing
these results, it should be possible to draw conclusions about which of the three assessors
are the best bettor.
58
10.1. VALUE BETTING STRATEGY
1
Under 1.75 OddsU nder1.75 · P (≤ 1) + · P (2)
2
1
Over 1.75 OddsOver1.75 · P (≥ 2) + · OddsOver1.75 · P (2)
2
Under 2.00 OddsOver2.00 · P (≤ 1) + P (2)
Over 2.00 OddsOver2.00 · P (≥ 3) + P (2)
1
Under 2.25 OddsU nder2.25 · P (≤ 2) + · OddsU nder2.25 · P (2)
2
1
Over 2.25 OddsOver2.25 · P (≥ 2) + · P (2)
2
Under 2.50 OddsU nder2.50 · P (≤ 2)
Over 2.50 OddsOver2.50 · P (≥ 3)
1
Under 2.75 OddsU nder2.75 · P (≤ 2) + · P (3)
2
1
Over 2.75 OddsOver2.75 · P (≥ 3) + · OddsOver2.75 · P (3)
2
Under 3.00 OddsOver3.00 · P (≤ 2) + P (2)
Over 3.00 OddsOver3.00 · P (≥ 4) + P (3)
1
Under 3.25 OddsU nder3.25 · P (≤ 3) + · OddsU nder3.25 · P (3)
2
1
Over 3.25 OddsOver3.25 · P (≥ 4) + · P (3)
2
Table 10.1: Formulas for calculating the expected value of any over/under bet encoun-
tered in the odds data used for this project.
59
CHAPTER 10. BETTING STRATEGY EVALUATION
Table 10.2: The betting results for the SAS League, fall season 2006, containing a total
of 90 matches.
purposes. For the SAS League, the Dixon-Coles approach using the full odds data set
has the best performance, while it actually has a negative return on limited odds data
set. Looking at the return on investment, the Poisson assessor is the best, presenting
a very high percentage on both the 1.10 and 1.20 setting. In all (4 out of 5), the 1.10
setting shows the best results with regards to the net result, while the 1.00 setting shows
the best return on investment (3 out of 5).
Table 10.3: The betting results for the Premier League, fall season 2006.
Comparing Table 10.2 with Tables 10.3 and 10.4, there is no consistency across the
leagues. For the Premier League, the 1.10 minimum value seems to yield the best results,
with very good results for both the Dixon-Coles based strategies. The 1.10 minimum
value also yields good results for the Segunda Division, however here the gamblers’ and
Poisson assessment are showing good results, and the Dixon-Coles not so much.
In general, it is interesting to see, that the increase of the minimum value does not show
an increase in the return on investment, nor in the net return. This raises a suspicion
60
10.2. THRESHOLD BETTING STRATEGY
Table 10.4: The betting results for the Segunda Division, fall season 2006.
towards, what can be regarded as borderline values. For the parameter setting of 1.00,
it is hard to say, if a bet with a value of an accepted 1.01, in reality is a value of 99 and
therefore should have been discarded. On the other hand, bets with a value higher than
1.20 perhaps are, in some sense, overestimated, leading to too high values. Perhaps the
best bets in fact are found in a middle interval of 1.10 to 1.20, which could explain that,
the 1.10 minimum value shows the best performance. Any further investigation of the
minimum value parameter is left for future work, which could be a part of further tuning
the model parameters. For the remainder of this project, the value betting strategy will
use a minimum value of 1.10.
61
CHAPTER 10. BETTING STRATEGY EVALUATION
could be tuned to achieve a better strategy result. Three test runs have been made on
the 2006 fall season matches, with three different threshold settings. The minimum odds
is fixed at 1.70.
League 0.20 0.25 0.30
SAS League 69 bets, +8,2 58 bets, +10,104 51 bets, +5,61
(111,9%) (117,4%) (111,0%)
Premier League 181 bets, -16,14 159 bets, -12,80 140 bets, -17,31
(91,1%) (91,9%) (87,6%)
Segunda Division 158 bets, +3,39 126 bets, +6,64 105 bets, +5,01
(102,1%) (105,3%) (110,0%)
Table 10.5: The number of bets places and the net result of the threshold betting strategy
on the fall 2006 season for the three leagues.
Table 10.5 shows the average log scores for the threshold betting strategy, using the
three settings for the threshold parameter. For both the SAS League and the Segunda
Division the strategy shows a positive return, and good returns of investment. For the
Premier League the results are a negative return on all settings, however the threshold
value of 0.25 performs least bad. A reason for the positive results for the SAS League
and the Segunda Division, can either be due to random behavior or to the differences
62
10.2. THRESHOLD BETTING STRATEGY
in the league behavior. The SAS League is regarded as a high scoring league, while the
Segunda Division is regarded as a low scoring league. Therefore the line for the SAS
League is often higher than 2.5 and for the Segunda often lower. Looking closer at the
bets placed by the threshold strategy, it was noticed, that a big part of the profit came
from bet placed on over 1.75 and over 2.00 for the Segunda, and under 3.25, 3.00 and
2.75 for the SAS League. Without it being possible to draw conclusions, it is interesting
to notice that the trend for the leagues are not as significant as the bookmakers’ odds
suggest. The SAS League is not as high scoring as one might think, and the Segunda
not as goal-less. All in all the best value for the threshold is 0.25, giving the best net
result for all three leagues, and the best return of investment on two of the three.
63
Chapter 11
Results
In this chapter, firstly the proposed assessors performances are examined and evaluated.
The evaluations are based on tests made with a test data set, containing the matches
for spring 2007 for the respective leagues. Secondly the betting strategies are run on
an odds data set, also for the spring 2007 season. In the first section the test settings
are presented along with the means for evaluating the significance of the results. This
is followed by the results and evaluation for the assessors, and finally the results for the
betting strategies.
65
CHAPTER 11. RESULTS
Table 11.1: Preliminary results for each of the assessors on the three leagues. The table
shows the average log score on the matches in the fall season of 2006
As it could be expected, the bookmaker shows the highest average log score. However
on the SAS League, the Poisson assessor is the best, where the bookmaker is the second
best. It is interesting to see, that on all three leagues, the Dixon-Coles assessor has the
lowest average log score.
Table 11.2: For all four assessors, the average log score is shown for each league for the
spring season of 2007.
The Dixon-Coles assessor has the lowest average log score for the SAS League and Pre-
mier League, but is the assessor which comes closest to the bookmaker on the Segunda
Division. In all, the bookmaker shows the best average log score.
66
11.1. ASSESSOR EVALUATION
The three leagues have different result behavior, which makes it interesting to examine
the assessors predictions respective to the single league. Perhaps the Poisson approach is
best for the Premier League, and the gamblers’ approach best suited for the SAS League.
Table 11.3 shows the average assessment for over/under 2.5, corresponding to the log
scores above.
Table 11.3: For all four assessors, the average log score is shown for each league for the
spring season of 2007.
It is interesting to notice, that both the Poisson and the gamblers’ approach seem very
adapt to the known behavior of the leagues. For the high-scoring SAS League, they
have over as a clear favorite, and for the low-scoring Segunda they have under as a clear
favorite. For the medium-scoring Premier League under is the slight favorite. For the
bookmaker it is noticed, that for all leagues, the average prediction is closer to 0.5/0.5
than the other assessors. Most interesting is it to notice, that the Dixon-Coles approach
hold under as the favorite for all three leagues. Even for the SAS League, where the
other assessors have over as the favorite. This explains the poor performance on the SAS
League and Premier League, and the rather good performance on the Segunda Division.
This raises the question if the Dixon-Coles is indeed fit as an assessor of over/under
outcomes. Remembering that the Dixon-Coles model was initially designed to assess the
probability of results, and has shown good performance in predicting the outcome of
soccer matches. In this report, the assumption was made, that if the model has shown
good performance in predicting outcome of a match (which is simply a combination
of predicting the number of goals by each team) the model would also be plausible for
predicting the total number of goals in a match. With the above results, this assumption
does not seem to hold, except for leagues with a low average number of goals.
67
CHAPTER 11. RESULTS
each assessor versus each of the other assessors. The assessor in the horizontal header
is assessor A, and the assessor in the vertical header is assessor B. The Wilcoxon test
has been made by, for each match in the test set, subtracting the log score for the
assessment of A from the log score of the assessment of B. The null hypothesis would
the be, that the mean µ of the distribution of the differences is less than or equal to 0.
The difference in scores, have then been sorted by order of absolute values, and assigned
ranks. The ranks of the positive ranks are summed, as are the negative ranks. The goal
is to reject the null hypothesis, so the positive rank sum is used to calculate the p-value.
The positive rank sum is used, because if the p-value for the positive rank sum is lower
than the significance level α, the null hypothesis can be rejected. If the null hypothesis
is rejected, it would mean that the assessor in the vertical header is significantly better
than the on in the horizontal header.
Table 11.4: The Wilcoxon Signed-Rank Test p-values for each combination of the asses-
sors on the SAS League 2007 spring season.
Table 11.4 shows the p-values found for all combinations of assessors. The average
log scores indicated, that the bookmaker was the best of the four assessors, with the
gamblers’ approach not far behind. The p-values indicate, that the bookmaker is not
significantly better than the gamblers’ approach. However, with a p-value of 0.0317, the
bookmaker can be said to be significantly better than both the Poisson and Dixon-Coles
approach at significant level 0.05.
Table 11.5: The Wilcoxon Signed-Rank Test p-values for each combination of the asses-
sors, on the Segunda Division 2007 spring season.
For the Segunda Division, the bookmaker is again the best assessor. However, it is
not significantly better than any of the other assessors. It is better than the gamblers’
approach at a significance level 0.17, and has its best performance against the Poisson
approach, but a p-value of 0.138 can not be said to be significant, since there is a
probability of 0.138 that the results are given by chance.
68
11.2. BETTING STRATEGY EVALUATION
For the Premier League, the Poisson and the gamblers’ approach are better than the
bookmaker and the Dixon-Coles. However, the difference is not significant with respect to
the bookmaker, while the Dixon-Coles shows poor performance. The gamblers’ approach
is in fact significantly better than the Dixon-Coles at a significance level 0.10, which does
not pass test of 0.05 significance level.
Table 11.6: The Wilcoxon Signed-Rank Test p-values for each combination of the asses-
sors on the Premier League 2007 spring season.
In all, none of the assessors are significantly better than the bookmaker. In general
the bookmaker shows the best performance, but is only significantly better than the
Poisson and Dixon-Coles approach on the SAS League. The gamblers’ and Poisson
approach show an accepted performance on all three of the leagues, being closest to
the predictions of the bookmaker. The Dixon-Coles approach fails the test, with rather
poor performance. On the SAS League it is significantly worse than the bookmakers.
The reason for this seems to be an overestimation of the probability for under 2.5 goals,
leading to the Dixon-Coles having its best performance on the low-scoring league, the
Segunda Division.
69
CHAPTER 11. RESULTS
Table 11.7: Betting strategy results for the 2007 spring season, using an odds data set
containing only over/under 2.5 odds.
For the limited data set with only 2.5 line odds, the net results are very differing. The
value betting strategy using the gamblers’ assessment shows a very high profit of +7,55
on only 14 bets on the SAS League. However it suffers a large loss on especially the
Segunda Division. The opposite is the case for the value betting strategy using the
Poisson assessment. It performs with a positive return on the Premier League and
Segunda Division, but suffers a big loss on the SAS League. Of all the strategies ran
on the limited data set, the value betting strategy which performs with a positive net
result in total, is the one using the Dixon-Coles approach. However, the profit gained is
solely from the Premier League matches, while the other two leagues yield a loss. There
is no sign of consistency in the performance results. The threshold strategy yields a loss
on all three leagues. However, this approach is initially intended for use on a odds data
containing multiple lines.
Table 11.8: Betting strategy results for the 2007 spring season, using a full odds data
set.
On the full odds data set, the performance is similar. Here the value betting strategy
using the gamblers’ assessment has been left out, since it can not decide to bet on lines
other than 2.5. The Poisson approach again shows differing performance with profit on
the Premier League and loss on the two other leagues. The Dixon-Coles approach shows
70
11.2. BETTING STRATEGY EVALUATION
rather promising results, with a very high profit for both the Premier League and the
Segunda Division. Of the total 526 matches, bets have been placed on 235 of these for a
net result of +28.97 units, which is a 112,3% return of investment. A result which would
impress any bettor. The threshold approach also shows a profit over all three leagues,
with 218 bets for a profit of 4.27 units, being 102% return of investment.
Of the four strategies, the value betting using the Poisson and the gamblers’ assessments,
shows too little stability in the results and too large a loss, in order for one to conclude
that the strategy could be used for betting. This is both the case for the limited and the
full odds data set. The threshold betting strategy performs better on the full odds data
set, and yields a slight profit for the Premier League and Segunda Division. Remembering
that that these leagues are not leagues with a high average number of goals, it would
interesting to test the threshold strategy on other low- or medium-scoring leagues. The
value betting strategy using Dixon-Coles showed the best betting results of all. Both
for the limited and the full odds data set, a profit was attained over the total matches.
The best results were seen for Premier League and Segunda Division for the full odds
data set. Remembering that the Dixon-Coles assessment were prone to over-estimate
the probability of a low number of goals, raises suspicion as to which it can be used for
placing bets. However, the results attained for the Premier League and Segunda Division
indicates that the value betting strategy using the Dixon-Coles assessor might be good
for placing bets on low-scoring leagues. However, this is at this unsubstantiated, and
would call for further research and tests.
71
Chapter 12
Conclusion
Having implemented and evaluated the assessors proposed in the report, it is now possible
to draw conclusions about the results. The goal of the project was to examine if it is
possible to create automatic probability assessments which can at least match those
made by a human bookmaker.
73
CHAPTER 12. CONCLUSION
gamblers’ approach showed the best performance, being that of the proposed assessors
coming closest to both the predictions and the average log scores of the bookmaker.
Using the Wilcoxon test, it was determined that none of the proposed assessors showed
significantly better performance than the bookmaker. However only in one case, on the
SAS League, did the bookmaker show significantly better performance than the Poisson
and Dixon-Coles approach. It can therefore not be concluded that any of the assessors
proposed was better than the bookmaker, nor can it be said that they in general are
significantly worse. It has been shown, that the assessor can not beat the bookmakers,
but almost match them with regards to predicting the number of goals.
74
12.2. FUTURE WORK
league and market). By monitoring the odds, and the stakes placed, changes in odds can
be made accordingly to minimize the risk on an event. In time the odds will adjust to
market and find its natural level, which it also would if the odds was from the beginning
compiled by a human bookmaker. In this sense, the assessors proposed in this project
would be candidates for such a system. A refinement of the Dixon-Coles model would, it
is assumed, also be suited for assessing the result at half time or perhaps the probability
of a team winning with a larger margin. A such model would be able to create several
markets for a single match, which would be desirable for any bookmaker.
75
Appendix A
Interviews
To gain information on how probabilities and odds for over/under goals on soccer matches
are made, two bookmakers and one professional gambler has been consulted. In the
following their views on the matter is accounted for.
77
APPENDIX A. INTERVIEWS
But investigating takes a lot of time, and in many cases the game is a no bet. I want
a high bet value before I place a bet. The time spent on investigating is often too
much compared to the over all winnings. Therefore I have been thinking about trying
to quantify the decision to bet, to save time. I do not expect that it is possible to get as
great a return on investment as by doing it manually, but if it is possible to make just
a small profit by having an automatic system, a lot of time is saved which can be spent
else where, on areas where larger winnings are possible. I have been testing an approach
where I for a match take the two teams, and calculate the average number of goals in
their matches so far in the season. Then I take the average number and compare to the
market odds. If the average is more than 0.25 lower than, for example under 2.75 goals,
it is a possible bet. I have found that odds should be 1.70 or higher to be a possible
bet. I have been testing this on several leagues, in the spring season, and for some of
the leagues there could be some interesting areas. Primarily bets in leagues where the
average number of goals is very low, like 2.28 in the French league for example, there are
often good bets on over 2.00 and over 2.25 goals. Some lines are better than others, but
as of now it is not possible to say if I have been lucky or there actually is a possibility of
making money here.
78
A.2. BOOKMAKER INTERVIEW
If a team normally plays matches with significantly few or many goals. Here we also
look at previous matches between the two opposing teams - are they prone to play 0-0
or do they explode in goals. A relevant factor is of course the player material in a team.
If a, normally, high scoring team is missing its best attacker, the chance of them scoring
goals is lower than if he plays. The same can be said for defensive players and conceding
goals. Another relevant factor is the weather, in fact. If it is snowing or raining heavily,
it becomes harder for the teams to create good attacking football and the chance of many
goals is lowered. The last thing we look at, is if there are any circumstances surrounding
the match that can affect the result. For example in a cup game, where the teams play
both home and away, and in the second match the away team can progress to the next
round if they keep the opponent from scoring. Then they concentrate on defending and
attacking becomes secondary. These special circumstances are normally only present in
cup games or at the end of a season.
OB and FCK are playing this weekend (March 2008). Can you describe how
you have assessed this match? For both teams it is the case that they are very
strong in keeping the opponents from scoring. At the same time they are not teams that
score a lot of goals. They are teams ”that get the job done”. They go for the 1-0 win.
In their past mutual games the tendency has been low scoring games, perhaps partly
due to the matches being important top games, where neither team can afford to lose.
Besides that, we are in the late winter, at the start of the spring season, and the pitch is
likely to be a bit hard due to frost. This game is likely to end with few goals, and under
2.5 goals is a clear favorite here. Our odds are 1.72 for under and 2.12 for over.
How do you create multiple line over/under odds for a match? In our case,
we have created a model that takes the input of our employees on the above mentioned
factors, which returns a probability distribution for 0, 1, 2 and so on goals. From these it
is simple to create over/under 2.5 goals and over/under 2.75 and so on. Alternatively it
is possible to make probabilities for over/under 1.5, 2.5, 3.5 and combine these to make
over/under 2.75 for example.
79
References
[Bet] BetXpert.com.
[CH] Tobias Christensen and Rasmus Hansen. Odds assessment on football matches.
Master’s thesis, Aalborg University.
[DC97] Mark J. Dixon and Stuart G. Coles. Modelling association football scores and
inefficiencies in the football betting market. Journal of Royal Statistical Society
series, Series C. Vol. 46, No. 2, p. 265-280, 1997.
[DS] Morris H. Degroot and Mark J. Scherwish. Probability and Statistics. Addison-
Wesley.
[Te] Tip-ex.com.
[TSK] Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. Introduction to Data
Mining. Pearson Education.
[WM] Robert L. Winkler and Allan H. Murphy. Good probability assessors. Journal
of Applied Meterology, Vol 7, No 4, p.751-758.
81