0% found this document useful (0 votes)

131 views26 pages

Aisyah Ariana Hamdan - Interim Report

Here are the research questions for this project: 1. What are the key factors that influence the outcome of a Formula One race? 2. Which machine learning algorithms are most suitable for predicting Formula One race outcomes? 3. How can historical Formula One data be analyzed and modeled to build an accurate prediction system? 4. What performance metrics can be used to evaluate the effectiveness of different machine learning models for race outcome prediction? 5. How can the predicted outcomes from a machine learning model be visually presented to provide strategic insights for a Formula One team? 6. What are some ways the prediction model can be improved over time as more race data becomes available?

Uploaded by

kshafee.kalid7988

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

131 views26 pages

Aisyah Ariana Hamdan - Interim Report

Uploaded by

kshafee.kalid7988

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Formula One Races’ Prediction using Machine Learning

Aisyah Ariana Binti Hamdan

19000476

Interim Report submitted in partial fulfilment of

the requirements for
Bachelor of Technology (Hons)
(Information Technology)

JANUARY 2023

Universiti Teknologi PETRONAS

Bandar Seri Iskandar
31750 Tronoh
Perak Darul Ridzuan
CERTIFICATION OF APPROVAL

Formula One Races’ Prediction using Machine Learning

Aisyah Ariana Binti Hamdan

19000476

An interim report submitted to the

Information Technology Programme

Universiti Teknologi PETRONAS

in partial fulfilment of the requirement for the

BACHELOR OF TECHNOLOGY (Hons)

(INFORMATION TECHNOLOGY)

Approved by,

_____________________

(Ts. Dr. Kamaluddeen Usman Danyaro)

UNIVERSITI TEKNOLOGI PETRONAS

TRONOH, PERAK

January 2023

i
CERTIFICATION OF ORIGINALITY

This is to certify that I am responsible for the work submitted in this project, that the
original work is my own except as specified in the references and acknowledgements,
and that the original work contained herein have not been undertaken or done by
unspecified sources or persons.

Aisyah Ariana H
______________________________

AISYAH ARIANA BINTI HAMDAN

ii
ABSTRACT

The purpose of this paper is to use machine learning to make predictions for
Formula One races. The predictions and their results are then displayed in an
interactive dashboard using data visualisation tools. Formula One is a cutting-edge,
high-tech racing competition that generates massive volumes of data that serve as the
ideal testing ground for new machine learning algorithms and approaches. There are
variables affecting the likelihood of the drivers winning because this sport is
unpredictable. As a result, a structured and integrated machine learning model would
be able to assist team builders in reviewing the team's performance and using this
information to provide an accurate prediction and better strategy. Furthermore, by
emphasising crucial elements like the ML algorithms utilised in achieving the
maximum prediction accuracy, this paper offers a comprehensive analysis and
assessment of the literature on Machine Learning (ML) and sport outcome predictions.
Moreover, CRISP-DM, a Cross-Industry Standard Process for Data Mining is used in
this project and contains six key steps. In terms of determining accuracy, the methods
that will be incorporated into this model also derive from the supervised learning
algorithm. The algorithms are Gradient Boosting Tree, Random Forest, Support
Vector Machine, and Neural Network. Finally, this report's content is to offer
information for any future studies in the field and to suggest future expansions of the
project's applications because this framework may also be used to predict other sports
that will be advantageous to the team.

iii
ACKNOWLEDGEMENT

I would like to express my sincere appreciation to all the individuals who have
supported me throughout the course of this project.

First and foremost, it is with great pleasure that I wish to express my sincere
thanks to Universiti Teknologi PETRONAS (UTP) for providing me with this
opportunity to apply the knowledge gained and interest developed during my
undergraduate studies in a professional setting. Additionally, I am beyond grateful to
Dr Kamaluddeen Usman Danyaro for the invaluable guidance, encouragement, and
inspiration throughout the project. The insights, knowledge, and constructive comment
provided have been essential in shaping my ideas and improving the quality of my
work.

Also, I want to express my gratitude to my friends and colleagues who have

helped with this project in a variety of capacities, be it through brainstorming sessions,
feedback, and resource sharing. This project has been more enlightening and fun
thanks to their cooperation.

To my supportive and loving family and friends, my deepest gratitude. Without

their constant encouragement and assurance, I would not have achieved and be where
I am right now. I am grateful to have such people in my life that no distance could stop
them from showering me with their love and support.

Last but not least, I want to thank myself. A big pat on the back for never
quitting, and being able to express the interest I have in this field in a structured
manner. I am so proud to see how far I have come and grown into a better person
throughout this project journey.

iv
TABLE OF CONTENTS

CERTIFICATIONS……………………………………………………. ……. i
ABSTRACT…………………………………………………………….……. iii
ACKNOWLEDGEMENT……………………………………………............. iv
CHAPTER 1: INTRODUCTION ……………………………………... 1
1.1 Background of Study……………………………….. 1
1.2 Problem Statement………………………………….. 2
1.3 Research Questions…………………………............. 2
1.4 Objectives…………………………………………… 3
1.5 Scope of Study………………………………………. 4
1.6 Project Relevancy and Significance…………………. 4
CHAPTER 2: LITERATURE REVIEW AND THEORY ……………... 5
2.1 Intro to Sports Analytics and Machine Learning……. 5
2.2 Factors Influencing the Machine Learning Model…... 6
2.3 Machine Learning Algorithms………………………. 7
CHAPTER 3: METHODOLOGY AND PROJECT WORK…………… 9
3.1 Research Methodology……………………………… 9
3.1.1 Business Understanding……………………... 10
3.1.2 Data Understanding………………………….. 11
3.1.3 Data Preparation……………………………... 11
3.1.4 Modelling……………………………………. 12
3.1.5 Evaluation…………………………………… 12
3.1.6 Deployment…………………………………. 12
3.2 Research Algorithm……………………………….... 13
3.3 Tools………………………………………………... 14
3.3.1 Kaggle……………………………………….. 14
3.3.2 Google Colab……………………………….... 14
3.3.3 Microsoft Power BI………………………….. 15
3.4 Project Milestones…………………………………... 15
CHAPTER 4: CONCLUSION AND FUTURE WORK……………….. 17
4.1 Conclusion and Future Work……………….……….. 17
REFERENCES…………………………………………………………………….. 18
APPENDICES……………………………………………………………………... 20
CHAPTER 1

INTRODUCTION

1.1 Background of Study

One of the most well-known and thrilling motorsports in the world,

Formula One (F1) pits expert drivers against one another in the latest,
quickest racing automobiles. F1 and sports analytics have unquestionably
gotten closer together in recent years. There is a lot of data that may be
evaluated to get insights thanks to the large amount of data produced by
the racing teams and the organisation that oversees the sport. Because to
the complexity of the sport and the numerous variables that might
influence the results, such as weather conditions, circuit configuration, car
performances, and driver abilities, predicting the winner of an F1 race has
always been difficult.

In order to provide the team a competitive edge, sports analytics

techniques like machine learning and data visualisation are being
deployed. Sports results are one area where machine learning (ML) has
demonstrated promising potential. A high accuracy prediction model can
be produced by incorporating ML algorithms. However, there is still a
need for further research in this area, particularly in the development of
more accurate prediction models.

As a result, the goal of this project is to use ML approaches to create a

reliable prediction model for F1 race outcomes. The best ML algorithms
will be used, and an extensive research will be done on the different
aspects that affect the prediction model. To have a better grasp of the
prediction, the outcome will be visualised using a dashboard. The
engineers of the F1 team will be able to use this information to improve
their tactics and comprehend the variables affecting performance.

1
1.2 Problem Statement

Sports may be intricate and dynamic activities with many variables that
might affect the result, including the weather, faulty equipment, and
individual abilities. If these factors are not properly considered, a team's
standing may be in jeopardy. As a result, predicting the outcome based on
an educated guess without a careful evaluation of the data and calculation
would be extremely wrong. Given the potential for these factors to shift
the outcome of a single event, the development of a prediction model using
machine learning is necessary. This model is curated for the Formula One
team principal and engineers, and it uses algorithms trained on historical
data to create a strategic, well-analyzed framework to predict each race's
outcomes.

1.3 Research Questions

Based on the problem statement in 1.2, research question can be deduced

that will serve as a guide in completing this project.

The research questions for this research study are outlined as below:

i. Which machine learning model will be employed that will

influence the accuracy of the results?
ii. Which machine learning algorithms will be adopted in this
project model?
iii. What data visualization method will be suitable for the findings
of the predicted model?

2
1.4 Objectives

The goal of this prediction model research is to aid in predicting the

formula one races’ winners. This prediction model can help the team
constructors in analyzing the team’s performances and use this data to
come out with better strategy. Hence, the objectives of this research are as
follows:

i. To develop a machine learning model that will accurately predict

the Formula One races champion.
ii. To utilise the data science lifecycle methods in developing the
prediction model.
iii. To employ data visualization measure in projecting the outcome of
the prediction.

1.5 Scope of Study

The project’s scope revolves around the application of the prediction

model using machine learning in sports. In short, the scope of the study
defines what the research is going to cover and what it is focusing on.
Hence, this project will focus on 4 scopes of study as mentioned below:

i. Deploy Data Science Lifecycle methods.

ii. Projecting the highest level of machine learning model accuracy.
iii. Employ data visualization methods for the project findings.
iv. Formulate supervise learning algorithms in the context of
predicting race winners and provisional standings.

3
1.6 Project Relevancy and Significance

The effort on applying machine learning (ML) to predict Formula One

(F1) race winners is very important and significant for a number of
reasons. First of all, F1 is a fiercely competitive and technologically
sophisticated sport, and being able to correctly predict race winners can
provide teams a big competitive edge. Optimising car performance,
modifying fuel and tyre strategies, and giving teams a competitive edge
may all be done with the help of accurate projections. Second, the use of
ML methods in F1 has broader effects on the sports analytics industry.
Researchers can learn how to apply ML approaches to other sports and
domains, such as predicting the results of other motorsports, team sports,
or individual sports, by creating precise prediction models for F1 race
winners. Last but not least, research into predicting the outcome of an F1
race using ML approaches can progress the discipline of machine learning.
F1 produces enormous volumes of data, which makes it a perfect testbed
for creating and analysing novel ML algorithms and methodologies. As a
result, new and more sophisticated ML algorithms may be created and
used in several additional subjects and domains. Overall, for F1 teams, the
field of sports analytics, and the larger field of machine learning, the effort
on utilising ML approaches to predict F1 race winners is highly important
and significant.

4
CHAPTER 2

LITERATURE REVIEW AND THEORY

2.1 Introduction to Sports Analytics and Machine Learning

The term "sports analytics" describes the application of data analysis

and statistical modelling to create predictions and learn more about a sport.
Each race or game generates enormous amounts of data, which can be
combined in a variety of useful ways. Bunker and Susnjak (2019) agreed
that sport result prediction is also potentially useful to players, team
management and performance analysts in identifying the most important
factors that help to achieve winning outcomes, upon which appropriate
tactics can be identified. Hence, including this data into a machine learning
model that can be used to predict race results is one method to handle this
data effectively. (e.g. results, standings).

According to Lotfi and Rebbouj (2021), the use of machine Learning

(ML) technology allow researchers to build models and simulate systems
to predict results. One of the intelligence techniques, machine learning
(ML), has demonstrated encouraging results in the structural classification
and prediction sectors, where it enables computers to learn without explicit
programming. ML is examined under three main headings, supervised
machine learning, unsupervised machine learning, and reinforcement
machine learning where they differ from each other according to the way
to get the required results claimed Barman and Demir (2021). Moreover,
machine learning is regarded as a key field in the discovery of hidden
patterns in datasets, concentrating on the development of smart models for
rapid and accurate prediction. As a result, the ML model can get more
accurate the more data is given into it.

5
Sport prediction is usually treated as a classification problem, with one
class (win, lose, draw) to be predicted as mentioned by Prasitio and Harlili
(2016). Hence, researchers are looking to employ a variety of features, such
as history team performance, historical match results, and player and other
data that has been gathered. There are numerous sports prediction models
available that use various algorithms to accomplish the forecast's objective.
Some of it include Bunker and Thabtah (2019), who constructed a
theoretical framework for general sports winner prediction using
unlabelled data. Multiple ML solutions used in match winner prediction for
NHL (National Hockey League) used as ensemble learning methods are
highlighted in Gu, Foster, Shang, and Wei (2019). Not just that, Ofoghi et
al. (2010) conducted a study for a machine learning approach to predicting
winning patterns in track cycling omnium that used unsupervised,
supervised and statistical analysis method. Last but not least, a machine
learning framework for predicting the race winner and championship
standings using 3 different machine learning algorithms by Sicoie (2022).

2.2 Factors Influencing the Machine Learning Model

Sports are undoubtedly unpredictable due to the multiple internal and

external factors that might affect the outcome of a single event. According
to Bunker and Susnjak (2019), sport result prediction is an interesting and
challenging problem due to the inherently unpredictable nature of sport,
and the seemingly endless number of potential factors that can affect
results. In Formula One context, the constant flow of telemetry data,
external factors such as weather and track condition, track information,
compound performance and unique race events such as unexpected DNFs
(Did not finish due to crashes, mechanical failures or disqualification)
come into play and can shift the outcome of a race event to a considerable
degree, claimed Sicoie (2022). Also, these factors have the power to change
a positive event outcome. Mentioned by Constantinou and Fenton (2013),
other key factors, such as player transfers, the availability of key players,
participation in international competitions, a new coach, level of injuries,

6
attack and defence ratings, and even team motivation/psychology in the
form of expert knowledge could lead to improved results.

Generally, there are a lot of factors, including internal and external

factors that could influence a sport’s machine learning model. These factors
can be found in any sport as it is a crucial area to look into by the team in
order to win. As claimed by Leard and Doyle (2011), home advantage,
momentum, and the likelihood of winning are the factors that were tested
on National Hockey League (NHL) game outcome. Besides, other studies
focus on individual player performances instead. According to Voyer and
Wright (1998), they examined 740 players’ performance in scoring,
shooting, getting the puck, etc. Additionally, a study by Gomez et al (2017)
the sport of basketball stated the importance of situational variables that
may have an interactive effect on teams’ performance such as game type,
game location, and game pace. Upon studying, it can be deduced that the
location of the game of the race can be said to be an important factor. It can
be related to the emotions of the players representing their country. This
reasoning is backed up by Haghighat et Al (2013), where emotions
influence the player who may or may not perform well.

2.3 Machine Learning Algorithms

In modeling the ML prediction model, numerous algorithms from

supervised machine learning can be used to get the most accurate outcome.
Nasteski (2017) claimed that supervised learning is one of the dominant
methodologies in machine learning. He also added that the techniques that
are used are even more successful than the unsupervised techniques
because the ability to label training data provided clearer criteria for model
optimization. Besides, according to Gomez et al (2017), classification and
Regression Tree (CRT) proved to be a powerful and effective technique in
explaining winning and losing team’s performances, which allows them to
measure the direct effect and interdependency of teams’ performances in a
visual tree. Using CRT which is under the Random Forest algorithm to
achieve the best accuracy is backed up by Passi and Pandey (2018) where

7
work on increasing prediction accuracy in the game of cricket using
machine learning found that Random Forest turned out to be the most
accurate classifier for both the datasets with an accuracy of 90.74%.

Moving on, not all machine learning model depends on Random Forest
algorithms to get the highest prediction accuracy. For instance, A game-
predicting expert system using Big Data and Machine Learning shared that
Support Vector Machine (SVM) outdone other ML algorithms. The high
prediction accuracy (i.e. >90%) confirms that the SVM and ensemble
machine learning algorithm are valuable tools that can accurately predict
game outcomes as stated by Gu et al (2019). Furthermore, a study by Lotfi
and Rebbouj (2021) on machine learning for sports results prediction using
algorithms claimed that his work proved that the algorithms are effective
in deriving highly accurate models by utilizing Neural Networks (NN).
Lastly, according to Bunker and Susnjak (2022), a wide set of candidate
algorithms and ensembles should be used in experimentation in sports
result prediction. One of the algorithms used was Boosting Gradient even
though higher propensity for research to us Artificial NN in sports domains.

8
CHAPTER 3

METHODOLOGY/ PROJECT WORK

3.1 Research Framework

A popular method for directing data mining and machine learning

initiatives is the Cross-Industry Standard Process for Data Mining, or
CRISP-DM. It includes all of the project's phases, their associated tasks,
and how these activities relate to one another. A structured method for
creating predictive models using machine learning is provided by this
framework. Additionally, by using CRISP-DM, it offers an iterative
method, including frequent chances to assess the project's development.
Data scientists and machine learning experts can create precise and
trustworthy prediction models by following the steps shown in Figure 1
below. These models can offer insightful data and guarantee that the
project's business goals are kept at the forefront at all times.

FIGURE 1. Overview of CRISP-DM

9
Based on Figure 1, Table 1 below shows a brief explanation of the 6
CRISP-DM phases.

TABLE 1. CRISP-DM Explanation

CRISP-DM
Explanation
Phases
Understands the business problem that you are trying to
Business solve and how a predictive model can help. Identify the
Understanding objectives, success, criteria, constraints, and risks
associated with the project.
Collect and explore the relevant data that will be used to
Data
build the predictive model. Identify any missing or
Understanding
erroneous data, and consider how to address the issues.
Clean, transform, and prepare the data for modeling. This
Data Preparation process may involve tasks such as selecting data, clean
data, constructing data, integrating data, and format data.
Select a proper machine learning algorithm and train the
model based on the prepared data. Experimenting with
Modelling
different algorithms and hyperparameters would work best
to find the best-performing model.
Focuses on technical model assessment. This phase looks
broadly at which model best meets the business and what
Evaluation
to do next. This includes evaluating results, reviewing the
process, and determine the next step.
Deploy the model into production, integrating it with
Deployment appropriate systems and processes. Ensure that it is well-
documented and accessible to relevant stakeholders.

3.1.1 Business Understanding

Business understanding in the context of this project is to

get a grasp of the project objectives and the requirements from
a business perspective. Since this project is designed to be used
by the Formula One team constructors or engineers, this model
will be based on their point of view. The problems and
objectives had been addressed in chapter 1 where it is being
look into from the business stand.

10
3.1.2 Data Understanding

In data understanding, the process of gathering and

exploring relevant data to be used to train the machine learning
model begins. This covers past results from Formula One races
as well as additional data, like weather information for race day,
team statistics, and other significant details. I have located the
dataset that is thought to be appropriate for this project based on
my study. It is from Kaggle, a website where users can design
and share data-driven solutions and access datasets, tools, and
resources. The data file that was discovered includes
information about the Formula One World Championship from
1950 through the current season of 2023. A snapshot of the
collected data, which consists of 12 properties, is shown in
Figure 2 below.

FIGURE 2. Properties of Gathered Data

3.1.3 Data Preparation

To make sure that the data is prepared for usage in the

machine learning model, it will be cleaned and pre-processed
during this stage. This could involve dealing with missing
values, improper formatting, removing extraneous data, and
other unrelated data in the dataset. For instance, reformatting is
required for the dataset that may be obtained on Kaggle. Such
as, formatting string attributes to an integer value so that they
can be fitted inside a prediction model.

11
3.1.4 Modelling

Machine learning techniques are incorporated during the

modelling phase. To determine the model that most accurately
predicts the results of the races, the produced dataset will be
split into a set of training and testing models using various
machine learning algorithms. The algorithms used will be
supervised machine learning algorithms that combine
regression and classification techniques. Following are the
suggested supervised learning algorithms to be used in the
model, which are based on the literature review research:

i. Neural Network
ii. Random Forest
iii. Kernel SVM
iv. Gradient Boosting Tree

3.1.5 Evaluation

Upon completing the modelling part, the evaluation phase

comes in to evaluate each prediction model. Each model will be
evaluated in determining the degree of the prediction model in
relation to the project goals. Moreover, the evaluation phase
will involve certain metrics such as accuracy, precision or F1
score.

3.1.6 Deployment

After the evaluation phase is finished, the project's insights

and conclusions will be made available through a visualisation
dashboard, an interactive dashboard that gives users or
decision-makers a high-level perspective of the most crucial
indicators. It mixes data with graphs, charts, and other images
in order to draw the audience's attention to the crucial metrics.

12
3.2 Machine Learning Algorithms

The machine learning algorithms that will be integrated in the model is

adopted from the supervised learning algorithms. In the field of machine
learning known as supervised learning, a labelled dataset is used to teach
the computer to make predictions or choices. With supervised learning, the
algorithm will figure out the relationship between the input and the output
after being paired with the appropriate output or label. Depending on the
algorithm employed, the output's level of accuracy varies. Classification
model and regression model are the two components of the algorithms for
supervised learning. Regression models concentrate on predicting
continuous variables like price and wage, whereas classification models
concentrate on categorising discrete values like true or false. Both of these
supervised learning techniques can determine the accuracy or speed of
learning.

Hence, the proposed algorithms to be integrated inside the machine

learning model are as follows:

i. Neural Network
ii. Random Forest
iii. Gradient Boosting Tree
iv. Kernel SVM

13
3.3 Tools
3.3.1 Kaggle

Kaggle is an online platform that offers access to a large

collection of datasets as well as data analysis, machine learning,
and data visualisation tools. Kaggle also provides courses and
tutorials to help beginners learn data science and machine
learning. This tool is used in this project to collect and explore
available Formula One race data.

FIGURE 3. Kaggle Logo

3.3.2 Google Colab

A free cloud-based platform called Google Colab allows

users to write and run arbitrary Python code as well as support
for additional programming languages like Julia and R. It is
ideal for machine learning and data analysis project or work.
Additionally, Google Colab gives users access to robust
computing tools like GPUs and TPUs that may be utilised to
speed up the training of machine learning models. In here, this
tool will be used to write the codes for the prediction machine
learning model.

FIGURE 4. Google Colab Logo

14
3.3.3 Microsoft Power BI

Microsoft created Power BI, which enables users to

analyse data and communicate insights in real-time. Users can
connect to different data sources through it, transform and clean
data, and build interactive reports and dashboards. Moreover,
Power BI offers a number of functions and instruments that are
intended to aid users in understanding their data. This tool is
used during the deployment phase, when an interactive
dashboard will be used to visualise the insights and results of
the prediction model.

FIGURE 5. Power BI Logo

3.4 Project Milestones

The project milestones are specific, measureable and significant events

or accomplishments in a project’s timelines that can be represented via a
Gantt chart to help track progress towards the project’s completion. Using
this effective planning tool, the project may move in an organized and
efficient manner.

The Gantt chart on the next page shows the significant events to be
covered in FYP I as well as FYP II.

15
FYP I:

FIGURE 6. FYP I Gantt Chart

FYP II:

FIGURE 7. FYP II Gantt Chart

16
CHAPTER 4

CONCLUSION AND FUTURE WORK

4.1 Conclusion

In conclusion, this interim report provides a structured progress report

on the first part of the final year project, which aims to develop a machine
learning model that accurately predicts the outcome of Formula One races.
According to the CRISP-DM methodology framework, the project is 20%
complete and is currently in the process of executing the next phases of the
project.

Integrating machine learning in Formula One can not only predict the
winners of the race but can also be used in other areas of the sport, which
would be beneficial for the team. For example, sports analysts and
engineers can use machine learning to detect damages to the systems and
cars, analyze drivers' performances and areas where they need to improve,
as well as predict the constructor's overall standings.

Overall, I have high hopes for this project and think that the model will
offer useful information regarding the Formula One racing and sports
analytics industries.

4.2 Future Work

The future work of this project includes executing the next steps of the
CRISP-DM framework, which will be presented in FYP II. The next steps
in this project include data preparation and analysis, data modelling, testing
and evaluation, as well as visualizing the findings and outcomes in an
interactive manner.

17
REFERENCES

[1] Bunker, R. P., & Thabtah, F. (2019). A machine learning framework for sport
result prediction. Applied Computing and Informatics, 15(1), 27–33.
https://doi.org/10.1016/j.aci.2017.09.005

[2] Bunker, R., & Susnjak, T. (2022). The application of machine learning
techniques for predicting match results in Team Sport: A Review. Journal of
Artificial Intelligence Research, 73, 1285–1322.
https://doi.org/10.1613/jair.1.13509

[3] Demir, İ., & Barman, İ. (2021). Modelling sport events with supervised machine
learning. Fundamental Journal of Mathematics and Applications, 232–244.
https://doi.org/10.33401/fujma.951665

[4] Gu, W., Foster, K., Shang, J., & Wei, L. (2019). A game-predicting expert
system using Big Data and machine learning. Expert Systems with
Applications, 130, 293–305. https://doi.org/10.1016/j.eswa.2019.04.025

[5] Gómez, M. A., Ibáñez, S. J., Parejo, I., & Furley, P. (2017). The use of
classification and regression tree when classifying winning and losing basketball
teams. Kinesiology, 49(1), 47. https://doi.org/10.26582/k.49.1.9

[6] Haghighat, M., Rastegari, H., & Nourafza, N. (2013). A Review of Data Mining
Techniques for Result Prediction in Sports, 2(5), 7–12.

[7] Lotfi, S., & Rebbouj, M. (2021). Machine learning for sport results prediction
using algorithms. International Journal of Information Technology and Applied
Sciences (IJITAS), 3(3), 148–155. https://doi.org/10.52502/ijitas.v3i3.114

[8] Nasteski, V. (2017). An overview of the supervised machine learning

methods. HORIZONS.B, 4, 51–62.
https://doi.org/10.20544/horizons.b.04.1.17.p05

[9] Ofoghi, B., Zeleznikow, J., MacMahon, C., & Dwyer, D. (2010). A machine
learning approach to predicting winning patterns in track cycling

18
omnium. Artificial Intelligence in Theory and Practice III, 67–76.
https://doi.org/10.1007/978-3-642-15286-3_7

[10] Passi, K., & Pandey, N. (2018). Increased prediction accuracy in the game of
cricket using Machine Learning. International Journal of Data Mining &
Knowledge Management Process, 8(2), 19–36.
https://doi.org/10.5121/ijdkp.2018.8203

[11] Sicoie, H. (2022). Machine Learning Framework for formula 1 race winner and
championship standings predictor(thesis). Tilburg University. Cognitive
Science and Artificial Intelligence.

[12] Vopani, V. (2023, March 7). Formula 1 World Championship (1950 - 2023).
Kaggle. Retrieved March 30, 2023, from
https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-
1950-2020/code?resource=download

19
APPENDICES

Snapshot of dataset retrieved from Kaggle

GS4 (2) - Mock Test - Laghima Tiwari
No ratings yet
GS4 (2) - Mock Test - Laghima Tiwari
59 pages
Pre-Exam Notes - AI
No ratings yet
Pre-Exam Notes - AI
84 pages
Share 'Civic Education - Questions and Answers Pamphlet v0.1 @413
No ratings yet
Share 'Civic Education - Questions and Answers Pamphlet v0.1 @413
68 pages
Catalogo Tune 1000
No ratings yet
Catalogo Tune 1000
30 pages
Football Match Data Analysis Using Machine Learning: Bachelor of Science (Information Technology)
No ratings yet
Football Match Data Analysis Using Machine Learning: Bachelor of Science (Information Technology)
24 pages
Citizens Guide To Nepa 2021
No ratings yet
Citizens Guide To Nepa 2021
37 pages
Gen Ai Solutions
No ratings yet
Gen Ai Solutions
14 pages
AccountStatement 3599893487 Jan27 165426
100% (1)
AccountStatement 3599893487 Jan27 165426
4 pages
IPL Data Analysis and Prediction Using M
No ratings yet
IPL Data Analysis and Prediction Using M
4 pages
J Chemosphere 2008 11 022
No ratings yet
J Chemosphere 2008 11 022
6 pages
Commissioned Community Radio Stations With Valid Gopa As On 01.02.2023
No ratings yet
Commissioned Community Radio Stations With Valid Gopa As On 01.02.2023
8 pages
Final
No ratings yet
Final
8 pages
AC2100 (Plus) - Eng Guide - 1.00 - 20161201
No ratings yet
AC2100 (Plus) - Eng Guide - 1.00 - 20161201
78 pages
拼音练习题
No ratings yet
拼音练习题
49 pages
Vol. CXXV-No - 106
No ratings yet
Vol. CXXV-No - 106
56 pages
Oper&Scm Supa
No ratings yet
Oper&Scm Supa
58 pages
Jamaica Sugar Industry Survey. Phase I (02461.en)
No ratings yet
Jamaica Sugar Industry Survey. Phase I (02461.en)
263 pages
LAW402 4k Words
No ratings yet
LAW402 4k Words
28 pages
Pre-Eclampsia Risk Monitoring and Alert System Using Machine Learning and IoT
No ratings yet
Pre-Eclampsia Risk Monitoring and Alert System Using Machine Learning and IoT
6 pages
Year 13 ECO 2020
No ratings yet
Year 13 ECO 2020
67 pages
FR Store Project Report Final
No ratings yet
FR Store Project Report Final
41 pages
OAPM Citizens Charter (Publisher) Revised RA 11032
No ratings yet
OAPM Citizens Charter (Publisher) Revised RA 11032
9 pages
Unit-5 - Computer Networks-Part 1
No ratings yet
Unit-5 - Computer Networks-Part 1
12 pages
Open Positions
No ratings yet
Open Positions
9 pages
DuraLabel Kodiak User Guide
No ratings yet
DuraLabel Kodiak User Guide
60 pages
2022 Champlain Valley Swim League Championship Meet Results
No ratings yet
2022 Champlain Valley Swim League Championship Meet Results
20 pages
19MV06
No ratings yet
19MV06
42 pages
13-09-2022 - AC Quote - Oftog Bussiness Solutions (OP-1)
No ratings yet
13-09-2022 - AC Quote - Oftog Bussiness Solutions (OP-1)
8 pages
Sustainable Battery Materials From Biomass
No ratings yet
Sustainable Battery Materials From Biomass
28 pages
Deformity Questionnaire
No ratings yet
Deformity Questionnaire
2 pages
Database Security Project
No ratings yet
Database Security Project
18 pages
IPE 493 - Inventory Management
No ratings yet
IPE 493 - Inventory Management
58 pages
TTL-Network ProductCatalogue 2021 2022 ENG
No ratings yet
TTL-Network ProductCatalogue 2021 2022 ENG
116 pages
Bulletin 08 01
No ratings yet
Bulletin 08 01
20 pages
Navy Clearance Divers
No ratings yet
Navy Clearance Divers
14 pages
OSCM Report Group 4 Div C
No ratings yet
OSCM Report Group 4 Div C
29 pages
Association Aggregation-Final
No ratings yet
Association Aggregation-Final
73 pages
Deep RL For Biology
No ratings yet
Deep RL For Biology
33 pages
OSU HHS 231 Special Exam Study Guides
No ratings yet
OSU HHS 231 Special Exam Study Guides
9 pages
Contoh Soal
No ratings yet
Contoh Soal
30 pages
Investment Management
No ratings yet
Investment Management
35 pages
CCMF Application Form 20120313
No ratings yet
CCMF Application Form 20120313
6 pages
Eca File
No ratings yet
Eca File
11 pages
Marketing Communications MOD001178: Anglia Ruskin University
No ratings yet
Marketing Communications MOD001178: Anglia Ruskin University
21 pages
Odd 7
No ratings yet
Odd 7
9 pages
Com 049
No ratings yet
Com 049
7 pages
GFMP Brochure PDF
No ratings yet
GFMP Brochure PDF
16 pages
Preparation of Ultrafine Rhenium Powders by CVD Hydrogen Reduction of Volatile Rhenium Oxides
No ratings yet
Preparation of Ultrafine Rhenium Powders by CVD Hydrogen Reduction of Volatile Rhenium Oxides
5 pages
4 Colgate Palmolive
No ratings yet
4 Colgate Palmolive
36 pages
01 Acemedia Final Imc Plan Spring22 Compressed 1
No ratings yet
01 Acemedia Final Imc Plan Spring22 Compressed 1
115 pages
Specifications: KID Teer Oader
No ratings yet
Specifications: KID Teer Oader
2 pages
McKinsey Company Case Study
No ratings yet
McKinsey Company Case Study
2 pages
Sangram Karmarkar - CV
No ratings yet
Sangram Karmarkar - CV
3 pages
APEGBC Bylaws
No ratings yet
APEGBC Bylaws
22 pages
Prescott Valley Police Department Media Release
No ratings yet
Prescott Valley Police Department Media Release
2 pages
Sdu Lab Assignment: 1. Component Diagram For Online Examination Registration System
No ratings yet
Sdu Lab Assignment: 1. Component Diagram For Online Examination Registration System
4 pages
Sit 308 Human Computer Interface
No ratings yet
Sit 308 Human Computer Interface
2 pages
June
No ratings yet
June
4 pages
ATCOs Online Log Book Proposal
No ratings yet
ATCOs Online Log Book Proposal
3 pages
Business RM Assignment 2022 - 23
No ratings yet
Business RM Assignment 2022 - 23
5 pages
Volumes of Solids
No ratings yet
Volumes of Solids
2 pages
T Thesis Topics in Machine Learning For Research Scholars
No ratings yet
T Thesis Topics in Machine Learning For Research Scholars
14 pages
Revolutionizing Customer Engagement The Role of AI in Modern CRM Systems 1 11
No ratings yet
Revolutionizing Customer Engagement The Role of AI in Modern CRM Systems 1 11
15 pages
Kaist cs492d Fall 2024 Lecture 6
No ratings yet
Kaist cs492d Fall 2024 Lecture 6
24 pages
Model Based Machine Learning 1704187221
No ratings yet
Model Based Machine Learning 1704187221
300 pages
Deep Learning
0% (1)
Deep Learning
5 pages
Modern ABC Chemistry For Class 12 Part I - Dr. S.P. Jauhar
No ratings yet
Modern ABC Chemistry For Class 12 Part I - Dr. S.P. Jauhar
6 pages
11th IP Unit-4 Emerging Trends
No ratings yet
11th IP Unit-4 Emerging Trends
39 pages
Syllabus - IM31202 - Statistical Learning With Applications
No ratings yet
Syllabus - IM31202 - Statistical Learning With Applications
3 pages
BERT and RoBERTa For Sarcasm Detection - Optimizing Performance Through Advanced Fine-Tuning
No ratings yet
BERT and RoBERTa For Sarcasm Detection - Optimizing Performance Through Advanced Fine-Tuning
11 pages
Cluster Analysis or Clustering Is The Art of Separating The Data Points Into Dissimilar Group With A
No ratings yet
Cluster Analysis or Clustering Is The Art of Separating The Data Points Into Dissimilar Group With A
11 pages
TYIT-SEm-V and Sem-VI Autonomous Syllabus 2024-25
No ratings yet
TYIT-SEm-V and Sem-VI Autonomous Syllabus 2024-25
65 pages
Single Layer Perceptron Learning Algorithm and Flowchart of The Program and The Code of The Program in C
No ratings yet
Single Layer Perceptron Learning Algorithm and Flowchart of The Program and The Code of The Program in C
8 pages
ML Internship Experience
No ratings yet
ML Internship Experience
38 pages
Comparing Q Learning and Policy Gradient in Frozen Lake Environment
No ratings yet
Comparing Q Learning and Policy Gradient in Frozen Lake Environment
8 pages
Screenshot 2023-08-06 at 2.07.28 PM
No ratings yet
Screenshot 2023-08-06 at 2.07.28 PM
49 pages
Facial Recognition and Machine Learning-Based Student Attendance Monitoring System
No ratings yet
Facial Recognition and Machine Learning-Based Student Attendance Monitoring System
7 pages
Prudhvi Entity Aug
No ratings yet
Prudhvi Entity Aug
14 pages
Soft Computing
No ratings yet
Soft Computing
96 pages
Multilinear Subspace Learning
No ratings yet
Multilinear Subspace Learning
5 pages
Medical Image Analysis - Unit 14 - Week 11
No ratings yet
Medical Image Analysis - Unit 14 - Week 11
4 pages
Optimizing Gas and Steam Turbine Performance Through Predictive Maintenance and Thermal Optimization For Sustainable and Cost-Effective Power Generation
No ratings yet
Optimizing Gas and Steam Turbine Performance Through Predictive Maintenance and Thermal Optimization For Sustainable and Cost-Effective Power Generation
17 pages
DRL Roadmap
No ratings yet
DRL Roadmap
11 pages
Invention of AI
No ratings yet
Invention of AI
13 pages
AIO2023
No ratings yet
AIO2023
11 pages
Analysis of Digitalization Transformation in AirAsia
No ratings yet
Analysis of Digitalization Transformation in AirAsia
7 pages
AI - Project Report
No ratings yet
AI - Project Report
4 pages
Machine Learning For AC OPF
No ratings yet
Machine Learning For AC OPF
4 pages
TMLS20 Machine Learning Coursework-1
No ratings yet
TMLS20 Machine Learning Coursework-1
5 pages
DD1420 20251
No ratings yet
DD1420 20251
3 pages

Aisyah Ariana Hamdan - Interim Report

Uploaded by

Aisyah Ariana Hamdan - Interim Report

Uploaded by

Formula One Races’ Prediction using Machine Learning

Aisyah Ariana Binti Hamdan

Interim Report submitted in partial fulfilment of

Universiti Teknologi PETRONAS

Formula One Races’ Prediction using Machine Learning

Aisyah Ariana Binti Hamdan

An interim report submitted to the

Information Technology Programme

Universiti Teknologi PETRONAS

in partial fulfilment of the requirement for the

BACHELOR OF TECHNOLOGY (Hons)

(Ts. Dr. Kamaluddeen Usman Danyaro)

UNIVERSITI TEKNOLOGI PETRONAS

AISYAH ARIANA BINTI HAMDAN

Also, I want to express my gratitude to my friends and colleagues who have

To my supportive and loving family and friends, my deepest gratitude. Without

1.1 Background of Study

One of the most well-known and thrilling motorsports in the world,

In order to provide the team a competitive edge, sports analytics

As a result, the goal of this project is to use ML approaches to create a

1.3 Research Questions

Based on the problem statement in 1.2, research question can be deduced

i. Which machine learning model will be employed that will

The goal of this prediction model research is to aid in predicting the

i. To develop a machine learning model that will accurately predict

1.5 Scope of Study

The project’s scope revolves around the application of the prediction

i. Deploy Data Science Lifecycle methods.

The effort on applying machine learning (ML) to predict Formula One

LITERATURE REVIEW AND THEORY

2.1 Introduction to Sports Analytics and Machine Learning

The term "sports analytics" describes the application of data analysis

According to Lotfi and Rebbouj (2021), the use of machine Learning

2.2 Factors Influencing the Machine Learning Model

Sports are undoubtedly unpredictable due to the multiple internal and

Generally, there are a lot of factors, including internal and external

2.3 Machine Learning Algorithms

In modeling the ML prediction model, numerous algorithms from

METHODOLOGY/ PROJECT WORK

3.1 Research Framework

A popular method for directing data mining and machine learning

FIGURE 1. Overview of CRISP-DM

TABLE 1. CRISP-DM Explanation

3.1.1 Business Understanding

Business understanding in the context of this project is to

In data understanding, the process of gathering and

FIGURE 2. Properties of Gathered Data

3.1.3 Data Preparation

To make sure that the data is prepared for usage in the

Machine learning techniques are incorporated during the

Upon completing the modelling part, the evaluation phase

After the evaluation phase is finished, the project's insights

The machine learning algorithms that will be integrated in the model is

Hence, the proposed algorithms to be integrated inside the machine

Kaggle is an online platform that offers access to a large

FIGURE 3. Kaggle Logo

3.3.2 Google Colab

A free cloud-based platform called Google Colab allows

FIGURE 4. Google Colab Logo

Microsoft created Power BI, which enables users to

FIGURE 5. Power BI Logo

3.4 Project Milestones

The project milestones are specific, measureable and significant events

FIGURE 6. FYP I Gantt Chart

FIGURE 7. FYP II Gantt Chart

CONCLUSION AND FUTURE WORK

In conclusion, this interim report provides a structured progress report

4.2 Future Work

[8] Nasteski, V. (2017). An overview of the supervised machine learning

Snapshot of dataset retrieved from Kaggle

You might also like