BTP Project Report
BTP Project Report
By
1
1. Abstract
The sport has developed a lot in the recent century and so has the technology involved in
the game. The Virtual Assistant Referee (VAR) is one of them and has largely impacted
the game.
The role of VAR is simple yet complex: to intervene in between the play when the
referees make a wrong decision or cannot make one. A specific scenario arises when they
must decide if a sliding tackle inside the box has resulted in a clean tackle or penalty for
the opposition team. The technology is there to watch the moment at which tackle took
place on repeat but the decisions are still made by humans and hence can be biased. We
tried to develop a CNN based foul detection which is theoretically based on the principle
of the initial point of contact.
2. Introduction
Football originated in 1863 in England, over 160 years ago. Since then, it has become a
major entertainment sport in sporting history. In the 2012–13 season, the dutch league
was introduced by Virtual Assistant Referee (VAR) that like any other machine or
technology which is used by humans to make their lives easy, it was used to make the
referees live easy in the game.
2
In this project, we concentrate on the second use case, the penalty decisions. Although
the decisions are taken using a thorough check using repeat video recordings of the
moment when a tackle takes place, reviewing from different angles, this task still has a
human dependency and may contain a bias. To automate this process, I propose a
Convolutional Neural Network, that will take the initial point of contact image as an
input and predict if a foul has been committed or not. Hence the penalty decisions can
now be automated rather than based on the human investigation.
3. Motivation
With the advent of technology, we have witnessed remarkable advancements in the game,
and one of the most significant innovations is the introduction of the Virtual Assistant
Referee (VAR) system. While VAR has enhanced the accuracy of refereeing decisions,
there remains room for improvement, particularly in the crucial area of penalty and foul
decisions.
The motivation for this project stems from the recognition that despite the availability of
advanced technology to review plays, the final decision still rests with human referees.
These referees, like all of us, are susceptible to human error, biases, and the pressures of
making split-second judgments that can profoundly impact the outcome of a match. This
inherent subjectivity in penalty decisions calls for a more objective and technologically
advanced solution.
The aim of this project is to harness the power of modern technology, specifically
Convolutional Neural Networks (CNNs), to revolutionize the way we approach penalty
decisions in football. By focusing on the initial point of contact during a tackle inside the
penalty box, we seek to automate the process of foul detection. This approach offers
several compelling motivations:
● Enhanced Objectivity: By employing a CNN-based system, we eliminate the
potential for human bias in penalty decisions. The technology operates solely
based on data and algorithms, ensuring a fair and impartial assessment of each
situation.
● Consistency: The consistency of decisions is crucial in football. This system
promises to deliver consistent results, reducing disputes and controversies
surrounding penalty calls. Players, coaches, and fans can have greater confidence
in the fairness of the game.
3
● Efficiency: The automation of penalty decisions through AI technology will
expedite the decision-making process. It will reduce the need for lengthy video
reviews, allowing the game to flow more smoothly and eliminating unnecessary
disruptions.
● Reduced Errors: Human referees are prone to errors, especially in high-pressure
situations. This project aims to minimize errors in penalty decisions, leading to
fairer outcomes and less frustration among stakeholders.
● Advancement of the Sport: Football has always evolved with the times, embracing
technological innovations to improve the game. Implementing AI for penalty
decisions is a natural progression that can enhance the sport's appeal and
competitiveness.
4. Literature Survey
The VAR-CNN
The model we are working with is a Convolutional Neural Network-based model which
takes images based on the initial contact and provides a classification for the same. In this
section, we will discuss the data, the model architecture used, results and inferences from
our model. The model is a small workaround for the actual virtual assistant referee; hence
we named it VAR-CNN.
Data Collection:
Data Collecting the data has been a ponderous task, there are no open-source resources
for
the kind of data of any league. The only available sources are the video clips of the
European matches and compilations on YouTube of tackling and fouls. A small chunk of
data is also acquired from the paper Soccer Event Detection Using Deep Learning.
4
Collecting the data has been a ponderous task, there are no open-source resources for the
kind of data of any league. The only available sources are the video clips of the European
matches and compilations on YouTube of tackling and fouls. A small chunk of data is
also acquired from the paper Soccer Event Detection Using Deep Learning.
The variations in data can be observed above. A total of 1200+ images were scrapped for
two classes namely Clean Tackles and Fouls. Clean Tackles as the name suggests is when
a defender gets the ball first and the initial contact would be on the ball. On the contrary,
a foul is when a defender gets in contact with the player first. This approach was the basis
of this study and data collection. The initial contact data and the moment just after it are
recorded in this dataset.
5
5. Problem Statement
Despite the introduction of the Virtual Assistant Referee (VAR) system, which aims to
improve the accuracy of refereeing decisions, there are still significant challenges in
determining whether a sliding tackle inside the penalty box should result in a clean tackle
or a penalty for the opposition team. This project addresses the following key problems:
6
6. Implementation
7
Model structure is:
The last 3 blocks of CNN contain dilation, the dense layers use relu activations and the
output operates on sigmoid activation, we use a binary classification model. The size of
the input images was trimmed to 256,256 using nearest interpolation while using data
augmentation we provided various techniques like rotation, horizontal flip, vertical flip,
brightness ranges etc. While training an early stopping callback was also used with the
patience of 10 epochs having the best weights restored. Validation accuracy was
monitored in the early stopping call back. The training accuracy achieved was ~68% and
the validation accuracy achieved was ~74%. The accuracies were low but acceptable
knowing the data size and complexity of the datasets.
Training Logs:
8
Overfitting was observed in each and every model with different regularization
combinations but with dropout, it was last observed and as we used early stopping best
weights were restored.
7. Predictions
The predictions are made by converting videos into frames using OpenCV and generating
predictions for each frame.
9
8. Work done till now and Future work
Till this date, we have successfully implemented the model on the Jupyter Notebook.
Next step will be Model Fine-tuning and Deployment.
9. References
● Gerke, Sebastian, Antje Linnemann, and Karsten Müller. "Soccer player recognition using spatial
constellation features and jersey number recognition." Computer Vision and Image
Understanding 159 (2017): 105-115.
● Baysal, Sermetcan, and Pınar Duygulu. "Sentioscope: a soccer player tracking system using
model field particles." IEEE Transactions on Circuits and Systems for Video Technology 26, no.
7 (2015): 1350-1362.
● Kamble, P. R., A. G. Keskar, and K. M. Bhurchandi. "A deep learning ball tracking system in
soccer videos." Opto-Electronics Review 27, no. 1 (2019): 58-69.
● Choi, Kyuhyoung, and Yongduek Seo. "Automatic initialization for 3D soccer player tracking."
Pattern Recognition Letters 32, no. 9 (2011): 1274-1282.
● Kim, Wonjun. "Multiple object tracking in soccer videos using topographic surface analysis."
Journal of Visual Communication and Image Representation 65 (2019): 102683.
● Liu, Jia, Xiaofeng Tong, Wenlong Li, Tao Wang, Yimin Zhang, and Hongqi Wang. "Automatic
player detection, labeling and tracking in broadcast soccer video." Pattern Recognition Letters 30,
no. 2 (2009): 103-113.
● Komorowski, Jacek, Grzegorz Kurzejamski, and Grzegorz Sarwas. "BallTrack: Football ball
tracking for real-time CCTV systems." In 2019 16th International Conference on Machine Vision
Applications (MVA), pp. 1-5. IEEE, 2019.
10
● Hurault, Samuel, Coloma Ballester, and Gloria Haro. "Self-Supervised Small Soccer Player
Detection and Tracking." In Proceedings of the 3rd International Workshop on Multimedia
Content Analysis in Sports, pp. 9-18. 2020.
● Kamble, Paresh R., Avinash G. Keskar, and Kishor M. Bhurchandi. "A convolutional neural
network-based 3D ball tracking by detection in soccer videos." In Eleventh International
Conference on machine vision (ICMV 2018), vol. 11041, p. 110412O. International Society for
Optics and Photonics, 2019.
● Naidoo, Wayne Chelliah, and Jules Raymond Tapamo. "Soccer video analysis by ball, player and
referee tracking." In Proceedings of the 2006 annual research conference of the South African
institute of computer scientists and information technologists on IT research in developing
countries, pp. 51-60. 2006.
● Liang, Dawei, Yang Liu, Qingming Huang, and Wen Gao. "A scheme for ball detection and
tracking in broadcast soccer video." In Pacific-Rim Conference on Multimedia, pp. 864-875.
Springer, Berlin, Heidelberg, 2005
● Mazzeo, Pier Luigi, Marco Leo, Paolo Spagnolo, and Massimiliano Nitti. "Soccer ball detection
by comparing different feature extraction methodologies." Advances in Artificial Intelligence
2012 (2012).
● Garnier, Paul, and Théophane Gregoir. "Evaluating Soccer Player: from Live Camera to Deep
Reinforcement Learning." arXiv preprint arXiv:2101.05388 (2021).
11