football-tda

The purpose of this project is to show a possible application of TDA. Our use case is based on football and the goal (pun intended) is to try to forecast the outcome of a match.

You can find our blog post at this link.

Data

The dataset we used can be found here. It is a collection of more than 25,000 european football matches from 2008 to 2016. For each match, the starting eleven are available for both teams, as well as match statistics and bookmaker odds.

The dataset also contains the attributes of more than 10,000 players taken from EA Sports' FIFA video game series, including weekly updates.

Feature Creation

The assumption we made is that each match can be modelled as the attributes of the starting eleven of the two teams. Since in this way the number of features was too high, an additional aggregation step was required (see the notebook for further details).

Thus, each match can be considered as a vector in a vector space and the totality of matches can be viewed as a point cloud.

For capturing local information surrounding a match, we computed persistent homology of its k-nearest neighbours and use it as a feature.

Model

We cross-validated a random forest classifier and train it to predict the outcome of a match. In order to validate our results, we used an elo-rating system and the odds of the market as baselines.

Results

Our results show that our model out-performs the elo-rating system and is comparable to the market.

Notebook overview

Given the promising results, we tried to simulate an entire championship with the ultimate purpose of evaluating the impact that a player would have had if hired by our favorite team. Therefore, we offer the possibility to select both the favorite player and the lucky team where to insert him. Then you can simulate the championship and check if your player improves the final ranking of his new team (little spoiler: Messi does!).

Enjoy!

Requirements

In order to run the notebook, the following python packages are required:

giotto-tda 0.1.4
pandas 0.25.3
pyarrow 0.15.1
tqdm 4.38.0
wget 3.2
openml

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
FootballTDA.ipynb		FootballTDA.ipynb
FootballTDA.png		FootballTDA.png
LICENSE		LICENSE
README.md		README.md
compute_statistics.py		compute_statistics.py
cross_validation.py		cross_validation.py
cv_output.pickle		cv_output.pickle
database.py		database.py
notebook_functions.py		notebook_functions.py
requirements.txt		requirements.txt
soccer_basics.py		soccer_basics.py
sub_space_extraction.py		sub_space_extraction.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

football-tda

Data

Feature Creation

Model

Results

Notebook overview

Requirements

About

Releases

Packages

Contributors 3

Languages

License

giotto-ai/football-tda

Folders and files

Latest commit

History

Repository files navigation

football-tda

Data

Feature Creation

Model

Results

Notebook overview

Requirements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages