Sports Analytics Weekly by kubeia.io - 44/2025

Your weekly serving of sports analytics insights.

2026
2025
2024
2023
2022

Week 52
Week 51
Week 50
Week 49
Week 48
Week 47
Week 46
Week 45
Week 44
Week 43
Week 42
Week 41
Week 40
Week 39
Week 38
Week 37
Week 36
Week 35
Week 33
Week 32
Week 31
Week 30
Week 29
Week 28
Week 27
Week 26
Week 25
Week 24
Week 23
Week 22
Week 21
Week 19
Week 17
Week 16
Week 14
Week 13
Week 12
Week 11
Week 10
Week 9
Week 7
Week 6
Week 5
Week 4
Week 3
Week 2
Week 1

📝 Sports Analytics

Expecting Goals Premier League Team Ratings

expectinggoals.com - Michael Caley

If there were no Expecting Goals team ratings model, people could still get insights from soccer analytics about their favorite teams. And while I have worked hard to optimize this model and find new ways to use the statistical record to evaluate clubs, these are surely marginal gains.I decided to build this ratings system for two reasons. The first is because it is fun. And the second is that building a ratings system opens up a variety of possible new studies and new ways of approaching studies.

You may not like Arsenal, but this is what peak performance looks like

thetransferflow.com - Neel Shelat

Arsenal are now four points clear at the top of the Premier League and certainly look like the best team in the division.

Why We Still Don't Really Know Who's Good at Football ?

youtube.com - Michael MacKelvie

We know that players are influenced by their environments, but how do we separate them? How does this vary positionally?

Team Ratings and Power Rankings for European Soccer

expectinggoals.com - Michael Caley

One decision I made in constructing the Expecting Goals Team Ratings system was not to optimize just for the English Premier League. The training and testing set included equal numbers of seasons from the top divisions in Spain, Italy, Germany and France. It is possible that, at the margins, this decision will make the ratings and projections a little less precise for the Premier League. But it also means that for future studies, these methods have been tested on more data and can be used on a more expansive set of data.

Can a Common Format Fix Football’s Data Chaos?

substack.com - Alex Marin Felices

The following summary critically reviews the research paper titled "Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)" by Gabriel Anzer, Kilian Arnsmeyer, Pascal Bauer, Joris Bekkers, Ulf Brefeld, Jesse Davis, Nicolas Evans, Matthias Kempe, Samuel J Robertson, Joshua Wyatt Smith and Jan Van Haaren. All data, figures, and analysis presented here are drawn from their original work; I do not claim any authorship or ownership of the content. This summary has been written to provide a concise and technically informed synthesis of the paper’s findings, methodologies, and implications, while maintaining fidelity to the authors’ intellectual contributions.

💰 Quantitative Finance

Features Selection in the Age of Generative AI

substack.com - Ernest Chan

Features are inputs to machine learning algorithms. Sometimes also called independent variables, covariates, or just X, they can be used for supervised or unsupervised learning, or for optimization. For example, at QTS, we use more than 100 of them as inputs to dynamically calibrate the allocation between our Tail Reaper strategy and E-mini S&P 500 futures. In general, modelers have no idea which features are useful a priori, or if they are redundant, for a particular application. Using all of the features can result in overfitting and poor out-of-sample performance, or worse, numerical instability and singularities during matrix inversion. Hence the need for a process called “features selection”.

This newsletter is brought to you by κυβεῖα. Kubeia is an innovative startup revolutionizing sports predictions with its user-friendly, no-code machine learning platform.

Don't forget to follow us on social media:

Twitter
Instagram
YouTube

Terms and conditions - Privacy policy