Sports Analytics Weekly by kubeia.io - 26/2023

Your weekly serving of sports analytics insights.

2025
2024
2023
2022

Week 52
Week 51
Week 49
Week 48
Week 47
Week 46
Week 45
Week 44
Week 43
Week 42
Week 41
Week 40
Week 39
Week 38
Week 37
Week 36
Week 35
Week 34
Week 33
Week 32
Week 31
Week 30
Week 29
Week 28
Week 27
Week 26
Week 25
Week 24
Week 23
Week 22
Week 21
Week 20
Week 19
Week 18
Week 16
Week 14
Week 13
Week 12
Week 10
Week 9
Week 6
Week 5
Week 4
Week 3
Week 1

🎲 Betting

Highlights from my meeting with the Indiana Gaming Commission

twitter.com - HarryDCrane

Sports betting industry is focused on providing a profitable environment for operators so states can collect revenue. There is no place for winning bettors. They are unwanted and unwelcome.

📝 Sports Analytics

Position-less Football: Receiver Alignment in the NFL

statsbomb.com - Will Morgan

As a follower of multiple sports, it's an interesting exercise to consider what broad trends can be observed from sport-to-sport. One such shift has been the upending of the traditional roles of players and how they are deployed strategically, which have subverted how "the game is supposed to be played".

The 2015/16 Big 5 Leagues Free Data Release

statsbomb.com - StatsBomb

StatsBomb is celebrating a 10th anniversary this summer: a decade since the forming of the website to share and host work from the analytics community. The foundation of the website and the business are community-based, and we've always been keen to pay it back.
So this summer, we're going to release the 2015/16 Big 5 League seasons, on our industry-leading data spec, for free.
1,826 matches, 98 teams, ~2,500 players, and ~6,000,000 rows of event data to work with.

A New Metric for Pitch Control based on an Intuitive Motion Model

sfu.ca - Lucas Wu and Tim B. Swartz

With the availability of tracking data, the determination of pitch control (field ownership) is an increasingly important topic in sports analytics. This paper reviews various approaches for the determination of pitch control and introduces a new field ownership metric that takes into account associated sporting dynamics. The methods that are proposed utilize the movement of the ball and players. Specifically, physical characteristics such as current velocity, acceleration and maximum velocity are considered. The determination of pitch control is based on the time that it takes the ball and the players to reach a given location. The main result of our investigation concerns the validation of the resultant pitch control diagram. Based on a sample of 5887 passes, the team identified as having pitch control was the observed recipient of the pass with 91% accuracy. The approach is generally applicable to invasion sports and is illustrated in the context of soccer. Various parameters are introduced that allow a user to modify the methods to alternative sports and to introduce player-specific maximum velocities and player-specific accelerations.

💰 Quantitative Finance

BloombergGPT: A Large Language Model for Finance (Hudson & Thames reading group)

youtube.com - Michael Struwig

This week's reading group video focuses on the paper entitled 'BloombergGPT: A Large Language Model for Finance'. The paper has attracted considerable attention, and we are thrilled to explore it further with everyone on Friday.
We will discuss various aspects of the model, including its architecture and origins, the distinctive dataset used to train it, its assessment for financial tasks, and the extent to which it lives up to the media's hype regarding its relevance and groundbreaking nature.

The inner workings of a top HFT firm

twitter.com - Christina Qi

If you want to learn about the inner workings of a top HFT firm, read this case where KCG is suing a former employee.

🤖 Machine Learning

LLM Powered Autonomous Agents

github.io - Lilian Weng

Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.

Large language models, Stanford CS324

github.io - Percy Liang

Welcome to CS324! This is a new course on understanding and developing large language models.

🚀 Engineering

When NumPy is too slow

pythonspeed.com - Itamar Turner-Trauring

If you’re doing numeric calculations, NumPy is a lot faster than than plain Python—but sometimes that’s not enough. What should you do when your NumPy-based code is too slow?

🕰️ Blast From the Past

Green All Over: A Priori, Patterns and Sceptics

blogspot.com

Aren't ALL games of "uncertain outcome"? And what does a 'small streak without drawing' mean? Basically nothing, except that a pattern has been found that has no predictive value at all. I guess a small fortune was lost this year on Tottenham Hotspur, chasing that overdue Draw!

Sharp Books, Soft Books: Inside the Sportsbook Ecosystem

wordpress.com

Today I’m going to tackle a topic that is somewhat controversial. Myths, misconceptions and disinformation are everywhere, there is passionate debate from all directions, and there is a lot of complex stuff going on. Let’s talk about the different ways that a centralized* sportsbook can be run; specifically, how they set their odds and how they manage risk.

Modified Kelly Criteria

sfu.ca - Dani Chu, Yifan Wu and Tim B. Swartz

This paper considers an extension of the Kelly criterion used in sports wagering. By recognizing that the probability p of placing a correct wager is unknown, modified Kelly criteria are obtained that take the uncertainty into account. Estimators are proposed that are developed from a decision theoretic framework. We observe that the resultant betting fractions can differ markedly based on the choice of loss function. In the cases that we study, the modified Kelly fractions are smaller than original Kelly. Journal of Quantitative Analysis in Sports, 14, 1-11.

Simplified Kalman filter for online rating: one-fits-all approach

arxiv.org - Leszek Szczecinski and Raphaelle Tihon

In this work, we deal with the problem of rating in sports, where the skills of the players/teams are inferred from the observed outcomes of the games. Our focus is on the online rating algorithms which estimate the skills after each new game by exploiting the probabilistic models of the relationship between the skills and the game outcome. We propose a Bayesian approach which may be seen as an approximate Kalman filter and which is generic in the sense that it can be used with any skills-outcome model and can be applied in the individual- as well as in the group-sports. We show how the well-know algorithms (such as the Elo, the Glicko, and the TrueSkill algorithms) may be seen as instances of the one-fits-all approach we propose. In order to clarify the conditions under which the gains of the Bayesian approach over the simpler solutions can actually materialize, we critically compare the known and the new algorithms by means of numerical examples using the synthetic as well as the empirical data.

The Triple Jeopardy of Ke Xu, a Chinese Hedge Fund Quant

bloomberg.com - ByKit Chellel and Jeremy Hodges

A secretive hedge fund used the British court system to punish an IP thief‚ even though he was already in jail.

Defensive Metrics: Measuring the Intensity of a High Press

statsbomb.com - Colin Trainor

In an article published last October I took my first look at some defensive metrics. That piece was very much an introductory one as I offered up a few of my initial ideas for consideration. I’m now going to take the opportunity to expand on one of the ideas that I wrote about in that initial article, Passes Allowed Per Defensive Action (PPDA).

🎙️ Podcast

Episode 162: Exploring the Zen of Python & pandas Features for Finance

realpython.com - Christopher Bailey, Christopher Trudeau

We cover a recent post by previous guest Matt Harrison about using Python and pandas for finance. Matt’s article covers methods in the pandas library for aggregation, resampling, and rolling averages.

This newsletter is brought to you by κυβεῖα. Kubeia is an innovative startup revolutionizing sports predictions with its user-friendly, no-code machine learning platform.

Don't forget to follow us on social media:

Twitter
Instagram
YouTube

Terms and conditions - Privacy policy