obrhubr.org - Niklas Oberhuber
In a fictional casino which offers even odds on a fair coin toss game, how much of your money should you invest? If you said anything other than 0, youâre leaving broke at the end of the night.
If you want to know how much to invest every flip, you should apply the Kelly Criterion. Itâs a way of calculating the optimal fraction of your capital to invest in order to maximise growth over a long series of bets.
arxiv.org - Haoyu Liu, Carl Donovan, Valentin Popov
Abstract:Financial and gambling markets are ostensibly similar and hence strategies from one could potentially be applied to the other. Financial markets have been extensively studied, resulting in numerous theorems and models, while gambling markets have received comparatively less attention and remain relatively undocumented. This study conducts a comprehensive comparison of both markets, focusing on trading rather than regulation. Five key aspects are examined: platform, product, procedure, participant and strategy. The findings reveal numerous similarities between these two markets. Financial exchanges resemble online betting platforms, such as Betfair, and some financial products, including stocks and options, share speculative traits with sports betting. We examine whether well-established models and strategies from financial markets could be applied to the gambling industry, which lacks comparable frameworks. For example, statistical arbitrage from financial markets has been effectively applied to gambling markets, particularly in peer-to-peer betting exchanges, where bettors exploit odds discrepancies for risk-free profits using quantitative models. Therefore, exploring the strategies and approaches used in both markets could lead to new opportunities for innovation and optimization in trading and betting activities.
youtube.com - Peter Webb
In this video, we look at the world of sports betting and explore the often-overlooked biases that can significantly impact your betting strategies. Whether you're a seasoned bettor or just getting started, understanding these hidden biases is crucial for making informed decisions and improving your chances of success.We discuss the psychological and statistical factors that influence betting markets, shedding light on how these biases can skew odds and affect outcomes. By the end of this video, you'll have a clearer understanding of how to identify and mitigate these biases, allowing you to approach sports betting with a more strategic mindset.
thetransferflow.com
We have a newsletter where folks who have worked in analytics for clubs and data companies cover all the biggest news and rumours in football. https://www.thetransferflow.com/
statsbomb.com - Hudl Statsbomb
Since our inception, Hudl Statsbomb has been at the forefront of educating the next generation of football analysts. Weâre committed to providing materials and resources to develop the skills needed to enter the industry, from free datasets and code to industry-standard education and training courses. Weâve opened the doors to our extensive knowledge base, all to help aspiring analysts hone their craft.
statsbomb.com - Akedjou Achraff Adjileye
In this paper, I introduce RisingBALLER, the first public transformer-based model trained on football match data to learn match-specific player representations. Inspired by advancements in language modeling, RisingBALLER treats each football match as a unique sequence where players act as tokens, with their embeddings conditioned by the specific context of the match.
statsbomb.com - Hudl Statsbomb
Analysis was once performed using limited video and basic data resources. Now teams have access to video and analysis software that can dramatically enhance the quality and speed of their work, whether in player recruitment or opposition scouting.StatsBombâs data and tools are used by 200 teams around the world to answer important football questions, one of which weâll answer today: how can we profile and analyse a playerâs passing using data?Focusing on the Swedish Allsvenskan, Norwegian Eliteserien, and Danish Superliga, weâll harness three state-of-the-art models from the StatsBomb data science team to look at the types of passes players typically make, which passes add the most value to their teamâs play, and which players are completing them effectively.
statsbomb.com - Edoardo Ghezzi and Hadi Sotudeh
Match phases are the primary means by which match time is indexed and divided into
valuable units for coaches. For each phase, coaches instruct their teams to deploy a set
of customized principles and arrangements (Pioli 2003, 7â24). Moreover, these indices
can have various applications such as trawling through video footage in match analysis,
fan engagement in media, and trend analysis for soccer governing bodies. Last but not
least, football analysts have focused for a long time on counting events without
phase-related context providing a blurry image of what happened on the pitch. This paper
proposes a framework for match phase identification and its possible applications based
on StatsBombâs 360 data (event and partial tracking data), which is also publicly available
online (StatsBomb 2023).
arxiv.org - Ryan S. Brill, Ryan Yee, Sameer K. Deshpande, Abraham J. Wyner
Abstract:Expected points is a value function fundamental to player evaluation and strategic in-game decision-making across sports analytics, particularly in American football. To estimate expected points, football analysts use machine learning tools, which are not equipped to handle certain challenges. They suffer from selection bias, display counter-intuitive artifacts of overfitting, do not quantify uncertainty in point estimates, and do not account for the strong dependence structure of observational football data. These issues are not unique to American football or even sports analytics; they are general problems analysts encounter across various statistical applications, particularly when using machine learning in lieu of traditional statistical models. We explore these issues in detail and devise expected points models that account for them. We also introduce a widely applicable novel methodological approach to mitigate overfitting, using a catalytic prior to smooth our machine learning models.
arxiv.org - Gian-Gabriel P. Garcia, J. Carlos MartĂnez Mori
Abstract:Given a collection of historical sports rankings, can one tell which player is the greatest of all time (i.e., the GOAT)? In this work, we design a data-driven random walk on the symmetric group to obtain a stationary distribution over player rankings, spanning across different time periods in sports history. We combine this distribution with a notion of stochastic dominance to obtain a partial order over the players. We implement our methods using publicly available data from the Association of Tennis Professionals (ATP) and the Women's Tennis Association (WTA) to find the GOATs in the respective categories.
arxiv.org - Jane Shaw MacDonald, Rafael Ordoñez Cardales, John M. Stockie
Abstract:Nordic skiing provides fascinating opportunities for mathematical modelling studies that exploit methods and insights from physics, applied mathematics, data analysis, scientific computing and sports science. A typical ski course winds over varied terrain with frequent changes in elevation and direction, and so its geometry is naturally described by a three-dimensional space curve. The skier travels along a course under the influence of various forces, and their dynamics can be described using a nonlinear system of ordinary differential equations (ODEs) that are derived from Newton's laws of motion. We develop an algorithm for solving the governing equations that combines Hermite spline interpolation, numerical quadrature and a high-order ODE solver. Numerical simulations are compared with measurements of skiers on actual courses to demonstrate the effectiveness of the model.
arxiv.org - Eduardo Alves Baratela, Felipe JordĂŁo Xavier, Thomas Peron, Paulino Ribeiro Villas-Boas, Francisco Aparecido Rodrigues
Abstract:Soccer attracts the attention of many researchers and professionals in the sports industry. Therefore, the incorporation of science into the sport is constantly growing, with increasing investments in performance analysis and sports prediction industries. This study aims to (i) highlight the use of complex networks as an alternative tool for predicting soccer match outcomes, and (ii) show how the combination of structural analysis of passing networks with match statistical data can provide deeper insights into the game patterns and strategies used by teams. In order to do so, complex network metrics and match statistics were used to build machine learning models that predict the wins and losses of soccer teams in different leagues. The results showed that models based on passing networks were as effective as ``traditional'' models, which use general match statistics. Another finding was that by combining both approaches, more accurate models were obtained than when they were used separately, demonstrating that the fusion of such approaches can offer a deeper understanding of game patterns, allowing the comprehension of tactics employed by teams relationships between players, their positions, and interactions during matches. It is worth mentioning that both network metrics and match statistics were important and impactful for the mixed model. Furthermore, the use of networks with a lower granularity of temporal evolution (such as creating a network for each half of the match) performed better than a single network for the entire game.
arxiv.org - Emiliano Seri, Roberto Rocci, Thomas Brendan Murphy
Abstract:The standard mixture modelling framework has been widely used to study heterogeneous populations, by modelling them as being composed of a finite number of homogeneous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact conceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, Heller et al (2008) formulated the Bayesian partial membership model (PM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute specific mixtures, but does not assume a factorization over attributes. Our work proposes using the PM for soft clustering of count data arising in football performance analysis and compare the results with those achieved with the mixed membership model and finite mixture model. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Serie A football player data from the 2022/2023 football season, to estimate the positions on the field where the players tend to play, in addition to their primary position, based on their playing style. The application of partial membership model to football data could have practical implications for coaches, talent scouts, team managers and analysts. These stakeholders can utilize the findings to make informed decisions related to team strategy, talent acquisition, and statistical research, ultimately enhancing performance and understanding in the field of football.
github.com
Eagle converts football broadcast data from television feeds to tracking data useful for analysis and visualisation. It uses a collection of custom trained models and a variety of computer vision techniques to identify, track and obtain player and ball coordinates from each frame of broadcast data.
arxiv.org - Nikolay S. Falaleev, Ruilong Chen
Abstract:Accurate camera calibration is essential for transforming 2D images from camera sensors into 3D world coordinates, enabling precise scene geometry interpretation and supporting sports analytics tasks such as player tracking, offside detection, and performance analysis. However, obtaining a sufficient number of high-quality point pairs remains a significant challenge for both traditional and deep learning-based calibration methods. This paper introduces a multi-stage pipeline that addresses this challenge by leveraging the structural features of the football pitch. Our approach significantly increases the number of usable points for calibration by exploiting line-line and line-conic intersections, points on the conics, and other geometric features. To mitigate the impact of imperfect annotations, we employ data fitting techniques. Our pipeline utilizes deep learning for keypoint and line detection and incorporates geometric constraints based on real-world pitch dimensions. A voter algorithm iteratively selects the most reliable keypoints, further enhancing calibration accuracy. We evaluated our approach on the largest football broadcast camera calibration dataset available, and secured the top position in the SoccerNet Camera Calibration Challenge 2023 [arXiv:2309.06006], which demonstrates the effectiveness of our method in real-world scenarios. The project code is available at this https URL .
arxiv.org - Arjun Raj, Lei Wang, Tom Gedeon
Abstract:Accurately detecting and tracking high-speed, small objects, such as balls in sports videos, is challenging due to factors like motion blur and occlusion. Although recent deep learning frameworks like TrackNetV1, V2, and V3 have advanced tennis ball and shuttlecock tracking, they often struggle in scenarios with partial occlusion or low visibility. This is primarily because these models rely heavily on visual features without explicitly incorporating motion information, which is crucial for precise tracking and trajectory prediction. In this paper, we introduce an enhancement to the TrackNet family by fusing high-level visual features with learnable motion attention maps through a motion-aware fusion mechanism, effectively emphasizing the moving ball's location and improving tracking performance. Our approach leverages frame differencing maps, modulated by a motion prompt layer, to highlight key motion regions over time. Experimental results on the tennis ball and shuttlecock datasets show that our method enhances the tracking performance of both TrackNetV2 and V3. We refer to our lightweight, plug-and-play solution, built on top of the existing TrackNet, as TrackNetV4.
arxiv.org - Anthony Cioppa, Silvio Giancola, Vladimir Somers, Victor Joos, Floriane Magera, Jan Held, Seyed Abolfazl Ghasemzadeh, ...
Abstract:The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely localizing when and which soccer actions related to the ball occur, (2) Dense Video Captioning, focusing on describing the broadcast with natural language and anchored timestamps, (3) Multi-View Foul Recognition, a novel task focusing on analyzing multiple viewpoints of a potential foul incident to classify whether a foul occurred and assess its severity, (4) Game State Reconstruction, another novel task focusing on reconstructing the game state from broadcast videos onto a 2D top-view map of the field. Detailed information about the tasks, challenges, and leaderboards can be found at this https URL, with baselines and development kits available at this https URL.
arxiv.org - Silvio Giancola, Anthony Cioppa, Bernard Ghanem, Marc Van Droogenbroeck
Abstract:The task of action spotting consists in both identifying actions and precisely localizing them in time with a single timestamp in long, untrimmed video streams. Automatically extracting those actions is crucial for many sports applications, including sports analytics to produce extended statistics on game actions, coaching to provide support to video analysts, or fan engagement to automatically overlay content in the broadcast when specific actions occur. However, before 2018, no large-scale datasets for action spotting in sports were publicly available, which impeded benchmarking action spotting methods. In response, our team built the largest dataset and the most comprehensive benchmarks for sports video understanding, under the umbrella of SoccerNet. Particularly, our dataset contains a subset specifically dedicated to action spotting, called SoccerNet Action Spotting, containing more than 550 complete broadcast games annotated with almost all types of actions that can occur in a football game. This dataset is tailored to develop methods for automatic spotting of actions of interest, including deep learning approaches, by providing a large amount of manually annotated actions. To engage with the scientific community, the SoccerNet initiative organizes yearly challenges, during which participants from all around the world compete to achieve state-of-the-art performances. Thanks to our dataset and challenges, more than 60 methods were developed or published over the past five years, improving on the first baselines and making action spotting a viable option for the sports industry. This paper traces the history of action spotting in sports, from the creation of the task back in 2018, to the role it plays today in research and the sports industry.
hulltactical.com
There have been many proposed market indicators that seem crazy or at least simplistic âcool storiesâ, including:Whether an NFC team or AFC team won the SuperbowlThe length of cigarette buttsHow full parking lots have beenLength of skirts Some such indicators can be dismissed. The sample size might be too small. Or the measurement might be too subjective. But then it gets tricky. What if the statistics are inarguable but the effect still seems weird and implausible?Which leads us to consider moon phases as an indicator. This was studied by Ilia Dichev and Troy Janes in the paper âLunar Cycle Effects in Stock Returns.â
arxiv.org - Shiva Maharaj, Nick Polson, Vadim Sokolov
Abstract:We provide a statistical analysis of the recent controversy between Vladimir Kramnik (ex chess world champion) and Hikaru Nakamura. Hikaru Nakamura is a chess prodigy and a five-time United States chess champion. Kramnik called into question Nakamura's 45.5 out of 46 win streak in an online blitz contest at this http URL. We assess the weight of evidence using a priori assessment of Viswanathan Anand and the streak evidence. Based on this evidence, we show that Nakamura has a 99.6 percent chance of not cheating. We study the statistical fallacies prevalent in both their analyses. On the one hand Kramnik bases his argument on the probability of such a streak is very small. This falls precisely into the Prosecutor's Fallacy. On the other hand, Nakamura tries to refute the argument using a cherry-picking argument. This violates the likelihood principle. We conclude with a discussion of the relevant statistical literature on the topic of fraud detection and the analysis of streaks in sports data.