nypost.com - Alex Mitchell
The recent rollout of legalized sports betting across 36 states has surged the gambling industry — but experts say it’s coming at the cost of mental health in young men.In particular, easy access to online betting, most popular with sportsbooks — which often incentivize new customers with credits and first-bet loss forgiveness as a lure — has a grasp on the Gen Z crowd.
arxiv.org - Yonathan Sarmiento, Debraj Das, Édgar Roldán
Abstract:Using martingale theory, we compute, in very few lines, exact analytical expressions for various first-exit-time statistics associated with one-dimensional biased diffusion. Examples include the distribution for the first-exit time from an interval, moments for the first-exit site, and functionals of the position, which involve memory and time integration. As a key example, we compute analytically the mean area swept by a biased diffusion until it escapes an interval that may be asymmetric and have arbitrary length. The mean area allows us to derive the hitherto unexplored cross-correlation function between the first-exit time and the first-exit site, which vanishes only for exit problems from symmetric intervals. As a colophon, we explore connections of our results with gambling, showing that betting on the time-integrated value of a losing game it is possible to design a strategy that leads to a net average win.
arxiv.org - David J. Aldous, F. Thomas Bruss
Abstract:We give elementary examples within a framework for studying decisions under uncertainty where probabilities are only roughly known. The framework, in gambling terms, is that the size of a bet is proportional to the gambler's perceived advantage based on their perceived probability, and their accuracy in estimating true probabilities is measured by mean squared-error. Within this framework one can study the cost of estimation errors, and seek to formalize the ``obvious" notion that in competitive interactions between agents whose actions depend on their perceived probabilities, those who are more accurate at estimating probabilities will generally be more successful than those who are less accurate.
theathletic.com - Gregg Evans
When Kevin De Bruyne used a deep dive into his own data to highlight his worth to Manchester City during talks over a new deal in 2021, the contract renewal process in football changed forever.
kuleuven.be - Jesse Davis, Lorenzo Cascioli and Maaike Van Roy
From an analysis perspective, there are two key types of data for soccer:Event data, which records information about on-the-ball actions. It provides semantic information about what is happening in the match. However, it ignores what happens off the ball (e.g., positions of other players).Optical tracking data, which records the locations of all players and the ball multiple times per second. This provides important context about off-the-ball positioning. However, it lacks the semantic information about which actions are being performed on the pitch.Consequently, the richest analyses require integrating information from both types of data sources.
arxiv.org - Mehmet S. Ismail
Abstract:In this note, I introduce Estimated Performance Rating (PRe^e), a novel system for evaluating player performance in sports and games. PRe^e addresses a key limitation of the Tournament Performance Rating (TPR) system, which is undefined for zero or perfect scores in a series of games. PRe^e is defined as the rating that solves an optimization problem related to scoring probability, making it applicable for any performance level. The main theorem establishes that the PRe^e of a player is equivalent to the TPR whenever the latter is defined. I then apply this system to historically significant win-streaks in association football, tennis, and chess. Beyond sports, PRe^e has broad applicability in domains where Elo ratings are used, from college rankings to the evaluation of large language models.
arxiv.org - Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach...
Abstract:Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI's model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning.
arxiv.org - Andrei Shelopugin, Alexander Sirotkin
Abstract:One of the key problems in the field of soccer analytics is predicting how a player performance changes when transitioning from one league to another. One potential solution to address this issue lies in the evaluation of the respective league strength. This article endeavors to compute club ratings of the first and second European and South American leagues. In order to calculate these ratings, the authors have designed the Glicko-2 rating system-based approach, which overcomes some Glicko-2 limitations. Particularly, the authors took into consideration the probability of the draw, the home-field advantage, and the property of teams to become stronger or weaker following their league transitions. Furthermore, authors have constructed a predictive model for forecasting match results based on the number of goals scored in previous matches. The metrics utilized in the analysis reveal that the Glicko-2 based approach exhibits a marginally superior level of accuracy when compared to the commonly used Poisson regression-based approach. In addition, Glicko-2 based ratings offer greater interpretability and can find application in various analytics tasks, such as predicting soccer player metrics for forthcoming seasons or the detailed analysis of a player performance in preceding matches. The implementation of the approach is available on this http URL.
arxiv.org - Luiz Fernando G. N. Maia, Teemu Pennanen, Moacyr A. H. B. da Silva, Rodrigo S. Targino
Abstract:This paper develops a general framework for stochastic modeling of goals and other events in football (soccer) matches. The events are modelled as Cox processes (doubly stochastic Poisson processes) where the event intensities may depend on all the modeled events as well as external factors. The model has a strictly concave log-likelihood function which facilitates its fitting to observed data. Besides event times, the model describes the random lengths of stoppage times which can have a strong influence on the final score of a match. The model is illustrated on eight years of data from Campeonato Brasileiro de Futebol Série A. We find that dynamic regressors significantly improve the in-game predictive power of the model. In particular, a) when a team receives a red card, its goal intensity decreases more than 30%; b) the goal rate of a team increases by 10% if it is losing by one goal and by 20% if its losing by two goals; and c) when the goal difference at the end of the second half is less than or equal to one, the stoppage time is on average more than one minute longer than in matches with a difference of two goals.
arxiv.org - Ryan S. Brill, Ronald Yurko, Abraham J. Wyner
Abstract:The standard mathematical approach to fourth-down decision making in American football is to make the decision that maximizes estimated win probability. Win probability estimates arise from machine learning models fit from historical data. These models, however, are overfit high-variance estimators, exacerbated by the highly correlated nature of football play-by-play data. We develop a machine learning framework that accounts for this auto-correlation and knits uncertainty quantification into our decision making. In particular, we recommend a fourth-down decision when we are confident it has higher win probability than all other decisions. Our final product is a major advance in fourth-down strategic decision making: far fewer fourth-down decisions are as obvious as analysts claim.
arxiv.org - Albert Cohen, Jimmy Risk
Abstract:This paper presents a new framework for player valuation in European football by fusing principles from financial mathematics and network theory. The valuation model leverages a "passing matrix" to encapsulate player interactions on the field, utilizing centrality measures to quantify individual influence. Unlike traditional approaches, this model is both metric-driven and cohort-free, providing a dynamic and individualized framework for ascertaining a player's fair market value. The methodology is empirically validated through a case study in European football, employing real-world match and financial data. The paper advances the disciplines of sports analytics and financial mathematics by offering a cross-disciplinary mechanism for player valuation, and also links together two well-known econometric methods in marginal revenue product and expected present valuation.
learnopencv.com
What is YOLO? You Only Look Once (YOLO): Unified, Real-Time Object Detection is a single-stage object detection model published at CVPR 2016, by Joseph Redmon, very famous for having very low latency and high accuracy. The entire YOLO series of models is a collection of pioneering concepts that have shaped today’s object detection methods. YOLO Models has emerged as an industry de facto, achieving high detection precision with minimal computational demands. Some YOLO models are tailored to align with the specific processing capabilities of the device, whether it’s a CPU or a GPU. Most YOLO models are designed to cater to different scales, such as small, medium, and large, which can be easily serialized in ONNX, TensorRT, OpenVINO, etc. This gives users the liberty to choose which is best suited for their application.
arxiv.org - Matthew J. Penn, Christl A. Donnelly, Samir Bhatt
Abstract:Player tracking data remains out of reach for many professional football teams as their video feeds are not sufficiently high quality for computer vision technologies to be used. To help bridge this gap, we present a method that can estimate continuous full-pitch tracking data from discrete data made from broadcast footage. Such data could be collected by clubs or players at a similar cost to event data, which is widely available down to semi-professional level. We test our method using open-source tracking data, and include a version that can be applied to a large set of over 200 games with such discrete data.
arxiv.org - Konstantinos Moutselos, Ilias Maglogiannis
Abstract:Players and ball detection are among the first required steps on a football analytics platform. Until recently, the existing open datasets on which the evaluations of most models were based, were not sufficient. In this work, we point out their weaknesses, and with the advent of the SoccerNet v3, we propose and deliver to the community an edited part of its dataset, in YOLO normalized annotation format for training and evaluation. The code of the methods and metrics are provided so that they can be used as a benchmark in future comparisons. The recent YOLO8n model proves better than FootAndBall in long-shot real-time detection of the ball and players on football fields.
arxiv.org - Changjian Chen, Jiashu Chen, Weikai Yang, Haoze Wang, Johannes Knittel, Xibin Zhao, Steffen Koch, Thomas Ertl, Shixia Liu
Abstract:Temporal action localization aims to identify the boundaries and categories of actions in videos, such as scoring a goal in a football match. Single-frame supervision has emerged as a labor-efficient way to train action localizers as it requires only one annotated frame per action. However, it often suffers from poor performance due to the lack of precise boundary annotations. To address this issue, we propose a visual analysis method that aligns similar actions and then propagates a few user-provided annotations (e.g. , boundaries, category labels) to similar actions via the generated alignments. Our method models the alignment between actions as a heaviest path problem and the annotation propagation as a quadratic optimization problem. As the automatically generated alignments may not accurately match the associated actions and could produce inaccurate localization results, we develop a storyline visualization to explain the localization results of actions and their alignments. This visualization facilitates users in correcting wrong localization results and misalignments. The corrections are then used to improve the localization results of other actions. The effectiveness of our method in improving localization performance is demonstrated through quantitative evaluation and a case study.
github.io - Omar Sanseviero
Understand how transformers work by demystifying all the math behind them
projecteuclid.org - Leo Breiman
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.