theguardian.com - Elias Visontay and Luca Ittimani
NSW police allege Ulises DĂĄvila, Kearyn Baccus and Clayton Lewis were involved in a scheme âfor yellow cards to occur during certain gamesâ in late 2023
nytimes.com - David Ornstein
Premier League clubs are set to hold a vote at their annual general meeting next month on a proposal to abolish the video assistant referee (VAR) system from the start of next season.VAR has been used in the English top-flight since 2019, helping improve decision making but also generating persistent controversy.
medium.com - ML Soccer Betting
The Generalised Attacking Performance (GAP) rating system, introduced by Wheatcroft 2020, is a dynamic ratings system that aims to assess the attacking and defensive strengths of football teams. This article aims to explain and offer a pythonic implementation of the GAP ratings, and assess their effectiveness in predicting outcomes in the Over/Under 2.5 goals betting market.
18skaters.com
This document describes an expected goals model for the PWHL.Important features:the model is a Multivariate Adaptive Regression Splines model (a âMARS modelâ); andthe variables used to predict goals are shot distance, shot angle, and shot rebound status.Thatâs not many variables, obviously. There isnât much PWHL data available (72 regular season games) so I used only the most important variables for an xG model.
statsbomb.com - StatsBomb
The 2024 edition of the StatsBomb Conference will be held at Old Trafford Stadium in Manchester on Friday 11th October 2024. The event is known for hosting expert names from the football analytics industry, with insights and experiences shared from some of the most forward-thinking sports organisations from around the world every year.StatsBomb has always had a close connection to the wider analytics community and itâs important to us that this group is represented at this conference.For that reason, weâre once again inviting you to send us your research proposals to present at the event. The best submissions will be offered the chance to write a paper using exclusive StatsBomb data and present it to an audience of industry professionals.
statsbomb.com - Abi Williams
âPressureâ occurs when one or more defensive players start to track down the QB, causing the QB to get rid of the ball or attempt to scramble or move in the pocket to avoid getting sacked. Previously at StatsBomb, our pressure counts have been derived from a range of variables regarding defender proximity to the QB and their involvement in engagements with blockers. While we believe this initial approach gave a good overall metric for pressures, we can generate additional insights using a model-based approach. Here weâll briefly outline that in-progress model and provide a sneak peek at some of the results.
degruyter.com - Nicholas G. Hall and Zhixin Liu
We propose an alternative design for tournaments that use a preliminary stage, followed by several rounds of single elimination play. The conventional âbracketâ design of these tournaments suffers from several deficiencies. Specifically, various reasonable performance criteria for the tournament are not satisfied, there is an unnecessary element of luck in the matchups of players, and there are situations where players have an incentive to shirk. To address all these issues, we allow higher ranked players at the single elimination stage to choose their next opponent sequentially at each round. We allow each playerâs ranking either to remain static, or to improve by beating a higher ranked player (Guyon, J. 2022. âChoose your opponentâ: a new knockout design for hybrid tournaments. J. Sports Anal. 8: 9â29). Using data from 2215 menâs professional tennis tournaments from 1991 to 2017, we demonstrate the reasonableness of the results obtained. We also perform sensitivity analysis for the effect of increasing irregularity in the pairwise win probability matrix on three traditional performance measures. Finally, we consider strategic shirking behavior at both the individual and group levels, and show how our opponent choice design can mitigate such behavior. Overall, the opponent choice design provides higher probabilities that the best player wins and also that the two best players meet, and reduces shirking, compared to the conventional bracket design.
arxiv.org - Daniel Frees, Pranav Ravella, Charlie Zhang
Abstract:This paper presents a groundbreaking model for forecasting English Premier League (EPL) player performance using convolutional neural networks (CNNs). We evaluate Ridge regression, LightGBM and CNNs on the task of predicting upcoming player FPL score based on historical FPL data over the previous weeks. Our baseline models, Ridge regression and LightGBM, achieve solid performance and emphasize the importance of recent FPL points, influence, creativity, threat, and playtime in predicting EPL player performances. Our optimal CNN architecture achieves better performance with fewer input features and even outperforms the best previous EPL player performance forecasting models in the literature. The optimal CNN architecture also achieves very strong Spearman correlation with player rankings, indicating its strong implications for supporting the development of FPL artificial intelligence (AI) Agents and providing analysis for FPL managers. We additionally perform transfer learning experiments on soccer news data collected from The Guardian, for the same task of predicting upcoming player score, but do not identify a strong predictive signal in natural language news texts, achieving worse performance compared to both the CNN and baseline models. Overall, our CNN-based approach marks a significant advancement in EPL player performance forecasting and lays the foundation for transfer learning to other EPL prediction tasks such as win-loss odds for sports betting and the development of cutting-edge FPL AI Agents.
arxiv.org - Wankang Zhai, Yuhan Wang
Abstract:Understanding the dynamics of momentum and game fluctuation in tennis matches is cru-cial for predicting match outcomes and enhancing player performance. In this study, we present a comprehensive analysis of these factors using a dataset from the 2023 Wimbledon final. Ini-tially, we develop a sliding-window-based scoring model to assess player performance, ac-counting for the influence of serving dominance through a serve decay factor. Additionally, we introduce a novel approach, Lasso-Ridge-based XGBoost, to quantify momentum effects, lev-eraging the predictive power of XGBoost while mitigating overfitting through regularization. Through experimentation, we achieve an accuracy of 94% in predicting match outcomes, iden-tifying key factors influencing winning rates. Subsequently, we propose a Derivative of the winning rate algorithm to quantify game fluctuation, employing an LSTM_Deep model to pre-dict fluctuation scores. Our model effectively captures temporal correlations in momentum fea-tures, yielding mean squared errors ranging from 0.036 to 0.064. Furthermore, we explore me-ta-learning using MAML to transfer our model to predict outcomes in ping-pong matches, though results indicate a comparative performance decline. Our findings provide valuable in-sights into momentum dynamics and game fluctuation, offering implications for sports analytics and player training strategies.
arxiv.org - Sushant Gautam, Mehdi Houshmand Sarkhoosh, Jan Held, Cise Midoglu, Anthony Cioppa, Silvio Giancola...
Abstract:The application of Automatic Speech Recognition (ASR) technology in soccer offers numerous opportunities for sports analytics. Specifically, extracting audio commentaries with ASR provides valuable insights into the events of the game, and opens the door to several downstream applications such as automatic highlight generation. This paper presents SoccerNet-Echoes, an augmentation of the SoccerNet dataset with automatically generated transcriptions of audio commentaries from soccer game broadcasts, enhancing video content with rich layers of textual information derived from the game audio using ASR. These textual commentaries, generated using the Whisper model and translated with Google Translate, extend the usefulness of the SoccerNet dataset in diverse applications such as enhanced action spotting, automatic caption generation, and game summarization. By incorporating textual data alongside visual and auditory content, SoccerNet-Echoes aims to serve as a comprehensive resource for the development of algorithms specialized in capturing the dynamics of soccer games. We detail the methods involved in the curation of this dataset and the integration of ASR. We also highlight the implications of a multimodal approach in sports analytics, and how the enriched dataset can support diverse applications, thus broadening the scope of research and development in the field of sports analytics.
apple.com - Seth Partnow
This week on The StatsBomb Football Podcast, host Seth Partnow is joined by CEO Ted Knutson and Head of Football Analysis Matt Edwards. They round off our first season by discussing the best resources and practices for aspiring analysts to break into the industry.
arxiv.org - Drew Prinster, Samuel Stanton, Anqi Liu, Suchi Saria
Abstract:As machine learning (ML) gains widespread adoption, practitioners are increasingly seeking means to quantify and control the risk these systems incur. This challenge is especially salient when ML systems have autonomy to collect their own data, such as in black-box optimization and active learning, where their actions induce sequential feedback-loop shifts in the data distribution. Conformal prediction has emerged as a promising approach to uncertainty and risk quantification, but existing variants either fail to accommodate sequences of data-dependent shifts, or do not fully exploit the fact that agent-induced shift is under our control. In this work we prove that conformal prediction can theoretically be extended to \textit{any} joint data distribution, not just exchangeable or quasi-exchangeable ones, although it is exceedingly impractical to compute in the most general case. For practical applications, we outline a procedure for deriving specific conformal algorithms for any data distribution, and we use this procedure to derive tractable algorithms for a series of agent-induced covariate shifts. We evaluate the proposed algorithms empirically on synthetic black-box optimization and active learning tasks.