racingpost.com - Chris Cook
BHA chair Joe Saumarez Smith has spoken out about his recent experiences of gambling checks imposed by bookmakers, having had his Betfair account suspended for nine days and others heavily restricted. "It has been extremely frustrating," said the man who has chaired British racing's ruling body since 2022, adding that he hoped to prove to influential figures that the obstacles facing punters are real and operating as a major deterrent.
arxiv.org - Eugenio Clerico
Abstract:Confidence sequences are sequences of confidence sets that adapt to incoming data while maintaining validity. Recent advances have introduced an algorithmic formulation for constructing some of the tightest confidence sequences for bounded real random variables. These approaches use a coin-betting framework, where a player sequentially bets on differences between potential mean values and observed data. This letter establishes that such coin-betting formulation is optimal among all possible algorithmic frameworks for constructing confidence sequences that build on e-variables and sequential hypothesis testing.
youtube.com
In the latest SBC Podcast I am joined by Dylan, the co-founder and co-owner of Pinnacle Odds Dropper (or POD for short!).POD is the feature review and focus of SBC's latest magazine (Issue 143) and with a detailed analysis (and SBC member discount) now available, this podcast explores everything about Dylan's operation and expertise. 'Chasing steam', 'top down betting', 'market movers' and 'warm money' - all of these terms broadly mean the same thing - following in smart money to make a profit.With sharp bookmakers (like Pinnacle) focusing their resources on keeping their prices up to date and 'soft bookmakers' often lagging behind, services like POD help followers to make profit when the former cut prices and the latter are slow to follow them. Automation drives this kind of service and provides 'every day bettors' with the opportunity to beat soft bookmakers with a steady stream of sharp information.In this podcast, Dylan and I discuss the concept, the software he and his team have built to help execute it and much more besides. The tipping landscape is constantly changing and this type of service is relatively new - I learnt a lot listening to Dylan and I'm sure you will too!
americansocceranalysis.com - Paul Harvey
In a recent-ish podcast by The Transfer Flow podcast, host Ravi Ramineni made an off hand comment that while working for the Seattle Sounders, the team had noticed that they could switch out up to three of their regular starters for bench players in a given game without causing too much of a problem. Any more than that and they began to run into trouble as a team.
arxiv.org - Steven J. Brams, Mehmet S. Ismail, D. Marc Kilgour
Abstract:Tennis, like other games and sports, is governed by rules, including the rules that determine the winner of points, games, sets, and matches. If the two players are equally skilled -- each has the same probability of winning a point when serving or when receiving -- we show that each has an equal chance of winning games, sets, and matches, whether or not sets go to a tiebreak. However, in a women's match that is decided by 2 out of 3 sets, and a men's match that is decided by 3 out of 5 sets, it is possible that the player who wins the most games may not be the player who wins the match. We calculate the probability that this happens and show that it has actually occurred -- most notably, in the 2019 men's Wimbledon final between Novak Djokovic and Roger Federer, which took almost five hours to complete and is considered one of the greatest tennis matches ever (Djokovic won). We argue that the discrepancy between the game winner and the match winner, when it occurs, should be resolved by a Grand Tiebreak (GT) -- played according to the rules of tiebreaks in sets -- because each player has a valid claim to being called the rightful winner. A GT would have the salutary effect of -- even every point -- lest he/she win in sets but lose more games. This would make competition keener throughout a match and probably decrease the need for a GT, because the game and set winner would more likely coincide when the players fight hard for every point.
arxiv.org - Bhaskar Lalwani, Aniruddha Mukherjee
Abstract:Kabaddi, a contact team sport of Indian origin, has seen a dramatic rise in global popularity, highlighted by the upcoming Kabaddi World Cup in 2025 with over sixteen international teams participating, alongside flourishing national leagues such as the Indian Pro Kabaddi League (230 million viewers) and the British Kabaddi League. We present the first open-source Python module to make Kabaddi statistical data easily accessible from multiple scattered sources across the internet. The module was developed by systematically web-scraping and collecting team-wise, player-wise and match-by-match data. The data has been cleaned, organized, and categorized into team overviews and player metrics, each filterable by season. The players are classified as raiders and defenders, with their best strategies for attacking, counter-attacking, and defending against different teams highlighted. Our module enables continuous monitoring of exponentially growing data streams, aiding researchers to quickly start building upon the data to answer critical questions, such as the impact of player inclusion/exclusion on team performance, scoring patterns against specific teams, and break down opponent gameplay. The data generated from Kabaddi tournaments has been sparsely used, and coaches and players rely heavily on intuition to make decisions and craft strategies. Our module can be utilized to build predictive models, craft uniquely strategic gameplays to target opponents and identify hidden correlations in the data. This open source module has the potential to increase time-efficiency, encourage analytical studies of Kabaddi gameplay and player dynamics and foster reproducible research. The data and code are publicly available: this https URL
arxiv.org - Virgilio Gómez-Rubio, Jesús Lagos, Francisco Palmí-Perales
Abstract:Finding players with similar profiles is an important problem in sports such as football. Scouting for new players requires a wealth of information about the available players so that similar profiles to that of a target player can be identified. However, information about the position of the players in the field is seldom used. For this reason, a novel approach based on spatial data analysis is introduced to produce a spatial similarity index that can help to identify similar players. The use of this new spatial similarity index is illustrated to identify similar players using spatial data from the Spanish competition "La Liga", season 2019-2020.
arxiv.org - Joris Bekkers
Abstract:Pressing is a fundamental defensive strategy in football, characterized by applying pressure on the ball owning team to regain possession. Despite its significance, existing metrics for measuring pressing often lack precision or comprehensive consideration of positional data, player movement and speed. This research introduces an innovative framework for quantifying pressing intensity, leveraging advancements in positional tracking data and components from Spearman's Pitch Control model. Our method integrates player velocities, movement directions, and reaction times to compute the time required for a defender to intercept an attacker or the ball. This time-to-intercept measure is then transformed into probabilistic values using a logistic function, enabling dynamic and intuitive analysis of pressing situations at the individual frame level. the model captures how every player's movement influences pressure on the field, offering actionable insights for coaches, analysts, and decision-makers. By providing a robust and intepretable metric, our approach facilitates the identification of pressing strategies, advanced situational analyses, and the derivation of metrics, advancing the analytical capabilities for modern football.
arxiv.org - Tiago Mendes-Neves, Luís Meireles, João Mendes-Moreira, Nuno de Almeida
Abstract:Portugal's prominent role as a global exporter of football talent is primarily driven by youth academies. Notably, Portugal leads the global ranking in terms of net transfer balance. This study aims to uncover and understand the recruitment strategies of Portuguese clubs for sourcing young talent and evaluate the relative success of different strategies. A comprehensive dataset spanning recent decades of Portuguese youth and professional football provides granular insights, including information such as players' birthplaces and the initial grassroots clubs where they developed. The initial findings suggest a correlation between a club's prominence and the geographic reach of its youth scouting operations, with larger clubs able to cast their net wider. Analysis of the correlation between players' birthplace and high-tier football club location suggests that the performance of senior teams acts as a catalyst for investment in youth teams. Regions without professional clubs are often left underserved. That said, certain clubs have made significant gains by focusing on player recruitment outside their district, such as the Algarve region, demonstrating how geographically targeted strategies can deliver substantial returns on investment. This study underscores data's role in sharpening youth player recruitment operations at football clubs. Clubs have access to in-depth and comprehensive datasets that can be used for resource allocation, territorial coverage planning, and identifying strategic partnerships with other clubs, potentially influencing their future success both on the field and financially. This offers opportunities for growth for individual clubs and holds implications for the continued strength of Portuguese football.
arxiv.org - Francisco Pedroche
Abstract:In this paper we analyze the FIA formula one world championships from 2012 to 2022 taking into account the drivers classifications and the constructors teams classifications of each Grand Prix. The needed data consisted of 22 matrices of sizes ranging from 25×2025 \times 20 to 10×1910 \times 19 that have been elaborated from the GP classifications extracted from the official FIA site. We have used the Kendall corrected evolutive coefficient, recently introduced, as a measure of Competitive Balance (CB) to study the evolution of the competitiveness along the years in both drivers and teams championships. In addition, we have compared the CB of F1 championships and two major European football leagues from the seasons 2012-2013 to 2022-2023.
arxiv.org - Kuangzhi Ge, Lingjun Chen, Kevin Zhang, Yulin Luo, Tianyu Shi, Liaoyuan Fan, Xiang Li, Guanqun Wang, Shanghang Zhang
Abstract:Recently, significant advances have been made in Video Large Language Models (Video LLMs) in both academia and industry. However, methods to evaluate and benchmark the performance of different Video LLMs, especially their fine-grained, temporal visual capabilities, remain very limited. On one hand, current benchmarks use relatively simple videos (e.g., subtitled movie clips) where the model can understand the entire video by processing just a few frames. On the other hand, their datasets lack diversity in task format, comprising only QA or multi-choice QA, which overlooks the models' capacity for generating in-depth and precise texts. Sports videos, which feature intricate visual information, sequential events, and emotionally charged commentary, present a critical challenge for Video LLMs, making sports commentary an ideal benchmarking task. Inspired by these challenges, we propose a novel task: sports video commentary generation, developed SCBench\textbf{SCBench} for Video LLMs. To construct such a benchmark, we introduce (1) SCORES\textbf{SCORES}, a six-dimensional metric specifically designed for our task, upon which we propose a GPT-based evaluation method, and (2) CommentarySet\textbf{CommentarySet}, a dataset consisting of 5,775 annotated video clips and ground-truth labels tailored to our metric. Based on SCBench, we conduct comprehensive evaluations on multiple Video LLMs (e.g. VILA, Video-LLaVA, etc.) and chain-of-thought baseline methods. Our results found that InternVL-Chat-2 achieves the best performance with 5.44, surpassing the second-best by 1.04. Our work provides a fresh perspective for future research, aiming to enhance models' overall capabilities in complex visual understanding tasks. Our dataset will be released soon.
arxiv.org - Shashikanta Sahoo
Abstract:In competitive combat sports like boxing, analyzing a boxers's performance statics is crucial for evaluating the quantity and variety of punches delivered during bouts. These statistics provide valuable data and feedback, which are routinely used for coaching and performance enhancement. We introduce BoxMAC, a real-world boxing dataset featuring 15 professional boxers and encompassing 13 distinct action labels. Comprising over 60,000 frames, our dataset has been meticulously annotated for multiple actions per frame with inputs from a boxing coach. Since two boxers can execute different punches within a single timestamp, this problem falls under the domain of multi-label action classification. We propose a novel architecture for jointly recognizing multiple actions in both individual images and videos. We investigate baselines using deep neural network architectures to address both tasks. We believe that BoxMAC will enable researchers and practitioners to develop and evaluate more efficient models for performance analysis. With its realistic and diverse nature, BoxMAC can serve as a valuable resource for the advancement of boxing as a sport
arxiv.org - Aleksandr Podkopaev, Darren Xu, Kuang-Chih Lee
Abstract:Conformal prediction is a valuable tool for quantifying predictive uncertainty of machine learning models. However, its applicability relies on the assumption of data exchangeability, a condition which is often not met in real-world scenarios. In this paper, we consider the problem of adaptive conformal inference without any assumptions about the data generating process. Existing approaches for adaptive conformal inference are based on optimizing the pinball loss using variants of online gradient descent. A notable shortcoming of such approaches is in their explicit dependence on and sensitivity to the choice of the learning rates. In this paper, we propose a different approach for adaptive conformal inference that leverages parameter-free online convex optimization techniques. We prove that our method controls long-term miscoverage frequency at a nominal level and demonstrate its convincing empirical performance without any need of performing cumbersome parameter tuning.
arxiv.org - Sebastian Morel-Balbi, Alec Kirkley
Abstract:A common task arising in various domains is that of ranking items based on the outcomes of pairwise comparisons, from ranking players and teams in sports to ranking products or brands in marketing studies and recommendation systems. Statistical inference-based methods such as the Bradley-Terry model, which extract rankings based on an underlying generative model of the comparison outcomes, have emerged as flexible and powerful tools to tackle the task of ranking in empirical data. In situations with limited and/or noisy comparisons, it is often challenging to confidently distinguish the performance of different items based on the evidence available in the data. However, existing inference-based ranking methods overwhelmingly choose to assign each item to a unique rank or score, suggesting a meaningful distinction when there is none. Here, we address this problem by developing a principled Bayesian methodology for learning partial rankings -- rankings with ties -- that distinguishes among the ranks of different items only when there is sufficient evidence available in the data. Our framework is adaptable to any statistical ranking method in which the outcomes of pairwise observations depend on the ranks or scores of the items being compared. We develop a fast agglomerative algorithm to perform Maximum A Posteriori (MAP) inference of partial rankings under our framework and examine the performance of our method on a variety of real and synthetic network datasets, finding that it frequently gives a more parsimonious summary of the data than traditional ranking, particularly when observations are sparse.
arxiv.org - Shamik Bhattacharjee, Kamlesh Marathe, Hitesh Kapoor, Nilesh Patil
Abstract:Fantasy sports, particularly fantasy cricket, have garnered immense popularity in India in recent years, offering enthusiasts the opportunity to engage in strategic team-building and compete based on the real-world performance of professional athletes. In this paper, we address the challenge of optimizing fantasy cricket team selection using reinforcement learning (RL) techniques. By framing the team creation process as a sequential decision-making problem, we aim to develop a model that can adaptively select players to maximize the team's potential performance. Our approach leverages historical player data to train RL algorithms, which then predict future performance and optimize team composition. This not only represents a huge business opportunity by enabling more accurate predictions of high-performing teams but also enhances the overall user experience. Through empirical evaluation and comparison with traditional fantasy team drafting methods, we demonstrate the effectiveness of RL in constructing competitive fantasy teams. Our results show that RL-based strategies provide valuable insights into player selection in fantasy sports.