statsbomb.com - StatsBomb
The 2023 StatsBomb Conference was held at Wembley Stadium just over a month ago. As part of the event, we invited the winners of our Research Competition to showcase their work to an audience of industry experts and professionals.
statsbomb.com - Will Morgan
With the increasing emphasis on the passing game over recent years, defenses have needed to respond in terms of their coverages and the pre-and-post snap looks they give the offense. Two-high vs single-high safety concepts are a topic du jour when it comes to analyzing defense in the modern game, particularly with regards to limiting explosive passing plays.
plos.org - Jassim AlMulla, Mohammad Tariqul Islam, Hamada R. H. Al-Absi, Tanvir Alam
Winning football matches is the major goal of all football clubs in the world. Football being the most popular game in the world, many studies have been conducted to analyze and predict match winners based on players’ physical and technical performance. In this study, we analyzed the matches from the professional football league of Qatar Stars League (QSL) covering the matches held in the last ten seasons. We incorporated the highest number of professional matches from the last ten seasons covering from 2011 up to 2022 and proposed SoccerNet, a Gated Recurrent Unit (GRU)-based deep learning-based model to predict match winners with over 80% accuracy. We considered match- and player-related information captured by STATS platform in a time slot of 15 minutes. Then we analyzed players’ performance at different positions on the field at different stages of the match. Our results indicated that in QSL, the defenders’ role in matches is more dominant than midfielders and forwarders. Moreover, our analysis suggests that the last 15–30 minutes of match segments of the matches from QSL have a more significant impact on the match result than other match segments. To the best of our knowledge, the proposed model is the first DL-based model in predicting match winners from any professional football leagues in the Middle East and North Africa (MENA) region. We believe the results will support the coaching staff and team management for QSL in designing game strategies and improve the overall quality of performance of the players.
nih.gov - Markel Rico-González, José Pino-Ortega, Amaia Méndez, Filipe Manuel Clemente and Arnold Baca
Due to the chaotic nature of soccer, the predictive statistical models have become in a current challenge to decision-making based on scientific evidence. The aim of the present study was to systematically identify original studies that applied machine learning (ML) to soccer data, highlighting current possibilities in ML and future applications. A systematic review of PubMed, SPORTDiscus, and FECYT (Web of Sciences, CCC, DIIDW, KJD, MEDLINE, RSCI, and SCIELO) was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. From the 145 studies initially identified, 32 were fully reviewed, and their outcome measures were extracted and analyzed. In summary, all articles were clustered into three groups: injury (n = 7); performance (n = 21), which was classified in match/league outcomes forecasting, physical/physiological forecasting, and technical/tactical forecasting; and the last group was about talent forecasting (n = 5). The development of technology, and subsequently the large amount of data available, has become ML in an important strategy to help team staff members in decision-making predicting dose-response relationship reducing the chaotic nature of this team sport. However, since ML models depend upon the amount of dataset, further studies should analyze the amount of data input needed make to a relevant predictive attempt which makes accurate predicting available.
arxiv.org - Yiming Ren, Teo Susnjak
Abstract:In this work, a machine learning approach is developed for predicting the outcomes of football matches. The novelty of this research lies in the utilisation of the Kelly Index to first classify matches into categories where each one denotes the different levels of predictive difficulty. Classification models using a wide suite of algorithms were developed for each category of matches in order to determine the efficacy of the approach. In conjunction to this, a set of previously unexplored features were engineering including Elo-based variables. The dataset originated from the Premier League match data covering the 2019-2021 seasons. The findings indicate that the process of decomposing the predictive problem into sub-tasks was effective and produced competitive results with prior works, while the ensemble-based methods were the most effective. The paper also devised an investment strategy in order to evaluate its effectiveness by benchmarking against bookmaker odds. An approach was developed that minimises risk by combining the Kelly Index with the predefined confidence thresholds of the predictive models. The experiments found that the proposed strategy can return a profit when following a conservative approach that focuses primarily on easy-to-predict matches where the predictive models display a high confidence level.
hexmos.com - sreedeep, Rijul Rajesh
Implementing AI for Object Detection isnt hard. Using YOLO we can learn the usage of AI and setup object detection with ease. In this article we will learn to setup a table detection using the new YOLOv8 model. Follow through the tutorial at the end to get it working practically.
arxiv.org - Alan Jeffares, Tennison Liu, Jonathan Crabbé, Mihaela van der Schaar
Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model. Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance. In the case of deep ensembles of neural networks, we are provided with the opportunity to directly optimize the true objective: the joint performance of the ensemble as a whole. Surprisingly, however, directly minimizing the loss of the ensemble appears to rarely be applied in practice. Instead, most previous research trains individual models independently with ensembling performed post hoc. In this work, we show that this is for good reason - joint optimization of ensemble loss results in degenerate behavior. We approach this problem by decomposing the ensemble objective into the strength of the base learners and the diversity between them. We discover that joint optimization results in a phenomenon in which base learners collude to artificially inflate their apparent diversity. This pseudo-diversity fails to generalize beyond the training data, causing a larger generalization gap. We proceed to comprehensively demonstrate the practical implications of this effect on a range of standard machine learning tasks and architectures by smoothly interpolating between independent training and joint optimization.
arxiv.org - Luca Pappalardo, Paolo Cintia, Paolo Ferragina, Emanuele Massucco, Dino Pedreschi, Fosca Giannotti
Abstract:The problem of evaluating the performance of soccer players is attracting the interest of many companies and the scientific community, thanks to the availability of massive data capturing all the events generated during a match (e.g., tackles, passes, shots, etc.). Unfortunately, there is no consolidated and widely accepted metric for measuring performance quality in all of its facets. In this paper, we design and implement PlayeRank, a data-driven framework that offers a principled multi-dimensional and role-aware evaluation of the performance of soccer players. We build our framework by deploying a massive dataset of soccer-logs and consisting of millions of match events pertaining to four seasons of 18 prominent soccer competitions. By comparing PlayeRank to known algorithms for performance evaluation in soccer, and by exploiting a dataset of players' evaluations made by professional soccer scouts, we show that PlayeRank significantly outperforms the competitors. We also explore the ratings produced by {\sf PlayeRank} and discover interesting patterns about the nature of excellent performances and what distinguishes the top players from the others. At the end, we explore some applications of PlayeRank -- i.e. searching players and player versatility --- showing its flexibility and efficiency, which makes it worth to be used in the design of a scalable platform for soccer analytics.