theathletic.com - Raphael Honigstein
Bayer Leverkusen won German football’s first title of the season on Saturday: they are Herbstmeister (autumn champions), top of the table midway through the season. The accolade is weirdly named and purely unofficial, but it does carry symbolic meaning as a good omen. A team that can do half the job are widely seen as capable of going all the way, not least because in two-thirds of all seasons since the Bundesliga’s foundation in 1963, they actually did.
learnopencv.com - Soumyadip
The YOLO (You Only Look Once) series of models, renowned for its real-time object detection capabilities, owes much of its effectiveness to its specialized loss functions. In this article, we delve into the various YOLO loss function integral to YOLO’s evolution, focusing on their implementation in PyTorch. Our aim is to provide a clear, technical understanding of these functions, which are crucial for optimizing model training and performance.
arxiv.org - Yu Bai, Song Mei, Huan Wang, Caiming Xiong
Modern machine learning models with high accuracy are often miscalibrated -- the predicted top probability does not reflect the actual accuracy, and tends to be over-confident. It is commonly believed that such over-confidence is mainly due to over-parametrization, in particular when the model is large enough to memorize the training data and maximize the confidence.
In this paper, we show theoretically that over-parametrization is not the only reason for over-confidence. We prove that logistic regression is inherently over-confident, in the realizable, under-parametrized setting where the data is generated from the logistic model, and the sample size is much larger than the number of parameters. Further, this over-confidence happens for general well-specified binary classification problems as long as the activation is symmetric and concave on the positive part. Perhaps surprisingly, we also show that over-confidence is not always the case -- there exists another activation function (and a suitable loss function) under which the learned classifier is under-confident at some probability values. Overall, our theory provides a precise characterization of calibration in realizable binary classification, which we verify on simulations and real data experiments.
fharrell.com - Frank Harrell
It is important to distinguish prediction and classification. In many decisionmaking contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions. The classification rule must be reformulated if costs/utilities or sampling criteria change. Predictions are separate from decisions and can be used by any decision maker.Classification is best used with non-stochastic/deterministic outcomes that occur in say 0.3 - 0.7 of the observations, and not when the simplest classifer (always outputting “positive” or always outputting “negative”) is highly accurate or when two individuals with identical inputs can easily have different outcomes. For these situations, modeling tendencies (i.e., probabilities) is key.Classification should be used when outcomes are distinct and predictors are strong enough to provide, for all subjects, a probability near 1.0 for one of the outcomes.
mic-journal.no - Rudolf E. Kalman
'Roughly speaking, what we know is science and what we don´t know is philosophy´ - Bertrand Russel, ca. 1968.
argmin.net - Ben Recht
No one can explain why or when statistics generalize and transfer.