Logo Logo
Switch Language to German

Mortier, Thomas ORCID logoORCID: https://orcid.org/0000-0001-9650-9263; Bengs, Viktor ORCID logoORCID: https://orcid.org/0000-0001-6988-6186; Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108; Luca, Stijn and Waegeman, Willem ORCID logoORCID: https://orcid.org/0000-0002-5950-3003 (April 2023): On the Calibration of Probabilistic Classifier Sets. 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), Valencia, Spain, 25-27 April, 2023. Ruiz, Francisco; Dy, Jennifer and van de Meent, Jan-Willem (eds.) : In: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, Vol. 206 PMLR. pp. 8857-8870

Full text not available from 'Open Access LMU'.


Multi-class classification methods that produce sets of probabilistic classifiers, such as ensemble learning methods, are able to model aleatoric and epistemic uncertainty. Aleatoric uncertainty is then typically quantified via the Bayes error, and epistemic uncertainty via the size of the set. In this paper, we extend the notion of calibration, which is commonly used to evaluate the validity of the aleatoric uncertainty representation of a single probabilistic classifier, to assess the validity of an epistemic uncertainty representation obtained by sets of probabilistic classifiers. Broadly speaking, we call a set of probabilistic classifiers calibrated if one can find a calibrated convex combination of these classifiers. To evaluate this notion of calibration, we propose a novel nonparametric calibration test that generalizes an existing test for single probabilistic classifiers to the case of sets of probabilistic classifiers. Making use of this test, we empirically show that ensembles of deep neural networks are often not well calibrated.

Actions (login required)

View Item View Item