Abstract
Multi-class classification methods that produce sets of probabilistic classifiers, such as ensemble learning methods, are able to model aleatoric and epistemic uncertainty. Aleatoric uncertainty is then typically quantified via the Bayes error, and epistemic uncertainty via the size of the set. In this paper, we extend the notion of calibration, which is commonly used to evaluate the validity of the aleatoric uncertainty representation of a single probabilistic classifier, to assess the validity of an epistemic uncertainty representation obtained by sets of probabilistic classifiers. Broadly speaking, we call a set of probabilistic classifiers calibrated if one can find a calibrated convex combination of these classifiers. To evaluate this notion of calibration, we propose a novel nonparametric calibration test that generalizes an existing test for single probabilistic classifiers to the case of sets of probabilistic classifiers. Making use of this test, we empirically show that ensembles of deep neural networks are often not well calibrated.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Informatik > Künstliche Intelligenz und Maschinelles Lernen |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme |
URN: | urn:nbn:de:bvb:19-epub-107492-6 |
Dokumenten ID: | 107492 |
Datum der Veröffentlichung auf Open Access LMU: | 23. Okt. 2023 10:44 |
Letzte Änderungen: | 11. Okt. 2024 13:42 |