ORCID: https://orcid.org/0009-0009-8422-299X und Hüllermeier, Eyke
ORCID: https://orcid.org/0000-0002-9944-4108
(25. October 2025):
Random forest calibration.
In: Knowledge-Based Systems, Vol. 328: pp. 114-143
[PDF, 6MB]
Abstract
The Random Forest (RF) classifier is often claimed to be relatively well calibrated when compared with other machine learning methods. Moreover, the existing literature suggests that traditional calibration methods, such as isotonic regression, do not substantially enhance the calibration of RF probability estimates unless supplied with extensive calibration data sets, which can represent a significant obstacle in cases of limited data availability. Nevertheless, there seems to be no comprehensive study validating such claims and systematically comparing state-of-the-art calibration methods specifically for RF. To close this gap, we investigate a broad spectrum of calibration methods tailored to or at least applicable to RF, ranging from simple scaling techniques to more advanced algorithms. Our results based on synthetic as well as real-world data unravel the intricacies of RF probability estimates, scrutinize the impact of hyper-parameters, and compare calibration methods in a systematic way. We demonstrate that a well-optimized RF matches or outperforms state-of-the-art calibration methods. In particular, statistical tests on metrics such as accuracy, ECE, Brier score, and log-loss consistently place the optimized RF among the top-performing group.
| Item Type: | Journal article |
|---|---|
| Form of publication: | Publisher's Version |
| Faculties: | Mathematics, Computer Science and Statistics > Computer Science > Artificial Intelligence and Machine Learning |
| Subjects: | 000 Computer science, information and general works > 004 Data processing computer science |
| URN: | urn:nbn:de:bvb:19-epub-128362-5 |
| ISSN: | 09507051 |
| Language: | English |
| Item ID: | 128362 |
| Date Deposited: | 09. Sep 2025 16:41 |
| Last Modified: | 09. Sep 2025 16:41 |
