Abstract
This work introduces interpretable regional descriptors, or IRDs, for local, model-agnostic interpretations. IRDs are hyperboxes that describe how an observation’s feature values can be changed without affecting its prediction. They justify a prediction by providing a set of “even if” arguments (semi-factual explanations), and they indicate which features affect a prediction and whether pointwise biases or implausibilities exist. A concrete use case shows that this is valuable for both machine learning modelers and persons subject to a decision. We formalize the search for IRDs as an optimization problem and introduce a unifying framework for computing IRDs that covers desiderata, initialization techniques, and a post-processing method. We show how existing hyperbox methods can be adapted to fit into this unified framework. A benchmark study compares the methods based on several quality measures and identifies two strategies to improve IRDs.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Statistik |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
500 Naturwissenschaften und Mathematik > 510 Mathematik |
ISBN: | 978-3-031-43417-4 ; 978-3-031-43418-1 |
Ort: | Cham |
Sprache: | Englisch |
Dokumenten ID: | 121921 |
Datum der Veröffentlichung auf Open Access LMU: | 04. Nov. 2024 14:12 |
Letzte Änderungen: | 04. Nov. 2024 14:12 |