Logo Logo
Hilfe
Hilfe
Switch Language to English

Belaid, Mohamed Karim; Bornemann, Richard; Rabus, Maximilian; Krestel, Ralf und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (Juli 2023): Compare-xAI: Toward Unifying Functional Testing Methods for Post-hoc XAI Algorithms into a Multi-dimensional Benchmark. World Conference on Explainable Artificial Intelligence (xAI 2023), Lisboa, Portugal, 26-28 July 2023. Longo, Luca (Hrsg.): Cham: Springer Nature Switzerland. S. 88-109 [PDF, 507kB]

Abstract

In recent years, Explainable AI (xAI) attracted a lot of attention as various countries turned explanations into a legal right. xAI algorithms enable humans to understand the underlying models and explain their behavior, leading to insights through which the models can be analyzed and improved beyond the accuracy metric by, e.g., debugging the learned pattern and reducing unwanted biases. However, the widespread use of xAI and the rapidly growing body of published research in xAI have brought new challenges. A large number of xAI algorithms can be overwhelming and make it difficult for practitioners to choose the correct xAI algorithm for their specific use case. This problem is further exacerbated by the different approaches used to assess novel xAI algorithms, making it difficult to compare them to existing methods. To address this problem, we introduce Compare-xAI, a benchmark that allows for a direct comparison of popular xAI algorithms with a variety of different use cases. We propose a scoring protocol employing a range of functional tests from the literature, each targeting a specific end-user requirement in explaining a model. To make the benchmark results easily accessible, we group the tests into four categories (fidelity, fragility, stability, and stress tests). We present results for 13 xAI algorithms based on 11 functional tests. After analyzing the findings, we derive potential solutions for data science practitioners as workarounds to the found practical limitations. Finally, Compare-xAI is a tentative to unify systematic evaluation and comparison methods for xAI algorithms with a focus on the end-user's requirements. The code is made available at:

Dokument bearbeiten Dokument bearbeiten