Testification of Condorcet Winners in dueling bandits

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Haddenhorst, Björn ORCID: https://orcid.org/0000-0002-4023-6646; Bengs, Viktor; Brandt, Jasmin und Hüllermeier, Eyke ORCID: https://orcid.org/0000-0002-9944-4108 (27. Juli 2021): Testification of Condorcet Winners in dueling bandits. Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Virtual, July 27-30, 2021. de Campos, Cassio und Maathuis, Marloes H. (Hrsg.): In: Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Bd. 161 PMLR. S. 1195-1205 [PDF, 875kB]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Veröffentlichte Version

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Ergänzendes Material

Externer Volltext: https://proceedings.mlr.press/v161/haddenhorst21a.html

Abstract

Several algorithms for finding the best arm in the dueling bandits setting assume the existence of a Condorcet winner (CW), that is, an arm that uniformly dominates all other arms. Yet, by simply relying on this assumption but not verifying it, such algorithms may produce doubtful results in cases where it actually fails to hold. Even worse, the problem may not be noticed, and an alleged CW still be produced. In this paper, we therefore address the problem as a ”testification” task, by which we mean a combination of testing and identification: The online identification of the CW is combined with the statistical testing of the CW assumption. Thus, instead of returning a supposed CW at some point, the learner has the possibility to stop sampling and refuse an answer in case it feels confident that the CW assumption is violated. Analyzing the testification problem formally, we derive lower bounds on the expected sample complexity of any online algorithm solving it. Moreover, a concrete algorithm is proposed, which achieves the optimal sample complexity up to logarithmic terms.

Dokumententyp:	Konferenzbeitrag (Paper)
Publikationsform:	Publisher's Version
Fakultät:	Mathematik, Informatik und Statistik > Informatik > Künstliche Intelligenz und Maschinelles Lernen
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme
URN:	urn:nbn:de:bvb:19-epub-93056-9
Dokumenten ID:	93056
Datum der Veröffentlichung auf Open Access LMU:	09. Aug. 2022 18:06
Letzte Änderungen:	27. Nov. 2024 16:25

Dokument bearbeiten