Identification of the Generalized Condorcet Winner in Multi-dueling Bandits

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Haddenhorst, Björn ORCID: https://orcid.org/0000-0002-4023-6646; Bengs, Viktor und Hüllermeier, Eyke ORCID: https://orcid.org/0000-0002-9944-4108 (7. Dezember 2021): Identification of the Generalized Condorcet Winner in Multi-dueling Bandits. Advances in Neural Information Processing Systems, Virtual, December 7 2021. Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.S. und Vaughan, J. Wortman (Hrsg.): Bd. 34 Curran Associates, Inc.. S. 25904-25916 [PDF, 516kB]

[thumbnail of NeurIPS-2021-identification-of-the-generalized-condorcet-winner-in-multi-dueling-bandits-Paper.pdf]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Veröffentlichte Version

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Ergänzendes Material

Externer Volltext: https://proceedings.neurips.cc/paper/2021/file/d9de6a144a3cc26cb4b3c47b206a121a-Paper.pdf

Abstract

The reliable identification of the “best” arm while keeping the sample complexity as low as possible is a common task in the field of multi-armed bandits. In the multi-dueling variant of multi-armed bandits, where feedback is provided in the form of a winning arm among as set of k chosen ones, a reasonable notion of best arm is the generalized Condorcet winner (GCW). The latter is an the arm that has the greatest probability of being the winner in each subset containing it. In this paper, we derive lower bounds on the sample complexity for the task of identifying the GCW under various assumptions. As a by-product, our lower bound results provide new insights for the special case of dueling bandits (k = 2). We propose the Dvoretzky–Kiefer–Wolfowitz tournament (DKWT) algorithm, which we prove to be nearly optimal. In a numerical study, we show that DKWT empirically outperforms current state-of-the-art algorithms, even in the special case of dueling bandits or under a Plackett-Luce assumption on the feedback mechanism.

Dokumententyp:	Konferenzbeitrag (Paper)
Publikationsform:	Publisher's Version
Fakultät:	Mathematik, Informatik und Statistik > Informatik > Künstliche Intelligenz und Maschinelles Lernen
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme
URN:	urn:nbn:de:bvb:19-epub-93052-6
Dokumenten ID:	93052
Datum der Veröffentlichung auf Open Access LMU:	09. Aug. 2022 17:34
Letzte Änderungen:	27. Nov. 2024 14:45

Dokument bearbeiten