Identifying Copeland Winners in Dueling Bandits with Indifferences

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Bengs, Viktor ORCID: https://orcid.org/0000-0001-6988-6186; Haddenhorst, Björn und Hüllermeier, Eyke ORCID: https://orcid.org/0000-0002-9944-4108 (Mai 2024): Identifying Copeland Winners in Dueling Bandits with Indifferences. 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), Valencia, Spain, 2. - 4. May 2024. In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research Bd. 238 PMLR. S. 226-234 [PDF, 522kB]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Veröffentlichte Version

[thumbnail of _AISTATS_POSTER__Identifying_Copeland_Winners_in_D_cmyk-1.pdf]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

Präsentation

DOI: 10.5282/ubm/epub.121726

Externer Volltext: https://proceedings.mlr.press/v238/bengs24a.html

Abstract

We consider the task of identifying the Copeland winner(s) in a dueling bandits problem with ternary feedback. This is an underexplored but practically relevant variant of the conventional dueling bandits problem, in which, in addition to strict preference between two arms, one may observe feedback in the form of an indifference. We provide a lower bound on the sample complexity for any learning algorithm finding the Copeland winner(s) with a fixed error probability. Moreover, we propose POCOWISTA, an algorithm with a sample complexity that almost matches this lower bound, and which shows excellent empirical performance, even for the conventional dueling bandits problem. For the case where the preference probabilities satisfy a specific type of stochastic transitivity, we provide a refined version with an improved worst case sample complexity.

Dokumententyp:	Konferenzbeitrag (Paper)
Fakultät:	Mathematik, Informatik und Statistik > Informatik > Künstliche Intelligenz und Maschinelles Lernen
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme
URN:	urn:nbn:de:bvb:19-epub-121726-0
ISSN:	2640-3498
Sprache:	Englisch
Dokumenten ID:	121726
Datum der Veröffentlichung auf Open Access LMU:	09. Okt. 2024 09:29
Letzte Änderungen:	25. Nov. 2024 06:47

Dokument bearbeiten