Logo Logo
Exportieren als [RSS feed] RSS 1.0 [RSS2 feed] RSS 2.0
Gruppiert nach: Dokumententyp | Veröffentlichungsdatum
Anzahl der Publikationen: 7

Zeitschriftenartikel

Busa-Fekete, Róbert; Szörényi, Balázs; Weng, Paul; Cheng, Weiwei und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (2014): Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm. In: Machine Learning, Bd. 97, Nr. 3: S. 327-351

Konferenzbeitrag

Feng, Xuening; Jiang, Zhaohui; Kaufmann, Timo ORCID logoORCID: https://orcid.org/0000-0001-5193-8574; Xu, Puchen; Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108; Weng, Paul und Zhu, Yifei (April 2025): DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback. The 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, Pennsylvania, USA, 25. February - 4. March 2025. Proceedings of the AAAI Conference on Artificial Intelligence. Bd. 39, Nr. 16 S. 16604-16612

Szörényi, Balázs; Busa-Fekete, Róbert; Weng, Paul und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (Juli 2015): Qualitative Multi-Armed Bandits: A Quantile-Based Approach. 32nd International Conference on Machine Learning, Lille, France, July 6 - 11, 2015. In: Proceedings of the 32nd International Conference on Machine Learning, Bd. 37 S. 1660-1668 [PDF, 449kB]

Busa-Fekete, Róbert; Szörényi, Balázs; Cheng, Weiwei; Weng, Paul und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (2013): Top-k Selection based on Adaptive Sampling of Noisy Preferences. ICML'13: 30th International Conference on International Conference on Machine Learning, Atlanta GA USA, June 16 - 21, 2013. Dasgupta, Sanjoy und McAllester, David (Hrsg.): In: Proceedings of the 30th International Conference on International Conference on Machine Learning, Bd. 28, Nr. 3 S. 1094-1102 [PDF, 434kB]

Weng, Paul; Busa-Fekete, Róbert und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (2013): Interactive Q-Learning with Ordinal Rewards and Unreliable Tutor. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2013). Reinforcement Learning with Generalized Feedback, Prague, 23rd September 2013. S. 1-13

Weng, Paul; Busa-Fekete, Róbert und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (2013): Preference-based Evolutionary Direct Policy Search. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2013). Reinforcement Learning with Generalized Feedback, Prague, 23rd September 2013. S. 1-8

Andere

Kaufmann, Timo ORCID logoORCID: https://orcid.org/0000-0001-5193-8574; Weng, Paul; Bengs, Viktor ORCID logoORCID: https://orcid.org/0000-0001-6988-6186 und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (30. April 2024): A Survey of Reinforcement Learning from Human Feedback. [PDF, 1MB]

Diese Liste wurde am Sat May 31 23:36:28 2025 CEST erstellt.