Publikationen von Weng, Paul

Zur erweiterten Suche

Eine Ebene höher

Gruppiert nach: Dokumententyp | Veröffentlichungsdatum

Springe zu: Zeitschriftenartikel | Konferenzbeitrag | Andere

Anzahl der Publikationen: 7

Zeitschriftenartikel

Busa-Fekete, Róbert; Szörényi, Balázs; Weng, Paul; Cheng, Weiwei und Hüllermeier, Eyke

(2014): Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm. In: Machine Learning, Bd. 97, Nr. 3: S. 327-351

Konferenzbeitrag

Feng, Xuening; Jiang, Zhaohui; Kaufmann, Timo

; Xu, Puchen; Hüllermeier, Eyke

; Weng, Paul und Zhu, Yifei (April 2025): DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback. The 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, Pennsylvania, USA, 25. February - 4. March 2025. Proceedings of the AAAI Conference on Artificial Intelligence. Bd. 39, Nr. 16 S. 16604-16612 [PDF, 856kB]

Szörényi, Balázs; Busa-Fekete, Róbert; Weng, Paul und Hüllermeier, Eyke

(Juli 2015): Qualitative Multi-Armed Bandits: A Quantile-Based Approach. 32nd International Conference on Machine Learning, Lille, France, July 6 - 11, 2015. In: Proceedings of the 32nd International Conference on Machine Learning, Bd. 37 S. 1660-1668 [PDF, 449kB]

Busa-Fekete, Róbert; Szörényi, Balázs; Cheng, Weiwei; Weng, Paul und Hüllermeier, Eyke

(2013): Top-k Selection based on Adaptive Sampling of Noisy Preferences. ICML'13: 30th International Conference on International Conference on Machine Learning, Atlanta GA USA, June 16 - 21, 2013. Dasgupta, Sanjoy und McAllester, David (Hrsg.): In: Proceedings of the 30th International Conference on International Conference on Machine Learning, Bd. 28, Nr. 3 S. 1094-1102 [PDF, 434kB]

Weng, Paul; Busa-Fekete, Róbert und Hüllermeier, Eyke

(2013): Interactive Q-Learning with Ordinal Rewards and Unreliable Tutor. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2013). Reinforcement Learning with Generalized Feedback, Prague, 23rd September 2013. S. 1-13

Weng, Paul; Busa-Fekete, Róbert und Hüllermeier, Eyke

(2013): Preference-based Evolutionary Direct Policy Search. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2013). Reinforcement Learning with Generalized Feedback, Prague, 23rd September 2013. S. 1-8

Andere

Kaufmann, Timo

; Weng, Paul; Bengs, Viktor

und Hüllermeier, Eyke

(30. April 2024): A Survey of Reinforcement Learning from Human Feedback. [PDF, 1MB]

Diese Liste wurde am Sat Jan 3 23:53:56 2026 CET erstellt.

Exportieren als	RSS 1.0 RSS 2.0