Anzahl der Publikationen: 1
Konferenzbeitrag
Feng, Xuening; Jiang, Zhaohui; Kaufmann, Timo
ORCID: https://orcid.org/0000-0001-5193-8574; Xu, Puchen; Hüllermeier, Eyke
ORCID: https://orcid.org/0000-0002-9944-4108; Weng, Paul und Zhu, Yifei
(April 2025):
DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback.
The 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, Pennsylvania, USA, 25. February - 4. March 2025.
Proceedings of the AAAI Conference on Artificial Intelligence.
Bd. 39, Nr. 16
S. 16604-16612
Diese Liste wurde am
Sat May 31 20:57:46 2025 CEST
erstellt.