Abstract
In this paper, we advocate for the potential of reinforcement learning from human feedback (RLHF) with self-supervised pretraining to increase the viability of reinforcement learning (RL) for real-world tasks, especially in the context of cyber-physical systems (CPS). We identify potential benefits of self-supervised pretraining in terms of the query sample complexity, safety, robustness, reward exploration and transfer. We believe that exploiting these benefits, combined with the generally improving sample efficiency of RL, will likely enable RL and RLHF to play an increasing role in CPS in the future.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Informatik > Künstliche Intelligenz und Maschinelles Lernen |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik |
URN: | urn:nbn:de:bvb:19-epub-122930-3 |
ISBN: | 978-3-031-47061-5 |
Ort: | Cham |
Dokumenten ID: | 122930 |
Datum der Veröffentlichung auf Open Access LMU: | 05. Dez. 2024 11:48 |
Letzte Änderungen: | 05. Dez. 2024 12:25 |