Logo Logo
Hilfe
Hilfe
Switch Language to English

Kaufmann, Timo ORCID logoORCID: https://orcid.org/0000-0001-5193-8574; Bengs, Viktor ORCID logoORCID: https://orcid.org/0000-0001-6988-6186 und Hüllermeier, Eyke ORCID logoORCID: https://orcid.org/0000-0002-9944-4108 (2024): Reinforcement Learning from Human Feedback for Cyber-Physical Systems: On the Potential of Self-Supervised Pretraining. International Conference on Machine Learning For Cyber-Physical Systems (ML4CPS 2023), Hamburg, Germany, 29. - 31. March 2023. Niggemann, Oliver; Beyerer, Jürgen; Krantz, Maria und Kühnert, Christian (Hrsg.): In: Technologien für die intelligente Automation, Bd. 18 Cham: Springer Nature Switzerland. S. 11-18 [PDF, 380kB]

Abstract

In this paper, we advocate for the potential of reinforcement learning from human feedback (RLHF) with self-supervised pretraining to increase the viability of reinforcement learning (RL) for real-world tasks, especially in the context of cyber-physical systems (CPS). We identify potential benefits of self-supervised pretraining in terms of the query sample complexity, safety, robustness, reward exploration and transfer. We believe that exploiting these benefits, combined with the generally improving sample efficiency of RL, will likely enable RL and RLHF to play an increasing role in CPS in the future.

Dokument bearbeiten Dokument bearbeiten