Logo Logo
Hilfe
Hilfe
Switch Language to English

Kölle, Michael; Tochtermann, Johannes; Schönberger, Julian; Stenzel, Gerhard; Altmann, Philipp ORCID logoORCID: https://orcid.org/0000-0003-1134-176X und Linnhoff-Popien, Claudia ORCID logoORCID: https://orcid.org/0000-0001-6284-9286 (2025): PIMAEX: Multi-Agent Exploration Through Peer Incentivization. ICAART 2025: 17th International Conference on Agents and Artificial Intelligence, Porto, Portugal, 23. - 25. Februar 2025. Rocha, Ana Paula; Steels, Luc und Herik, H. Jaap van den (Hrsg.): In: Proceedings of the 17th International Conference on Agents and Artificial Intelligence, ICAART 2025 - (Volume 1), Setúbal: SciTePress. S. 572-579 [PDF, 533kB]

Abstract

While exploration in single-agent reinforcement learning has been studied extensively in recent years, consid-erably less work has focused on its counterpart in multi-agent reinforcement learning. To address this issue, this work proposes a peer-incentivized reward function inspired by previous research on intrinsic curiosity and influence-based rewards. The PIMAEX reward, short for Peer-Incentivized Multi-Agent Exploration, aims to improve exploration in the multi-agent setting by encouraging agents to exert influence over each other to increase the likelihood of encountering novel states. We evaluate the PIMAEX reward in conjunction with PIMAEX-Communication, a multi-agent training algorithm that employs a communication channel for agents to influence one another. The evaluation is conducted in the Consume/Explore environment, a partially observable environment with deceptive rewards, specifically designed to challenge the exploration vs. exploitation dilemma and the credit-assignm ent problem. The results empirically demonstrate that agents using the PI-MAEX reward with PIMAEX-Communication outperform those that do not.

Dokument bearbeiten Dokument bearbeiten