Abstract
Sequential recommendations play a crucial role in many real-world applications. Due to the sequential nature, reinforcement learning has been employed to iteratively produce recommendations based on an observed stream of user behavior. In this setting, a recommendation agent interacts with the environments (users) by sequentially recommending items (actions) to maximize users' overall long-term cumulative rewards. However, most reinforcement learning-based recommendation models only focus on extrinsic rewards based on user feedback, leading to sub-optimal policies if user-item interactions are sparse and fail to obtain the dynamic rewards based on the users' preferences. As a remedy, we propose a dynamic intrinsic reward signal integrated with a contrastive discriminator-augmented reinforcement learning framework. Concretely, our framework contains two modules: (1) a contrastive learning module is employed to learn the representation of item sequences;(2) an intrinsic reward learning function to imitate the user's internal dynamics. Furthermore, we combine static extrinsic reward and dynamic intrinsic reward to train a sequential recommender system based on double Q-learning. We integrate our framework with five representative sequential recommendation models. Specifically, our framework augments these recommendation models with two output layers: the supervised layer that applies cross-entropy loss to perform ranking and the other for reinforcement learning. Experimental results on two real-world datasets demonstrate that the proposed framework outperforms several sequential recommendation baselines and exploration with intrinsic reward baselines.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Informatik |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik |
ISSN: | 0219-1377 |
Sprache: | Englisch |
Dokumenten ID: | 111084 |
Datum der Veröffentlichung auf Open Access LMU: | 02. Apr. 2024, 07:23 |
Letzte Änderungen: | 02. Apr. 2024, 07:23 |