Abstract
We evaluate the asymptotic performance of boundedly-rational strategies in multi-armed bandit problems, where performance is measured in terms of the tendency (in the limit) to play optimal actions in either (i) isolation or (ii) networks of other learners. We show that, for many strategies commonly employed in economics, psychology, and machine learning, performance in isolation and performance in networks are essentially unrelated. Our results suggest that the performance of various, common boundedly-rational strategies depends crucially upon the social context (if any) in which such strategies are to be employed.
Item Type: | Journal article |
---|---|
Keywords: | Bandit problems; Networks; Reinforcement learning; Simulating annealing; Epsilon greedy |
Faculties: | Philosophy, Philosophy of Science and Religious Science > Munich Center for Mathematical Philosophy (MCMP) Philosophy, Philosophy of Science and Religious Science > Munich Center for Mathematical Philosophy (MCMP) > Philosophy of Science |
Subjects: | 100 Philosophy and Psychology > 100 Philosophy |
ISSN: | 0020-7276 |
Language: | English |
Item ID: | 18393 |
Date Deposited: | 02. Mar 2014, 10:22 |
Last Modified: | 04. Nov 2020, 12:59 |