Abstract
We evaluate the asymptotic performance of boundedly-rational strategies in multi-armed bandit problems, where performance is measured in terms of the tendency (in the limit) to play optimal actions in either (i) isolation or (ii) networks of other learners. We show that, for many strategies commonly employed in economics, psychology, and machine learning, performance in isolation and performance in networks are essentially unrelated. Our results suggest that the performance of various, common boundedly-rational strategies depends crucially upon the social context (if any) in which such strategies are to be employed.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Keywords: | Bandit problems; Networks; Reinforcement learning; Simulating annealing; Epsilon greedy |
Fakultät: | Philosophie, Wissenschaftstheorie und Religionswissenschaft > Munich Center for Mathematical Philosophy (MCMP)
Philosophie, Wissenschaftstheorie und Religionswissenschaft > Munich Center for Mathematical Philosophy (MCMP) > Philosophy of Science |
Themengebiete: | 100 Philosophie und Psychologie > 100 Philosophie |
ISSN: | 0020-7276 |
Sprache: | Englisch |
Dokumenten ID: | 18393 |
Datum der Veröffentlichung auf Open Access LMU: | 02. Mrz. 2014, 10:22 |
Letzte Änderungen: | 04. Nov. 2020, 12:59 |