Skip to content

Latest commit

 

History

History
48 lines (48 loc) · 1.77 KB

2021-07-21-bubeck21b.md

File metadata and controls

48 lines (48 loc) · 1.77 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions
We consider the cooperative multi-player version of the stochastic multi-armed bandit problem. We study the regime where the players cannot communicate but have access to shared randomness. In prior work by the first two authors, a strategy for this regime was constructed for two players and three arms, with regret $\tilde{O}(\sqrt{T})$, and with no collisions at all between the players (with very high probability). In this paper we show that these properties (near-optimal regret and no collisions at all) are achievable for any number of players and arms. At a high level, the previous strategy heavily relied on a 2-dimensional geometric intuition that was difficult to generalize in higher dimensions, while here we take a more combinatorial route to build the new strategy.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
bubeck21b
0
Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions
821
822
821-822
821
false
Bubeck, Sebastien and Budzinski, Thomas and Sellke, Mark
given family
Sebastien
Bubeck
given family
Thomas
Budzinski
given family
Mark
Sellke
2021-07-21
Proceedings of Thirty Fourth Conference on Learning Theory
134
inproceedings
date-parts
2021
7
21