abstract | section | title | layout | series | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | publisher | container-title | volume | genre | issued | extras | |||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for non-stationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting. |
contributed |
Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information |
inproceedings |
Proceedings of Machine Learning Research |
auer19b |
0 |
Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information |
159 |
163 |
159-163 |
159 |
false |
Auer, Peter and Chen, Yifang and Gajane, Pratik and Lee, Chung-Wei and Luo, Haipeng and Ortner, Ronald and Wei, Chen-Yu |
|
2019-06-25 |
PMLR |
Proceedings of the Thirty-Second Conference on Learning Theory |
99 |
inproceedings |
|