Skip to content

Latest commit

 

History

History
42 lines (42 loc) · 1.43 KB

2021-07-21-lattimore21b.md

File metadata and controls

42 lines (42 loc) · 1.43 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Mirror Descent and the Information Ratio
We establish a connection between the stability of mirror descent and the information ratio by Russo and Van Roy (2014). Our analysis shows that mirror descent with suitable loss estimators and exploratory distributions enjoys the same bound on the adversarial regret as the bounds on the Bayesian regret for information-directed sampling. Along the way, we develop the theory for information-directed sampling and provide an efficient algorithm for adversarial bandits for which the regret upper bound matches exactly the best known information-theoretic upper bound. Keywords: Bandits, partial monitoring, mirror descent, information theory.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
lattimore21b
0
Mirror Descent and the Information Ratio
2965
2992
2965-2992
2965
false
Lattimore, Tor and Gyorgy, Andras
given family
Tor
Lattimore
given family
Andras
Gyorgy
2021-07-21
Proceedings of Thirty Fourth Conference on Learning Theory
134
inproceedings
date-parts
2021
7
21