title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Optimal Thompson Sampling strategies for support-aware CVaR bandits

In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward distribution. While existing works in this setting mainly focus on Upper Confidence Bound algorithms, we introduce a new Thompson Sampling approach for CVaR bandits on bounded rewards that is flexible enough to solve a variety of problems grounded on physical resources. Building on a recent work by Riou & Honda (2020), we introduce B-CVTS for continuous bounded rewards and M-CVTS for multinomial distributions. On the theoretical side, we provide a non-trivial extension of their analysis that enables to theoretically bound their CVaR regret minimization performance. Strikingly, our results show that these strategies are the first to provably achieve asymptotic optimality in CVaR bandits, matching the corresponding asymptotic lower bounds for this setting. Further, we illustrate empirically the benefit of Thompson Sampling approaches both in a realistic environment simulating a use-case in agriculture and on various synthetic examples.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

baudry21a

0

Optimal Thompson Sampling strategies for support-aware CVaR bandits

716

726

716-726

716

false

Baudry, Dorian and Gautron, Romain and Kaufmann, Emilie and Maillard, Odalric

given	family
Dorian	Baudry

given	family
Romain	Gautron

given	family
Emilie	Kaufmann

given	family
Odalric	Maillard

2021-07-01

Proceedings of the 38th International Conference on Machine Learning

139

inproceedings

date-parts

2021

7

1

http://proceedings.mlr.press/v139/baudry21a/baudry21a.pdf

label	link
Supplementary PDF	http://proceedings.mlr.press/v139/baudry21a/baudry21a-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2021-07-01-baudry21a.md

2021-07-01-baudry21a.md

Files

2021-07-01-baudry21a.md

Latest commit

History

2021-07-01-baudry21a.md

File metadata and controls