Skip to content

Latest commit

 

History

History
56 lines (56 loc) · 2.14 KB

2024-10-09-shala24a.md

File metadata and controls

56 lines (56 loc) · 2.14 KB
title openreview abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
HPO-RL-Bench: A Zero-Cost Benchmark for HPO in Reinforcement Learning
MlB61zPAeR
Despite the undeniable importance of optimizing the hyperparameters of RL algorithms, existing state-of-the-art Hyperparameter Optimization (HPO) techniques are not frequently utilized by RL researchers. To catalyze HPO research in RL, we present a new large-scale benchmark that includes pre-computed reward curve evaluations of hyperparameter configurations for six established RL algorithms (PPO, DDPG, A2C, SAC, TD3, DQN) on 22 environments (Atari, Mujoco, Control), repeated for multiple seeds. We exhaustively computed the reward curves of all possible combinations of hyperparameters for the considered hyperparameter spaces for each RL algorithm in each environment. As a result, our benchmark permits zero-cost experiments for deploying and comparing new HPO methods. In addition, the benchmark offers a set of integrated HPO methods, enabling plug-and-play tuning of the hyperparameters of new RL algorithms, while pre-computed evaluations allow a zero-cost comparison of a new RL algorithm against the tuned RL baselines in our benchmark.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
shala24a
0
HPO-RL-Bench: A Zero-Cost Benchmark for HPO in Reinforcement Learning
18/1
31
18/1-31
18
false
Shala, Gresa and Arango, Sebastian Pineda and Biedenkapp, Andr\'e and Hutter, Frank and Grabocka, Josif
given family
Gresa
Shala
given family
Sebastian Pineda
Arango
given family
André
Biedenkapp
given family
Frank
Hutter
given family
Josif
Grabocka
2024-10-09
Proceedings of the Third International Conference on Automated Machine Learning
256
inproceedings
date-parts
2024
10
9