title | section | openreview | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards |
Poster |
i84V7i6KEMd |
Preference-based reinforcement learning (PbRL) aligns a robot behavior with human preferences via a reward function learned from binary feedback over agent behaviors. We show that encoding environment dynamics in the reward function improves the sample efficiency of PbRL by an order of magnitude. In our experiments we iterate between: (1) encoding environment dynamics in a state-action representation |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
metcalf23a |
0 |
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards |
1484 |
1532 |
1484-1532 |
1484 |
false |
Metcalf, Katherine and Sarabia, Miguel and Mackraz, Natalie and Theobald, Barry-John |
|
2023-12-02 |
Proceedings of The 7th Conference on Robot Learning |
229 |
inproceedings |
|