title | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sample Complexity Characterization for Linear Contextual MDPs |
Contextual Markov decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. While CMDPs serve as an important framework to model many real-world applications with time-varying environments, they are largely unexplored from theoretical perspective. In this paper, we study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights. For both models, we propose novel model-based algorithms and show that they enjoy guaranteed |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
deng24a |
0 |
Sample Complexity Characterization for Linear Contextual {MDP}s |
1693 |
1701 |
1693-1701 |
1693 |
false |
Deng, Junze and Cheng, Yuan and Zou, Shaofeng and Liang, Yingbin |
|
2024-04-18 |
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics |
238 |
inproceedings |
|