Safe Reinforcement Learning: Process of learning policies that maximize the expectation of the return in problems in which it is important to ensure reasonable system performance and/or respect safety constraints during the learning and/or deployment processes.
Contributed by Chunyang Zhang.
-
A comprehensive survey on safe reinforcement learning. JMLR, 2015. paper
Javier Garcí and Fern o Fernández.
-
Policy learning with constraints in model-free reinforcement learning: A survey. IJCAI, 2021. paper
Yongshuai Liu, Avishai Halev, and Xin Liu.
-
Safe learning in robotics: From learning-based control to safe reinforcement learning. Annual Review of Control, Robotics, and Autonomous Systems, 2022. paper
Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, and Angela P. Schoellig.
-
A review of safe reinforcement learning: Methods, theory and applications. arXiv, 2022. paper
Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, and Alois Knoll.*
-
Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 2021. paper
Kiran, B Ravi and Sobh, Ibrahim and Talpaert, Victor and Mannion, Patrick and Sallab, Ahmad A. Al and Yogamani, Senthil, and Pérez, Patrick.
-
State-wise safe reinforcement learning: A survey. arXiv, 2023. paper
Weiye Zhao, Tairan He, Rui Chen, Tianhao Wei, and Changliu Liu.
-
Modeling risk in reinforcement learning: A literature mapping. arXiv, 2023. paper
Leonardo Villalobos-Arias, Derek Martin, Abhijeet Krishnan, Madeleine Gagné, Colin M. Potts, and Arnav Jhala.
-
Safe and robust reinforcement-learning: Principles and practice. arXiv, 2024. paper
Taku Yamagata and Raul Santos-Rodriguez.
-
Provably efficient reinforcement learning with linear function approximation. ICML, 2020. paper
Chi Jin, Zhuoran Yang, Zhaoran Wang, and Michael I Jordan.
-
Model-based safe deep reinforcement learning via a constrained proximal policy optimization algorithm. NIPS, 2022. paper
Ashish Kumar Jayant and Shalabh Bhatnagar.
-
Conservative and adaptive penalty for model-based safe reinforcement learning. AAAI, 2022. paper
Yecheng Jason Ma, Andrew Shen, Osbert Bastani, and Dinesh Jayaraman.
-
DOPE: Doubly optimistic and pessimistic exploration for safe reinforcement learning. NIPS, 2022. paper
Archana Bura, Aria Hasanzadezonuzy, Dileep Kalathil, Srinivas Shakkottai, and Jean-Francois Chamberland.
-
Risk sensitive model-based reinforcement learning using uncertainty guided planning. NIPS, 2021. paper
Garrett Thomas, Yuping Luo, and Tengyu Ma.
-
Safe reinforcement learning by imagining the near future. NIPS, 2021. paper
Stefan Radic Webster and Peter Flach.
-
Approximate model-based shielding for safe reinforcement learning. ECAL, 2023. paper
Alexander W. Goodall and Francesco Belardinelli.
-
Model-free safe control for zero-violation reinforcement learning. CoRL, 2022. paper
Weiye Zhao, Tairan He, and Changliu Liu.
-
More for less: Safe policy improvement with stronger performance guarantees. IJCAI, 2023. paper
Patrick Wienhöft, Marnix Suilen, Thiago D. Simão, Clemens Dubslaff, Christel Baier, and Nils Jansen.
-
Model-free safe reinforcement learning through neural barrier certificate. RAL, 2023. paper
Yujie Yang, Yuxuan Jiang, Yichen Liu, Jianyu Chen, and Shengbo Eben Li.
-
Provably efficient model-free constrained RL with linear function approximation. NIPS, 2022. paper
Arnob Ghosh, Xingyu Zhou, and Ness Shroff.
-
Provably efficient model-free algorithms for non-stationary CMDPs. arXiv, 2023. paper
Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, and Xingyu Zhou.
-
Anytime-competitive reinforcement learning with policy prior. NIPS, 2023. paper
Jianyi Yang, Pengfei Li, Tongxin Li, Adam Wierman, and Shaolei Ren.
-
Anytime-constrained reinforcement learning. arXiv, 2023. paper
Jeremy McMahan and Xiaojin Zhu.
-
Constrained policy optimization. ICML, 2017. paper
Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel.
-
Reward constrained policy optimization. ICLR, 2019. paper
Chen Tessler, Daniel J. Mankowitz, and Shie Mannor.
-
Projection-based constrained policy optimization. ICLR, 2020. paper
Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, and Peter J. Ramadge.
-
CRPO: A new approach for safe reinforcement learning with convergence guarantee. ICML, 2021. paper
Tengyu Xu, Yingbin Liang, and Guanghui Lan.
-
When to update your model: Constrained model-based reinforcement learning. NIPS, 2022. paper
Tianying Ji, Yu Luo, Fuchun Sun, Mingxuan Jing, Fengxiang He, and Wenbing Huang.
-
Constraints penalized Q-learning for safe offline reinforcement learning. AAAI, 2022. paper
Haoran Xu, Xianyuan Zhan, and Xiangyu Zhu.
-
Exploring safer behaviors for deep reinforcement learning. AAAI, 2022. paper
Enrico Marchesini, Davide Corsi, and Alessandro Farinelli.
-
Constrained proximal policy optimization. arXiv, 2023. paper
Chengbin Xuan, Feng Zhang, Faliang Yin, and Hak-Keung Lam.
-
Safe policy improvement for POMDPs via finite-state controllers. AAAI, 2023. paper
Thiago D. Simão, Marnix Suilen, and Nils Jansen.
-
Policy regularization with dataset constraint for offline reinforcement learning. ICML, 2023. paper
Yuhang Ran, Yichen Li, Fuxiang Zhang, Zongzhang Zhang, and Yang Yu.
-
Constrained update projection approach to safe policy optimization. NIPS, 2022. paper
Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, and Gang Pan.
-
Constrained variational policy optimization for safe reinforcement learning. ICML, 2022. paper
Zuxin Liu, Zhepeng Cen, Vladislav Isenbaev, Wei Liu, Steven Wu, Bo Li, and Ding Zhao.
-
Towards safe reinforcement learning with a safety editor policy. NIPS, 2022. paper
Haonan Yu, Wei Xu, and Haichao Zhang.
-
CUP: A conservative update policy algorithm for safe reinforcement learning. arXiv, 2022. paper
Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, and Gang Pan.
-
State-wise constrained policy optimization. arXiv, 2023. paper
Weiye Zhao, Rui Chen, Yifan Sun, Tianhao Wei, and Changliu Liu.
-
Scalable safe policy improvement via Monte Carlo tree search. ICML, 2023. paper
Alberto Castellini, Federico Bianchi, Edoardo Zorzi, Thiago D. Simão, Alessandro Farinelli, and Matthijs T. J. Spaan.
-
Towards robust and safe reinforcement learning with Benign off-policy data. ICML, 2023. paper
Zuxin Liu, Zijian Guo, Zhepeng Cen, Huan Zhang, Yihang Yao, Hanjiang Hu, and Ding Zhao.
-
Constraint-conditioned policy optimization for versatile safe reinforcement learning. arXiv, 2023 paper
Yihang Yao, Zuxin Liu, Zhepeng Cen, Jiacheng Zhu, Wenhao Yu, Tingnan Zhang, and Ding Zhao.
-
Quantile constrained reinforcement learning: A reinforcement learning framework constraining outage probability. NIPS, 2022. paper
Whiyoung Jung, Myungsik Cho, Jongeui Park, and Youngchul Sung.
-
Reinforcement learning in a safety-embedded MDP with trajectory optimization. arXiv, 2023. paper
Fan Yang, Wenxuan Zhou, Zuxin Liu, Ding Zhao, and David Held.
-
Recursively-constrained partially observable Markov decision processes. arXiv, 2023。 paper
Qi Heng Ho, Tyler Becker, Ben Kraske, Zakariya Laouar, Martin Feather, Federico Rossi, Morteza Lahijanian, and Zachary N. Sunberg.
-
Trust region-based safe distributional reinforcement learning for multiple constraints. NIPS, 2024. paper
Dohyeong Kim, Kyungjae Lee, and Songhwai Oh.
-
TRC: Trust region conditional value at risk for safe reinforcement learning. arXiv, 2023. paper
Dohyeong Kim and Songhwai Oh.
-
Transition constrained bayesian optimization via Markov decision processes. arXiv, 2024. paper
Jose Pablo Folch, Calvin Tsay, Robert M Lee, Behrang Shafei, Weronika Ormaniec, Andreas Krause, Mark van der Wilk, Ruth Misener, and Mojmír Mutný.
-
Safety optimized reinforcement learning via multi-objective policy optimization. ICRA, 2024. paper
Homayoun Honari, Mehran Ghafarian Tamizi, and Homayoun Najjaran.
-
Double duality: Variational primal-dual policy optimization for constrained reinforcement learning. arXiv, 2024. paper
Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, and Mengdi Wang.
-
ACPO: A policy optimization algorithm for average MDPs with constraints. ICML, 2024. paper
Akhil Agnihotri, Rahul Jain, and Haipeng Luo.
-
Spectral-risk safe reinforcement learning with convergence guarantees. arXiv, 2024. paper
Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, and Songhwai Oh.
-
Enhancing efficiency of safe reinforcement learning via sample manipulation. arXiv, 2024. paper
Shangding Gu, Laixi Shi, Yuhao Ding, Alois Knoll, Costas Spanos, Adam Wierman, and Ming Jin.
-
Confident natural policy gradient for local planning in qπ-realizable constrained MDPs. arXiv, 2024. paper
Tian Tian, Lin F. Yang, and Csaba Szepesvári.
-
CVaR-constrained policy optimization for safe reinforcement learning. TNNLS, 2024. paper
Qiyuan Zhang, Shu Leng, Xiaoteng Ma, Qihan Liu, Xueqian Wang, Bin Liang, Yu Liu, and Jun Yang.
-
Safe reinforcement learning using finite-horizon gradient-based estimation. ICML, 2024. paper
Juntao Dai, Yaodong Yang, Qian Zheng, and Gang Pan.
-
Exterior penalty policy optimization with penalty metric network under constraints. IJCAI, 2024. paper
Shiqing Gao, Jiaxin Ding, Luoyi Fu, Xinbing Wang, and Chenghu Zhou.
-
Solving truly massive budgeted monotonic POMDPs with oracle-guided meta-reinforcement learning. arXiv, 2024. paper
Manav Vora, Michael N Grussing, and Melkior Ornik.
-
Absolute state-wise constrained policy optimization: High-probability state-wise constraints satisfaction. arXiv, 2024. paper
Weiye Zhao, Feihan Li, Yifan Sun, Yujie Wang, Rui Chen, Tianhao Wei, and Changliu Liu.
-
Probably anytime-safe stochastic combinatorial semi-bandits. ICML, 2023. paper
Yunlong Hou, Vincent Tan, and Zixin Zhong.
-
Lyapunov-based safe policy optimization for continuous control. ICML, 2019. paper
Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duez-Guzmn, and Mohamamd Ghavamzadeh.
-
Lyapunov design for safe reinforcement learning. JMLR, 2002. paper
Theodore J. Perkins and Andrew G. Barto.
-
Value functions are control barrier functions: Verification of safe policies using control theory. arXiv, 2023. paper
Daniel C.H. Tan, Fernando Acero, Robert McCarthy, Dimitrios Kanoulas, and Zhibin Li.
-
Safe exploration in model-based reinforcement learning using control barrier functions. Automatica, 2023. paper
Max H. Cohen and Calin Belta.
-
Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments. ICML, 2023. paper
Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, and Qi Zhu.
-
State wise safe reinforcement learning with pixel observations. arXiv, 2023. paper
Simon Sinong Zhan, Yixuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, and Qi Zhu.
-
Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments. ICML, 2023. paper
Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, and Qi Zhu.
-
Safe and efficient reinforcement learning using disturbance-observer-based control barrier functions. ICML, 2023. paper
Yikun Cheng, Pan Zhao, and Naira Hovakimyan.
-
NLBAC: A neural ordinary differential equations-based framework for stable and safe reinforcement learning. arXiv, 2024. paper
Liqun Zhao, Keyan Miao, Konstantinos Gatsis, and Antonis Papachristodoulou.
-
Log barriers for safe black-box optimization with application to safe reinforcement learning. JMLR, 2024. paper
Ilnura Usmanova, Yarden As, Maryam Kamgarpour, and Andreas Krause.
-
WCSAC: Worst-case soft actor critic for safety-constrained reinforcement learning. AAAI, 2021. paper
Qisong Yang, Thiago D. Simao, Simon H. Tindemans, and Matthijs T. J. Spaan.
-
Finite time analysis of constrained actor critic and constrained natural actor critic algorithms. arXiv, 2023. paper
Prashansa Panda and Shalabh Bhatnagar.
-
DSAC-C: Constrained maximum entropy for robust discrete soft-actor critic. arXiv, 2023. paper
Dexter Neo and Tsuhan Chen.
-
SCPO: Safe reinforcement learning with safety critic policy optimization. arXiv, 2023. paper
Jaafar Mhamed and Shangding Gu.
-
Adversarially trained actor critic for offline CMDPs. arXiv, 2024. paper
Honghao Wei, Xiyue Peng, Xin Liu, and Arnob Ghosh.
-
Parameter-efficient tuning helps language model alignment. arXiv, 2023. paper
Tianci Xue, Ziqi Wang, and Heng Ji.
-
Confronting reward model overoptimization with constrained RLHF. arXiv, 2023. paper
Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, and Stephen McAleer.
-
Safe RLHF: Safe reinforcement learning from human feedback. ICLR, 2024. paper
Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, and Yaodong Yang.
-
Safe reinforcement learning with free-form natural language constraints and pre-trained language models. arXiv, 2024. paper
Xingzhou Lou, Junge Zhang, Ziyan Wang, Kaiqi Huang, and Yali Du.
-
Progressive safeguards for safe and model-agnostic reinforcement learning. arXiv, 2024. paper
Nabil Omi, Hosein Hasanbeig, Hiteshi Sharma, Sriram K. Rajamani, and Siddhartha Sen.
-
SaFormer: A conditional sequence modeling approach to offline safe reinforcement learning. arXiv, 2023. paper
Qin Zhang, Linrui Zhang, Haoran Xu, Li Shen, Bowen Wang, Yongzhe Chang, Xueqian Wang, Bo Yuan, and Dacheng Tao.
-
Model-free, regret-optimal best policy identification in online CMDPs. arXiv, 2023. paper
Zihan Zhou, Honghao Wei, and Lei Ying.
-
Scalable and efficient continual learning from demonstration via hypernetwork-generated stable dynamics model. arXiv, 2023. paper
Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, and Justus Piater.
-
SafeDreamer: Safe reinforcement learning with world models. ICLR, 2024. paper
Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, and Yaodong Yang.
-
Dynamic model predictive shielding for provably safe reinforcement learning. arXiv, 2024. paper
Arko Banerjee, Kia Rahmani, Joydeep Biswas, and Isil Dillig.
-
Verified safe reinforcement learning for neural network dynamic models. arXiv, 2024. paper
Junlin Wu, Huan Zhang, and Yevgeniy Vorobeychik.
-
Evaluating model-free reinforcement learning toward safety-critical tasks. AAAI, 2023. paper
Linrui Zhang, Qin Zhang, Li Shen, Bo Yuan, Xueqian Wang, and Dacheng Tao.
-
Safe evaluation for offline learning: Are we ready to deploy? arXiv, 2022. paper
Hager Radi, Josiah P. Hanna, Peter Stone, and Matthew E. Taylor.
-
Safe offline reinforcement learning with real-time budget constraints. ICML, 2023. paper
Qian Lin, Bo Tang, Zifan Wu, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, and Dong Wang.
-
Efficient off-policy safe reinforcement learning using trust region conditional value at risk. arXiv, 2023. paper
Dohyeong Kim and Songhwai Oh.
-
Provable safe reinforcement learning with binary feedback. AISTATS, 2023. paper
Andrew Bennett, Dipendra Misra, and Nathan Kallus.
-
Long-term safe reinforcement learning with binary feedback. AAAI, 2024. paper
Akifumi Wachi, Wataru Hashimoto, and Kazumune Hashimoto.
-
Off-policy primal-dual safe reinforcement learning. ICLR, 2024. paper
Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, and Dong Wang.
-
OASIS: Conditional distribution shaping for offline safe reinforcement learning. arXiv, 2024. paper
Yihang Yao, Zhepeng Cen, Wenhao Ding, Haohong Lin, Shiqi Liu, Tingnan Zhang, Wenhao Yu, and Ding Zhao.
-
Learning-aware safety for interactive autonomy. arXiv, 2023. paper
Haimin Hu, Zixu Zhang, Kensuke Nakamura, Andrea Bajcsy, and Jaime F. Fisac.
-
Robust safe reinforcement learning under adversarial disturbances. arXiv, 2023. paper
Zeyang Li, Chuxiong Hu, Shengbo Eben Li, Jia Cheng, and Yunan Wang.
-
Inverse constrained reinforcement learning. ICML, 2021. paper
Shehryar Malik, Usman Anwar, Alireza Aghasi, and Ali Ahmed.
-
Benchmarking constraint inference in inverse reinforcement learning. ICLR, 2023. paper
Guiliang Liu, Yudong Luo, Ashish Gaurav, Kasra Rezaee, and Pascal Poupart.
-
Maximum causal entropy inverse constrained reinforcement learning. arXiv, 2023. paper
Mattijs Baert, Pietro Mazzaglia, Sam Leroux, and Pieter Simoens.
-
Identifiability and generalizability in constrained inverse reinforcement learning. ICML, 2023. paper
Andreas Schlaginhaufen and Maryam Kamgarpour.
-
FP-IRL: Fokker-Planck-based inverse reinforcement learning -- A physics-constrained approach to Markov decision processes. ICML, 2023. paper
Chengyang Huang, Siddhartha Srivastava, Xun Huan, and Krishna Garikipati.
-
On the robustness of safe reinforcement learning under observational perturbations. ICLR, 2023. paper
Zuxin Liu, Zijian Guo, Zhepeng Cen, Huan Zhang, Jie Tan, Bo Li, and Ding Zhao.
-
Detecting adversarial directions in deep reinforcement learning to make robust decisions. ICML, 2023. paper
Ezgi Korkmaz and Jonah Brown-Cohen.
-
Efficient trust region-based safe reinforcement learning with low-bias distributional actor-critic. arXiv, 2023. paper
Dohyeong Kim, Kyungjae Lee, and Songhwai Oh.
-
Don't do it: Safer reinforcement learning with rule-based guidance. arXiv, 2022. paper
Ekaterina Nikonova, Cheng Xue, and Jochen Renz.
-
Saute RL: Almost surely safe reinforcement learning using state augmentation. ICML, 2022. paper
Aivar Sootla, Alexander I Cowen-Rivers, Taher Jafferjee, Ziyan Wang, David H Mguni, Jun Wang, and Haitham Ammar.
-
Provably efficient exploration in constrained reinforcement learning: Posterior sampling is all you need. arXiv, 2023. paper
Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, and Maurits Kaptein.
-
Safe exploration in reinforcement learning: A generalized formulation and algorithms. NIPS, 2023. paper
Akifumi Wachi, Wataru Hashimoto, Xun Shen, and Kazumune Hashimoto.
-
Sample-efficient and safe deep reinforcement learning via reset deep ensemble agents. NIPS, 2023. paper
Woojun Kim, Yongjae Shin, Jongeui Park, and Youngchul Sung.
-
Imitate the good and avoid the bad: An incremental approach to safe reinforcement learning. AAAI, 2024. paper
Huy Hoang and Tien Mai Pradeep Varakantham.
-
GUARD: A safe reinforcement learning benchmark. arXiv, 2023. paper
Weiye Zhao, Rui Chen, Yifan Sun, Ruixuan Liu, Tianhao Wei, and Changliu Liu.
-
OmniSafe: An infrastructure for accelerating safe reinforcement learning research. arXiv, 2023. paper
Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, and Yaodong Yang.
-
Datasets and benchmarks for offline safe reinforcement learning. arXiv, 2023. paper
Zuxin Liu, Zijian Guo, Haohong Lin, Yihang Yao, Jiacheng Zhu, Zhepeng Cen, Hanjiang Hu, Wenhao Yu, Tingnan Zhang, Jie Tan, and Ding Zhao.
-
InterCode: Standardizing and benchmarking interactive coding with execution feedback. arXiv, 2023. paper
John Yang, Akshara Prabhakar, Karthik Narasimhan, and Shunyu Yao.
-
Safety-Gymnasium: A unified safe reinforcement learning benchmark. arXiv, 2023. paper
Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, and Yaodong Yang.
-
Controlgym: Large-scale safety-critical control environments for benchmarking reinforcement learning algorithms. arXiv, 2023. paper
Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, and Tamer Başar.
-
Near-optimal conservative exploration in reinforcement learning under episode-wise constraints. ICML, 2023. paper
Donghao Li, Ruiquan Huang, Cong Shen, and Jing Yang.
-
Near-optimal sample complexity bounds for constrained MDPs. NIPS, 2022. paper
Sharan Vaswani, Lin F. Yang, and Csaba Szepesvári.
-
Learning policies with zero or bounded constraint violation for constrained MDPs. NIPS, 2021. paper
Tao Liu, Ruida Zhou, Dileep Kalathil, Panganamala Kumar, and Chao Tian.
-
Provably learning Nash policies in constrained Markov potential games. arXiv, 2023. paper
Pragnya Alatur, Giorgia Ramponi, Niao He, and Andreas Krause.
-
Provably safe reinforcement learning: A theoretical and experimental comparison. arXiv, 2022. paper
Hanna Krasowski, Jakob Thumm, Marlon Müller, Lukas Schäfer, Xiao Wang, and Matthias Althoff.
-
Shielded reinforcement learning for hybrid systems. CoRL, 2023. paper
Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen, Florian Lorber, and Christian Schilling.
-
Safe reinforcement learning in tensor reproducing kernel Hilbert space. arXiv, 2023. paper
Xiaoyuan Cheng, Boli Chen, Liz Varga, and Yukun Hu.
-
Joint chance-constrained Markov decision processes. Annals of Operations Research, 2022. paper
V Varagapriya, Vikas Vikram Singh, and Abdel Lisser.
-
Approximate solutions to constrained risk-sensitive Markov decision processes. European Journal of Operational Research, 2023. paper
Uday M Kumar, Sanjay P. Bhat, Veeraruna Kavitha, and Nandyala Hemachandra.
-
Nearly minimax optimal reinforcement learning for linear Markov decision processes. ICML, 2023. paper
Jiafan He, Heyang Zhao, Dongruo Zhou, and Quanquan Gu.
-
Truly no-regret learning in constrained MDPs. arXiv, 2024. paper
Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, and Niao He.
-
Achieving O~(1/ε) sample complexity for constrained markov decision process. arXiv, 2024. paper
Jiashuo Jiang and Yinyu Ye.
-
Sampling-based safe reinforcement learning for nonlinear dynamical systems. arXiv, 2024. paper
Wesley A. Suttle, Vipul K. Sharma, Krishna C. Kosaraju, S. Sivaranjani, Ji Liu, Vijay Gupta, and Brian M. Sadler.
-
Learning adversarial MDPs with stochastic hard constraints. arXiv, 2024. paper
Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, and Nicola Gatti.
-
ConstrainedZero: Chance-constrained POMDP planning using learned probabilistic failure surrogates and adaptive safety constraints. IJCAI, 2024. paper
Robert J. Moss, Arec Jamgochian, Johannes Fischer, Anthony Corso, and Mykel J. Kochenderfer.
-
Efficient exploration in average-reward constrained reinforcement learning: Achieving near-optimal regret with posterior sampling. ICML, 2024. paper
Danil Provodin, Maurits Kaptein, and Mykola Pechenizkiy.
-
Achieving tractable minimax optimal regret in average reward MDPs. arXiv, 2024. paper
Victor Boone and Zihan Zhang.
-
Improved regret bound for safe reinforcement learning via tighter cost pessimism and reward optimism. arXiv, 2024. paper
Kihyun Yu, Duksang Lee, William Overman, and Dabeen Lee.
-
Redeeming intrinsic rewards via constrained optimization. AAAI, 2023. paper
Tairan He, Weiye Zhao, and Changliu Liu.
-
Anchor-changing regularized natural policy gradient for multi-objective reinforcement learning. NIPS, 2022. paper
Ruida Zhou, Tao Liu, Dileep Kalathil, P. R. Kumar, and Chao Tian.
-
ROSARL: Reward-only safe reinforcement learning. arXiv, 2023. paper
Geraud Nangue Tasse, Tamlin Love, Mark Nemecek, Steven James, and Benjamin Rosman.
-
Solving richly constrained reinforcement learning through state augmentation and reward penalties. arXiv, 2023. paper
Hao Jiang, Tien Mai, Pradeep Varakantham, and Minh Huy Hoang.
-
State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards. TAC, 2023. paper
Miguel Calvo-Fullana, Santiago Paternain, Luiz F. O. Chamon, and Alejandro Ribeiro.
-
Balance reward and safety optimization for safe reinforcement learning: A perspective of gradient manipulation. AAAI, 2024. paper
Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Ming Jin, and Alois Knoll
-
AutoCost: Evolving intrinsic cost for zero-violation reinforcement learning. AAAI, 2023. paper
Tairan He, Weiye Zhao, and Changliu Liu.
-
Semi-infinitely constrained markov decision processes and efficient reinforcement learning. NIPS, 2022. paper
Liangyu Zhang, Yang Peng, Wenhao Yang, and Zhihua Zhang.
-
Policy-based primal-dual methods for convex constrained markov decision processes. AAAI, 2023. paper
Donghao Ying, Mengzi Amy Guo, Yuhao Ding, Javad Lavaei, and Zuo-Jun Max Shen.
-
Provably efficient primal-dual reinforcement learning for CMDPs with non-stationary objectives and constraints. AAAI, 2023. paper
Yuhao Ding and Javad Lavaei.
-
Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach. AAAI, 2022. paper
Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, Alec Koppel, and Vaneet Aggarwal.
-
Probabilistic constraint for safety-critical reinforcement learning. arXiv, 2023. paper
Weiqin Chen, Dharmashankar Subramanian, and Santiago Paternain.
-
Last-iterate convergent policy gradient primal-dual methods for constrained MDPs. arXiv, 2023. paper
Dongsheng Ding, Chen-Yu Wei, Kaiqing Zhang, and Alejandro Ribeiro.
-
Learning-aware safety for interactive autonomy. arXiv, 2023. paper
Haimin Hu, Zixu Zhang, Kensuke Nakamura, Andrea Bajcsy, and Jaime Fernandez Fisac.
-
Safe reinforcement learning with dual robustness. arXiv, 2023. paper
Zeyang Li, Chuxiong Hu, Yunan Wang, Yujie Yang, and Shengbo Eben Li.
-
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm. AAAI, 2022. paper
Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, Alec Koppel, and Vaneet Aggarwal.
-
Distributionally safe reinforcement learning under model uncertainty: A single-level approach by differentiable convex programming. arXiv, 2023. paper
Alaa Eddine Chriat and Chuangchuang Sun.
-
A policy gradient primal-dual algorithm for constrained MDPs with uniform PAC guarantees. arXiv, 2024. paper
Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, and Yutaka Matsuo.
-
Adaptive primal-dual method for safe reinforcement learning. arXiv, 2024. paper
Weiqin Chen, James Onyejizu, Long Vu, Lan Hoang, Dharmashankar Subramanian, Koushik Kar, Sandipan Mishra, and Santiago Paternain.
-
Learning general parameterized policies for infinite horizon average reward constrained MDPs via primal-dual policy gradient algorithm. arXiv, 2024. paper
Qinbo Bai, Washim Uddin Mondal, and Vaneet Aggarwal.
-
POCE: Primal policy optimization with conservative estimation for multi-constraint offline reinforcement learning. CVPR, 2024. paper
iayi Guan, Li Shen, Ao Zhou, Lusong Li, Han Hu, Xiaodong He, Guang Chen, and Changjun Jiang.
-
Towards deployment-efficient reinforcement learning: Lower bound and optimality. ICLR, 2022. paper
Jiawei Huang, Jinglin Chen, Li Zhao, Tao Qin, Nan Jiang, and Tie-Yan Liu.
-
Benchmarking constraint inference in inverse reinforcement learning. ICLR, 2023. paper
Guiliang Liu, Yudong Luo, Ashish Gaurav, Kasra Rezaee, and Pascal Poupart.
-
A CMDP-within-online framework for meta-safe reinforcement learning. ICLR, 2023. paper
Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, and Ming Jin.
-
Reinforcement learning by guided safe exploration. ECAI, 2023. paper
Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, and Matthijs T. J. Spaan.
-
Meta SAC-Lag: Towards deployable safe reinforcement learning via metagradient-based hyperparameter tuning. IROS, 2024. paper
Homayoun Honari, Amir Mehdi Soufi Enayati, Mehran Ghafarian Tamizi, and Homayoun Najjaran
-
Trajectory generation, control, and safety with denoising diffusion probabilistic models. arXiv, 2023. paper
Nicolò Botteghi, Federico Califano, Mannes Poel, and Christoph Brune.
-
DiffCPS: Diffusion model based constrained policy search for offline reinforcement learning. arXiv, 2023. paper
Longxiang He, Linrui Zhang, Junbo Tan, and Xueqian Wang.
-
Feasibility-guided safe offline reinforcement learning. ICLR, 2024. paper
Longxiang He, Linrui Zhang, Junbo Tan, and Xueqian Wang.
-
Constrained decision Transformer for offline safe reinforcement learning. ICML, 2023. paper
Zuxin Liu, Zijian Guo, Yihang Yao, Zhepeng Cen, Wenhao Yu, Tingnan Zhang, and Ding Zhao.
-
Transdreamer: Reinforcement learning with Transformer world models. arXiv, 2022. paper
Chang Chen, Yi-Fu Wu, Jaesik Yoon, and Sungjin Ahn.
-
Temporal logic specification-conditioned decision Transformer for offline safe reinforcement learning. ICML, 2024. paper
Zijian Guo, Weichao Zhou, and Wenchao Li.
-
Policy learning for robust markov decision process with a mismatched generative model. AAAI, 2022. paper
Jialian Li, Tongzheng Ren, Dong Yan, Hang Su, and Jun Zhu.
-
Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees. Artificial Intelligence, 2023. paper
Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, and Jaime F. Fisac.
-
Responsive safety in reinforcement learning by PID lagrangian methods. ICML, 2020. paper
Adam Stooke, Joshua Achiam, and Pieter Abbeel.
-
Safe Dreamer: Safe reinforcement learning with world mdels. arXiv 2023. paper
Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, and Yaodong Yang.
-
Robust Lagrangian and adversarial policy gradient for robust constrained Markov decision processes. arXiv 2023. paper
David M. Bossens.
-
Safe reinforcement learning as Wasserstein variational inference: Formal methods for interpretability. arXiv, 2023. paper
Yanran Wang and David Boyle.
-
Gradient shaping for multi-constraint safe reinforcement learning. arXiv, 2023. paper
Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, and Ding Zhao.
-
Causal temporal reasoning for Markov decision processes. arXiv, 2022. paper
Milad Kazemi and Nicola Paoletti.
-
Can agents run relay race with dtrangers? Generalization of RL to out-of-distribution trajectories. ICRL, 2023. paper
Licheng Lan, Huan Zhang, andCho-Jui Hsieh.
-
Safe reinforcement learning via curriculum induction. NIPS, 2020. paper
Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, and Alekh Agarwal.
-
Concurrent learning of policy and unknown safety constraints in reinforcement learning. arXiv, 2024. paper
Lunet Yifru and Ali Baheri.
-
Experience replay for continual learning. NIPS, 2019. paper
David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne.
-
Safe model-based multi-agent mean-field reinforcement learning. arXiv, 2023. paper
Matej Jusup, Barna Pásztor, Tadeusz Janik, Kenan Zhang, Francesco Corman, Andreas Krause, and Ilija Bogunovic.
-
Continual learning as computationally constrained reinforcement learning.. arXiv, 2023. paper
Saurabh Kumar, Henrik Marklund, Ashish Rao, Yifan Zhu, Hong Jun Jeon, Yueyang Liu, and Benjamin Van Roy.
-
Safe reinforcement learning in constrained Markov decision processes. ICML, 2020. paper
Akifumi Wachi and Yanan Sui.
-
Reachability constrained reinforcement learning. ICML, 2022. paper
Dongjie Yu, Haitong Ma, Shengbo Li, and Jianyu Chen.
-
A near-optimal algorithm for safe reinforcement learning under instantaneous hard constraints. ICML, 2023. paper
Ming Shi, Yingbin Liang, and Ness Shroff.
-
Iterative reachability estimation for safe reinforcement learning. NIPS, 2023. paper
Milan Ganai, Zheng Gong, Chenning Yu, Sylvia Herbert, and Sicun Gao.
-
Risk-sensitive inhibitory control for safe reinforcement learning. ACC, 2023. paper
Armin Lederer, Erfaun Noorani, John S. Baras, and Sandra Hirche.
-
Progressive adaptive chance-constrained safeguards for reinforcement learning. arXiv, 2023. paper
Zhaorun Chen, Binhao Chen, Tairan He, Liang Gong, and Chengliang Liu.
-
Learn with imagination: Safe set guided state-wise constrained policy optimization. AAAI, 2024. paper
Weiye Zhao, Yifan Sun, Feihan Li, Rui Chen, Tianhao Wei, and Changliu Liu.
-
Safe reinforcement learning via shielding under partial observability. AAAI, 2023. paper
Steven Carr, Nils Jansen, Sebastian Junges, and Ufuk Topcu.
-
Safe reinforcement learning with learned non-Markovian safety constraints. arXiv, 2024. paper
Siow Meng Low and Akshat Kumar.
-
Feasibility consistent representation learning for safe reinforcement learning. ICML, 2024. paper
Zhepeng Cen, Yihang Yao, Zuxin Liu, and Ding Zhao.
-
Safe reinforcement learning in black-box environments via adaptive shielding. arXiv, 2024. paper
Daniel Bethell, Simos Gerasimou, Radu Calinescu, and Calum Imrie.
-
Realizable continuous-space shields for safe reinforcement learning. arXiv, 2024. paper
Kyungmin Kim, Davide Corsi, Andoni Rodriguez, JB Lanier, Benjami Parellada, Pierre Baldi, Cesar Sanchez, and Roy Fox.
-
Safe reinforcement learning from pixels using a stochastic latent representation. ICLR, 2023. paper
Yannick Hogewind, Thiago D. Simao, Tal Kachman, and Nils Jansen.
-
Coaching a teachable student. CVPR, 2023. paper
Jimuyang Zhang, Zanming Huang, and Eshed Ohn-Bar.
-
Provably efficient generalized Lagrangian policy optimization for safe multi-agent reinforcement learning. JMLR, 2023. paper
Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, and Mihailo R. Jovanovic.
-
Learning adaptive safety for multi-agent systems. arXiv, 2023. paper
Luigi Berducci, Shuo Yang, Rahul Mangharam, and Radu Grosu.
-
Safe multi-agent reinforcement learning with natural language constraints. arXiv, 2024. paper
Ziyan Wang, Meng Fang, Tristan Tomilin, Fei Fang, and Yali Du.
-
Learning shared safety constraints from multi-task demonstrations. arXiv, 2023. paper
Konwoo Kim, Gokul Swamy, Zuxin Liu, Ding Zhao, Sanjiban Choudhury, and Zhiwei Steven Wu.
-
Safe and balanced: A framework for constrained multi-objective reinforcement learning. arXiv, 2024. paper
Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Alois Knoll, and Ming Jin
-
GenSafe: A generalizable safety enhancer for safe reinforcement learning algorithms based on reduced order Markov decision process model. arXiv, 2024. paper
Zhehua Zhou, Xuan Xie, Jiayang Song, Zhan Shu, and Lei Ma.
-
Langevin policy for safe reinforcement learning. ICML, 2024. paper
Fenghao Lei, Long Yang, Shiting Wen, Zhixiong Huang, Zhiwang Zhang, and Chaoyi Pang.
-
Near-optimal policy identification in robust constrained Markov decision processes via epigraph form. arXiv, 2024. paper
Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai, Kenta Hoshino, Yohei Hosoe, Kazumi Kasaura, Masashi Hamaya, Paavo Parmas, and Yutaka Matsuo.
-
Dense reinforcement learning for safety validation of autonomous vehicles. Nature, 2023. paper
Shuo Feng, Haowei Sun, Xintao Yan, Haojie Zhu, Zhengxia Zou, Shengyin Shen, and Henry X. Liu.
-
Enhancing system-level safety in mixed-autonomy platoon via safe reinforcement learning. arXiv, 2024. paper
Jingyuan Zhou, Longhao Yan, and Kaidi Yang.
-
Multi-constraint safe RL with objective suppression for safety-critical applications. arXiv, 2024. paper
Zihan Zhou, Jonathan Booher, Wei Liu, Aleksandr Petiushko, and Animesh Garg.
-
Do no harm: A counterfactual approach to safe reinforcement learning. arXiv, 2024. paper
Sean Vaskov, Wilko Schwarting, and Chris L. Baker.
-
Safe multi-agent reinforcement learning with bilevel optimization in autonomous driving. arXiv, 2024. paper
Zhi Zheng and Shangding Gu.
-
Online 3D bin packing with constrained deep reinforcement learning. AAAI, 2021. paper
Hang Zhao, Qijin She, Chenyang Zhu, Yin Yang, and Kai Xu.
-
Spatiotemporally constrained action space attacks on deep reinforcement learning agents. AAAI, 2020. paper
Xian Yeow Lee, Sambit Ghadai, Kai Liang Tan, Chinmay Hegde, and Soumik Sarkar.
-
Evaluation of constrained reinforcement learning algorithms for legged locomotion. arXiv, 2023. paper
Joonho Lee, Lukas Schroth, Victor Klemm, Marko Bjelonic, Alexander Reske, and Marco Hutter.
-
Learning safe control for multi-robot systems: Methods, verification, and open challenges. arXiv, 2023. paper
Kunal Garg, Songyuan Zhang, Oswin So, Charles Dawson, and Chuchu Fan.
-
Constrained reinforcement learning for dexterous manipulation. IJCAI, 2022. paper
Abhineet Jain, Jack Kolb, and Harish Ravichandar.
-
Safe multi-agent reinforcement learning for formation control without individual reference targets. arXiv, 2023. paper
Murad Dawood, Sicong Pan, Nils Dengler, Siqi Zhou, Angela P. Schoellig, and Maren Bennewitz.
-
Safe reinforcement learning in uncertain contexts. TRO, 2024. paper
Dominik Baumann and Thomas B. Schon.
-
Offline goal-conditioned reinforcement learning for safety-critical tasks with recovery policy. ICRA, 2024. paper
Chenyang Cao, Zichen Yan, Renhao Lu, Junbo Tan, and Xueqian Wang.
-
Safe reinforcement learning on the constraint manifold: Theory and applications. arXiv, 2024. paper
Puze Liu, Haitham Bou-Ammar, Jan Peters, and Davide Tateo.
-
SRL-VIC: A variable stiffness-based safe reinforcement learning for contact-rich robotic tasks. RA-L, 2024. paper
Heng Zhang, Gokhan Solak, Gustavo J. G. Lahr, and Arash Ajoudani.
-
Safe reinforcement learning of robot trajectories in the presence of moving obstacles. RA-L, 2024. paper
Jonas Kiemel, Ludovic Righetti, Torsten Kröger, and Tamim Asfour.
-
District cooling system control for providing operating reserve based on safe deep reinforcement learning. TPS, 2023. paper
Peipei Yu, Hongcai Zhang, Yonghua Song, Hongxun Hui, and Ge Chen.
-
Safe reinforcement learning for power system control: A review. arXiv, 2024. paper
Peipei Yu, Zhenyi Wang, Hongcai Zhang, and Yonghua Song.
-
A review of safe reinforcement learning methods for modern power systems. arXiv, 2024. paper
Tong Su, Tong Wu, Junbo Zhao, Anna Scaglione, and Le Xie.
-
Offline inverse constrained reinforcement learning for safe-critical decision making in healthcare. arXiv, 2024. paper
Nan Fang, Guiliang Liu, and Wei Gong