An on-policy MARL algorithm for highway on-ramp merging problem, which features parameter sharing, action masking, local reward design and a priority-based safety supervisor.
All the MARL algorithms are extended from the single-agent RL with parameter sharing among agents.
- MAA2C: TBD.
- MAPPO.
- MAACKTR.
- MADQN: Does not work well.
- MASAC: TBD.
- create an python virtual environment:
conda create -n marl_cav python=3.6 -y
- active the virtul environment:
conda activate marl_cav
- install pytorch (torch>=1.2.0):
pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
- install the requirements:
pip install -r requirements.txt
Fig.1 Illustration of the considered on-ramp merging traffic scenario. CAVs (blue) and HDVs (green) coexist on both ramp and through lanes.
To run the code, just run it via python run_xxx.py
. The config files contain the parameters for the MARL policies.
Fig.2 Performance comparison between the proposed method and 3 state-of-the-art MARL algorithms.
To reproduce, we train the algorithms for 3 random seeds, 0, 2000, 2021. For example, we can set the torch_seed and seed to 0
to run the seed 0. We can plot the comparison curves with the code: python common/plot_benchmark_safety.py
@misc{chen2021deep,
title={Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic},
author={Dong Chen and Zhaojian Li and Yongqiang Wang and Longsheng Jiang and Yue Wang},
year={2021},
eprint={2105.05701},
archivePrefix={arXiv},
primaryClass={eess.SY}
}