This is an improved version of IQ-Learn, originally proposed in NeurIPS 2021.
Our modifications include:
- ✅ Added KL divergence and reward-based baselines
- ✅ Extended support for Gym Atari and MuJoCo environments
- ✅ Optimized training pipeline for better stability
📄 IQ-Learn: Inverse Soft-Q Learning for Imitation
➡️ arXiv Link
IQ-Learn is a state-of-the-art imitation learning framework that directly learns soft Q-functions from expert data. Unlike traditional adversarial approaches (e.g., GAIL, AIRL), IQ-Learn provides a simple, stable, and data-efficient alternative for both offline and online imitation learning.
1️⃣ Introduced KL divergence and reward-based baselines to improve performance.
2️⃣ Adapted the method for Gym Atari and MuJoCo environments.
3️⃣ Optimized the training pipeline for better efficiency and generalization.
✔️ Drop-in replacement for Behavior Cloning
✔️ Non-adversarial online imitation learning (successor to GAIL & AIRL)
✔️ Performs well with very sparse expert data
✔️ Scales to complex environments (Atari, MuJoCo)
✔️ Can recover reward functions from the environment
Please refer to the iq_learn directory for installation and usage instructions.
$env:WANDB_MODE = "offline"
We provide a utility script convert_transitions.py
to convert expert trajectories into the format required by IQ-Learn.
This is useful when you have custom environments or datasets and want to apply IQ-Learn directly.
python convert_transitions.py --env_name
Make sure your expert data includes state, action, next_state, reward, and done fields.
IQ-Learn achieving human-level imitation in various Atari games:
If you use this code, please cite the original IQ-Learn paper:
@inproceedings{garg2021iqlearn,
title={IQ-Learn: Inverse soft-Q Learning for Imitation},
author={Divyansh Garg and Shuvam Chakraborty and Chris Cundy and Jiaming Song and Stefano Ermon},
booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
year={2021},
url={https://openreview.net/forum?id=Aeo-xqtb5p}
}
For any questions or discussions, feel free to open an issue or reach out! 🚀