reproducing DeepSeek R1 Zero with Qwen2.5-0.5B on two 4090 GPUs
Create a virtual environment and install dependencies with
pip install -r requirements.txt
For a two GPU setup, start vLLM first.
CUDA_VISIBLE_DEVICES=1 trl vllm-serve --model Qwen/Qwen2.5-0.5B
Then run training with
accelerate launch --config_file configs/deepspeed/zero3.yaml --num_processes 1 train.py
Follow instructions in eval/README.md
.