fix config name

huggingface · Feb 7, 2025 · b581598 · b581598
1 parent 250ab46
commit b581598
Show file tree

Hide file tree

Showing 3 changed files with 3 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -112,7 +112,7 @@ Here `{model}` and `{dataset}` refer to the model and dataset IDs on the Hugging
 To train via the GRPO trainer, we use one GPU to run vLLM for faster generation and the remaining GPUs for training. For example, one a node with 8 GPUs, use the `recipes/accelerate_configs/zero3.yaml` config and then overwrite `num_processes` to run on 7 devices:
 
 ```shell
-ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml --num_processes=7 src/open_r1/grpo.py --config recipes/qwen/Qwen2.5-1.5B-Instruct/grpo/confg_full.yaml
+ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml --num_processes=7 src/open_r1/grpo.py --config recipes/qwen/Qwen2.5-1.5B-Instruct/grpo/config_full.yaml
 ```
 
 We provide a minimal reproducible experiment using GRPO for mathematical reasoning, referencing the approach from [SimpleRL-Reason](https://hkust-nlp.notion.site/simplerl-reason) which uses a 7B model trained on 8K examples. Running this on 8 H100 80G GPU takes about 3 hours:

diff --git a/...wen2.5-1.5B-Instruct/grpo/confg_full.yaml → ...en2.5-1.5B-Instruct/grpo/config_full.yaml b/...wen2.5-1.5B-Instruct/grpo/confg_full.yaml → ...en2.5-1.5B-Instruct/grpo/config_full.yaml
@@ -35,7 +35,7 @@ max_steps: -1
 num_train_epochs: 1
 output_dir: data/Qwen2.5-1.5B-Open-R1-GRPO
 overwrite_output_dir: true
-per_device_eval_batch_size: 4   
+per_device_eval_batch_size: 4
 per_device_train_batch_size: 1
 push_to_hub: true
 report_to:

diff --git a/recipes/qwen/README.md b/recipes/qwen/README.md
@@ -20,5 +20,5 @@ You can find the configuration files for different model sizes in this folder an
 ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/sft.py --config recipes/qwen/Qwen2.5-1.5B-Instruct/sft/config_full.yaml
 
 # GRPO
-ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/grpo.py --config recipes/qwen/Qwen2.5-1.5B-Instruct/grpo/confg_full.yaml
+ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml src/open_r1/grpo.py --config recipes/qwen/Qwen2.5-1.5B-Instruct/grpo/config_full.yaml
 ```