We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
when I train Qwen-1.5b-Inststruct,I have 324892 dataset-train,but Num examples = 10,797 Tokenizing train dataset: 100%|██████████| 324892/324892 [02:39<00:00, 2043.00 examples/s] Packing train dataset: 100%|██████████| 324892/324892 [01:23<00:00, 3885.00 examples/s]
model_name_or_path: Qwen2.5-1.5B-Instruct model_revision: main torch_dtype: bfloat16 attn_implementation: flash_attention_2 # Data training arguments dataset_name: /data/open-r1/dataset/conversations.jsonl dataset_configs: - all preprocessing_num_workers: 8 # SFT trainer config bf16: true do_eval: false eval_strategy: "no" eval_steps: 100 gradient_accumulation_steps: 1 gradient_checkpointing: true gradient_checkpointing_kwargs: use_reentrant: false hub_model_id: Qwen2.5-1.5B-Open-R1-Distill hub_strategy: every_save learning_rate: 2.0e-04 log_level: info logging_steps: 5 logging_strategy: steps lr_scheduler_type: cosine_with_min_lr lr_scheduler_kwargs: min_lr_rate: 0.1 packing: true max_seq_length: 4096 max_steps: -1 num_train_epochs: 1 output_dir: /data/open-r1/output/Qwen2.5-1.5B-Open-R1 overwrite_output_dir: true per_device_eval_batch_size: 1 per_device_train_batch_size: 1 push_to_hub: false report_to: - none save_strategy: "epoch" save_steps: 100 save_total_limit: 5 seed: 42 warmup_ratio: 0.1
The text was updated successfully, but these errors were encountered:
I know why......packing=true
Sorry, something went wrong.
No branches or pull requests
when I train Qwen-1.5b-Inststruct,I have 324892 dataset-train,but Num examples = 10,797

Tokenizing train dataset: 100%|██████████| 324892/324892 [02:39<00:00, 2043.00 examples/s]
Packing train dataset: 100%|██████████| 324892/324892 [01:23<00:00, 3885.00 examples/s]
The text was updated successfully, but these errors were encountered: