Skip to content

Commit c3f7600

Browse files
author
Shubhra Pandit
committed
fix model_path and batch_size for sparse case
1 parent 965b31a commit c3f7600

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

docs/llms/guides/sparse-finetuning-llm-gsm8k-with-sparseml.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ accelerate launch \
224224
--learning_rate 0.00005 \
225225
--lr_scheduler_type "linear" \
226226
--max_seq_length 1024 \
227-
--per_device_train_batch_size 32 \
227+
--per_device_train_batch_size 16 \
228228
--max_grad_norm None \
229229
--warmup_steps 20 \
230230
--distill_teacher PATH_TO_TEACHER \
@@ -331,7 +331,7 @@ MODEL_PATH=<MODEL_PATH>
331331
TASK=gsm8k
332332
python main.py \
333333
--model sparseml \
334-
--model_args pretrained=MODEL_PATH,trust_remote_code=True \
334+
--model_args pretrained=${MODEL_PATH},trust_remote_code=True \
335335
--tasks $TASK \
336336
--batch_size 48 \
337337
--no_cache \

versioned_docs/version-1.7.0/llms/guides/sparse-finetuning-llm-gsm8k-with-sparseml.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ accelerate launch \
224224
--learning_rate 0.00005 \
225225
--lr_scheduler_type "linear" \
226226
--max_seq_length 1024 \
227-
--per_device_train_batch_size 32 \
227+
--per_device_train_batch_size 16 \
228228
--max_grad_norm None \
229229
--warmup_steps 20 \
230230
--distill_teacher PATH_TO_TEACHER \
@@ -331,7 +331,7 @@ MODEL_PATH=<MODEL_PATH>
331331
TASK=gsm8k
332332
python main.py \
333333
--model sparseml \
334-
--model_args pretrained=MODEL_PATH,trust_remote_code=True \
334+
--model_args pretrained=${MODEL_PATH},trust_remote_code=True \
335335
--tasks $TASK \
336336
--batch_size 48 \
337337
--no_cache \

0 commit comments

Comments
 (0)