-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HuggingFace arg so that arch is automatic #39
Conversation
If the user passes in some value that conflicts with the transformer passed, do we ignore it or take that into consideration? for eg. python transformer_mem.py \
--hf_model_name_or_path meta-llama/Llama-2-7b-hf \
--num-gpus 8 \
--zero-stage 3 \
--batch-size-per-gpu 2 \
--sequence-length 4096 \
--num_attention_heads 16 in the above example the |
Currently, Calculating memory with training configuration: {'hf_model_name_or_path': 'NousResearch/Hermes-2-Pro-Llama-3-8B', 'num_gpus': 8, 'tensor_parallel_size': 1, 'pipeline_parallel_size': 1, 'partition_activations': False, 'zero_stage': 3, 'zero_allgather_bucket_size': 500000000.0, 'zero3_max_live_params': 1000000000.0, 'checkpoint_activations': False, 'batch_size_per_gpu': 2, 'sequence_length': 4096, 'vocab_size': 128288, 'hidden_size': 4096, 'num_attention_heads': 32, 'num_layers': 32, 'ffn_expansion_factor': 3.5, 'infer': False, 'kv_size_ratio': 0.25, 'is_mixed_precision': True, 'high_prec_bytes_per_val': 4, 'low_prec_bytes_per_val': 2, 'bytes_per_grad_ele': 4, 'num_experts': 0, 'expert_parallelism': 1, 'misc_mem_gib': 0}
Number of Parameters: 6.17 B |
I think if the user provides an arg, we overwrite the HF config on that value. All overwritten values should get a print (e.g. |
How do we check if the value is user-provided or a default value? Say, the user gave Instead, maybe we could keep the default values in another dictionary and have the What do you think? @Quentin-Anthony |
But this would mean that we have no default values and that the user needs to pass everything? If I'm misunderstanding, maybe just implement what you're describing real quick and we can iterate. |
Hi @Quentin-Anthony, |
Hi @Quentin-Anthony, wanted to check in about this PR? Is this still required? Is something missing here? |
Yep still needed! Reviewing now. |
I rebased, and some reason this PR "files changed" view is now showing all the rebase changes? Gonna try and close and reopen to see if that fixes it. EDIT: That did it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned things up a bit, rebased, and tested. All looks great to me. Thank you!
Thank You @Quentin-Anthony! |
This pull request is made to work on adding automated parameter calculations for all hugging face models.
Expected Behaviour:
Ref: [ #1 ]