Add HuggingFace arg so that arch is automatic #1

Quentin-Anthony · 2023-12-20T23:21:58Z

Stas Bekman had the idea of supporting a HuggingFace model as input so that all model architecture settings don't need manually dug up. We'd like something like:

python transformer_mem.py --hf_model_name_or_path meta-llama/Llama-2-7b-hf --num-gpus 8 --zero-stage 3 --batch-size-per-gpu 2 --sequence-length 4096

The text was updated successfully, but these errors were encountered:

bhavnicksm · 2024-05-07T20:32:48Z

Hey @Quentin-Anthony,
is someone working on this? If not, I can try to make a PR for this.

Quentin-Anthony · 2024-05-08T08:28:55Z

Nobody is. I'd love a PR!

bhavnicksm · 2024-05-08T15:14:13Z

@Quentin-Anthony, created a draft PR, we can continue the conversation there

haileyschoelkopf mentioned this issue Feb 20, 2024

Add simple inference FLOP counter to calc_transformer_flops.py #31

Merged

bhavnicksm mentioned this issue May 8, 2024

Add HuggingFace arg so that arch is automatic #39

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HuggingFace arg so that arch is automatic #1

Add HuggingFace arg so that arch is automatic #1

Quentin-Anthony commented Dec 20, 2023

bhavnicksm commented May 7, 2024

Quentin-Anthony commented May 8, 2024

bhavnicksm commented May 8, 2024 •

edited

Loading

Add HuggingFace arg so that arch is automatic #1

Add HuggingFace arg so that arch is automatic #1

Comments

Quentin-Anthony commented Dec 20, 2023

bhavnicksm commented May 7, 2024

Quentin-Anthony commented May 8, 2024

bhavnicksm commented May 8, 2024 • edited Loading

bhavnicksm commented May 8, 2024 •

edited

Loading