Update colab examples #86

wenxindongwork · 2024-08-20T21:34:37Z

Use transformers' AutoModelForCausalLM instead of optimum-tpu's AutoModelForCausalLM for finetuning.

The from optimum.tpu version imports models that are specifically optimized for inference. While the colab example works for smaller models, it fails with a HBM OOM error for llama3-70b (on a v4-256). Changing the following import statement solved the problem.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

The `from optimum.tpu` version imports models that are specifically optimized for inference.

…timum-tpu into update_example

tengomucho

You are right @wenxindongwork, at some point I added some improvements for tuning, but with FSDP the models are essentially the same in transformers and in optimum.tpu, so it might well be easier just to import the transformers version.
Note that on the other hand, this means that the model might be loaded in float32 and end up using more memory, whereas the optimum.tpu's models will load on bfloat16 by default. If you do this, you can end up with OOM on some configurations.

tengomucho · 2024-08-21T07:59:59Z

examples/language-modeling/gemma_tuning.ipynb

+    "cls_to_wrap = \"GemmaDecoderLayer\"\n",
+    "fsdp_training_args = {\n",
+    "    \"fsdp\": \"full_shard\",\n",
+    "    \"fsdp_config\": fsdp_v2.get_fsdp_config(cls_to_wrap),\n",
+    "}\n",


well, that was the point of using get_fsdp_training_args, that you do not need to know what classes to wrap on supported models. I would revert this bit

thanks for the quick review! get_fsdp_training_args accepts only the optimum.tpu model class, not the transformers one. I updated the get_fsdp_training_args function, so it should now work.

examples/language-modeling/llama_tuning.md

tengomucho

Nice, with your change it looks fine. 🤗
Last question I was wondering was about using the AutoModelForCausalLM class from transformers rather than from optimum.tpu.
AS I mentioned in a comment, in Optimum TPU models we set dtype to bfloat16, because TPU are capable of that and it uses less memory, e.g. here.
I hadn't test the scripts with float32, but if you do and it works fine please let me know, so we can just merge this.

wenxindongwork · 2024-08-21T21:23:26Z

Just updated the examples to load the models in bf16 instead, hope that works!

HuggingFaceDocBuilderDev · 2024-08-22T07:55:13Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tengomucho · 2024-08-22T08:00:12Z

The code style workflow is failing, can you run it locally (make style) and push again so we can merge this please?

wenxindongwork · 2024-08-22T15:07:15Z

just installed ruff and ran make sytle. Thanks!

wenxindongwork and others added 4 commits August 20, 2024 14:29

update colab examples

023ff4b

The `from optimum.tpu` version imports models that are specifically optimized for inference.

Merge branch 'huggingface:main' into update_example

ed7f1dd

udpate exmaples

8df30be

Merge branch 'update_example' of https://github.com/wenxindongwork/op…

403bdd0

…timum-tpu into update_example

allenwang28 approved these changes Aug 20, 2024

View reviewed changes

tengomucho approved these changes Aug 21, 2024

View reviewed changes

tengomucho requested changes Aug 21, 2024

View reviewed changes

update fsdp_v2. get_fsdp_training_args

f40451d

tengomucho reviewed Aug 21, 2024

View reviewed changes

load model in bf16

2cb9068

tengomucho approved these changes Aug 22, 2024

View reviewed changes

make style

1b83380

tengomucho approved these changes Aug 22, 2024

View reviewed changes

tengomucho merged commit d4e2294 into huggingface:main Aug 22, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update colab examples #86

Update colab examples #86

wenxindongwork commented Aug 20, 2024

tengomucho left a comment •

edited

Loading

tengomucho Aug 21, 2024

wenxindongwork Aug 21, 2024

tengomucho left a comment

wenxindongwork commented Aug 21, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

tengomucho commented Aug 22, 2024

wenxindongwork commented Aug 22, 2024

Update colab examples #86

Update colab examples #86

Conversation

wenxindongwork commented Aug 20, 2024

Use transformers' AutoModelForCausalLM instead of optimum-tpu's AutoModelForCausalLM for finetuning.

Before submitting

tengomucho left a comment • edited Loading

Choose a reason for hiding this comment

tengomucho Aug 21, 2024

Choose a reason for hiding this comment

wenxindongwork Aug 21, 2024

Choose a reason for hiding this comment

tengomucho left a comment

Choose a reason for hiding this comment

wenxindongwork commented Aug 21, 2024

HuggingFaceDocBuilderDev commented Aug 22, 2024

tengomucho commented Aug 22, 2024

wenxindongwork commented Aug 22, 2024

tengomucho left a comment •

edited

Loading