Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update colab examples #86

Merged
merged 7 commits into from
Aug 22, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions examples/language-modeling/gemma_tuning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,9 @@
"metadata": {},
"outputs": [],
"source": [
"from optimum.tpu import AutoModelForCausalLM\n",
"model = AutoModelForCausalLM.from_pretrained(model_id, use_cache=False)"
"from transformers import AutoModelForCausalLM\n",
"import torch\n",
"model = AutoModelForCausalLM.from_pretrained(model_id, use_cache=False, torch_dtype=torch.bfloat16)"
]
},
{
Expand Down
5 changes: 2 additions & 3 deletions examples/language-modeling/llama_tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,13 @@ Then, the tokenizer and model need to be loaded. We will choose [`meta-llama/Met

```python
import torch
from transformers import AutoTokenizer
from optimum.tpu import AutoModelForCausalLM
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "meta-llama/Meta-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Add custom token for padding Llama
tokenizer.add_special_tokens({'pad_token': tokenizer.eos_token})
model = AutoModelForCausalLM.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)
```

To tune the model with the [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes) dataset, you can load it and obtain the `quote` column:
Expand Down
6 changes: 4 additions & 2 deletions optimum/tpu/fsdp_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,9 @@ def get_fsdp_training_args(model: PreTrainedModel) -> Dict:
matched_model = False
if model_type == "gemma":
from .modeling_gemma import GemmaForCausalLM
from transformers import GemmaForCausalLM as HFGemmaForCausalLLM

if isinstance(model, GemmaForCausalLM):
if isinstance(model, GemmaForCausalLM) or isinstance(model, HFGemmaForCausalLLM):
logger = logging.get_logger(__name__)
from torch_xla import __version__ as xla_version
if xla_version == "2.3.0":
Expand All @@ -95,8 +96,9 @@ def get_fsdp_training_args(model: PreTrainedModel) -> Dict:
matched_model = True
elif model_type == "llama":
from .modeling_llama import LlamaForCausalLM
from transformers import LlamaForCausalLM as HFLlamaForCausalLLM

if isinstance(model, LlamaForCausalLM):
if isinstance(model, LlamaForCausalLM) or isinstance(model, HFLlamaForCausalLLM):
cls_to_wrap = "LlamaDecoderLayer"
matched_model = True

Expand Down
Loading