Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

Open
BijanProjects opened this issue Jan 21, 2025 · 2 comments

Comments

@BijanProjects
Copy link

BijanProjects commented Jan 21, 2025

I keep getting the error of TypeError: 'str' object is not callable with using model.generate(). It used to work even a few days ago but not now.

Here is the code I am running P100 of Kaggle:

Install the package

%%capture
!pip install unsloth

Also get the latest nightly Unsloth!

!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
!pip install rouge-score

from unsloth import FastLanguageModel
import torch
max_seq_length = 8192 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

4bit pre quantized models we support for 4x faster downloading + no OOMs.

fourbit_models = [
"unsloth/Meta-Llama-3.1-8B-bnb-4bit", # Llama-3.1 15 trillion tokens model 2x faster!
"unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
"unsloth/Meta-Llama-3.1-70B-bnb-4bit",
"unsloth/Meta-Llama-3.1-405B-bnb-4bit", # We also uploaded 4bit for 405b!
"unsloth/Mistral-Nemo-Base-2407-bnb-4bit", # New Mistral 12b 2x faster!
"unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit",
"unsloth/mistral-7b-v0.3-bnb-4bit", # Mistral v3 2x faster!
"unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
"unsloth/Phi-3.5-mini-instruct", # Phi-3.5 2x faster!
"unsloth/Phi-3-medium-4k-instruct",
"unsloth/gemma-2-9b-bnb-4bit",
"unsloth/gemma-2-27b-bnb-4bit", # Gemma 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
# token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

model = FastLanguageModel.get_peft_model(
model,
r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 16,
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
use_rslora = False, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
)

This cell is for inference which causes the problem on model.generate

FastLanguageModel.for_inference(model)

samp = ["User prompt: Hi, what are you? Your response:"]

test_input = tokenizer(samp, return_tensors = "pt").to("cuda")
outputs = model.generate(**test_input, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)

@BijanProjects BijanProjects changed the title TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) Jan 21, 2025
@tomcotter7
Copy link

tomcotter7 commented Jan 22, 2025

what version of accelerate are you using? I updated from 1.1.1 to 1.3.0 and this was fixed. (assuming it's the same error, if you share the stack trace I can help more.)

@BijanProjects
Copy link
Author

Thank you @tomcotter7 for your help. Actually, I could not reproduce the error after a factory reset of notebook. Probably you are right! A version inconsistency has happened there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants