TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

BijanProjects · 2025-01-21T15:50:50Z

I keep getting the error of TypeError: 'str' object is not callable with using model.generate(). It used to work even a few days ago but not now.

Here is the code I am running P100 of Kaggle:

Install the package

%%capture
!pip install unsloth

Also get the latest nightly Unsloth!

!pip install --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
!pip install rouge-score

from unsloth import FastLanguageModel
import torch
max_seq_length = 8192 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

4bit pre quantized models we support for 4x faster downloading + no OOMs.

fourbit_models = [
"unsloth/Meta-Llama-3.1-8B-bnb-4bit", # Llama-3.1 15 trillion tokens model 2x faster!
"unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
"unsloth/Meta-Llama-3.1-70B-bnb-4bit",
"unsloth/Meta-Llama-3.1-405B-bnb-4bit", # We also uploaded 4bit for 405b!
"unsloth/Mistral-Nemo-Base-2407-bnb-4bit", # New Mistral 12b 2x faster!
"unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit",
"unsloth/mistral-7b-v0.3-bnb-4bit", # Mistral v3 2x faster!
"unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
"unsloth/Phi-3.5-mini-instruct", # Phi-3.5 2x faster!
"unsloth/Phi-3-medium-4k-instruct",
"unsloth/gemma-2-9b-bnb-4bit",
"unsloth/gemma-2-27b-bnb-4bit", # Gemma 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
# token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

model = FastLanguageModel.get_peft_model(
model,
r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",],
lora_alpha = 16,
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
use_rslora = False, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
)

This cell is for inference which causes the problem on model.generate

FastLanguageModel.for_inference(model)

samp = ["User prompt: Hi, what are you? Your response:"]

test_input = tokenizer(samp, return_tensors = "pt").to("cuda")
outputs = model.generate(**test_input, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)

tomcotter7 · 2025-01-22T13:33:21Z

what version of accelerate are you using? I updated from 1.1.1 to 1.3.0 and this was fixed. (assuming it's the same error, if you share the stack trace I can help more.)

BijanProjects · 2025-01-27T23:10:21Z

Thank you @tomcotter7 for your help. Actually, I could not reproduce the error after a factory reset of notebook. Probably you are right! A version inconsistency has happened there.

BijanProjects changed the title ~~TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model~~ TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

BijanProjects commented Jan 21, 2025 •

edited

Loading

tomcotter7 commented Jan 22, 2025 •

edited

Loading

BijanProjects commented Jan 27, 2025

TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

TypeError: 'str' object is not callable error with Llama 3.1 8B instruct model (4 bit quantization) #1570

Comments

BijanProjects commented Jan 21, 2025 • edited Loading

Install the package

Also get the latest nightly Unsloth!

4bit pre quantized models we support for 4x faster downloading + no OOMs.

This cell is for inference which causes the problem on model.generate

tomcotter7 commented Jan 22, 2025 • edited Loading

BijanProjects commented Jan 27, 2025

BijanProjects commented Jan 21, 2025 •

edited

Loading

tomcotter7 commented Jan 22, 2025 •

edited

Loading