LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32 #34

jeisonmp · 2024-07-03T23:27:11Z

When I run with gpt2 models, all its ok! But when I run with anyone those models exists ridger/MMfreeLM-370M, MMfreeLM-1.3B or MMfreeLM-2.7 this error occur.Why? Can anyone help me?

Error: LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32
[1] 93105 IOT instruction python3 generate_text.py

import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
import mmfreelm
from transformers import AutoModelForCausalLM, AutoTokenizer

# Nome do modelo pré-treinado
#name = 'ridger/MMfreeLM-370M'
name = 'ridger/MMfreeLM-1.3B'
#name = 'ridger/MMfreeLM-2.7B'
#name = 'openai-community/gpt2'

# # Carregar o tokenizador e o modelo
tokenizer = AutoTokenizer.from_pretrained(name)
model = AutoModelForCausalLM.from_pretrained(name).cuda().half()

# input_prompt = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, "
# input_ids = tokenizer(input_prompt, return_tensors="pt").input_ids.cuda()
# outputs = model.generate(input_ids, max_length=32,  do_sample=True, top_p=0.4, temperature=0.6)
# print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs.input_ids.cuda()
    attention_mask = inputs.attention_mask.cuda()
    outputs = model.generate(input_ids, attention_mask=attention_mask, max_length=32, do_sample=True, top_p=0.4, temperature=0.6, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

while True:
    prompt = input("Você: ")
    if prompt.lower() in ['exit', 'quit']:
        break
    response = generate_response(prompt)
    print(f"Modelo: {response}")

The text was updated successfully, but these errors were encountered:

jeisonmp · 2024-07-05T13:40:12Z

Do you know if run in WSL 2 win10?

nevercast · 2024-07-11T00:24:55Z

What NVIDIA GPU are you using?

jeisonmp · 2024-07-17T18:33:46Z

Hi @nevercast ! Is NVIDIA GeForce GTX 1050

nevercast · 2024-07-17T20:11:17Z

Hi!

My immediate assumption is that the GTX 1050 does not have a compute version new enough to support this kernel function - I can validate this later for you, haven't had coffee yet.

If that is the case though, you might want to consider trying to run this project on a free Google Colab notebook with a T4 GPU attached.

nevercast · 2024-07-18T11:08:09Z

GTX 1050 is Compute version 6.1. Triton (which I believe this project uses) has dropped support for anything less than 7.0.

You may be able to get a build to work but you'd be own your own effort and against the grain so to speak.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32 #34

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32 #34

jeisonmp commented Jul 3, 2024

jeisonmp commented Jul 5, 2024

nevercast commented Jul 11, 2024

jeisonmp commented Jul 17, 2024

nevercast commented Jul 17, 2024

nevercast commented Jul 18, 2024 •

edited

Loading

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32 #34

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bfly.i32 #34

Comments

jeisonmp commented Jul 3, 2024

jeisonmp commented Jul 5, 2024

nevercast commented Jul 11, 2024

jeisonmp commented Jul 17, 2024

nevercast commented Jul 17, 2024

nevercast commented Jul 18, 2024 • edited Loading

nevercast commented Jul 18, 2024 •

edited

Loading