Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Models Trained on Gaudi Do Not Work #511

Open
1 task done
gouki510 opened this issue Nov 16, 2024 · 1 comment
Open
1 task done

[Feature]: Models Trained on Gaudi Do Not Work #511

gouki510 opened this issue Nov 16, 2024 · 1 comment

Comments

@gouki510
Copy link

🚀 The feature, motivation and pitch

Feature

Support for models fine-tuned on Gaudi machines using GaudiSFTTrainer (from Optimum Habana) in the vllm-fork library. Specifically, models with hf_config.architectures set to GaudiLlamaForCausalLM currently fail to work with this library.

Motivation

When fine-tuning models on Gaudi machines, the resulting hf_config includes an architecture field GaudiLlamaForCausalLM, which is not currently supported by this library. This limitation prevents the use of Gaudi-optimized models with the vllm-fork, restricting its usability for users working with Gaudi hardware.

Related Problem

Attempting to use Gaudi fine-tuned models results in the following error:

ValueError: Model architectures ['GaudiLlamaForCausalLM'] are not supported for now.

This issue arises because the architecture name includes the prefix Gaudi, which is not recognized by the library.

Alternatives

A temporary workaround involves modifying the library code as follows:
Adding the code below at line 457 in vllm-fork/vllm/engine/llm_engine.py allows the library to work with Gaudi-trained models:

# Remove "Gaudi" from the architecture name if it exists
architectures = []
for arch in engine_config.model_config.hf_config.architectures:
    if "Gaudi" in arch:
        arch = arch.replace("Gaudi", "")
    architectures.append(arch)
engine_config.model_config.hf_config.architectures = architectures

This workaround removes the "Gaudi" prefix from the architecture name, enabling compatibility.
However, I would like to confirm if this is an appropriate solution before proceeding with a pull request.

Additional context

Adding official support for Gaudi-trained models would greatly improve the usability of the vllm-fork library for users working with Habana Labs hardware.
If this fix is validated, I am happy to submit a pull request to address the issue.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@michalkuligowski
Copy link

Hi @gouki510, thank you for explaining this issue, we are currently working on a fix in naming in Optimum Habana. We want also to provide a script to modify existing models as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants