[Feature]: Models Trained on Gaudi Do Not Work #511

gouki510 · 2024-11-16T10:26:44Z

🚀 The feature, motivation and pitch

Feature

Support for models fine-tuned on Gaudi machines using GaudiSFTTrainer (from Optimum Habana) in the vllm-fork library. Specifically, models with hf_config.architectures set to GaudiLlamaForCausalLM currently fail to work with this library.

Motivation

When fine-tuning models on Gaudi machines, the resulting hf_config includes an architecture field GaudiLlamaForCausalLM, which is not currently supported by this library. This limitation prevents the use of Gaudi-optimized models with the vllm-fork, restricting its usability for users working with Gaudi hardware.

Related Problem

Attempting to use Gaudi fine-tuned models results in the following error:

ValueError: Model architectures ['GaudiLlamaForCausalLM'] are not supported for now.

This issue arises because the architecture name includes the prefix Gaudi, which is not recognized by the library.

Alternatives

A temporary workaround involves modifying the library code as follows:
Adding the code below at line 457 in vllm-fork/vllm/engine/llm_engine.py allows the library to work with Gaudi-trained models:

# Remove "Gaudi" from the architecture name if it exists
architectures = []
for arch in engine_config.model_config.hf_config.architectures:
    if "Gaudi" in arch:
        arch = arch.replace("Gaudi", "")
    architectures.append(arch)
engine_config.model_config.hf_config.architectures = architectures

This workaround removes the "Gaudi" prefix from the architecture name, enabling compatibility.
However, I would like to confirm if this is an appropriate solution before proceeding with a pull request.

Additional context

Adding official support for Gaudi-trained models would greatly improve the usability of the vllm-fork library for users working with Habana Labs hardware.
If this fix is validated, I am happy to submit a pull request to address the issue.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

michalkuligowski · 2025-01-02T12:57:12Z

Hi @gouki510, thank you for explaining this issue, we are currently working on a fix in naming in Optimum Habana. We want also to provide a script to modify existing models as a workaround.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Models Trained on Gaudi Do Not Work #511

[Feature]: Models Trained on Gaudi Do Not Work #511

gouki510 commented Nov 16, 2024

michalkuligowski commented Jan 2, 2025

[Feature]: Models Trained on Gaudi Do Not Work #511

[Feature]: Models Trained on Gaudi Do Not Work #511

Comments

gouki510 commented Nov 16, 2024

🚀 The feature, motivation and pitch

Feature

Motivation

Related Problem

Alternatives

Additional context

Before submitting a new issue...

michalkuligowski commented Jan 2, 2025