Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add openvino export configs and support chatglm #454

Closed
wants to merge 3 commits into from

Conversation

eaidova
Copy link
Collaborator

@eaidova eaidova commented Oct 16, 2023

What does this PR do?

Introducing export configs for openvino. ChatGLM2 model used as example for demonstration that this API extension work

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

}


class ChatGLM2DummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@register_normalized_config("chatglm")
class ChatGLM2NormalizedConfig(NormalizedTextConfig):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)
else:
onnx_config_constructor = TasksManager.get_exporter_config_constructor(
model=model, exporter="openvino", task=task
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here I would prefer we keep the onnx config so that we don't duplicate code as I'm unsure why we would need it for now, is there a specific reason for that @eaidova ?

Copy link
Collaborator Author

@eaidova eaidova Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main idea of this pr to allow override configs for onnx for openvino.
I see 2 reasons for that:

  1. Adding configurations for supporting new models now requires to guiranree that model can be exported in onnx (and possibly even provide pipeline for running it with ORT). However, we do not use export to onnx as default path for export models to openvino anymore. The set of supported operations from torch in openvino can be different from set of ops supported to torch-onnx export. For example, we successfully convert Tapas model family to openvino (for example https://huggingface.co/google/tapas-tiny-finetuned-wtq), but onnx export failed with unsupported aten::scatter_reduce
  2. Unblocking different optimizations of existing models that specific for openvino. For the same reasons that we do not export models to onnx and has own inference pipelines, in some time it may happens that onnx and openvino export paths can become different and require different configurations (for example if I want to merge 2 decoders in seq2seq models in the way how it will be convenient for openvino or in some other our plans we want to try export caulsallm models with including beam search inside models). Some simple example, where onnx configuration is not perfectly fit for us, text-generation-with-past, if I understand, for onnx default path is a merged model, while we use model-with-past. ONNX configs for a model with past fill input_ids with only 1 token for this case lead to some exporting model troubles, that require models patching and consider each new case as separated (in the latest optimum, created patchers per model, now there is just function that check model type and apply the patch on pytorch model), but it can be avoided in majority cases if input_ids will have 2 tokens instead of 1. Also it uses only dynamic batch and static sequence len (=1) for this case, that lead to extra model reshaping before loading to make sequence len dynamic for this input.
  3. Simplification of new models enabling flow. We really like optimum and all its features for its smooth user experience (thank you very much for everything what you do) and recommend it to our openvino users as the main path for running inference for HF models using openvino, so we are very interested in extension of supported models and having the latest trading models available running with openvino. Support of cli and API for export models open the door for converting everything that supported in optimum directly in openvino (even if we do not have some OVModelForXXX classes) directly to openvino and it is the great step for us. But now, it is not enough to just install optimum-intel from git for get the latest available models from optimum side, it requires also install optimum for git. But there is no guarantee that they are synchronized. Like you already highlighted about changes with position_ids for example or another thing what I recently found trying to run mistral model that now for some models it is not enough to specify only with_past=True for getting the model with past in inputs and outputs in the same time, additional with_past_in_inputs flag should be passed in config. We need to wait the next official realse for aligning and getting new models supported that maybe non-convinient for us.

We do not duplicate code, (for majority cases configs mapping filled in runtime, just reusing the same onnx config, but if we have own one for openvino specifc, we will use own) just allow overriding some export configs if it is applicable.

return {
"input_ids": input_ids,
"past_key_values": past_key_values,
"use_cache": self.use_cache,
"position_ids": None,
"attention_mask": kwargs.get("attention_mask", None),
"position_ids": position_ids,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thnaks for adding the position_ids support added in huggingface/optimum#1381

Comment on lines +23 to +49
def create_register(overwrite_existing: bool = False):
def wrapper(model_type: str, *supported_tasks: str) -> Callable[[Type], Type]:
def decorator(config_cls: Type) -> Type:
mapping = TasksManager._SUPPORTED_MODEL_TYPE.get(model_type, {})
mapping_backend = mapping.get("openvino", {})
for task in supported_tasks:
normalized_task = task
if "-with-past" in task:
normalized_task = task.split("-with-past")[0]
if normalized_task not in TasksManager.get_all_tasks():
known_tasks = ", ".join(TasksManager.get_all_tasks())
raise ValueError(
f'The TasksManager does not know the task called "{task}", known tasks: {known_tasks}.'
)
if not overwrite_existing and task in mapping_backend:
continue
mapping_backend[task] = make_backend_config_constructor_for_task(config_cls, task)
mapping["openvino"] = mapping_backend
TasksManager._SUPPORTED_MODEL_TYPE[model_type] = mapping
return config_cls

return decorator

return wrapper


register_in_tasks_manager = create_register(True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it simplified with :

Suggested change
def create_register(overwrite_existing: bool = False):
def wrapper(model_type: str, *supported_tasks: str) -> Callable[[Type], Type]:
def decorator(config_cls: Type) -> Type:
mapping = TasksManager._SUPPORTED_MODEL_TYPE.get(model_type, {})
mapping_backend = mapping.get("openvino", {})
for task in supported_tasks:
normalized_task = task
if "-with-past" in task:
normalized_task = task.split("-with-past")[0]
if normalized_task not in TasksManager.get_all_tasks():
known_tasks = ", ".join(TasksManager.get_all_tasks())
raise ValueError(
f'The TasksManager does not know the task called "{task}", known tasks: {known_tasks}.'
)
if not overwrite_existing and task in mapping_backend:
continue
mapping_backend[task] = make_backend_config_constructor_for_task(config_cls, task)
mapping["openvino"] = mapping_backend
TasksManager._SUPPORTED_MODEL_TYPE[model_type] = mapping
return config_cls
return decorator
return wrapper
register_in_tasks_manager = create_register(True)
register_in_tasks_manager = TasksManager.create_register("openvino")


@register_in_tasks_manager("chatglm", *["text-generation", "text-generation-with-past"])
class ChatGLM2OpenVINOConfig(TextDecoderOnnxConfig):
NORMALIZED_CONFIG_CLASS = ChatGLM2NormalizedConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need ChatGLM2NormalizedConfig

Suggested change
NORMALIZED_CONFIG_CLASS = ChatGLM2NormalizedConfig
NORMALIZED_CONFIG_CLASS = NormalizedConfig.with_args(
vocab_size="padded_vocab_size",
num_layers="num_layers",
)

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be moved to optimum/intel/utils/input_generators.py

@eaidova
Copy link
Collaborator Author

eaidova commented Feb 21, 2024

closed in flawor #568

@eaidova eaidova closed this Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants