-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add openvino export configs and support chatglm #454
Conversation
ce9d0cb
to
e57baac
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
} | ||
|
||
|
||
class ChatGLM2DummyPastKeyValuesGenerator(DummyPastKeyValuesGenerator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move it to optimum directly as it's not specific to OpenVINO
|
||
|
||
@register_normalized_config("chatglm") | ||
class ChatGLM2NormalizedConfig(NormalizedTextConfig): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same we could move it to optimum
) | ||
else: | ||
onnx_config_constructor = TasksManager.get_exporter_config_constructor( | ||
model=model, exporter="openvino", task=task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I would prefer we keep the onnx config so that we don't duplicate code as I'm unsure why we would need it for now, is there a specific reason for that @eaidova ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main idea of this pr to allow override configs for onnx for openvino.
I see 2 reasons for that:
- Adding configurations for supporting new models now requires to guiranree that model can be exported in onnx (and possibly even provide pipeline for running it with ORT). However, we do not use export to onnx as default path for export models to openvino anymore. The set of supported operations from torch in openvino can be different from set of ops supported to torch-onnx export. For example, we successfully convert Tapas model family to openvino (for example https://huggingface.co/google/tapas-tiny-finetuned-wtq), but onnx export failed with unsupported aten::scatter_reduce
- Unblocking different optimizations of existing models that specific for openvino. For the same reasons that we do not export models to onnx and has own inference pipelines, in some time it may happens that onnx and openvino export paths can become different and require different configurations (for example if I want to merge 2 decoders in seq2seq models in the way how it will be convenient for openvino or in some other our plans we want to try export caulsallm models with including beam search inside models). Some simple example, where onnx configuration is not perfectly fit for us, text-generation-with-past, if I understand, for onnx default path is a merged model, while we use model-with-past. ONNX configs for a model with past fill input_ids with only 1 token for this case lead to some exporting model troubles, that require models patching and consider each new case as separated (in the latest optimum, created patchers per model, now there is just function that check model type and apply the patch on pytorch model), but it can be avoided in majority cases if input_ids will have 2 tokens instead of 1. Also it uses only dynamic batch and static sequence len (=1) for this case, that lead to extra model reshaping before loading to make sequence len dynamic for this input.
- Simplification of new models enabling flow. We really like optimum and all its features for its smooth user experience (thank you very much for everything what you do) and recommend it to our openvino users as the main path for running inference for HF models using openvino, so we are very interested in extension of supported models and having the latest trading models available running with openvino. Support of cli and API for export models open the door for converting everything that supported in optimum directly in openvino (even if we do not have some OVModelForXXX classes) directly to openvino and it is the great step for us. But now, it is not enough to just install optimum-intel from git for get the latest available models from optimum side, it requires also install optimum for git. But there is no guarantee that they are synchronized. Like you already highlighted about changes with position_ids for example or another thing what I recently found trying to run mistral model that now for some models it is not enough to specify only with_past=True for getting the model with past in inputs and outputs in the same time, additional with_past_in_inputs flag should be passed in config. We need to wait the next official realse for aligning and getting new models supported that maybe non-convinient for us.
We do not duplicate code, (for majority cases configs mapping filled in runtime, just reusing the same onnx config, but if we have own one for openvino specifc, we will use own) just allow overriding some export configs if it is applicable.
return { | ||
"input_ids": input_ids, | ||
"past_key_values": past_key_values, | ||
"use_cache": self.use_cache, | ||
"position_ids": None, | ||
"attention_mask": kwargs.get("attention_mask", None), | ||
"position_ids": position_ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thnaks for adding the position_ids
support added in huggingface/optimum#1381
ce6cdaf
to
e596cc7
Compare
def create_register(overwrite_existing: bool = False): | ||
def wrapper(model_type: str, *supported_tasks: str) -> Callable[[Type], Type]: | ||
def decorator(config_cls: Type) -> Type: | ||
mapping = TasksManager._SUPPORTED_MODEL_TYPE.get(model_type, {}) | ||
mapping_backend = mapping.get("openvino", {}) | ||
for task in supported_tasks: | ||
normalized_task = task | ||
if "-with-past" in task: | ||
normalized_task = task.split("-with-past")[0] | ||
if normalized_task not in TasksManager.get_all_tasks(): | ||
known_tasks = ", ".join(TasksManager.get_all_tasks()) | ||
raise ValueError( | ||
f'The TasksManager does not know the task called "{task}", known tasks: {known_tasks}.' | ||
) | ||
if not overwrite_existing and task in mapping_backend: | ||
continue | ||
mapping_backend[task] = make_backend_config_constructor_for_task(config_cls, task) | ||
mapping["openvino"] = mapping_backend | ||
TasksManager._SUPPORTED_MODEL_TYPE[model_type] = mapping | ||
return config_cls | ||
|
||
return decorator | ||
|
||
return wrapper | ||
|
||
|
||
register_in_tasks_manager = create_register(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could it simplified with :
def create_register(overwrite_existing: bool = False): | |
def wrapper(model_type: str, *supported_tasks: str) -> Callable[[Type], Type]: | |
def decorator(config_cls: Type) -> Type: | |
mapping = TasksManager._SUPPORTED_MODEL_TYPE.get(model_type, {}) | |
mapping_backend = mapping.get("openvino", {}) | |
for task in supported_tasks: | |
normalized_task = task | |
if "-with-past" in task: | |
normalized_task = task.split("-with-past")[0] | |
if normalized_task not in TasksManager.get_all_tasks(): | |
known_tasks = ", ".join(TasksManager.get_all_tasks()) | |
raise ValueError( | |
f'The TasksManager does not know the task called "{task}", known tasks: {known_tasks}.' | |
) | |
if not overwrite_existing and task in mapping_backend: | |
continue | |
mapping_backend[task] = make_backend_config_constructor_for_task(config_cls, task) | |
mapping["openvino"] = mapping_backend | |
TasksManager._SUPPORTED_MODEL_TYPE[model_type] = mapping | |
return config_cls | |
return decorator | |
return wrapper | |
register_in_tasks_manager = create_register(True) | |
register_in_tasks_manager = TasksManager.create_register("openvino") |
|
||
@register_in_tasks_manager("chatglm", *["text-generation", "text-generation-with-past"]) | ||
class ChatGLM2OpenVINOConfig(TextDecoderOnnxConfig): | ||
NORMALIZED_CONFIG_CLASS = ChatGLM2NormalizedConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we need ChatGLM2NormalizedConfig
NORMALIZED_CONFIG_CLASS = ChatGLM2NormalizedConfig | |
NORMALIZED_CONFIG_CLASS = NormalizedConfig.with_args( | |
vocab_size="padded_vocab_size", | |
num_layers="num_layers", | |
) |
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be moved to optimum/intel/utils/input_generators.py
closed in flawor #568 |
What does this PR do?
Introducing export configs for openvino. ChatGLM2 model used as example for demonstration that this API extension work
Before submitting