Add fp16 and int8 to OpenVINO models and export CLI #443

echarlaix · 2023-09-29T11:32:12Z

as discussed in #437

HuggingFaceDocBuilderDev · 2023-09-29T11:38:36Z

The documentation is not available anymore as the PR was closed or merged.

AlexKoff88

LGTM, thanks Ella!

helena-intel · 2023-10-02T09:35:32Z

Thanks @echarlaix ! I tested it and it worked well. I'm wondering if we should restrict INT8 weights compression to tasks where we are reasonably confident that it works well (LLMs), and show a message about manually quantizing for other tasks.

AlexKoff88 · 2023-10-02T10:23:37Z

optimum/exporters/openvino/__main__.py

        model_kwargs=model_kwargs,
    )
+    del models_and_onnx_configs
+
+    if int8:


One comment from my side. Such an implementation means that we will:

Convert from PyTorch to OpenVINO with memory reuse (no-copy)

However, after that, we will serialize to the disk, load the model again, quantize weights, remove files from the disk, and store the compressed version, finally.

It can be time-consuming for really large models.

Ideally, we should have nncf.compress_weights() right after openvino.convert_model(), the similar to FP16.

I opened a follow-up PR #444 for the default 8-bit compression. I tried to follow the approach described above.

@eaidova, please take a look as well.

Ideally, we should have nncf.compress_weights() right after openvino.convert_model(), the similar to FP16.

Yes that's a good point, was going to modify this PR but I see that you opened #444, thanks @AlexKoff88

optimum/exporters/openvino/__main__.py

AlexKoff88

LGTM, thanks!

echarlaix added 2 commits September 29, 2023 13:27

Add int8 and fp16 to OV export CLI

42d04b2

format

49f0eb7

echarlaix added 3 commits September 29, 2023 15:48

add tests

d44e13c

update description

864ff18

fix

3ddbafe

echarlaix marked this pull request as ready for review September 29, 2023 15:29

echarlaix requested review from AlexKoff88 and helena-intel October 2, 2023 07:40

AlexKoff88 approved these changes Oct 2, 2023

View reviewed changes

AlexKoff88 reviewed Oct 2, 2023

View reviewed changes

echarlaix mentioned this pull request Oct 2, 2023

Added 8 bit weights compression by default for decoders larger than 1B #444

Merged

eaidova reviewed Oct 3, 2023

View reviewed changes

optimum/exporters/openvino/__main__.py Outdated Show resolved Hide resolved

echarlaix added 11 commits October 3, 2023 16:44

merge main in branch

7505a19

rename var

745da20

format

2644e0b

fix

eefec6c

merge main in branch

8003dcc

format

2aa230d

add int8 compression to ovmodel

7af8298

format

6df9543

add test

cb194e8

format

f150731

add test

5818137

echarlaix requested a review from AlexKoff88 October 4, 2023 10:34

echarlaix changed the title ~~Add fp16 and int8 to OpenVINO export CLI~~ Add fp16 and int8 to OpenVINO models and export CLI Oct 4, 2023

AlexKoff88 reviewed Oct 4, 2023

View reviewed changes

eaidova approved these changes Oct 4, 2023

View reviewed changes

echarlaix merged commit ce6c6bc into main Oct 4, 2023
10 of 12 checks passed

echarlaix deleted the add-weight-only-cli branch October 4, 2023 11:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fp16 and int8 to OpenVINO models and export CLI #443

Add fp16 and int8 to OpenVINO models and export CLI #443

echarlaix commented Sep 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 29, 2023 •

edited

Loading

AlexKoff88 left a comment

helena-intel commented Oct 2, 2023

AlexKoff88 Oct 2, 2023

AlexKoff88 Oct 2, 2023

AlexKoff88 Oct 2, 2023

echarlaix Oct 2, 2023 •

edited

Loading

AlexKoff88 left a comment

Add fp16 and int8 to OpenVINO models and export CLI #443

Add fp16 and int8 to OpenVINO models and export CLI #443

Conversation

echarlaix commented Sep 29, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Sep 29, 2023 • edited Loading

AlexKoff88 left a comment

Choose a reason for hiding this comment

helena-intel commented Oct 2, 2023

AlexKoff88 Oct 2, 2023

Choose a reason for hiding this comment

AlexKoff88 Oct 2, 2023

Choose a reason for hiding this comment

AlexKoff88 Oct 2, 2023

Choose a reason for hiding this comment

echarlaix Oct 2, 2023 • edited Loading

Choose a reason for hiding this comment

AlexKoff88 left a comment

Choose a reason for hiding this comment

echarlaix commented Sep 29, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 29, 2023 •

edited

Loading

echarlaix Oct 2, 2023 •

edited

Loading