Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate weight-only quantizaion of INC #417

Merged
merged 10 commits into from
Sep 15, 2023
Merged

Conversation

mengniwang95
Copy link
Contributor

This PR integrate weight-only quantization of neural compressor into optimum-intel.

Notice: Need to use the master branch for test

@hshen14
Copy link
Collaborator

hshen14 commented Aug 27, 2023

@echarlaix Could you please help review this PR? INC supports production-level quality of weight-only quantization including INT8 and INT4 for LLMs in latest master (also be released in INC v2.3 in early Sep). Thanks.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 27, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the addition @mengniwang95

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for the addition @mengniwang95 !

@mengniwang95
Copy link
Contributor Author

Hi Ella, currently UT fails since it doesn't use the latest master code. Do we need to wait neural-compressor 2.3 release for UT test and merge this PR after all test passing? @echarlaix

@echarlaix
Copy link
Collaborator

Hi Ella, currently UT fails since it doesn't use the latest master code. Do we need to wait neural-compressor 2.3 release for UT test and merge this PR after all test passing? @echarlaix

For when is the neural-compressor release planned ? This PR is compatible with the current INC latest version so I'm ok to with merging it now (the test can be added in an other PR with INC being installed from source)

@mengniwang95
Copy link
Contributor Author

Hi Ella, currently UT fails since it doesn't use the latest master code. Do we need to wait neural-compressor 2.3 release for UT test and merge this PR after all test passing? @echarlaix

For when is the neural-compressor release planned ? This PR is compatible with the current INC latest version so I'm ok to with merging it now (the test can be added in an other PR with INC being installed from source)

Hi Ella, neural-compressor release is planned on 9/15. I add INT4 UT in this branch, but it is not triggered due to neural-compressor < 2.3

@hshen14
Copy link
Collaborator

hshen14 commented Sep 14, 2023

@echarlaix it seems some tests failed, while they may not be related with the changes. Could you please help check, or is it okay to get this PR merged?

@echarlaix
Copy link
Collaborator

@echarlaix it seems some tests failed, while they may not be related with the changes. Could you please help check, or is it okay to get this PR merged?

Could you update your branch by rebasing from main ? This will fix all unrelated tests. The INC tests are failing, because the release is previewed for tomorrow I think we should install neural-compressor from source here to verify all the tests are passing and then we can merge

@echarlaix echarlaix merged commit a7782ae into huggingface:main Sep 15, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants