Migrate SQ and WOQ to INC 3.x API. #1606

changwangss · 2024-06-12T11:28:17Z

Type of Change

SQ, WOQ features based INC 3.x.
CI changes:
Because NeuralChat and engine are no longer updated, so remove it from CI.
WOQ changes:

Remove weight dtype fp4_e2m1_bnb.
nf4, fp4 weight dtype support compressed model from INC.

SQ changes:
use INC 3.x API.

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: changwangss <[email protected]>

Signed-off-by: Ye, Xinyu <[email protected]>

Signed-off-by: changwangss <[email protected]>

Signed-off-by: Ye, Xinyu <[email protected]>

…ension-for-transformers into wangchang/inc3.x

Signed-off-by: changwangss <[email protected]>

github-actions · 2024-06-12T11:28:42Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow

Check ID	Status
format-scan (pylint)	success	✅
format-scan (bandit)	success	✅
format-scan (cloc)	success	✅
format-scan (cpplint)	success	✅

These checks are required after the changes to intel_extension_for_transformers/neural_chat/examples/finetuning/multi_modal/train.py, intel_extension_for_transformers/neural_chat/models/model_utils.py, intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py.

🟢 Optimize Unit Test workflow

Check ID	Status
optimize-unit-test-baseline	success	✅
optimize-unit-test-PR-test	success	✅
Genreate-OptimizeUT-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py, tests/CI/test_quantization.py, tests/CI/test_weight_only.py, tests/CI/test_weight_only_gpu.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.

changwangss · 2024-06-13T10:15:32Z

@XuehaoSun please update CI support INC 3.x API installation.

Signed-off-by: changwangss <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Sun, Xuehao <[email protected]>

chensuyue · 2024-06-25T08:36:01Z

Any update for SQ part?

Signed-off-by: changwangss <[email protected]>

airMeng · 2024-07-12T00:27:42Z

Is the following error related with this?

Traceback (most recent call last):
  File "~/frameworks.ai.pytorch.ipex-gpu/examples/gpu/inference/python/llm/run_generation_woq.py", line 19, in <module>
    from intel_extension_for_transformers.transformers.modeling import AutoModelForCausalLM
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/__init__.py", line 44, in <module>
    from .modeling import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/__init__.py", line 21, in <module>
    from .modeling_auto import (AutoModel, AutoModelForCausalLM,
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 63, in <module>
    from ..llm.quantization.utils import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/llm/quantization/utils.py", line 26, in <module>
    from neural_compressor.torch.algorithms.weight_only.modules import WeightOnlyLinear
ModuleNotFoundError: No module named 'neural_compressor.torch'

changwangss and others added 23 commits May 7, 2024 01:49

adapt smoothquant,static,woq autooround

d2aa176

Signed-off-by: changwangss <[email protected]>

Merge branch 'main' into wangchang/inc3.x

4fa9555

add GPTQ API

e804b7b

Signed-off-by: changwangss <[email protected]>

migrated RTN to use INC3.x API.

cbfd1be

Signed-off-by: Ye, Xinyu <[email protected]>

migrate sq with INC 3.x

44d0ced

Signed-off-by: changwangss <[email protected]>

ssupport moothquant with fix alpha

c66f984

Signed-off-by: changwangss <[email protected]>

migrate restore sq from json

0d064a0

Signed-off-by: changwangss <[email protected]>

Merge branch 'main' into wangchang/inc3.x

785be96

added HQQ for WOQ in ITREX.

4dc368a

Signed-off-by: Ye, Xinyu <[email protected]>

Merge branch 'wangchang/inc3.x' of https://github.com/intel/intel-ext…

231c80a

…ension-for-transformers into wangchang/inc3.x

support chatglm,qwen,baichuan sq

a7cf2c1

Signed-off-by: changwangss <[email protected]>

adapt inc fixed gptq

2a60dfb

Signed-off-by: changwangss <[email protected]>

migraate awq teq

eb63429

Signed-off-by: changwangss <[email protected]>

Merge branch 'main' into wangchang/inc3.x

3c2bec9

rebase autoround

0e40185

Signed-off-by: changwangss <[email protected]>

adapt inc 3.x weightonlylinear

7ae4aec

Signed-off-by: changwangss <[email protected]>

support autoround 0.2 and sq with alpha auto

f6f0ffc

Signed-off-by: changwangss <[email protected]>

fix awq folding setting to True

02d5751

Signed-off-by: changwangss <[email protected]>

Merge branch 'main' into wangchang/inc3.x

72035a2

rebase

e8c0946

Signed-off-by: changwangss <[email protected]>

fix extension

658e129

Signed-off-by: changwangss <[email protected]>

support extension

fe06a84

Signed-off-by: changwangss <[email protected]>

fix benchmark

17287f8

Signed-off-by: changwangss <[email protected]>

changwangss requested review from PenghuiCheng and lvliang-intel as code owners June 12, 2024 11:28

changwangss and others added 3 commits June 13, 2024 19:37

fix pylint

91c973b

Signed-off-by: changwangss <[email protected]>

fix pylint

27148eb

Signed-off-by: changwangss <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

a375b6f

for more information, see https://pre-commit.ci

install requirements_pt.txt of inc3.x

6b429c0

Signed-off-by: Sun, Xuehao <[email protected]>

XuehaoSun requested a review from VincyZhang as a code owner June 14, 2024 03:29

Merge branch 'main' into wangchang/inc3.x

ce93c14

changwangss added the WIP label Jun 17, 2024

changwangss added 7 commits June 26, 2024 14:10

Merge branch 'main' into wangchang/inc3.x

81d0b35

Merge branch 'main' into wangchang/inc3.x

35b24f1

rebase

ed478c1

Signed-off-by: changwangss <[email protected]>

update sq

634ef3e

Signed-off-by: changwangss <[email protected]>

remove woq hqq

565c377

Signed-off-by: changwangss <[email protected]>

Merge branch 'main' into wangchang/inc3.x

33e15d8

fix pylint

54c8157

Signed-off-by: changwangss <[email protected]>

changwangss mentioned this pull request Jul 10, 2024

qbits support f4 weight repack #1653

Merged

changwangss added 3 commits July 9, 2024 20:00

Merge branch 'main' into wangchang/inc3.x

b5537bc

fit to the latest inc

f77c8f8

Signed-off-by: changwangss <[email protected]>

remove engine ci and neuralchat ci

daa485c

Signed-off-by: changwangss <[email protected]>

changwangss removed the WIP label Jul 11, 2024

XuehaoSun approved these changes Jul 11, 2024

View reviewed changes

XuehaoSun merged commit a864bb2 into main Jul 11, 2024
14 checks passed

XuehaoSun deleted the wangchang/inc3.x branch July 11, 2024 05:43

chensuyue mentioned this pull request Jul 12, 2024

Remove deprecated modules intel/neural-compressor#1872

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate SQ and WOQ to INC 3.x API. #1606

Migrate SQ and WOQ to INC 3.x API. #1606

changwangss commented Jun 12, 2024 •

edited

Loading

github-actions bot commented Jun 12, 2024 •

edited

Loading

changwangss commented Jun 13, 2024

chensuyue commented Jun 25, 2024

airMeng commented Jul 12, 2024 •

edited

Loading

Migrate SQ and WOQ to INC 3.x API. #1606

Migrate SQ and WOQ to INC 3.x API. #1606

Conversation

changwangss commented Jun 12, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

github-actions bot commented Jun 12, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

changwangss commented Jun 13, 2024

chensuyue commented Jun 25, 2024

airMeng commented Jul 12, 2024 • edited Loading

changwangss commented Jun 12, 2024 •

edited

Loading

github-actions bot commented Jun 12, 2024 •

edited

Loading

airMeng commented Jul 12, 2024 •

edited

Loading