Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Migrate SQ and WOQ to INC 3.x API. #1606

Merged
merged 38 commits into from
Jul 11, 2024
Merged

Migrate SQ and WOQ to INC 3.x API. #1606

merged 38 commits into from
Jul 11, 2024

Conversation

changwangss
Copy link
Contributor

@changwangss changwangss commented Jun 12, 2024

Type of Change

SQ, WOQ features based INC 3.x.
CI changes:
Because NeuralChat and engine are no longer updated, so remove it from CI.
WOQ changes:

  1. Remove weight dtype fp4_e2m1_bnb.
  2. nf4, fp4 weight dtype support compressed model from INC.

SQ changes:
use INC 3.x API.

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

changwangss and others added 23 commits May 7, 2024 01:49
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
Copy link

github-actions bot commented Jun 12, 2024

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow
Check ID Status Error details
format-scan (pylint) success
format-scan (bandit) success
format-scan (cloc) success
format-scan (cpplint) success

These checks are required after the changes to intel_extension_for_transformers/neural_chat/examples/finetuning/multi_modal/train.py, intel_extension_for_transformers/neural_chat/models/model_utils.py, intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py.

🟢 Optimize Unit Test workflow
Check ID Status Error details
optimize-unit-test-baseline success
optimize-unit-test-PR-test success
Genreate-OptimizeUT-Report success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py, tests/CI/test_quantization.py, tests/CI/test_weight_only.py, tests/CI/test_weight_only_gpu.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.

@changwangss
Copy link
Contributor Author

@XuehaoSun please update CI support INC 3.x API installation.

changwangss and others added 3 commits June 13, 2024 19:37
Signed-off-by: changwangss <[email protected]>
Signed-off-by: changwangss <[email protected]>
@XuehaoSun XuehaoSun requested a review from VincyZhang as a code owner June 14, 2024 03:29
@chensuyue
Copy link
Contributor

Any update for SQ part?

@changwangss changwangss removed the WIP label Jul 11, 2024
@XuehaoSun XuehaoSun merged commit a864bb2 into main Jul 11, 2024
14 checks passed
@XuehaoSun XuehaoSun deleted the wangchang/inc3.x branch July 11, 2024 05:43
@airMeng
Copy link
Contributor

airMeng commented Jul 12, 2024

Is the following error related with this?

Traceback (most recent call last):
  File "~/frameworks.ai.pytorch.ipex-gpu/examples/gpu/inference/python/llm/run_generation_woq.py", line 19, in <module>
    from intel_extension_for_transformers.transformers.modeling import AutoModelForCausalLM
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/__init__.py", line 44, in <module>
    from .modeling import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/__init__.py", line 21, in <module>
    from .modeling_auto import (AutoModel, AutoModelForCausalLM,
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 63, in <module>
    from ..llm.quantization.utils import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/llm/quantization/utils.py", line 26, in <module>
    from neural_compressor.torch.algorithms.weight_only.modules import WeightOnlyLinear
ModuleNotFoundError: No module named 'neural_compressor.torch'

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants