Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit 2bb257e
Author: Kaihui-intel <[email protected]>
Date:   Thu Oct 10 19:27:11 2024 +0800

    Add woq examples (#1982)

    Signed-off-by: Kaihui-intel <[email protected]>
    Signed-off-by: Sun, Xuehao <[email protected]>
    Co-authored-by: Sun, Xuehao <[email protected]>

commit 586eb88
Author: Huang, Tai <[email protected]>
Date:   Wed Oct 9 09:22:39 2024 +0800

    add transformers-like api link in readme (#2022)

    Signed-off-by: Huang, Tai <[email protected]>

commit 4e9c764
Author: Kaihui-intel <[email protected]>
Date:   Tue Oct 8 13:13:45 2024 +0800

    Remove itrex dependency for 3x example (#2016)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: Sun, Xuehao <[email protected]>

commit a0066d4
Author: Kaihui-intel <[email protected]>
Date:   Mon Sep 30 18:17:32 2024 +0800

    Fix transformers rtn layer-wise quant (#2008)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 802a5af
Author: Huang, Tai <[email protected]>
Date:   Mon Sep 30 17:02:52 2024 +0800

    add autoround EMNLP24 to pub list (#2014)

    Signed-off-by: Huang, Tai <[email protected]>

commit 44795a1
Author: Kaihui-intel <[email protected]>
Date:   Mon Sep 30 16:55:22 2024 +0800

    Adapt transformers 4.45.1 (#2019)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: changwangss <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit d4662ad
Author: Kaihui-intel <[email protected]>
Date:   Mon Sep 30 15:52:17 2024 +0800

    Add transformers-like api doc (#2018)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 72398b6
Author: Wang, Chang <[email protected]>
Date:   Fri Sep 27 15:11:04 2024 +0800

    fix xpu device set weight and bias (#2010)

    Signed-off-by: changwangss <[email protected]>
    Co-authored-by: Sun, Xuehao <[email protected]>

commit 9d27743
Author: Sun, Xuehao <[email protected]>
Date:   Fri Sep 27 14:17:24 2024 +0800

    Update model accuracy (#2006)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit 7bbc473
Author: xinhe <[email protected]>
Date:   Fri Sep 27 11:47:00 2024 +0800

    add pad_to_buckets in evaluation for hpu performance (#2011)

    * add pad_to_buckets in evaluation for hpu performance
    ---------

    Signed-off-by: xin3he <[email protected]>

commit b6b7d7c
Author: Kaihui-intel <[email protected]>
Date:   Thu Sep 26 17:21:54 2024 +0800

    Update auto_round requirements for transformers example (#2013)

    Signed-off-by: Kaihui-intel <[email protected]>

commit ee600ba
Author: Wang, Chang <[email protected]>
Date:   Fri Sep 20 13:54:06 2024 +0800

    add repack_awq_to_optimum_format function (#1998)

    Signed-off-by: changwangss <[email protected]>

commit 4ee6861
Author: Sun, Xuehao <[email protected]>
Date:   Thu Sep 19 22:27:05 2024 +0800

    remove accelerate version in unit test (#2007)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit 2445811
Author: WeiweiZhang1 <[email protected]>
Date:   Sat Sep 14 18:13:30 2024 +0800

    enable auto_round format export (#2002)

    Signed-off-by: Zhang, Weiwei1 <[email protected]>

commit 906333a
Author: Kaihui-intel <[email protected]>
Date:   Sat Sep 14 16:17:46 2024 +0800

    Replace FORCE_DEVICE with INC_TARGET_DEVICE [transformers] (#2005)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 443d007
Author: xinhe <[email protected]>
Date:   Fri Sep 13 21:35:32 2024 +0800

    add INC_FORCE_DEVICE introduction (#1988)

    * add INC_FORCE_DEVICE introduction

    Signed-off-by: xin3he <[email protected]>

    * Update PyTorch.md

    * Update PyTorch.md

    * Update docs/source/3x/PyTorch.md

    Co-authored-by: Yi Liu <[email protected]>

    * rename to INC_TARGET_DEVICE

    Signed-off-by: xin3he <[email protected]>

    ---------

    Signed-off-by: xin3he <[email protected]>
    Co-authored-by: Yi Liu <[email protected]>

commit 5de9a4f
Author: Kaihui-intel <[email protected]>
Date:   Fri Sep 13 20:48:22 2024 +0800

    Support transformers-like api for woq quantization (#1987)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: Wang, Chang <[email protected]>

commit 9c39b42
Author: chen, suyue <[email protected]>
Date:   Thu Sep 12 14:34:49 2024 +0800

    update docker image prune rules (#2003)

    Signed-off-by: chensuyue <[email protected]>

commit 09d4f2d
Author: Huang, Tai <[email protected]>
Date:   Mon Sep 9 09:24:35 2024 +0800

    Add recent publications (#1995)

    * add recent publications

    Signed-off-by: Huang, Tai <[email protected]>

    * update total count

    Signed-off-by: Huang, Tai <[email protected]>

    ---------

    Signed-off-by: Huang, Tai <[email protected]>

commit 399cd44
Author: Kaihui-intel <[email protected]>
Date:   Tue Sep 3 16:37:09 2024 +0800

     Remove the save of gptq config (#1993)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 05272c4
Author: Yi Liu <[email protected]>
Date:   Tue Sep 3 10:21:51 2024 +0800

    add per_channel_minmax (#1990)

    Signed-off-by: yiliu30 <[email protected]>

commit 82d8c06
Author: chen, suyue <[email protected]>
Date:   Fri Aug 30 21:21:00 2024 +0800

    update 3x pt binary build (#1992)

    Signed-off-by: chensuyue <[email protected]>

commit e9f06af
Author: Huang, Tai <[email protected]>
Date:   Fri Aug 30 17:49:48 2024 +0800

    Update installation_guide.md (#1989)

    Correct typo in installation doc

commit 093c966
Author: Wang, Chang <[email protected]>
Date:   Fri Aug 30 17:45:54 2024 +0800

    add quantize, save, load function for transformers-like api (#1986)

    Signed-off-by: changwangss <[email protected]>

commit 4dd49a4
Author: xinhe <[email protected]>
Date:   Thu Aug 29 17:23:18 2024 +0800

    add hasattr check for torch fp8 dtype (#1985)

    Signed-off-by: xin3he <[email protected]>

commit f2c454f
Author: chen, suyue <[email protected]>
Date:   Thu Aug 29 13:45:39 2024 +0800

    update installation and ci test for 3x api (#1991)

    Signed-off-by: chensuyue <[email protected]>

commit 7ba9fdc
Author: Kaihui-intel <[email protected]>
Date:   Mon Aug 19 14:50:50 2024 +0800

    support gptq `true_sequential` and `quant_lm_head` (#1977)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 68b1f8b
Author: Sun, Xuehao <[email protected]>
Date:   Fri Aug 16 09:43:46 2024 +0800

    Fix UT env and upgrade torch to 2.4.0 (#1978)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit f9dfd54
Author: Yi Liu <[email protected]>
Date:   Thu Aug 15 14:13:26 2024 +0800

    Skip some tests for torch 2.4 (#1981)

    Signed-off-by: yiliu30 <[email protected]>

commit 46d9192
Author: xinhe <[email protected]>
Date:   Thu Aug 15 09:57:22 2024 +0800

    update readme for fp8 (#1979)

    Signed-off-by: xinhe3 <[email protected]>

commit 842b715
Author: chen, suyue <[email protected]>
Date:   Tue Aug 13 12:09:25 2024 +0800

    bump main version into v3.1 (#1974)

    Signed-off-by: chensuyue <[email protected]>

commit 3845cdc
Author: Neo Zhang Jianyu <[email protected]>
Date:   Tue Aug 13 12:09:09 2024 +0800

    fix online doc search issue (#1975)

    Co-authored-by: ZhangJianyu <[email protected]>

commit 7056720
Author: chen, suyue <[email protected]>
Date:   Sun Aug 11 20:58:34 2024 +0800

    update main page (#1973)

    Signed-off-by: chensuyue <[email protected]>

commit 95197d1
Author: xinhe <[email protected]>
Date:   Sat Aug 10 23:28:43 2024 +0800

    Cherry pick v1.17.0 (#1964)

    * [SW-184941] INC CI, CD and Promotion

    Change-Id: I60c420f9776e1bdab7bb9e02e5bcbdb6891bfe52

    * [SW-183320]updated setup.py

    Change-Id: I592af89486cb1d9e0b5197521c428920197a9103

    * [SW-177474] add HQT FP8 porting code

    Change-Id: I4676f13a5ed43c444f2ec68675cc41335e7234dd
    Signed-off-by: Zhou Yuwen <[email protected]>

    * [SW-189361] Fix white list extend

    Change-Id: Ic2021c248798fce37710d28014a6d59259c868a3

    * [SW-191317] Raise exception according to hqt config object

    Change-Id: I06ba8fa912c811c88912987c11e5c12ef328348a

    * [SW-184714] Port HQT code into INC

    HQT lib content was copied as is under fp8_quant

    Tests were copied to 3.x torch location

    Change-Id: Iec6e1fa7ac4bf1df1c95b429524c40e32bc13ac9

    * [SW-184714] Add internal folder to fp8 quant

    This is a folder used for experiments,
    not to be used by users

    Change-Id: I9e221ae582794e304e95392c0f37638f7bce69bc

    * [SW-177468] Removed unused code + cleanup

    Change-Id: I4d27c067e87c1a30eb1da9df16a16c46d092c638

    * Fix errors in regression_detection

    Change-Id: Iee5318bd5593ba349812516eb5641958ece3c438

    * [SW-187731] Save orig module as member of patched module

    This allows direct usage of the original module methods,
    which solves torch compile issue

    Change-Id: I464d8bd1bacdfc3cd1f128a67114e1e43f092632

    * [SW-190899] Install packages according to configuration

    Change-Id: I570b490658f5d2c5399ba1db93f8f52f56449525

    * [SW-184689] use finalize_calibration intrenaly for one step flow

    Change-Id: Ie0b8b426c951cf57ed7e6e678c86813fb2d05c89

    * [SW-191945] align requirement_pt.txt in gerrit INC with Github INC

    Change-Id: If5c0dbf21bf989af37a8e29246e4f8760cd215ef
    Signed-off-by: xinhe3 <[email protected]>

    * [SW-192358] Remove HQT reference in INC

    Change-Id: Ic25f9323486596fa2dc6d909cd568a37ab84dd5e

    * [SW-191415] update fp8 maxAbs observer  using torch.copy_

    Change-Id: I3923c832f9a8a2b14e392f3f4719d233a457702f

    * [SW-184943] Enhance INC WOQ model loading

    - Support loading huggingface WOQ model
    - Abstract WeightOnlyLinear base class. Add INCWeightOnlyLinear and HPUWeighOnlyLinear subclasses
    - Load woq linear weight module by module
    - Save hpu format tensor to reuse it once load it again

    Change-Id: I679a42759b49e1f45f52bbb0bdae8580a23d0bcf

    * [SW-190303] Implement HPUWeightOnlyLinear class in INC

    Change-Id: Ie05c8787e708e2c3559dce24ef0758d6c498ac41

    * [SW-192809] fix json_file bug when instantiating FP8Config class

    Change-Id: I4a715d0a706efe20ccdb49033755cabbc729ccdc
    Signed-off-by: Zhou Yuwen <[email protected]>

    * [SW-192931] align setup.py with github INC and remove fp8_convert

    Change-Id: Ibbc157646cfcfad64b323ecfd96b9bbda5ba9e2f
    Signed-off-by: xinhe3 <[email protected]>

    * [SW-192917] Update all HQT logic files with pre-commit check

    Change-Id: I119dc8578cb10932fd1a8a674a8bdbf61f978e42
    Signed-off-by: xinhe3 <[email protected]>

    * update docstring

    Signed-off-by: yuwenzho <[email protected]>

    * add fp8 example and document (#1639)

    Signed-off-by: xinhe3 <[email protected]>

    * Update settings to be compatible with gerrit

    * enhance ut

    Signed-off-by: yuwenzho <[email protected]>

    * move fp8 sample to helloworld folder

    Signed-off-by: yuwenzho <[email protected]>

    * update torch version of habana docker

    Signed-off-by: xinhe3 <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * update readme demo

    Signed-off-by: xinhe3 <[email protected]>

    * update WeightOnlyLinear to INCWeightOnlyLinear

    Signed-off-by: xinhe3 <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * add docstring for FP8Config

    Signed-off-by: xinhe3 <[email protected]>

    * fix pylint

    Signed-off-by: xinhe3 <[email protected]>

    * update fp8 test scripts

    Signed-off-by: chensuyue <[email protected]>

    * delete deps

    Signed-off-by: chensuyue <[email protected]>

    * update container into v1.17.0

    Signed-off-by: chensuyue <[email protected]>

    * update docker version

    Signed-off-by: xinhe3 <[email protected]>

    * update pt ut

    Signed-off-by: chensuyue <[email protected]>

    * add lib path

    Signed-off-by: chensuyue <[email protected]>

    * fix dir issue

    Signed-off-by: xinhe3 <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * update fp8 test scope

    Signed-off-by: chensuyue <[email protected]>

    * fix typo

    Signed-off-by: xinhe3 <[email protected]>

    * update fp8 test scope

    Signed-off-by: chensuyue <[email protected]>

    * update pre-commit-ci

    Signed-off-by: chensuyue <[email protected]>

    * work around for hpu

    Signed-off-by: xinhe3 <[email protected]>

    * fix UT

    Signed-off-by: xinhe3 <[email protected]>

    * fix parameter

    Signed-off-by: chensuyue <[email protected]>

    * omit some test

    Signed-off-by: chensuyue <[email protected]>

    * update main page example to llm loading

    Signed-off-by: xinhe3 <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * fix autotune

    Signed-off-by: xinhe3 <[email protected]>

    ---------

    Signed-off-by: Zhou Yuwen <[email protected]>
    Signed-off-by: xinhe3 <[email protected]>
    Signed-off-by: yuwenzho <[email protected]>
    Signed-off-by: chensuyue <[email protected]>
    Co-authored-by: yan tomsinsky <[email protected]>
    Co-authored-by: Ron Ben Moshe <[email protected]>
    Co-authored-by: Uri Livne <[email protected]>
    Co-authored-by: Danny Semiat <[email protected]>
    Co-authored-by: smarkovichgolan <[email protected]>
    Co-authored-by: Dudi Lester <[email protected]>

commit de0fa21
Author: Huang, Tai <[email protected]>
Date:   Fri Aug 9 22:32:37 2024 +0800

    Fix broken link in docs (#1969)

    Signed-off-by: Huang, Tai <[email protected]>

commit 385da7c
Author: Sun, Xuehao <[email protected]>
Date:   Fri Aug 9 21:53:51 2024 +0800

    Add 3.x readme (#1971)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit acd8f4f
Author: Huang, Tai <[email protected]>
Date:   Fri Aug 9 15:24:14 2024 +0800

    Add version mapping between INC and Gaudi SW Stack (#1967)

    Signed-off-by: Huang, Tai <[email protected]>

commit 74a4641
Author: Sun, Xuehao <[email protected]>
Date:   Fri Aug 9 10:23:59 2024 +0800

    remove unnecessary CI (#1966)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit b99abae
Author: Kaihui-intel <[email protected]>
Date:   Tue Aug 6 16:02:03 2024 +0800

    Fix `opt_125m_woq_gptq_int4_dq_ggml` issue (#1965)

    Signed-off-by: Kaihui-intel <[email protected]>

commit b35ff8f
Author: Zixuan Cheng <[email protected]>
Date:   Fri Aug 2 09:06:35 2024 +0800

    example update for 3.x ipex sq (#1902)

    Signed-off-by: violetch24 <[email protected]>

commit 000946f
Author: Zixuan Cheng <[email protected]>
Date:   Thu Aug 1 10:19:32 2024 +0800

    add SDXL model example to INC 3.x (#1887)

    * add SDXL model example to INC 3.x

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * add evaluation script

    Signed-off-by: violetch24 <[email protected]>

    * add test script

    Signed-off-by: violetch24 <[email protected]>

    * minor fix

    Signed-off-by: violetch24 <[email protected]>

    * Update run_quant.sh

    * add iter limit

    Signed-off-by: violetch24 <[email protected]>

    * modify test script

    Signed-off-by: violetch24 <[email protected]>

    * update json

    Signed-off-by: chensuyue <[email protected]>

    * add requirements

    Signed-off-by: violetch24 <[email protected]>

    * Update run_benchmark.sh

    * Update sdxl_smooth_quant.py

    * minor fix

    Signed-off-by: violetch24 <[email protected]>

    ---------

    Signed-off-by: Cheng, Zixuan <[email protected]>
    Signed-off-by: violetch24 <[email protected]>
    Signed-off-by: chensuyue <[email protected]>
    Co-authored-by: violetch24 <[email protected]>
    Co-authored-by: chensuyue <[email protected]>

commit aa42e5e
Author: xinhe <[email protected]>
Date:   Wed Jul 31 15:36:06 2024 +0800

    replenish docstring (#1955)

    * replenish docstring

    Signed-off-by: xin3he <[email protected]>

    * update  Quantizer API docstring

    Signed-off-by: xin3he <[email protected]>

    * Add docstring for auto accelerator (#1956)

    Signed-off-by: yiliu30 <[email protected]>

    * temporary remove torch/quantization and add it back after fp8 code is updated.

    * Update config.py

    ---------

    Signed-off-by: xin3he <[email protected]>
    Signed-off-by: yiliu30 <[email protected]>
    Co-authored-by: Yi Liu <[email protected]>

commit 81a076d
Author: Neo Zhang Jianyu <[email protected]>
Date:   Wed Jul 31 13:51:33 2024 +0800

    fix welcome.html link issue (#1962)

    Co-authored-by: ZhangJianyu <[email protected]>

commit 87f02c1
Author: chen, suyue <[email protected]>
Date:   Wed Jul 31 10:09:47 2024 +0800

    fix docs link (#1959)

    Signed-off-by: chensuyue <[email protected]>

commit 03813e2
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Jul 31 10:09:29 2024 +0800

    Bump tensorflow version (#1961)

    Signed-off-by: dependabot[bot] <[email protected]>

commit 3b5dbf6
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 30 17:27:21 2024 +0800

    Set low_gpu_mem_usage=False for AutoRound

    Signed-off-by: Kaihui-intel <[email protected]>

commit 41244d3
Author: chen, suyue <[email protected]>
Date:   Mon Jul 29 23:05:36 2024 +0800

    new previous results could not find all raise issues in CI model test (#1958)

    Signed-off-by: chensuyue <[email protected]>

commit 190e6b2
Author: Kaihui-intel <[email protected]>
Date:   Mon Jul 29 19:39:57 2024 +0800

    Fix itrex qbits nf4/int8 training core dumped issue (#1954)

    Signed-off-by: Kaihui-intel <[email protected]>
    Signed-off-by: chensuyue <[email protected]>

commit 0e724a4
Author: Kaihui-intel <[email protected]>
Date:   Mon Jul 29 16:22:13 2024 +0800

    Add save/load for pt2e example (#1927)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 50eb6fb
Author: chen, suyue <[email protected]>
Date:   Mon Jul 29 13:40:36 2024 +0800

    update 3x torch installation (#1957)

    Signed-off-by: chensuyue <[email protected]>

commit 6e1b1da
Author: Zixuan Cheng <[email protected]>
Date:   Fri Jul 26 15:58:00 2024 +0800

    add ipex xpu example to 3x API (#1948)

    Signed-off-by: violetch24 <[email protected]>

commit 19024b3
Author: zehao-intel <[email protected]>
Date:   Fri Jul 26 14:52:01 2024 +0800

    Enable yolov5 Example for TF 3x API  (#1943)

    Signed-off-by: zehao-intel <[email protected]>

commit d84a93f
Author: zehao-intel <[email protected]>
Date:   Thu Jul 25 14:45:19 2024 +0800

    Complement UT of calibration function for TF 3x API (#1945)

    Signed-off-by: zehao-intel <[email protected]>

commit fb85779
Author: zehao-intel <[email protected]>
Date:   Thu Jul 25 14:04:25 2024 +0800

    Update Examples for TF 3x API (#1901)

    Signed-off-by: zehao-intel <[email protected]>

commit 6b30207
Author: zehao-intel <[email protected]>
Date:   Thu Jul 25 13:39:06 2024 +0800

    Add Docstring for TF 3x API and Torch 3x Mixed Precision (#1944)

    Signed-off-by: zehao-intel <[email protected]>

commit d254d50
Author: Yi Liu <[email protected]>
Date:   Wed Jul 24 21:50:44 2024 +0800

    Update doc for client-usage and LWQ (#1947)

    Signed-off-by: yiliu30 <[email protected]>

commit f253d35
Author: Neo Zhang Jianyu <[email protected]>
Date:   Wed Jul 24 17:48:05 2024 +0800

    Update publish.yml (#1950)

commit 6cda338
Author: Neo Zhang Jianyu <[email protected]>
Date:   Wed Jul 24 17:31:19 2024 +0800

    Update publish.yml (#1949)

    * Update publish.yml

    * Update publish.yml

commit c80b68a
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 23 21:26:53 2024 +0800

    Update AutoRound commit version (#1941)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 9077b38
Author: zehao-intel <[email protected]>
Date:   Tue Jul 23 17:04:37 2024 +0800

    Refine Pytorch 3x Mixed Precision Example (#1946)

    Signed-off-by: zehao-intel <[email protected]>

commit efcb293
Author: Neo Zhang Jianyu <[email protected]>
Date:   Tue Jul 23 10:15:41 2024 +0800

    Update for API 3.0 online doc (#1940)

    Co-authored-by: ZhangJianyu <[email protected]>

commit b787940
Author: Wang, Mengni <[email protected]>
Date:   Tue Jul 23 10:12:34 2024 +0800

    add docstring for mx quant (#1932)

    Signed-off-by: Mengni Wang <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: xinhe <[email protected]>

commit 0c52e12
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 23 09:59:17 2024 +0800

    Add docstring for WOQ&LayerWise (#1938)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: xinhe <[email protected]>

commit 08914d6
Author: Huang, Tai <[email protected]>
Date:   Mon Jul 22 11:14:44 2024 +0800

    add read permission token (#1942)

    Signed-off-by: Huang, Tai <[email protected]>

commit e106dea
Author: zehao-intel <[email protected]>
Date:   Sun Jul 21 21:48:51 2024 +0800

    Update Example for Pytorch 3x Mixed Precision (#1882)

    Signed-off-by: zehao-intel <[email protected]>

commit 1ebf698
Author: Zixuan Cheng <[email protected]>
Date:   Fri Jul 19 15:56:09 2024 +0800

    add docstring for static quant and smooth quant (#1936)

    * add docstring for static quant and smooth quant

    Signed-off-by: violetch24 <[email protected]>

    * format fix

    Signed-off-by: violetch24 <[email protected]>

    * update scan path

    Signed-off-by: violetch24 <[email protected]>

    * Update utility.py

    ---------

    Signed-off-by: violetch24 <[email protected]>
    Co-authored-by: violetch24 <[email protected]>

commit 296c5d4
Author: Yi Liu <[email protected]>
Date:   Fri Jul 19 15:08:05 2024 +0800

    Add docstring for PT2E and HQQ (#1937)

    Signed-off-by: yiliu30 <[email protected]>

commit 437c8e7
Author: Kaihui-intel <[email protected]>
Date:   Thu Jul 18 10:00:41 2024 +0800

    Fix unused pkgs  import (#1931)

    Signed-off-by: Kaihui-intel <[email protected]>

commit ff37401
Author: chen, suyue <[email protected]>
Date:   Wed Jul 17 23:11:15 2024 +0800

    3.X API installation update (#1935)

    Signed-off-by: chensuyue <[email protected]>

commit 6c27c19
Author: zehao-intel <[email protected]>
Date:   Wed Jul 17 20:35:42 2024 +0800

    Support calib_func on TF 3x API (#1934)

    Signed-off-by: zehao-intel <[email protected]>

commit 53e6ee6
Author: Zixuan Cheng <[email protected]>
Date:   Wed Jul 17 20:35:03 2024 +0800

    Support xpu for ipex static quant (#1916)

    Signed-off-by: violetch24 <[email protected]>

commit a1cc618
Author: chen, suyue <[email protected]>
Date:   Wed Jul 17 17:29:49 2024 +0800

    remove peft version limit (#1933)

    Signed-off-by: chensuyue <[email protected]>

commit 3058388
Author: Yi Liu <[email protected]>
Date:   Wed Jul 17 15:31:38 2024 +0800

    Add doc for client usage (#1914)

    Signed-off-by: yiliu30 <[email protected]>

commit 29471df
Author: Kaihui-intel <[email protected]>
Date:   Wed Jul 17 12:12:40 2024 +0800

    Enhance load_empty_model import (#1930)

    Signed-off-by: Kaihui-intel <[email protected]>

commit fd96851
Author: Kaihui-intel <[email protected]>
Date:   Wed Jul 17 12:05:32 2024 +0800

    Integrate AutoRound v0.3 to 2x (#1926)

    Signed-off-by: Kaihui-intel <[email protected]>

commit bfa27e4
Author: Kaihui-intel <[email protected]>
Date:   Wed Jul 17 09:33:13 2024 +0800

    Integrate AutoRound v0.3 (#1925)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 5767aed
Author: xinhe <[email protected]>
Date:   Wed Jul 17 09:16:37 2024 +0800

    add docstring for torch.quantization and torch.utils (#1928)

    Signed-off-by: xin3he <[email protected]>

commit f909bca
Author: chen, suyue <[email protected]>
Date:   Tue Jul 16 21:12:54 2024 +0800

    update itrex ut test (#1929)

    Signed-off-by: chensuyue <[email protected]>

commit 649e6b1
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 16 21:05:55 2024 +0800

    Support LayerWise for RTN/GPTQ (#1883)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: chensuyue <[email protected]>

commit de43d85
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 16 17:18:12 2024 +0800

    Support absorb dict for awq (#1920)

    Signed-off-by: Kaihui-intel <[email protected]>

commit e976595
Author: Kaihui-intel <[email protected]>
Date:   Tue Jul 16 17:17:56 2024 +0800

    Support woq Autotune (#1921)

    Signed-off-by: Kaihui-intel <[email protected]>

commit d56075c
Author: Huang, Tai <[email protected]>
Date:   Tue Jul 16 15:21:06 2024 +0800

    fix typo in architecture diagram (#1924)

    Signed-off-by: Huang, Tai <[email protected]>

commit 0a54239
Author: chen, suyue <[email protected]>
Date:   Tue Jul 16 15:12:43 2024 +0800

    update documentation for 3x API (#1923)

    Signed-off-by: chensuyue <[email protected]>
    Signed-off-by: xin3he <[email protected]>
    Signed-off-by: yiliu30 <[email protected]>

commit be42d03
Author: xinhe <[email protected]>
Date:   Tue Jul 16 09:48:48 2024 +0800

    implement TorchBaseConfig (#1911)

    Signed-off-by: xin3he <[email protected]>

commit 7a4715c
Author: Kaihui-intel <[email protected]>
Date:   Mon Jul 15 14:59:03 2024 +0800

    Support PT2E save and load (#1918)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 34f0a9f
Author: Yi Liu <[email protected]>
Date:   Mon Jul 15 09:10:14 2024 +0800

    Add `save`/`load` support for HQQ (#1913)

    Signed-off-by: yiliu30 <[email protected]>
    Co-authored-by: chen, suyue <[email protected]>

commit d320460
Author: Yi Liu <[email protected]>
Date:   Fri Jul 12 14:48:12 2024 +0800

    remove 1x docs (#1900)

    Signed-off-by: yiliu30 <[email protected]>

commit 6c547f7
Author: chen, suyue <[email protected]>
Date:   Fri Jul 12 14:42:04 2024 +0800

    fix CI docker container clean up issue (#1917)

    Signed-off-by: chensuyue <[email protected]>

commit 1703658
Author: chen, suyue <[email protected]>
Date:   Fri Jul 12 11:14:48 2024 +0800

    Remove deprecated modules (#1872)

    Signed-off-by: chensuyue <[email protected]>

commit f698c96
Author: chen, suyue <[email protected]>
Date:   Thu Jul 11 18:00:28 2024 +0800

    update Gaudi CI baseline artifacts name (#1912)

    Signed-off-by: chensuyue <[email protected]>

commit 4a45093
Author: Yi Liu <[email protected]>
Date:   Thu Jul 11 17:47:47 2024 +0800

    Add export support for TEQ (#1910)

    Signed-off-by: yiliu30 <[email protected]>

commit 16a7b11
Author: Yi Liu <[email protected]>
Date:   Thu Jul 11 17:13:24 2024 +0800

    Get default config based on the auto-detect CPU type (#1904)

    Signed-off-by: yiliu30 <[email protected]>

commit 2fc7255
Author: xinhe <[email protected]>
Date:   Thu Jul 11 13:22:52 2024 +0800

    implement `incbench` command for ease-of-use benchmark (#1884)
     implement incbench command as entrypoint for ease-of-use benchmark
     automatically check numa/socket info and dump it with table for ease-of-understand
     supports both Linux and Windows platform
     add benchmark documents
     dump benchmark summary
     add benchmark UTs
    incbench main.py: run 1 instance on NUMA:0.
    incbench --num_i 2 main.py: run 2 instances on NUMA:0.
    incbench --num_c 2 main.py: run multi-instances with 2 cores per instance on NUMA:0.
    incbench -C 24-47 main.py: run 1 instance on COREs:24-47.
    incbench -C 24-47 --num_c 4 main.py: run multi-instances with 4 COREs per instance on COREs:24-47.

    ---------

    Signed-off-by: xin3he <[email protected]>
    Co-authored-by: chen, suyue <[email protected]>

commit de8577e
Author: chen, suyue <[email protected]>
Date:   Wed Jul 10 17:21:45 2024 +0800

    bump version into 3.0 (#1908)

    Signed-off-by: chensuyue <[email protected]>

commit 01f16c4
Author: chen, suyue <[email protected]>
Date:   Wed Jul 10 17:19:57 2024 +0800

    support habana fp8 UT test in CI (#1909)

    Signed-off-by: chensuyue <[email protected]>

commit 28578b9
Author: Yi Liu <[email protected]>
Date:   Wed Jul 10 13:19:27 2024 +0800

    Add docstring for `common` module (#1905)

    Signed-off-by: yiliu30 <[email protected]>

commit 5fde50f
Author: Wang, Chang <[email protected]>
Date:   Wed Jul 10 10:34:46 2024 +0800

    update fp4_e2m1 mapping list (#1906)

    * update fp4_e2m1 mapping list

    * Update utility.py

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    ---------

    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 3fe2fd9
Author: xinhe <[email protected]>
Date:   Tue Jul 9 15:01:25 2024 +0800

    fix bf16 symbolic_trace bug (#1892)

    Description: fix bf16 symbolic_trace bug,

    - cause abnormal recursive calling.
    - missing necessary attributes
    - By moving BF16 fallback ahead of quantization and removing bf16_symbolic_trace, we fix it.

    ---------

    Signed-off-by: xin3he <[email protected]>
    Co-authored-by: Sun, Xuehao <[email protected]>

commit e080e06
Author: Sun, Xuehao <[email protected]>
Date:   Tue Jul 9 11:04:30 2024 +0800

    remove neural insight CI (#1903)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit f28fcee
Author: Yi Liu <[email protected]>
Date:   Fri Jul 5 15:47:37 2024 +0800

    Remove 1x API (#1865)

    Signed-off-by: yiliu30 <[email protected]>
    Co-authored-by: chen, suyue <[email protected]>

commit 1386ac5
Author: Yi Liu <[email protected]>
Date:   Thu Jul 4 12:18:03 2024 +0800

    Port auto-detect absorb layers for TEQ (#1895)

    Signed-off-by: yiliu30 <[email protected]>

commit 856118e
Author: Wang, Chang <[email protected]>
Date:   Wed Jul 3 13:50:00 2024 +0800

    remove import pdb (#1897)

    Signed-off-by: changwangss <[email protected]>

commit f75ff40
Author: xinhe <[email protected]>
Date:   Wed Jul 3 13:07:48 2024 +0800

    support auto_host2device on RTN and GPTQ(#1894)

    Signed-off-by: He, Xin3 <[email protected]>

commit b9e73f5
Author: chen, suyue <[email protected]>
Date:   Wed Jul 3 11:10:45 2024 +0800

    tmp fix nas deps issue (#1896)

    Signed-off-by: chensuyue <[email protected]>

commit 63b2912
Author: Yi Liu <[email protected]>
Date:   Tue Jul 2 14:46:02 2024 +0800

    Refine HQQ UTs (#1888)

    Signed-off-by: yiliu30 <[email protected]>

commit 5592acc
Author: zehao-intel <[email protected]>
Date:   Tue Jul 2 14:18:51 2024 +0800

    Remove Gelu Fusion for TF Newapi (#1886)

    Signed-off-by: zehao-intel <[email protected]>

commit 4372a76
Author: Kaihui-intel <[email protected]>
Date:   Fri Jun 28 14:55:10 2024 +0800

    Fix sql injection for Neural Solution gRPC (#1879)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 4ae2e87
Author: xinhe <[email protected]>
Date:   Thu Jun 27 09:56:52 2024 +0800

    support quant_lm_head arg in all WOQ configs (#1881)

    Signed-off-by: xin3he <[email protected]>

commit cc763f5
Author: Dina Suehiro Jones <[email protected]>
Date:   Wed Jun 26 18:29:06 2024 -0700

    Update the Gaudi container example in the README (#1885)

commit 1f58f02
Author: Yi Liu <[email protected]>
Date:   Thu Jun 20 22:03:45 2024 +0800

    Add `set_local` support for static quant with pt2e (#1870)

    Signed-off-by: yiliu30 <[email protected]>

commit 0341295
Author: Yi Liu <[email protected]>
Date:   Wed Jun 19 09:40:11 2024 +0800

    rm cov (#1878)

    Signed-off-by: yiliu30 <[email protected]>

commit 503d9ef
Author: Kaihui-intel <[email protected]>
Date:   Tue Jun 18 17:12:12 2024 +0800

    Add op statistics dump for woq (#1876)

    Signed-off-by: Kaihui-intel <[email protected]>

commit 5a0374e
Author: Yi Liu <[email protected]>
Date:   Tue Jun 18 16:21:05 2024 +0800

    Enhance autotune to return the best `q_model` directly (#1875)

    Signed-off-by: yiliu30 <[email protected]>

commit 90fb431
Author: Kaihui-intel <[email protected]>
Date:   Tue Jun 18 16:06:04 2024 +0800

    fix layer match (#1873)

    Signed-off-by: Kaihui-intel <[email protected]>
    Co-authored-by: Sun, Xuehao <[email protected]>

commit f4eb660
Author: Sun, Xuehao <[email protected]>
Date:   Mon Jun 17 16:12:06 2024 +0800

    Limit numpy versions (#1874)

    Signed-off-by: Sun, Xuehao <[email protected]>

commit 2928d85
Author: chen, suyue <[email protected]>
Date:   Fri Jun 14 21:51:13 2024 +0800

    update v2.6 release readme (#1871)

    Signed-off-by: chensuyue <[email protected]>

commit 48c5e3a
Author: Kaihui-intel <[email protected]>
Date:   Fri Jun 14 21:10:14 2024 +0800

    Modify WOQ examples structure (#1866)

    Signed-off-by: Kaihui-intel <[email protected]>
    Signed-off-by: chensuyue <[email protected]>

commit 498af74
Author: Sun, Xuehao <[email protected]>
Date:   Fri Jun 14 21:09:36 2024 +0800

    Update SQ/WOQ status (#1869)

    Signed-off-by: Sun, Xuehao <[email protected]>
    Co-authored-by: chen, suyue <[email protected]>

commit b401b02
Author: Kaihui-intel <[email protected]>
Date:   Fri Jun 14 17:48:03 2024 +0800

    Add PT2E cv&llm example (#1853)

    Signed-off-by: Kaihui-intel <[email protected]>

commit e470f6c
Author: xinhe <[email protected]>
Date:   Fri Jun 14 17:34:26 2024 +0800

    [3x] add recommendation examples (#1844)

    Signed-off-by: xin3he <[email protected]>

commit a141512
Author: zehao-intel <[email protected]>
Date:   Fri Jun 14 14:56:30 2024 +0800

    Improve UT Branch Coverage for TF 3x (#1867)

    Signed-off-by: zehao-intel <[email protected]>

commit b99a79d
Author: Zixuan Cheng <[email protected]>
Date:   Fri Jun 14 14:10:49 2024 +0800

    modify 3.x ipex example structure (#1858)

    * modify 3.x ipex example structure

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * add json path

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * fix for sq

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * minor fix

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * Update run_clm_no_trainer.py

    * Update run_clm_no_trainer.py

    * Update run_clm_no_trainer.py

    * minor fix

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * remove old files

    Signed-off-by: Cheng, Zixuan <[email protected]>

    * fix act_algo

    Signed-off-by: Cheng, Zixuan <[email protected]>

    ---------

    Signed-off-by: Cheng, Zixuan <[email protected]>
    Co-authored-by: xinhe <[email protected]>

commit 922b247
Author: zehao-intel <[email protected]>
Date:   Fri Jun 14 12:33:39 2024 +0800

    Add TF 3x Examples (#1839)

    Signed-off-by: zehao-intel <[email protected]>

commit 70a1d50
Author: Zixuan Cheng <[email protected]>
Date:   Fri Jun 14 10:17:33 2024 +0800

    fix 3x ipex static quant regression (#1864)

    Description
    fix 3x ipex static quant regression
    cannot fallback with op type name ('linear')
    dump wrong op stats (no 'Linear&relu' op type)
    ---------

    Signed-off-by: Cheng, Zixuan <[email protected]>

commit 4e45f8f
Author: zehao-intel <[email protected]>
Date:   Fri Jun 14 10:04:11 2024 +0800

    Improve UT Coverage for TF 3x  (#1852)

    Signed-off-by: zehao-intel <[email protected]>
    Signed-off-by: chensuyue <[email protected]>

commit 794b276
Author: xinhe <[email protected]>
Date:   Thu Jun 13 18:02:04 2024 +0800

    migrate export to 2x and 3x from deprecated (#1845)

    Signed-off-by: xin3he <[email protected]>

commit 0eced14
Author: yuwenzho <[email protected]>
Date:   Wed Jun 12 18:49:17 2024 -0700

    Enhance INC WOQ model loading & support Huggingface WOQ model loading (#1826)

    Signed-off-by: yuwenzho <[email protected]>

commit 6733dab
Author: Wang, Mengni <[email protected]>
Date:   Wed Jun 12 17:08:31 2024 +0800

    update mx script (#1838)

    Signed-off-by: Mengni Wang <[email protected]>

commit a0dee94
Author: Wang, Chang <[email protected]>
Date:   Wed Jun 12 15:01:25 2024 +0800

    Remove export_compressed_model in AWQConfig (#1831)

commit 2c3556d
Author: Huang, Tai <[email protected]>
Date:   Wed Jun 12 14:46:14 2024 +0800

    Add 3x architecture diagram (#1849)

    Signed-off-by: Huang, Tai <[email protected]>

commit 0e2cade
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Wed Jun 12 14:20:06 2024 +0800

    Bump braces from 3.0.2 to 3.0.3 in /neural_insights/gui (#1862)

    Signed-off-by: dependabot[bot] <[email protected]>

commit 5b5579b
Author: Kaihui-intel <[email protected]>
Date:   Wed Jun 12 14:12:00 2024 +0800

    Fix Neural Solution security issue (#1856)

    Signed-off-by: Kaihui-intel <[email protected]>

commit e9cb48c
Author: xinhe <[email protected]>
Date:   Wed Jun 12 11:19:47 2024 +0800

    improve UT coverage of PT Utils and Quantization (#1842)

    * update UTs

    ---------

    Signed-off-by: xin3he <[email protected]>
    Signed-off-by: xinhe3 <[email protected]>

commit 6b27383
Author: Yi Liu <[email protected]>
Date:   Wed Jun 12 11:11:50 2024 +0800

    Fix config expansion with empty options (#1861)

    Signed-off-by: yiliu30 <[email protected]>

commit 25c71aa
Author: WenjiaoYue <[email protected]>
Date:   Tue Jun 11 17:54:31 2024 +0800

    Delete the static resources of the JupyterLab extension after packaging (#1860)

    Signed-off-by: Yue, Wenjiao <[email protected]>

commit 455f1e1
Author: Wang, Mengni <[email protected]>
Date:   Tue Jun 11 15:28:40 2024 +0800

    Add UT and remove unused code for torch MX quant (#1854)

    * Add UT and remove unused code for torch MX quant
    ---------

    Signed-off-by: Mengni Wang <[email protected]>

Signed-off-by: xinhe3 <[email protected]>
  • Loading branch information
xinhe3 committed Oct 11, 2024
1 parent 23fe77e commit be20c15
Show file tree
Hide file tree
Showing 99 changed files with 76,801 additions and 1,750 deletions.
6 changes: 3 additions & 3 deletions .azure-pipelines/scripts/fwk_version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

echo "export FWs version..."
export tensorflow_version='2.15.0-official'
export pytorch_version='2.3.0+cpu'
export torchvision_version='0.18.0+cpu'
export ipex_version='2.3.0+cpu'
export pytorch_version='2.4.0+cpu'
export torchvision_version='0.19.0'
export ipex_version='2.4.0+cpu'
export onnx_version='1.16.0'
export onnxruntime_version='1.18.0'
export mxnet_version='1.9.1'
15 changes: 10 additions & 5 deletions .azure-pipelines/scripts/install_nc.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,21 @@
echo -e "\n Install Neural Compressor ... "
cd /neural-compressor
if [[ $1 = *"3x_pt"* ]]; then
if [[ $1 != *"3x_pt_fp8"* ]]; then
python -m pip install --no-cache-dir -r requirements_pt.txt
if [[ $1 = *"3x_pt_fp8"* ]]; then
pip uninstall neural_compressor_3x_pt -y || true
python setup.py pt bdist_wheel
else
echo -e "\n Install torch CPU ... "
pip install torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu
python -m pip install --no-cache-dir -r requirements.txt
python setup.py bdist_wheel
fi
python -m pip install --no-cache-dir -r requirements_pt.txt
python setup.py pt bdist_wheel
pip install --no-deps dist/neural_compressor*.whl --force-reinstall
elif [[ $1 = *"3x_tf"* ]]; then
python -m pip install --no-cache-dir -r requirements.txt
python -m pip install --no-cache-dir -r requirements_tf.txt
python setup.py tf bdist_wheel
python setup.py bdist_wheel
pip install dist/neural_compressor*.whl --force-reinstall
else
python -m pip install --no-cache-dir -r requirements.txt
Expand Down
8 changes: 6 additions & 2 deletions .azure-pipelines/scripts/models/env_setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,10 @@ SCRIPTS_PATH="/neural-compressor/.azure-pipelines/scripts/models"
log_dir="/neural-compressor/.azure-pipelines/scripts/models"
if [[ "${inc_new_api}" == "3x"* ]]; then
WORK_SOURCE_DIR="/neural-compressor/examples/3.x_api/${framework}"
git clone https://github.com/intel/intel-extension-for-transformers.git /itrex
cd /itrex
pip install -r requirements.txt
pip install -v .
else
WORK_SOURCE_DIR="/neural-compressor/examples/${framework}"
fi
Expand Down Expand Up @@ -95,8 +99,8 @@ if [[ "${fwk_ver}" != "latest" ]]; then
pip install intel-tensorflow==${fwk_ver}
fi
elif [[ "${framework}" == "pytorch" ]]; then
pip install torch==${fwk_ver} -f https://download.pytorch.org/whl/torch_stable.html
pip install torchvision==${torch_vision_ver} -f https://download.pytorch.org/whl/torch_stable.html
pip install torch==${fwk_ver} --index-url https://download.pytorch.org/whl/cpu
pip install torchvision==${torch_vision_ver} --index-url https://download.pytorch.org/whl/cpu
elif [[ "${framework}" == "onnxrt" ]]; then
pip install onnx==1.15.0
pip install onnxruntime==${fwk_ver}
Expand Down
5 changes: 4 additions & 1 deletion .azure-pipelines/scripts/ut/3x/run_3x_pt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@ rm -rf torch/quantization/fp8_quant
LOG_DIR=/neural-compressor/log_dir
mkdir -p ${LOG_DIR}
ut_log_name=${LOG_DIR}/ut_3x_pt.log
pytest --cov="${inc_path}" -vs --disable-warnings --html=report.html --self-contained-html . 2>&1 | tee -a ${ut_log_name}

find . -name "test*.py" | sed "s,\.\/,python -m pytest --cov=\"${inc_path}\" --cov-report term --html=report.html --self-contained-html --cov-report xml:coverage.xml --cov-append -vs --disable-warnings ,g" > run.sh
cat run.sh
bash run.sh 2>&1 | tee ${ut_log_name}

cp report.html ${LOG_DIR}/

Expand Down
2 changes: 2 additions & 0 deletions .azure-pipelines/scripts/ut/3x/run_3x_pt_fp8.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ echo "${test_case}"
echo "set up UT env..."
export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH
sed -i '/^intel_extension_for_pytorch/d' /neural-compressor/test/3x/torch/requirements.txt
sed -i '/^auto_round/d' /neural-compressor/test/3x/torch/requirements.txt
cat /neural-compressor/test/3x/torch/requirements.txt
pip install -r /neural-compressor/test/3x/torch/requirements.txt
pip install git+https://github.com/HabanaAI/[email protected]
pip install pytest-cov
Expand Down
2 changes: 1 addition & 1 deletion .azure-pipelines/scripts/ut/env_setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ elif [[ $(echo "${test_case}" | grep -c "tf pruning") != 0 ]]; then
fi

if [[ $(echo "${test_case}" | grep -c "api") != 0 ]] || [[ $(echo "${test_case}" | grep -c "adaptor") != 0 ]]; then
pip install git+https://github.com/intel/auto-round.git@e24b9074af6cdb099e31c92eb81b7f5e9a4a244e
pip install git+https://github.com/intel/auto-round.git@5dd16fc34a974a8c2f5a4288ce72e61ec3b1410f
fi

# test deps
Expand Down
4 changes: 2 additions & 2 deletions .azure-pipelines/scripts/ut/run_basic_pt_pruning.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ test_case="run basic pt pruning"
echo "${test_case}"

echo "specify fwk version..."
export pytorch_version='2.3.0+cpu'
export pytorch_version='2.4.0+cpu'
export torchvision_version='0.18.0+cpu'
export ipex_version='2.3.0+cpu'
export ipex_version='2.4.0+cpu'

echo "set up UT env..."
bash /neural-compressor/.azure-pipelines/scripts/ut/env_setup.sh "${test_case}"
Expand Down
3 changes: 2 additions & 1 deletion .azure-pipelines/scripts/ut/run_itrex.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ bash /intel-extension-for-transformers/.github/workflows/script/install_binary.s
sed -i '/neural-compressor.git/d' /intel-extension-for-transformers/tests/requirements.txt
pip install -r /intel-extension-for-transformers/tests/requirements.txt
# workaround
pip install onnx==1.15.0
pip install onnx==1.16.0
pip install onnxruntime==1.18.0
echo "pip list itrex ut deps..."
pip list
LOG_DIR=/neural-compressor/log_dir
Expand Down
4 changes: 2 additions & 2 deletions .azure-pipelines/template/docker-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,19 +36,18 @@ steps:
- ${{ if eq(parameters.dockerConfigName, 'commonDockerConfig') }}:
- script: |
rm -fr ${BUILD_SOURCESDIRECTORY} || sudo rm -fr ${BUILD_SOURCESDIRECTORY} || true
echo y | docker image prune -a
displayName: "Clean workspace"
- checkout: self
clean: true
displayName: "Checkout out Repo"
fetchDepth: 0

- ${{ if eq(parameters.dockerConfigName, 'gitCloneDockerConfig') }}:
- script: |
rm -fr ${BUILD_SOURCESDIRECTORY} || sudo rm -fr ${BUILD_SOURCESDIRECTORY} || true
mkdir ${BUILD_SOURCESDIRECTORY}
chmod 777 ${BUILD_SOURCESDIRECTORY}
echo y | docker image prune -a
displayName: "Clean workspace"
- checkout: none
Expand All @@ -62,6 +61,7 @@ steps:
- ${{ if eq(parameters.imageSource, 'build') }}:
- script: |
docker image prune -a -f
if [[ ! $(docker images | grep -i ${{ parameters.repoName }}:${{ parameters.repoTag }}) ]]; then
docker build -f ${BUILD_SOURCESDIRECTORY}/.azure-pipelines/docker/${{parameters.dockerFileName}}.devel -t ${{ parameters.repoName }}:${{ parameters.repoTag }} .
fi
Expand Down
2 changes: 2 additions & 0 deletions .azure-pipelines/ut-basic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ pr:
- neural_compressor/torch
- neural_compressor/tensorflow
- neural_compressor/onnxrt
- neural_compressor/transformers
- neural_compressor/evaluation
- .azure-pipelines/scripts/ut/3x

pool: ICX-16C
Expand Down
3 changes: 2 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,8 @@ repos:
examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static/prompt.json|
examples/notebook/dynas/ResNet50_Quantiation_Search_Supernet_NAS.ipynb|
examples/notebook/dynas/Transformer_LT_Supernet_NAS.ipynb|
neural_compressor/torch/algorithms/fp8_quant/internal/diffusion_evaluation/SR_evaluation/imagenet1000_clsidx_to_labels.txt
neural_compressor/torch/algorithms/fp8_quant/internal/diffusion_evaluation/SR_evaluation/imagenet1000_clsidx_to_labels.txt|
neural_compressor/evaluation/hf_eval/datasets/cnn_validation.json
)$
- repo: https://github.com/astral-sh/ruff-pre-commit
Expand Down
18 changes: 16 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testi
* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)

## What's New
* [2024/10] [Transformers-like API](./docs/source/3x/transformers_like_api.md) for INT4 inference on Intel CPU and GPU.
* [2024/07] From 3.0 release, framework extension API is recommended to be used for quantization.
* [2024/07] Performance optimizations and usability improvements on [client-side](./docs/source/3x/client_quant.md).

Expand Down Expand Up @@ -71,7 +72,7 @@ pip install "neural-compressor>=2.3" "transformers>=4.34.0" torch torchvision
```
After successfully installing these packages, try your first quantization program.

### [FP8 Quantization](./examples/3.x_api/pytorch/cv/fp8_quant/)
### [FP8 Quantization](./docs/source/3x/PT_FP8Quant.md)
Following example code demonstrates FP8 Quantization, it is supported by Intel Gaudi2 AI Accelerator.

To try on Intel Gaudi2, docker image with Gaudi Software Stack is recommended, please refer to following script for environment setup. More details can be found in [Gaudi Guide](https://docs.habana.ai/en/latest/Installation_Guide/Bare_Metal_Fresh_OS.html#launch-docker-image-that-was-built).
Expand Down Expand Up @@ -147,7 +148,7 @@ Intel Neural Compressor will convert the model format from auto-gptq to hpu form
</tr>
<tr>
<td colspan="2" align="center"><a href="./docs/source/3x/PT_WeightOnlyQuant.md">Weight-Only Quantization</a></td>
<td colspan="2" align="center"><a href="./docs/3x/PT_FP8Quant.md">FP8 Quantization</a></td>
<td colspan="2" align="center"><a href="./docs/source/3x/PT_FP8Quant.md">FP8 Quantization</a></td>
<td colspan="2" align="center"><a href="./docs/source/3x/PT_MXQuant.md">MX Quantization</a></td>
<td colspan="2" align="center"><a href="./docs/source/3x/PT_MixedPrecision.md">Mixed Precision</a></td>
</tr>
Expand All @@ -164,6 +165,16 @@ Intel Neural Compressor will convert the model format from auto-gptq to hpu form
<td colspan="2" align="center"><a href="./docs/source/3x/TF_SQ.md">Smooth Quantization</a></td>
</tr>
</tbody>
<thead>
<tr>
<th colspan="8">Transformers-like APIs</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="8" align="center"><a href="./docs/source/3x/transformers_like_api.md">Overview</a></td>
</tr>
</tbody>
<thead>
<tr>
<th colspan="8">Other Modules</th>
Expand All @@ -181,6 +192,9 @@ Intel Neural Compressor will convert the model format from auto-gptq to hpu form
> From 3.0 release, we recommend to use 3.X API. Compression techniques during training such as QAT, Pruning, Distillation only available in [2.X API](https://github.com/intel/neural-compressor/blob/master/docs/source/2x_user_guide.md) currently.
## Selected Publications/Events

* EMNLP'2024: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2024)
* Blog on Medium: [Quantization on Intel Gaudi Series AI Accelerators](https://medium.com/intel-analytics-software/intel-neural-compressor-v3-0-a-quantization-tool-across-intel-hardware-9856adee6f11) (Aug 2024)
* Blog by Intel: [Neural Compressor: Boosting AI Model Efficiency](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Neural-Compressor-Boosting-AI-Model-Efficiency/post/1604740) (June 2024)
* Blog by Intel: [Optimization of Intel AI Solutions for Alibaba Cloud’s Qwen2 Large Language Models](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-accelerate-alibaba-qwen2-llms.html) (June 2024)
* Blog by Intel: [Accelerate Meta* Llama 3 with Intel AI Solutions](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-meta-llama3-with-intel-ai-solutions.html) (Apr 2024)
Expand Down
2 changes: 2 additions & 0 deletions docs/build_docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,12 @@
"sphinx.ext.coverage",
"sphinx.ext.autosummary",
"sphinx_md",
"sphinx_rtd_theme",
"autoapi.extension",
"sphinx.ext.napoleon",
"sphinx.ext.githubpages",
"sphinx.ext.linkcode",
"sphinxcontrib.jquery",
]

autoapi_dirs = ["../../neural_compressor"]
Expand Down
16 changes: 10 additions & 6 deletions docs/build_docs/sphinx-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
recommonmark
sphinx==6.1.1
sphinx-autoapi
sphinx-markdown-tables
sphinx-md
sphinx_rtd_theme
recommonmark==0.7.1
setuptools_scm[toml]==8.1.0
sphinx==7.3.7
sphinx-autoapi==3.1.0
sphinx-autobuild==2024.4.16
sphinx-markdown-tables==0.0.17
sphinx-md==0.0.4
sphinx_rtd_theme==2.0.0
sphinxcontrib-jquery==4.1
sphinxemoji==0.3.1
23 changes: 23 additions & 0 deletions docs/build_docs/update_html.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,34 @@ def update_source_url(version, folder_name, index_file):
f.write(index_buf)


def update_search(folder):
search_file_name = "{}/search.html".format(folder)

with open(search_file_name, "r") as f:
index_buf = f.read()
key_str = '<script src="_static/searchtools.js"></script>'
version_list = """<!--[if lt IE 9]>
<script src="_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="_static/jquery.js?v=5d32c60e"></script>
<script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
<script src="_static/documentation_options.js?v=fc837d61"></script>
<script src="_static/doctools.js?v=9a2dae69"></script>
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
<script src="_static/js/theme.js"></script>
<script src="_static/searchtools.js"></script>"""
index_buf = index_buf.replace(key_str, version_list)

with open(search_file_name, "w") as f:
f.write(index_buf)


def main(folder, version):
folder_name = os.path.basename(folder)
for index_file in glob.glob("{}/**/*.html".format(folder), recursive=True):
update_version_link(version, folder_name, index_file)
update_source_url(version, folder_name, index_file)
update_search(folder)


def help(me):
Expand Down
2 changes: 1 addition & 1 deletion docs/3x/PT_FP8Quant.md → docs/source/3x/PT_FP8Quant.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,6 @@ model = convert(model)
| Task | Example |
|----------------------|---------|
| Computer Vision (CV) | [Link](../../examples/3.x_api/pytorch/cv/fp8_quant/) |
| Large Language Model (LLM) | [Link](https://github.com/HabanaAI/optimum-habana-fork/tree/habana-main/examples/text-generation#running-with-fp8) |
| Large Language Model (LLM) | [Link](https://github.com/huggingface/optimum-habana/tree/main/examples/text-generation#running-with-fp8) |

> Note: For LLM, Optimum-habana provides higher performance based on modified modeling files, so here the Link of LLM goes to Optimum-habana, which utilize Intel Neural Compressor for FP8 quantization internally.
8 changes: 6 additions & 2 deletions docs/source/3x/PT_WeightOnlyQuant.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@

PyTorch Weight Only Quantization
===============

- [Introduction](#introduction)
- [Supported Matrix](#supported-matrix)
- [Usage](#usage)
Expand All @@ -14,6 +15,8 @@ PyTorch Weight Only Quantization
- [HQQ](#hqq)
- [Specify Quantization Rules](#specify-quantization-rules)
- [Saving and Loading](#saving-and-loading)
- [Layer Wise Quantization](#layer-wise-quantization)
- [Efficient Usage on Client-Side](#efficient-usage-on-client-side)
- [Examples](#examples)

## Introduction
Expand Down Expand Up @@ -108,9 +111,10 @@ model = convert(model)
| model_path (str) | Model path that is used to load state_dict per layer | |
| use_double_quant (bool) | Enables double quantization | False |
| act_order (bool) | Whether to sort Hessian's diagonal values to rearrange channel-wise quantization order | False |
| percdamp (float) | Percentage of Hessian's diagonal values' average, which will be added to Hessian's diagonal to increase numerical stability | 0.01. |
| percdamp (float) | Percentage of Hessian's diagonal values' average, which will be added to Hessian's diagonal to increase numerical stability | 0.01 |
| block_size (int) | Execute GPTQ quantization per block, block shape = [C_out, block_size] | 128 |
| static_groups (bool) | Whether to calculate group wise quantization parameters in advance. This option mitigate actorder's extra computational requirements. | False. |
| static_groups (bool) | Whether to calculate group wise quantization parameters in advance. This option mitigate actorder's extra computational requirements. | False |
| true_sequential (bool) | Whether to quantize layers within a transformer block in their original order. This can lead to higher accuracy but slower overall quantization process. | False |
> **Note:** `model_path` is only used when use_layer_wise=True. `layer-wise` is stay-tuned.
``` python
Expand Down
27 changes: 22 additions & 5 deletions docs/source/3x/PyTorch.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,16 +176,21 @@ def load(output_dir="./saved_results", model=None):
<td class="tg-9wq8"><a href="PT_SmoothQuant.md">link</a></td>
</tr>
<tr>
<td class="tg-9wq8" rowspan="2">Static Quantization</td>
<td class="tg-9wq8" rowspan="2"><a href=https://pytorch.org/docs/master/quantization.html#post-training-static-quantization>Post-traning Static Quantization</a></td>
<td class="tg-9wq8">intel-extension-for-pytorch</td>
<td class="tg-9wq8" rowspan="3">Static Quantization</td>
<td class="tg-9wq8" rowspan="3"><a href=https://pytorch.org/docs/master/quantization.html#post-training-static-quantization>Post-traning Static Quantization</a></td>
<td class="tg-9wq8">intel-extension-for-pytorch (INT8)</td>
<td class="tg-9wq8">&#10004</td>
<td class="tg-9wq8"><a href="PT_StaticQuant.md">link</a></td>
</tr>
<tr>
<td class="tg-9wq8"><a href=https://pytorch.org/docs/stable/torch.compiler_deepdive.html>TorchDynamo</a></td>
<td class="tg-9wq8"><a href=https://pytorch.org/docs/stable/torch.compiler_deepdive.html>TorchDynamo (INT8)</a></td>
<td class="tg-9wq8">&#10004</td>
<td class="tg-9wq8"><a href="PT_StaticQuant.md">link</a></td>
<tr>
<td class="tg-9wq8"><a href=https://docs.habana.ai/en/latest/index.html>Intel Gaudi AI accelerator (FP8)</a></td>
<td class="tg-9wq8">&#10004</td>
<td class="tg-9wq8"><a href="PT_FP8Quant.md">link</a></td>
</tr>
</tr>
<tr>
<td class="tg-9wq8">Dynamic Quantization</td>
Expand Down Expand Up @@ -240,7 +245,7 @@ Deep Learning</a></td>
</table>
2. How to set different configuration for specific op_name or op_type?
> INC extends a `set_local` method based on the global configuration object to set custom configuration.
> Neural Compressor extends a `set_local` method based on the global configuration object to set custom configuration.
```python
def set_local(self, operator_name_or_list: Union[List, str, Callable], config: BaseConfig) -> BaseConfig:
Expand All @@ -259,3 +264,15 @@ Deep Learning</a></td>
quant_config.set_local(".*mlp.*", RTNConfig(bits=8)) # For layers with "mlp" in their names, set bits=8
quant_config.set_local("Conv1d", RTNConfig(dtype="fp32")) # For Conv1d layers, do not quantize them.
```

3. How to specify an accelerator?

> Neural Compressor provides automatic accelerator detection, including HPU, XPU, CUDA, and CPU.

> The automatically detected accelerator may not be suitable for some special cases, such as poor performance, memory limitations. In such situations, users can override the detected accelerator by setting the environment variable `INC_TARGET_DEVICE`.

> Usage:

```bash
export INC_TARGET_DEVICE=cpu
```
Loading

0 comments on commit be20c15

Please sign in to comment.