Enable quant model support #1074

jiqing-feng · 2024-12-16T05:35:09Z

This PR could enable BNB model's support. Even we cannot use fused linear in a quant model, there is still a 20% speed-up of decoding on llama2-7b.

Wii rebase it after #1054 merged.

Signed-off-by: jiqing-feng <[email protected]>

HuggingFaceDocBuilderDev · 2024-12-16T05:40:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng added 10 commits December 9, 2024 12:31

enable IPEXModelForSeq2SeqLM

3888824

Signed-off-by: jiqing-feng <[email protected]>

set static cache

f9fa807

Signed-off-by: jiqing-feng <[email protected]>

add tests for IPEXModelForSeq2SeqLM

202df43

Signed-off-by: jiqing-feng <[email protected]>

add docs

4488073

Signed-off-by: jiqing-feng <[email protected]>

fix readme

16fecf8

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into text2text

de501f4

refactor compile

4225bf0

Signed-off-by: jiqing-feng <[email protected]>

fix check

2ac7ecf

Signed-off-by: jiqing-feng <[email protected]>

fix ruff check

24b988c

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'huggingface:main' into text2text

5c4f9a1

jiqing-feng marked this pull request as draft December 16, 2024 05:35

jiqing-feng added 7 commits December 16, 2024 12:10

enable quantized model

46b93a4

Signed-off-by: jiqing-feng <[email protected]>

add bnb test

82d39ce

Signed-off-by: jiqing-feng <[email protected]>

add bnb tests in yaml

7dc08da

Signed-off-by: jiqing-feng <[email protected]>

fix tests

30027ff

Signed-off-by: jiqing-feng <[email protected]>

disable bnb tests

314db04

Signed-off-by: jiqing-feng <[email protected]>

fix gpt2

87656ca

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into quant

9a7e931

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable quant model support #1074

Enable quant model support #1074

jiqing-feng commented Dec 16, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 16, 2024

Enable quant model support #1074

Are you sure you want to change the base?

Enable quant model support #1074

Conversation

jiqing-feng commented Dec 16, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 16, 2024

jiqing-feng commented Dec 16, 2024 •

edited

Loading