Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[output issue] found mistakes in llama-3-70b output by bf16_int4 during benchmark #413

Open
intelyoungway opened this issue May 21, 2024 · 1 comment

Comments

@intelyoungway
Copy link

weights: Meta-Llama-3-70B-Instruct
precision: bf16_int4 (v.s. bf16)
version: 1.6.0
hardware: 2S-SPR9468 (Quadrant/Flat)
system: Ubuntu22.04LTS container (latest XFT image)
kernel: 5.17.3
command:

bf16 precision:

bash run_benchmark.sh -m llama-3-70b -d bf16 -s 2 -bs 1 -in 1024 -out 128 -i 1

bf16_int4:

bash run_benchmark.sh -m llama-3-70b -d bf16_int4 -s 2 -bs 1 -in 1024 -out 128 -i 1

issue:

on bf16 precision, output is valid:
bf16-output-is-ok

on bf16_int4 precision, output is invalid:
bf16-int4-output-is-invalid

@pujiang2018
Copy link
Contributor

new quantization mechanism is under design, need some time to make the potential fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants