diff --git a/quanto-introduction.md b/quanto-introduction.md index a3eedee1e6..7d3775c6d8 100644 --- a/quanto-introduction.md +++ b/quanto-introduction.md @@ -108,14 +108,14 @@ Note: the first bar in each group always corresponds to the non-quantized model.
- mistralai/Mistral-7B-v0.1 Lambada prediction accuracy + mistralai/Mistral-7B-v0.1 Lambada prediction accuracy
- mistralai/Mistral-7B-v0.1 Lambada prediction accuracy + mistralai/Mistral-7B-v0.1 Lambada prediction accuracy
@@ -126,7 +126,7 @@ The graph below gives the latency per-token measured on an NVIDIA A100 GPU.
- mistralai/Mistral-7B-v0.1 Mean Latency per token + mistralai/Mistral-7B-v0.1 Mean Latency per token