Update documentation (#485)

huggingface · Dec 7, 2023 · f32d501 · f32d501
1 parent 3da80f6
commit f32d501
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -75,12 +75,13 @@ It is possible to export your model to the [OpenVINO](https://docs.openvino.ai/2
 optimum-cli export openvino --model gpt2 ov_model
 ```
 
-If you add `--int8`, the weights will be quantized to INT8, the activations will be kept in floating point precision.
+If you add `--int8`, the model linear and embedding weights will be quantized to INT8, the activations will be kept in floating point precision.
 
 ```plain
 optimum-cli export openvino --model gpt2 --int8 ov_model
 ```
 
+To apply quantization on both weights and activations, you can find more information in the [documentation](https://huggingface.co/docs/optimum/main/en/intel/optimization_ov).
 
 #### Inference:
 

diff --git a/docs/source/inference.mdx b/docs/source/inference.mdx
@@ -102,7 +102,7 @@ You can also apply INT8 quantization on your models weights when exporting your
 optimum-cli export openvino --model gpt2 --int8 ov_model
 ```
 
-This will results in the exported model linear and embedding layers to be quanrtized to INT8, the activations will be kept in floating point precision.
+This will results in the exported model linear and embedding layers to be quantized to INT8, the activations will be kept in floating point precision.
 
 This can also be done when loading your model by setting the `load_in_8bit` argument when calling the `from_pretrained()` method.