diff --git a/docs/source/openvino/export.mdx b/docs/source/openvino/export.mdx index 2b4ad4f05..3e7e458c0 100644 --- a/docs/source/openvino/export.mdx +++ b/docs/source/openvino/export.mdx @@ -78,7 +78,7 @@ Optional arguments: --ratio RATIO A parameter used when applying 4-bit quantization to control the ratio between 4-bit and 8-bit quantization. If set to 0.8, 80% of the layers will be quantized to int4 while 20% will be quantized to int8. This helps to achieve better accuracy at the sacrifice of the model size - and inference latency. Default value is 1.0. Note: If dataset is provided, and the ration is + and inference latency. Default value is 1.0. Note: If dataset is provided, and the ratio is less than 1.0, then data-aware mixed precision assignment will be applied. --sym Whether to apply symmetric quantization --group-size GROUP_SIZE