How do I target NPU for quantization? #38

kleiti · 2024-11-12T13:53:22Z

I can provide the execution provider like this:
config.StaticQuantConfig(calibration_data_reader=data_reader, quant_format=QuantFormat.QOperator, execution_provider="DmlExecutionProvider"), but there's no way to specify the device as NPU?

mengniwang95 · 2024-11-13T04:44:09Z

Hi, if you use ONNX Runtime to run the quantized model on NPU, providing execution provider is enough.

kleiti · 2024-11-13T06:54:49Z

How does it select between GPU and NPU if using DmlExecutionProvider?

mengniwang95 · 2024-11-13T09:28:36Z

Hi, since currently we only provide limited support and do limited test for DML EP, we don't expose many parameter about this EP.
You can check this PR of onnxrutnime about device selecting for DML EP microsoft/onnxruntime#17612.
Actually you can quantize the model first, and then select device during later inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I target NPU for quantization? #38

How do I target NPU for quantization? #38

kleiti commented Nov 12, 2024

mengniwang95 commented Nov 13, 2024

kleiti commented Nov 13, 2024

mengniwang95 commented Nov 13, 2024

How do I target NPU for quantization? #38

How do I target NPU for quantization? #38

Comments

kleiti commented Nov 12, 2024

mengniwang95 commented Nov 13, 2024

kleiti commented Nov 13, 2024

mengniwang95 commented Nov 13, 2024