Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I target NPU for quantization? #38

Open
kleiti opened this issue Nov 12, 2024 · 3 comments
Open

How do I target NPU for quantization? #38

kleiti opened this issue Nov 12, 2024 · 3 comments

Comments

@kleiti
Copy link

kleiti commented Nov 12, 2024

I can provide the execution provider like this:
config.StaticQuantConfig(calibration_data_reader=data_reader, quant_format=QuantFormat.QOperator, execution_provider="DmlExecutionProvider"), but there's no way to specify the device as NPU?

@mengniwang95
Copy link
Contributor

Hi, if you use ONNX Runtime to run the quantized model on NPU, providing execution provider is enough.

@kleiti
Copy link
Author

kleiti commented Nov 13, 2024

How does it select between GPU and NPU if using DmlExecutionProvider?

@mengniwang95
Copy link
Contributor

Hi, since currently we only provide limited support and do limited test for DML EP, we don't expose many parameter about this EP.
You can check this PR of onnxrutnime about device selecting for DML EP microsoft/onnxruntime#17612.
Actually you can quantize the model first, and then select device during later inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants