We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
目前使用--fp16将模型大小降低了一倍,但运行过程中内存并无变化,请问该如何修改以降低内存呢
The text was updated successfully, but these errors were encountered:
模型转换的 --fp16 与是否使用 fp16 推理没有关联,使用 fp16 的开关是:编译 mnn 打开 MNN_ARM82 ,创建 session 或者 module 时,precision 设成 low ,这样如果设备支持便会启用 fp16 优化
此外可以考虑用动态量化的方式:
Sorry, something went wrong.
谢谢您,我还想问一下,使用动态量化将模型转化为int8后,是不是也是只有模型大小减少,但推理时会反量化,运行内存并没有变化呢
No branches or pull requests
目前使用--fp16将模型大小降低了一倍,但运行过程中内存并无变化,请问该如何修改以降低内存呢
The text was updated successfully, but these errors were encountered: