New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

关于fp32和fp16的问题 #244

Closed

Jiangjiawei2 opened this issue Mar 19, 2022 · 3 comments

Jiangjiawei2 commented Mar 19, 2022

现在已成功在cpu上跑通RVM的ONNX,MNN，但是速度还是太慢，在MNN上扣一张图的时间大概在138ms，想进一步提速，于是想试试fp16和int8,看到作者在别的issuse说现在不支持fp16。想请问下，是lite目前不支持fp16，还是MNN目前没有fp16呢，如果可以使用fp16,是否需要仿照大佬自己写推理框架？

Owner

DefTruth commented Mar 20, 2022

lite目前没有考虑fp16，以后可能会增加fp16的支持，预计是基于MNN和ORT来做。MNN是支持fp16的，可以在MNNConvert阶段通过指定参数--fp16就可以。具体的讨论可以看一下我的另一个回答。

模型转换的问题 RVM-Inference#27

Owner

DefTruth commented Mar 20, 2022

补充一下：MNN是可以直接加载fp16和int8的模型进行推理的，前后处理逻辑和fp32是一样的，MNN内部做了这种转换的处理，所以mnn_rvm.cpp的代码应该不需要怎么改动，你可以试一下。

DefTruth closed this as completed

MatchX commented May 26, 2022

mnn转换fp16模型很简单举个栗子 MNNConvert -f ONNX --modelFile cdfs1116_sim.onnx --MNNModel cdfsout_fp16.mnn --fp16 --bizCode biz 其他和fp32没有区别

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment