Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch between CPU and NPU in simple Conv2D #210

Open
deneriz-veridas opened this issue Aug 21, 2024 · 3 comments
Open

Mismatch between CPU and NPU in simple Conv2D #210

deneriz-veridas opened this issue Aug 21, 2024 · 3 comments

Comments

@deneriz-veridas
Copy link

Hi,

We have been working with the VX delegate to execute TFLite models on the NPU of the i.MX 8M Plus which is a VeriSilicon's VIPNano-SI+. Doing so, we have found that there are mismatches between the execution of the model in CPU and in NPU, even with a model with a single Conv2D with a 3x3 kernel and padding 'same'. This plot shows the distribution of this mismatch.

image

Even more, this mismatch errors propagate along different layers across the model. This file (conv-sequence.zip) contains the descomposition of a model with 20 Conv2D layers into 20 models, each of them adding one layer to the previous one, allowing the measurement of the mismatch after each of the layers. The following plot shows this propagation across the model.

image

Is there a way to avoid this mismatch? Is this a known issue with this NPU?

We are using TFLite Runtime 2.9.1.1 and the forked iMX delegate under version lf-5.15.71_2.2.0.

@jetxeberria
Copy link

I'm also seeing mismatch between CPU and NPU executions. This is very annoying! Help please!

@sunshinemyson
Copy link
Contributor

@deneriz-veridas @jetxeberria ,

Thanks for your feedback. Very nice data analysis. Our NPU integer math is not bit-accurate compare to tflite CPU implementation - for single layer 1-bit distance.

In our practise, the difference doesn't impact the top-1 accuracy in mobilenet-v1. we usually check the result from application POV such as label/box, not compare the absolute error between cpu and npu.

@deneriz-veridas
Copy link
Author

Hi @sunshinemyson,

Thanks for having a look to this issue. I understand this errors can have minimal impact in classification applications. However, we working with a model that generates embeddings, which we use to compute the distance between them. In this application, the errors are much more important.

Could you extend more on the bit-accuracy of the NPU integer math? Do you have characterized when this happens? We are looking for a way to avoid or mitigate this. Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants