Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for TFLite models #5

Open
ScottTodd opened this issue Aug 9, 2024 · 5 comments
Open

Add tests for TFLite models #5

ScottTodd opened this issue Aug 9, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@ScottTodd
Copy link
Member

See nod-ai/SHARK-TestSuite#291

Search around for upstream test suites (corpus of .tflite files)

Could also test TOSA operators maybe using https://git.mlplatform.org/tosa/conformance_tests.git/ (see https://www.mlplatform.org/tosa/software.html)

@ScottTodd
Copy link
Member Author

We can also test across different tensorflow package versions, to show when compatibility breaks (e.g. when TOSA ops change, see https://discord.com/channels/689900678990135345/689900680009482386/1276633806643662868 and https://discord.com/channels/689900678990135345/689900680009482386/1255019974867550208).

ScottTodd added a commit that referenced this issue Sep 19, 2024
Progress on #6.

A sample test report HTML file is available here:
https://scotttodd.github.io/iree-test-suites/onnx_models/report_2024_09_17.html

These new tests

* Download models from https://github.com/onnx/models
* Extract metadata from the models to determine which functions to call
with random data
* Run the models through [ONNX Runtime](https://onnxruntime.ai/) as a
reference implementation
* Import the models using `iree-import-onnx` (until we have a better
API: iree-org/iree#18289)
* Compile the models using `iree-compile` (currently just for `llvm-cpu`
but this could be parameterized later)
* Run the models using `iree-run-module`, checking outputs using
`--expected_output` and the reference data

Tests are written in Python using a set of pytest helper functions. As
the tests run, they can log details about what commands they are
running. When run locally, the `artifacts/` directory will contain all
the relevant files. More can be done in follow-up PRs to improve the
ergonomics there (like generating flagfiles).

Each test case can use XFAIL like
`@pytest.mark.xfail(raises=IreeRunException)`. As we test across
multiple backends or want to configure the test suite from another repo
(e.g. [iree-org/iree](https://github.com/iree-org/iree)), we can explore
more expressive marks.

Note that unlike the ONNX _operator_ tests, these tests use
`onnxruntime` and `iree-import-onnx` at test time. The operator tests
handle that as an infrequently ran offline step. We could do something
similar here, but the test inputs and outputs can be rather large for
real models and that gets into Git LFS or cloud storage territory.

If this test authoring model works well enough, we can do something
similar for other ML frameworks like TFLite
(#5).
@ScottTodd ScottTodd self-assigned this Dec 9, 2024
@ScottTodd
Copy link
Member Author

May start on this soon, given some recent regressions in tflite/tosa program compilation.

@ScottTodd ScottTodd added the enhancement New feature or request label Dec 11, 2024
@ScottTodd
Copy link
Member Author

https://pypi.org/project/ai-edge-litert/ no wheels published for Windows... same for the original https://pypi.org/project/tflite-runtime/. Well, that limits testing options. Might be able to test compilation without execution, or generate test golden inputs/outputs on Linux and check those files in.

@ScottTodd
Copy link
Member Author

https://pypi.org/project/ai-edge-litert/ no wheels published for Windows... same for the original https://pypi.org/project/tflite-runtime/. Well, that limits testing options. Might be able to test compilation without execution, or generate test golden inputs/outputs on Linux and check those files in.

Ah! https://github.com/iree-org/iree/blob/main/integrations/tensorflow/test/python/iree_tfl_tests/test_util.py this code works on Windows still

import tensorflow.compat.v2 as tf

self.tflite_interpreter = tf.lite.Interpreter(model_path=self.tflite_file)

ScottTodd added a commit that referenced this issue Jan 15, 2025
Progress on #5.

This contains two simple test cases for demonstration purposes, one of
which is currently failing due to a regression:
iree-org/iree#19402.

The test suite follows the same structure as the onnx_models test suite
in this repository. Some cleanup and refactoring will be more evident as
this grows. We could for example share the `compile_mlir_with_iree`
helper function between both test suites.
@ScottTodd
Copy link
Member Author

Landed a test suite with two tests so far, running nightly: https://github.com/iree-org/iree-test-suites/actions/workflows/test_litert_models.yml?query=branch%3Amain

The new tests also show how test_mobilenet_v1_0_25_224 is newly failing after iree-org/iree#19683, as expected:
https://github.com/iree-org/iree-test-suites/actions/runs/12845103100/job/35818831797#step:6:21

ERROR    litert_models.utils:utils.py:96 Compilation of '/home/runner/.cache/kagglehub/models/tensorflow/mobilenet-v1/tfLite/0-25-224/1/1_cpu.vmfb' failed
ERROR    litert_models.utils:utils.py:97 iree-compile stdout:
ERROR    litert_models.utils:utils.py:98 
ERROR    litert_models.utils:utils.py:99 iree-compile stderr:
ERROR    litert_models.utils:utils.py:100 <unknown>:0: error: loc("MobilenetV1/MobilenetV1/Conv2d_0/Relu6"): 'tosa.conv2d' op requires attribute 'acc_type'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant