Update e2e/lm-eval test infrastructure #1323

dbarbuzzi · 2025-04-03T22:30:23Z

SUMMARY:
Remove some no longer necessary flags from the run_tests.sh script and update e2e/lm-eval test infra to use pytest’s parametrization.

Unused flags

A couple of the flags in the test script were added to support reporting efforts in our CI/CD, but they are no longer necessary as they are handled outside the test script.

e2e/lm-eval test parametrization

This change is primarily to improve the reporting for these sets of tests, particularly about the test name that is generated. There are some challenges that the current, non-pytest parametrization approach introduce:

The parametrization index is included in the generated name. This is problematic because any change to the order of the parametrization then results in breaking references (e.g., historic views of the test results) between past and future runs.
- Technically, this is not an issue in these two examples because each config is executed individually, but it would be great to transition any other instances to pytest’s built-in parametrization to help with our test tracking.
Parametrizing a class results in a 'proper' class name, which normalizes certain values. Specifically, the parameters here are file paths which include slashes which are in turn normalized, replacing the slashes with different characters.
- Alternatively, pytest puts all parametrization in the test name itself and doesn't do any of this normalization, preserving the original data.

For point 2, an example of the before/after naming:

Before
classname: tests/e2e/vLLM/test_vllm.py::TestvLLM_0_tests_e2e_vLLM_configs_kv_cache_tinyllama_yaml
test name: test_vllm
full name: tests/e2e/vLLM/test_vllm.py::TestvLLM_0_tests_e2e_vLLM_configs_kv_cache_tinyllama_yaml::test_vllm

After
classname: tests/e2e/vLLM/test_vllm.py::TestvLLM
test name: test_vllm[tests/e2e/vLLM/configs/kv_cache_tinyllama.yaml]
full name: tests/e2e/vLLM/test_vllm.py::TestvLLM::test_vllm[tests/e2e/vLLM/configs/kv_cache_tinyllama.yaml]

TEST PLAN:
There are two (internal) test runs, one e2e and one lm-eval, showing the new changes working without issue to be communicated internally.

These flags were previously added to support running the tests in CI/CD, however, they are no longer necessary for that purpose. Signed-off-by: Domenic Barbuzzi <[email protected]>

This uses pytest’s built-in parametrization for the e2e/lm-eval tests which produces far saner test naming in the console output/report files. Signed-off-by: Domenic Barbuzzi <[email protected]>

github-actions · 2025-04-03T22:30:34Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

dbarbuzzi added 2 commits April 3, 2025 21:17

Drop unused flags

9f3e474

These flags were previously added to support running the tests in CI/CD, however, they are no longer necessary for that purpose. Signed-off-by: Domenic Barbuzzi <[email protected]>

Use pytest parametrization

8c0a55a

This uses pytest’s built-in parametrization for the e2e/lm-eval tests which produces far saner test naming in the console output/report files. Signed-off-by: Domenic Barbuzzi <[email protected]>

dbarbuzzi marked this pull request as ready for review April 4, 2025 13:33

dbarbuzzi added the ready When a PR is ready for review label Apr 4, 2025

brian-dellabetta approved these changes Apr 4, 2025

View reviewed changes

rahul-tuli approved these changes Apr 4, 2025

View reviewed changes

kylesayrs enabled auto-merge (squash) April 8, 2025 16:02

Merge branch 'main' into drop-unused-flags

8333e0c

kylesayrs merged commit 8f47eef into main Apr 8, 2025
8 checks passed

kylesayrs deleted the drop-unused-flags branch April 8, 2025 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update e2e/lm-eval test infrastructure #1323

Update e2e/lm-eval test infrastructure #1323

dbarbuzzi commented Apr 3, 2025 •

edited

Loading

github-actions bot commented Apr 3, 2025

Update e2e/lm-eval test infrastructure #1323

Update e2e/lm-eval test infrastructure #1323

Conversation

dbarbuzzi commented Apr 3, 2025 • edited Loading

github-actions bot commented Apr 3, 2025

dbarbuzzi commented Apr 3, 2025 •

edited

Loading