Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLI script exporting CLIP Toy model IREE test data #672

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sogartar
Copy link
Contributor

@sogartar sogartar commented Dec 10, 2024

This is required to have an easy way of exporting test data that will be
used in IREE to guard against regressions.

E.g.

python -m sharktank.models.clip.export_toy_text_model_iree_test_data \
  --output-dir=clip_toy_text_model

Refactor some of the existing tests to reuse the new export logic.

@sogartar
Copy link
Contributor Author

This PR requires that #664 is merged first.

@sogartar sogartar force-pushed the clip-text-model-export-toy-iree-test-data branch from fc67e1a to a2ea0c2 Compare December 10, 2024 23:49
This is required to have an easy way of exporting test data that will be
used in IREE to guard against regressions.
E.g.
```
python -m sharktank.models.clip.export_toy_text_model_iree_test_data \
  --output-path-prefix=clip_toy_text_model
```

Refactor some of the existing tests to reuse the new export logic.
@sogartar sogartar force-pushed the clip-text-model-export-toy-iree-test-data branch from a2ea0c2 to b71e906 Compare December 10, 2024 23:53
@sogartar sogartar changed the title WIP Add CLI script exporting CLIP Toy model IREE test data Add CLI script exporting CLIP Toy model IREE test data Dec 11, 2024
@sogartar sogartar marked this pull request as ready for review December 11, 2024 00:03
@@ -67,7 +72,7 @@ def export_clip_text_model_dataset_from_hugging_face(


def export_clip_text_model_mlir(
model: Union[ClipTextModel, PathLike],
model: Union[ClipTextModel, AnyPath],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave this as PathLike and remove the AnyPath. Users should just make sure to convert the string before invoking. (Same as above)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed AnyPath.

input_ids=input_ids,
target_iree_parameters_output_path=target_iree_parameters_output_path,
target_mlir_output_path=target_mlir_output_path,
input_ids_output_path=input_ids_output_path if i == 0 else None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split this across lines or put in an if conditional. The , between make the behavior extremely ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

target_iree_parameters_output_path=target_iree_parameters_output_path,
target_mlir_output_path=target_mlir_output_path,
input_ids_output_path=input_ids_output_path if i == 0 else None,
expected_last_hidden_state_output_path=expected_last_hidden_state_output_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. This looks like it should just be in an if condition rather than a oneliner

def testSmokeExportToyIreeTestData(self):
from sharktank.models.clip.export_toy_text_model_iree_test_data import main

main([f"--output-path-prefix={self.path_prefix}clip_toy_text_model"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pythons path library to operate on this instead of string manipulations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I changed it to use a directory as a prefix.

if (
not self.caching
or not os.path.exists(mlir_path)
or not os.path.exists(parameters_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional feels weird / wrong. Is it checking for defaults? Also it feels particularly weird for a test. Reconsider why this is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the caching. It is useful to speedup iteration, but it does add clutter.

@sogartar sogartar requested a review from rsuderman December 11, 2024 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants