Replies: 2 comments 2 replies
-
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Where are we with full fx + dynamo testing in CI? Should pass 1:1 w/ TS |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
A Framework for Performance Benchmarking
Updated Version of RFC #1169
TL;DR
An updated view on performance benchmarking, model functionality testing, and overall evaluation of performance of Torch-TRT across models, compilation models, and inputs.
Goal(s)
The primary goal of this document and discussion is to outline the various upgrades and advances that have been made since RFC #1169 and provide suggestions and ideas on next steps regarding improving model coverage, inference performance, and FX/TS functionality.
Updates Since #1169
TorchBench
Integration with the existing PyTorch benchmarking framework has many positives, as well as a few drawbacks. The main positives are ease-of-maintenance, a wide net of models with streamlined installation procedures, and a polished CLI. A few drawbacks include the need to test custom configurations of our compilation, include dynamic batch and varying input batch size, segment-length (for language models), among other customizations.
Still, integration with this tool could potentially come in many forms. For example, one option is to author a PR to add functionality to test Torch-TRT with the TorchBench tool for CLI functionality and models, but also have a small-form single-model performance script in the Torch-TRT repository for testing more granular performance configurations. It is worth noting that there is already some existing functionality built in to the TorchBench suite for Torch-TRT.
Usecases
Assessing performance and functionality across many models, for example, those provided in Torch's benchmarking suite: https://github.com/pytorch/benchmark/tree/main/torchbenchmark/models
Defining evaluation stages of models in Torch-TRT:
1. Unsuccessful Compilation
2. Successful Compilation, Unsuccessful Inference
3. Successful Compilation, Successful Inference
4. Inference faster than PyTorch
5. Inference faster than PyTorch --> ONNX --> TensorRT
Proposed APIs / UX
Bash scripts for evaluating Torch-TRT across all models in the Torch benchmarking suite, or some user-specified subset, with a data-aggregation mechanism to collect and score models automatically during the run. Furthermore, the bash scripts should handle versioning issues, and verify that the installations of each dependency are compatible to avoid crashes.
Functionality added to TorchBench to include benchmarking of Torch-TRT models across both TorchScript and FX.
Limitations
The benchmarking additions, as scoped, will not include functionality for benchmarking a custom model of the user's choosing, but will instead focus on a set of key popular models to determine overall performance and coverage over a large class of model types. The TorchBench repository includes documentation on how to incorporate new models for benchmarking.
Internal Implementation
Design
New Python scripts needed which interface with TorchTRT, Torch, TensorRT, and the Torch benchmark models. For each model, batch size, and input shape, the script will compile the model using a set of desired compilation methods:
Functionality for the above methods of compilation already exist, except for Dynamo, and would just need to be refactored for full functionality:
TensorRT/tools/perf/perf_run.py
Lines 309 to 328 in 2ef6c3a
Then, the script should aggregate statistics about the model run, including which of the evaluation scores is achieved by Torch-TRT, and coalesce these in an easy-to-use data structure such as a Pandas DataFrame.
Implementation Phases
Prototype - S
MVP
(1.5.0)
- MExtension Phase 1 - S
Beta Was this translation helpful? Give feedback.
All reactions