✨[Feature] Weight specific engine caching #3146

narendasan · 2024-09-04T17:51:38Z

Is your feature request related to a problem? Please describe.

Caching right now is weight agnostic, but at the cost of creating lower performance engines.

Describe the solution you'd like

If we know that weights would be identical, then we can cache engines that are higher performance. The caching system would need to be able to distinguish these two caches and based on user settings select the right one

TensorRT has a flag called kREFIT_IDENTICAL for this workflow

Describe alternatives you've considered

Additional context

narendasan added the feature request New feature or request label Sep 4, 2024

narendasan assigned zewenli98 Sep 4, 2024

zewenli98 linked a pull request Sep 19, 2024 that will close this issue

feat: Support weight-stripped engine and REFIT_IDENTICAL flag #3167

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨[Feature] Weight specific engine caching #3146

✨[Feature] Weight specific engine caching #3146

narendasan commented Sep 4, 2024 •

edited

Loading

✨[Feature] Weight specific engine caching #3146

✨[Feature] Weight specific engine caching #3146

Comments

narendasan commented Sep 4, 2024 • edited Loading

narendasan commented Sep 4, 2024 •

edited

Loading