We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is your feature request related to a problem? Please describe.
If there is graph breaks due to unsupported ops, overhead between TRT and torch module is observed.
Describe the solution you'd like
Entire subgraphs are capture/replayed by cuda graphs in wrapper runtime module
Describe alternatives you've considered
cuda graph can be applied to torch and trt module individually but it's not ideal to reduce cpu overhead.
Additional context
The text was updated successfully, but these errors were encountered:
keehyuna
Successfully merging a pull request may close this issue.
Is your feature request related to a problem? Please describe.
If there is graph breaks due to unsupported ops, overhead between TRT and torch module is observed.
Describe the solution you'd like
Entire subgraphs are capture/replayed by cuda graphs in wrapper runtime module
Describe alternatives you've considered
cuda graph can be applied to torch and trt module individually but it's not ideal to reduce cpu overhead.
Additional context
The text was updated successfully, but these errors were encountered: