Replies: 2 comments
-
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#migrating-plugins said:
I think you can take the necessary changes from https://github.com/NVIDIA/TensorRT/blob/main/samples/python/python_plugin/circ_pad_plugin_multi_tactic.py. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
TL;DR
We want to automatically generate the plugin and converters given the kernel and host code. Users can include a custom kernel in a TensorRT engine using Torch-TensorRT and users don't need to write the plugin themselves. Torch-TRT does everything for us.
Goal(s)
Allow users to use custom kernels Torch-TensorRT engines without the effort of writing tensorrt plugin. Increase the model performance with graph breaks in model.
Usecases
Proposed APIs/UX
As what is demonstrated here, the workflow for a plugin is usually shown here
In this case, user would need to provide the kernel code and then write the plugin according to the needs. What we would do here is:
Introduce some code generation utilities in Torch-TensorRT. If we take a look at the tutorial shown above, we could find there is a Plugin example, which is also demonstrated in TensorRT repo here. This could be a template, once we provide the kernel code, Torch-TensorRT will analyze the input tensor shape, output tensor shape and etc. After getting all required information, the template could be used to generate the plugin according to each kernel. In this process, some techniques such as PyTorch fake tensor, inference in PyTorch could be applied here.
Introduce the tensor shape parsing system in Torch-TensorRT to analyze the required information to generate the plugin.
Example Workflow
We start by assuming that the user has some custom PyTorch operator that calls a custom kernel. This operator has be registered with PyTorch and has a
fake tensor
implementationTo generate the plugin and the converter, the user would add an additional decorator
And this decorator would construct the plugin class, the plugin constructor class and the converter.
Limitations
Internal Implementation
Design
We need to generate 3 objects:
Plugin Class
Plugin Constructor Class
Mostly generic code:
Converter Generator
We know the schema of the custom op since its a PyTorch custom op. This lets us generate the code to take node inputs and pack them for the plugin and format node outputs.
Extensions Required to Core API implementations
Data Structures
Details specific for TorchScript Support
Details specific for FX support
Implementation Phases
Prototype -
MVP
(<TARGET RELEASE VERSION>)
Extension Phase 1
(<TARGET RELEASE VERSION>)
Use autogened plugins to pull in single op graph breaks like aten::embedding bag
Extension Phase 2
(<TARGET RELEASE VERSION>)
Beta Was this translation helpful? Give feedback.
All reactions