-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RMP] Provide PyTorch serving support for T4R models in Torchscript #255
Comments
@karlhigley , I saw the comment after 22.05 and wondered whether we should have this ticket defined and ready for discussion at the next grooming meeting. Can you help in definition ? |
@karlhigley , do you see this getting over by 22.08? |
@karlhigley , should this ticket also have these additional requirements that we discussed in the backlog grooming meeting?
|
No, because they're not part of the scope as described at the top of this issue and they've already been completed |
@karlhigley , the original ask from product team is Session based recommender - Add PyT API to Merlin models / Systems. This ticket only covers the systems side and not models. I think this should be a [Task] that must be part of a RMP ticket that will cover the models side also. Let me know your thoughts on this. |
@viswa-nvidia That was my original thought too, until I got talked out of it. 🤷🏻 |
@edknv does the work you've done for Transformers4Rec support this functionality from the model training side? |
Each side of this equation works independently, but no one has tested them together AFAIK, so we're not ready to close this yet |
@rnyak @bschifferer can one of you please test this functionality and ensure that we're able to save models via torchscript and serve via systems. I think this is relevant to a number of your customers. |
I'm working on an integration test for that here, but it isn't working yet: NVIDIA-Merlin/systems#176 |
@marcromeyn @oliverholworthy please provide more details for the blockers |
@marcromeyn , Add tickets to the backlog for the optimizations |
Problem:
Users should be able to serve pytorch models that were produced with Transformers4Rec or any other process using a Systems ensemble. This will work towards supporting Session-based models as well as expanding System's support for a new modeling framework.
Goal:
Systems should be able to serve all pytorch models that are currently supported by Triton.
Definition of Done
Have an example that serves PyT session based model in conjunction with a NVT workflow where the session based models scores the whole catalog
Open questions
Constraints:
Not all pytorch models can be served via Triton's
pytorch
backend, so we will need to be able to use multiple backends in order to serve all Triton-compatible pytorch models.Starting Point:
Transformers4Rec
Systems
pytorch
backend for "torchscriptable" models for optimized performancePredictPytorch
operator systems#153Integration Issues
is_ragged
in LocalExecutor_transform_data
core#173 should fix that issue.Nice to have: (P1)
Documentation
Examples
Blockers:
The text was updated successfully, but these errors were encountered: