Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

Closed
25 of 27 tasks
karlhigley opened this issue May 2, 2022 · 12 comments
Closed
25 of 27 tasks

Comments

@karlhigley
Copy link
Contributor

karlhigley commented May 2, 2022

Problem:

Users should be able to serve pytorch models that were produced with Transformers4Rec or any other process using a Systems ensemble. This will work towards supporting Session-based models as well as expanding System's support for a new modeling framework.

Goal:

Systems should be able to serve all pytorch models that are currently supported by Triton.

Definition of Done

Have an example that serves PyT session based model in conjunction with a NVT workflow where the session based models scores the whole catalog

Open questions

Constraints:

Not all pytorch models can be served via Triton's pytorch backend, so we will need to be able to use multiple backends in order to serve all Triton-compatible pytorch models.

Starting Point:

Transformers4Rec

Systems

Integration Issues

Nice to have: (P1)

Documentation

Examples

Blockers:

  • [INF] Unresolved architectural decisions
  • Support for ragged tensors in T4R
  • Start with fix padding and have it all padded to the same length and then investigate whether it's worthwhile to add the like padding to the Max Sequence length
  • Padding support in dataloader that supports systems along with cross framework support
@viswa-nvidia
Copy link

@karlhigley , I saw the comment after 22.05 and wondered whether we should have this ticket defined and ready for discussion at the next grooming meeting. Can you help in definition ?

@nv-alaiacano nv-alaiacano changed the title [RMP] PyTorch support (models, feature transforms, serving) [RMP] PyTorch serving support in Systems Jul 27, 2022
@nv-alaiacano nv-alaiacano changed the title [RMP] PyTorch serving support in Systems [RMP] Provide serving support for PyTorch models Jul 27, 2022
@nv-alaiacano nv-alaiacano added this to the Merlin 22.08 milestone Jul 27, 2022
@viswa-nvidia
Copy link

viswa-nvidia commented Aug 4, 2022

@karlhigley , do you see this getting over by 22.08?

@viswa-nvidia
Copy link

@karlhigley , should this ticket also have these additional requirements that we discussed in the backlog grooming meeting?

  • enable PyT T4R
    • Multi-gpu training
      - DP (working)
    • Serving
      - Python-backend (working)

@karlhigley
Copy link
Contributor Author

No, because they're not part of the scope as described at the top of this issue and they've already been completed

@viswa-nvidia
Copy link

@karlhigley , the original ask from product team is Session based recommender - Add PyT API to Merlin models / Systems. This ticket only covers the systems side and not models. I think this should be a [Task] that must be part of a RMP ticket that will cover the models side also. Let me know your thoughts on this.

@karlhigley
Copy link
Contributor Author

@viswa-nvidia That was my original thought too, until I got talked out of it. 🤷🏻

@viswa-nvidia viswa-nvidia changed the title [RMP] Provide serving support for PyTorch models [Task] Provide serving support for PyTorch models Aug 10, 2022
@EvenOldridge
Copy link
Member

@edknv does the work you've done for Transformers4Rec support this functionality from the model training side?
@karlhigley excluding onnx support can we currently serve Torchscript based models in systems?
Looking to close this

@karlhigley
Copy link
Contributor Author

Each side of this equation works independently, but no one has tested them together AFAIK, so we're not ready to close this yet

@EvenOldridge
Copy link
Member

@rnyak @bschifferer can one of you please test this functionality and ensure that we're able to save models via torchscript and serve via systems. I think this is relevant to a number of your customers.

@karlhigley
Copy link
Contributor Author

I'm working on an integration test for that here, but it isn't working yet: NVIDIA-Merlin/systems#176

@karlhigley karlhigley changed the title [Task] Provide serving support for PyTorch models [RMP] Provide PyTorch serving support for T4R models (in Torchscript and Python back-ends) Aug 26, 2022
@karlhigley karlhigley modified the milestones: Merlin 22.09, Merlin 22.10 Aug 26, 2022
@karlhigley karlhigley changed the title [RMP] Provide PyTorch serving support for T4R models (in Torchscript and Python back-ends) [RMP] Provide PyTorch serving support for T4R models in Torchscript Oct 5, 2022
@viswa-nvidia
Copy link

@marcromeyn @oliverholworthy please provide more details for the blockers

@viswa-nvidia
Copy link

@marcromeyn , Add tickets to the backlog for the optimizations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants