[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

karlhigley · 2022-05-02T21:45:19Z

Problem:

Users should be able to serve pytorch models that were produced with Transformers4Rec or any other process using a Systems ensemble. This will work towards supporting Session-based models as well as expanding System's support for a new modeling framework.

Goal:

Systems should be able to serve all pytorch models that are currently supported by Triton.

Definition of Done

Have an example that serves PyT session based model in conjunction with a NVT workflow where the session based models scores the whole catalog

Open questions

Constraints:

Not all pytorch models can be served via Triton's pytorch backend, so we will need to be able to use multiple backends in order to serve all Triton-compatible pytorch models.

Starting Point:

Transformers4Rec

Systems

Use Triton's pytorch backend for "torchscriptable" models for optimized performance
- Create a PredictPytorch operator systems#153
- Write an integration test for T4R Torchscript serving systems#176
Load the schema files written out by T4R

Integration Issues

Address the issues @rnyak surfaced while trying to put together an example
- Fix missing value_count property and/or error from value_count op if sequence min and max length is same: This PR Remove specifying is_ragged in LocalExecutor _transform_data core#173 should fix that issue.
- Fix wrong response dimension from TIS after sending request. This PR Refactoring part1- flags modification Transformers4Rec#543 should be fixing that issue.
- Pad the rows when converting dataframes to dictionaries for Triton requests systems#234
- [FEA] Create an easy functionality to generate dict of tensors- a standard way to move array data across frameworks core#175
- [BUG] getting error when serving Session-based NVT workflow model on TIS with TransformWorkflow op systems#240

Nice to have: (P1)

[FEA] Add list column support in Merlin Systems systems#135

Documentation

Examples

Blockers:

[INF] Unresolved architectural decisions
Support for ragged tensors in T4R
Start with fix padding and have it all padded to the same length and then investigate whether it's worthwhile to add the like padding to the Max Sequence length
Padding support in dataloader that supports systems along with cross framework support

The text was updated successfully, but these errors were encountered:

viswa-nvidia · 2022-07-07T22:14:21Z

@karlhigley , I saw the comment after 22.05 and wondered whether we should have this ticket defined and ready for discussion at the next grooming meeting. Can you help in definition ?

viswa-nvidia · 2022-08-04T23:36:27Z

@karlhigley , do you see this getting over by 22.08?

viswa-nvidia · 2022-08-05T00:26:40Z

@karlhigley , should this ticket also have these additional requirements that we discussed in the backlog grooming meeting?

enable PyT T4R
- Multi-gpu training
  - DP (working)
- Serving
  - Python-backend (working)

karlhigley · 2022-08-05T17:54:18Z

No, because they're not part of the scope as described at the top of this issue and they've already been completed

viswa-nvidia · 2022-08-09T23:59:46Z

@karlhigley , the original ask from product team is Session based recommender - Add PyT API to Merlin models / Systems. This ticket only covers the systems side and not models. I think this should be a [Task] that must be part of a RMP ticket that will cover the models side also. Let me know your thoughts on this.

karlhigley · 2022-08-10T02:07:29Z

@viswa-nvidia That was my original thought too, until I got talked out of it. 🤷🏻

EvenOldridge · 2022-08-17T23:19:49Z

@edknv does the work you've done for Transformers4Rec support this functionality from the model training side?
@karlhigley excluding onnx support can we currently serve Torchscript based models in systems?
Looking to close this

karlhigley · 2022-08-18T13:52:34Z

Each side of this equation works independently, but no one has tested them together AFAIK, so we're not ready to close this yet

EvenOldridge · 2022-08-19T21:47:25Z

@rnyak @bschifferer can one of you please test this functionality and ensure that we're able to save models via torchscript and serve via systems. I think this is relevant to a number of your customers.

karlhigley · 2022-08-20T19:53:07Z

I'm working on an integration test for that here, but it isn't working yet: NVIDIA-Merlin/systems#176

viswa-nvidia · 2023-03-28T17:20:30Z

@marcromeyn @oliverholworthy please provide more details for the blockers

viswa-nvidia · 2023-04-04T17:03:50Z

@marcromeyn , Add tickets to the backlog for the optimizations

karlhigley added epic roadmap labels May 20, 2022

nv-alaiacano changed the title ~~[RMP] PyTorch support (models, feature transforms, serving)~~ [RMP] PyTorch serving support in Systems Jul 27, 2022

nv-alaiacano changed the title ~~[RMP] PyTorch serving support in Systems~~ [RMP] Provide serving support for PyTorch models Jul 27, 2022

nv-alaiacano added this to the Merlin 22.08 milestone Jul 27, 2022

nv-alaiacano assigned karlhigley and nv-alaiacano Jul 27, 2022

karlhigley mentioned this issue Aug 1, 2022

(WIP) PyTorch back-end operator NVIDIA-Merlin/systems#151

Closed

viswa-nvidia modified the milestones: Merlin 22.08, Merlin 22.09 Aug 8, 2022

viswa-nvidia mentioned this issue Aug 10, 2022

[RMP] Merlin Models PyTorch API #534

Closed

4 tasks

viswa-nvidia changed the title ~~[RMP] Provide serving support for PyTorch models~~ [Task] Provide serving support for PyTorch models Aug 10, 2022

oliverholworthy mentioned this issue Aug 22, 2022

[FEA] Save input and output schema when .save methods are called on models NVIDIA-Merlin/models#669

Open

viswa-nvidia mentioned this issue Aug 25, 2022

[RMP] T4R fixes: MultiGPU data parallel training for next-item prediction and fixed serving #522

Closed

10 tasks

karlhigley changed the title ~~[Task] Provide serving support for PyTorch models~~ [RMP] Provide PyTorch serving support for T4R models (in Torchscript and Python back-ends) Aug 26, 2022

karlhigley modified the milestones: Merlin 22.09, Merlin 22.10 Aug 26, 2022

karlhigley changed the title ~~[RMP] Provide PyTorch serving support for T4R models (in Torchscript and Python back-ends)~~ [RMP] Provide PyTorch serving support for T4R models in Torchscript Oct 5, 2022

karlhigley assigned oliverholworthy and sararb Oct 7, 2022

karlhigley assigned rnyak Oct 7, 2022

karlhigley modified the milestones: Merlin 22.10, Merlin 22.11 Oct 7, 2022

viswa-nvidia modified the milestones: Merlin 22.11, Merlin 22.12 Nov 15, 2022

rnyak mentioned this issue Nov 23, 2022

Pad the rows when converting dataframes to dictionaries for Triton requests NVIDIA-Merlin/systems#234

Closed

rnyak mentioned this issue Nov 30, 2022

[BUG] getting error when serving Session-based NVT workflow model on TIS with TransformWorkflow op NVIDIA-Merlin/systems#240

Closed

viswa-nvidia added 22.12 22.11 22.10 and removed epic labels Dec 15, 2022

viswa-nvidia modified the milestones: Merlin 22.12, Merlin 23.02 Dec 20, 2022

karlhigley mentioned this issue Dec 21, 2022

[INF] Merlin Commons #776

Open

11 tasks

viswa-nvidia modified the milestones: Merlin 23.02, Merlin 23.03 Jan 24, 2023

viswa-nvidia modified the milestones: Merlin 23.03, Merlin 23.04 Feb 28, 2023

viswa-nvidia modified the milestones: Merlin 23.04, Merlin 23.05 Apr 25, 2023

nv-alaiacano removed their assignment Apr 26, 2023

EvenOldridge modified the milestones: Merlin 23.05, Merlin 23.06 May 30, 2023

viswa-nvidia closed this as completed Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

karlhigley commented May 2, 2022 •

edited by viswa-nvidia

Loading

viswa-nvidia commented Jul 7, 2022

viswa-nvidia commented Aug 4, 2022 •

edited

Loading

viswa-nvidia commented Aug 5, 2022

karlhigley commented Aug 5, 2022

viswa-nvidia commented Aug 9, 2022

karlhigley commented Aug 10, 2022

EvenOldridge commented Aug 17, 2022

karlhigley commented Aug 18, 2022

EvenOldridge commented Aug 19, 2022

karlhigley commented Aug 20, 2022

viswa-nvidia commented Mar 28, 2023

viswa-nvidia commented Apr 4, 2023

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

[RMP] Provide PyTorch serving support for T4R models in Torchscript #255

Comments

karlhigley commented May 2, 2022 • edited by viswa-nvidia Loading

Problem:

Goal:

Definition of Done

Open questions

Constraints:

Starting Point:

Transformers4Rec

Systems

Integration Issues

Documentation

Examples

Blockers:

viswa-nvidia commented Jul 7, 2022

viswa-nvidia commented Aug 4, 2022 • edited Loading

viswa-nvidia commented Aug 5, 2022

karlhigley commented Aug 5, 2022

viswa-nvidia commented Aug 9, 2022

karlhigley commented Aug 10, 2022

EvenOldridge commented Aug 17, 2022

karlhigley commented Aug 18, 2022

EvenOldridge commented Aug 19, 2022

karlhigley commented Aug 20, 2022

viswa-nvidia commented Mar 28, 2023

viswa-nvidia commented Apr 4, 2023

karlhigley commented May 2, 2022 •

edited by viswa-nvidia

Loading

viswa-nvidia commented Aug 4, 2022 •

edited

Loading