[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

mvidela31 · 2024-12-25T21:34:39Z

🚀 Feature request

I was wondering if it is possible to support the SageMaker multi-model deployment using the Triton ensemble of Merlin models.

SageMaker already supports multilpe hosting modes for Model deployment with Triton Inference Server, including the Multi-model endpoints with ensemble hosting mode. I tried to use that hosting mode with the Triton ensembles of Merlin models, but according to the last update of the Merlin SageMaker example implementation #1040, the --model-control-mode=explicit control mode (required by multiple models hosting for dynamic model loading) was removed.

I hypothesize that the cause of this incompatibility is due to the generated Merlin executor_model is not a proper Triton ensemble (since its config.pbtxt file doesn't have the correct platform platform: "ensemble", neither the required ensemble_scheduling: {...} section), but just another Triton model that executes the 0_transformworkflowtriton and 1_predictpytorchtriton steps internally. Therefore, the executor_model it's not automatically recognized as the ensemble of the 0_transformworkflowtriton and 1_predictpytorchtriton models to be executed.

EDIT: I realized that in merlin-systems PR#255 the Triton ensemble runtime was deprecated and changed to the current executor model. It is possible to support the option of exporting the recommender system artifacts as a Triton ensemble, at least for Transformers4rec systems deployment?

The text was updated successfully, but these errors were encountered:

mvidela31 · 2024-12-28T20:49:16Z

Moved this issue to the merlin-systems repo: #393.

mvidela31 added the enhancement New feature or request label Dec 25, 2024

mvidela31 changed the title ~~[FEA] Support SageMaker multi-model deployment~~ [FEA] Support Triton ensemble runtime for SageMaker multi-model deployment Dec 27, 2024

mvidela31 closed this as completed Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

mvidela31 commented Dec 25, 2024 •

edited

Loading

mvidela31 commented Dec 28, 2024

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

Comments

mvidela31 commented Dec 25, 2024 • edited Loading

🚀 Feature request

mvidela31 commented Dec 28, 2024

mvidela31 commented Dec 25, 2024 •

edited

Loading