Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support Triton ensemble runtime for SageMaker multi-model deployment #1106

Closed
mvidela31 opened this issue Dec 25, 2024 · 1 comment
Closed
Labels
enhancement New feature or request

Comments

@mvidela31
Copy link

mvidela31 commented Dec 25, 2024

🚀 Feature request

I was wondering if it is possible to support the SageMaker multi-model deployment using the Triton ensemble of Merlin models.

SageMaker already supports multilpe hosting modes for Model deployment with Triton Inference Server, including the Multi-model endpoints with ensemble hosting mode. I tried to use that hosting mode with the Triton ensembles of Merlin models, but according to the last update of the Merlin SageMaker example implementation #1040, the --model-control-mode=explicit control mode (required by multiple models hosting for dynamic model loading) was removed.

I hypothesize that the cause of this incompatibility is due to the generated Merlin executor_model is not a proper Triton ensemble (since its config.pbtxt file doesn't have the correct platform platform: "ensemble", neither the required ensemble_scheduling: {...} section), but just another Triton model that executes the 0_transformworkflowtriton and 1_predictpytorchtriton steps internally. Therefore, the executor_model it's not automatically recognized as the ensemble of the 0_transformworkflowtriton and 1_predictpytorchtriton models to be executed.

EDIT: I realized that in merlin-systems PR#255 the Triton ensemble runtime was deprecated and changed to the current executor model. It is possible to support the option of exporting the recommender system artifacts as a Triton ensemble, at least for Transformers4rec systems deployment?

@mvidela31 mvidela31 added the enhancement New feature or request label Dec 25, 2024
@mvidela31 mvidela31 changed the title [FEA] Support SageMaker multi-model deployment [FEA] Support Triton ensemble runtime for SageMaker multi-model deployment Dec 27, 2024
@mvidela31
Copy link
Author

Moved this issue to the merlin-systems repo: #393.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant