-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Support Triton ensemble runtime for SageMaker multi-model deployment #393
Comments
I verified that writing the Triton ensemble with its For anyone facing the same problem, I share below the auxiliary functions that I used to make the Triton ensemble using the outputs of the merlin-systems export method: import os
from nvtabular.workflow import Workflow
from transformers4rec import torch as tr
def export_t4rec_triton_ensemble(
data_workflow: Workflow,
model: tr.Model,
executor_model_path: str,
output_path: str = ".",
) -> None:
"""
Export a Triton ensemble with the `NVtabular` data processing step and the \
`Transformer4rec` model inference step.
Parameters
----------
data_workflow: nvtabular.workflow.Workflow
Data processing workflow.
model: transformers4rec.torch.Model
Recommender model.
executor_model_path: str
Exported `Transformers4rec` execution model directory.
output_path: str
Output path to save the generated Triton ensemble files.
"""
ensemble_cfg = get_t4rec_triton_ensemble_config(
data_workflow=data_workflow,
model=model,
executor_model_config_path=os.path.join(executor_model_path, "config.pbtxt"),
)
os.makedirs(os.path.join(output_path, "ensemble_model"), exist_ok=True)
os.makedirs(os.path.join(output_path, "ensemble_model", "1"), exist_ok=True)
with open(os.path.join(output_path, "ensemble_model", "config.pbtxt"), "w") as f:
f.write(ensemble_cfg)
return
def get_t4rec_triton_ensemble_config(
data_workflow: Workflow,
model: tr.Model,
executor_model_config_path: str,
) -> str:
"""
Generates a `config.pbtxt` file for the `Transformers4rec` Triton ensemble.
Parameters
----------
data_workflow: nvtabular.workflow.Workflow
Data processing workflow.
model: transformers4rec.torch.Model
Recommender model.
executor_model_config_path: str
`config.pbtxt` file path of the exported `Transformers4rec` execution model.
Returns
-------
str
Triton ensemble `config.pbtxt` file contents.
"""
cfg = 'name: "ensemble_model"\nplatform: "ensemble"\n'
# Ensemble input/outputs
with open(executor_model_config_path, "r") as f:
executor_cfg = f.read()
cfg += "\n".join(executor_cfg.split("\n")[2:-4]) + "\n"
# Ensemble scheduling
cfg += "ensemble_scheduling {\n\tstep [\n"
## 0_transformworkflowtriton model step
cfg += "\t\t{\n"
cfg += '\t\t\tmodel_name: "%s"\n\t\t\tmodel_version: %s\n' % (
"0_transformworkflowtriton",
"-1",
)
for col in data_workflow.input_schema.column_names:
cfg += '\t\t\tinput_map {\n\t\t\t\tkey: "%s"\n\t\t\t\tvalue: "%s"\n\t\t\t}\n\t\t\t\n' % (
col,
col,
)
for col in model.input_schema.column_names:
cfg += (
'\t\t\toutput_map {\n\t\t\t\tkey: "%s__values"\n\t\t\t\tvalue: "%s__values"\n\t\t\t\
}\n\t\t\toutput_map {\n\t\t\t\tkey: "%s__offsets"\n\t\t\t\tvalue: "%s__offsets"\n\t\t\t}\
\n'
% (col, col, col, col)
)
cfg += "\t\t},\n"
## 1_predictpytorchtriton model step
cfg += "\t\t{\n"
cfg += '\t\t\tmodel_name: "%s"\n\t\t\tmodel_version: %s\n' % ("1_predictpytorchtriton", "-1")
for col in model.input_schema.column_names:
cfg += (
'\t\t\tinput_map {\n\t\t\t\tkey: "%s__values"\n\t\t\t\tvalue: "%s__values"\n\t\t\t}\
\n\t\t\tinput_map {\n\t\t\t\tkey: "%s__offsets"\n\t\t\t\tvalue: "%s__offsets"\n\t\t\t}\n'
% (col, col, col, col)
)
for col in model.output_schema.column_names:
cfg += '\t\t\toutput_map {\n\t\t\t\tkey: "%s"\n\t\t\t\tvalue: "%s"\n\t\t\t}\n' % (
col,
col,
)
cfg += "\t\t}\n"
cfg += "\t]\n}"
cfg = cfg.replace("\t", " ")
return cfg Nevertheless, it would be awesome to support exporting the Triton ensemble artifacts via the merlin-systems library. |
❓ Questions & Help
I was wondering if it is possible to support the SageMaker multi-model deployment using the Triton ensemble of Merlin models.
SageMaker already supports multilpe hosting modes for Model deployment with Triton Inference Server, including the Multi-model endpoints with ensemble hosting mode. I tried to use that hosting mode with the Triton ensembles of Merlin models, but according to the last update of the Merlin SageMaker example implementation #1040 the
--model-control-mode=explicit
control mode (required by multiple models hosting for dynamic model loading) was removed.I hypothesize that the cause of this incompatibility is due to the generated Merlin
executor_model
is not a proper Triton ensemble (since itsconfig.pbtxt
file doesn't have the correct platformplatform: "ensemble"
, neither the requiredensemble_scheduling: {...}
section), but just another Triton model that executes the0_transformworkflowtriton
and1_predictpytorchtriton
steps internally. Therefore, theexecutor_model
it's not automatically recognized as the ensemble of the0_transformworkflowtriton
and1_predictpytorchtriton
models to be executed.EDIT: I realized that in #255 the Triton ensemble runtime was deprecated and changed to the current executor model. It is possible to support the option of exporting the recommender system artifacts as a Triton ensemble, at least for Transformers4rec systems deployment?
The text was updated successfully, but these errors were encountered: