Name	Name	Last commit message	Last commit date
parent directory ..
model_input	model_input
README.md	README.md
config.properties	config.properties
create_mar.sh	create_mar.sh
model_handler_generalized.py	model_handler_generalized.py
requirements_script.py	requirements_script.py
setup_config_script.py	setup_config_script.py

Transformer (NMT) models for English-French and English-German translation.

The Transformer, introduced in the paper Attention Is All You Need, is a powerful sequence-to-sequence modeling architecture capable of producing state-of-the-art neural machine translation (NMT) systems.

Recently, the fairseq team has explored large-scale semi-supervised training of Transformers using back-translated data, further improving translation quality over the original model. More details can be found in this blog post.

In this example, we have shown how to serve a English-to-French/English-German Translation model using TorchServe. We have used a generalized custom handler which enables us to translate English-to-French and English-to-German simultaneously. The generalized custom handler uses pre-trained Transformer_WMT14_En-Fr / Transformer_WMT19_En-De models from fairseq.

NOTE: This example currently works with Py36 only due to fairseq dependency on dataclasses issue. This example currently doesn't work on Windows

Objectives

Demonstrate how to package a pre-trained Transformer (NMT) models for English-French and English-German translation with generalized custom handler into torch model archive (.mar) file
Demonstrate how to load model archive (.mar) file into TorchServe and run inference.

Serve the Transformer (NMT) models for English-French/English-German on TorchServe

To generate the model archive (.mar) file for English-to-French translation model using following command
```
./create_mar.sh en2fr_model
```
The above command will create a "model_store" directory in the current working directory and generate TransformerEn2Fr.mar file.
To generate the model archive (.mar) file for English-to-German translation model using following command
```
./create_mar.sh en2de_model
```
The above command will create a "model_store" directory in the current working directory and generate TransformerEn2De.mar file.

Start the TorchServe using the model archive (.mar) file created in above step

torchserve --start --model-store model_store --ts-config config.properties --disable-token-auth  --enable-model-api

Use Management API to register the model with one initial worker For English-to-French model

curl -X POST "http://localhost:8081/models?initial_workers=1&synchronous=true&url=TransformerEn2Fr.mar"
{
    "status": "Model \"TransformerEn2Fr\" Version: 1.0 registered with 1 initial workers"
}

For English-to-German model

 curl -X POST "http://localhost:8081/models?initial_workers=1&synchronous=true&url=TransformerEn2De.mar"
 {
     "status": "Model \"TransformerEn2De\" Version: 1.0 registered with 1 initial workers"
 }

To get the inference use the following curl command For English-to-French model

curl http://127.0.0.1:8080/predictions/TransformerEn2Fr -T model_input/sample.txt | json_pp
{
    "input" : "Hi James, when are you coming back home? I am waiting for you.\nPlease come as soon as possible.",
    "french_output" : "Bonjour James, quand rentrerez-vous chez vous, je vous attends et je vous prie de venir le plus tôt possible."
}

For English-to-German model

 curl http://127.0.0.1:8080/predictions/TransformerEn2De -T model_input/sample.txt | json_pp
 {
     "input" : "Hi James, when are you coming back home? I am waiting for you.\nPlease come as soon as possible.",
     "german_output" : "Hallo James, wann kommst du nach Hause? Ich warte auf dich. Bitte komm so bald wie möglich."
 }

Here sample.txt contains simple english sentences which are given as input to Inference API. The output of above curl command will be the french translation of sentences present in the sample.txt file.

Batch Inference with TorchServe using Translation (NMT) model

TorchServe Model Configuration

To configure TorchServe to use the batching feature, provide the batch configuration information through "POST /models" API.

The configuration that we are interested in is the following:

batch_size: This is the maximum batch size that a model is expected to handle.
max_batch_delay: This is the maximum batch delay time TorchServe waits to receive batch_size number of requests. If TorchServe doesn't receive batch_size number of requests before this timer time's out, it sends what ever requests that were received to the model handler.

Steps to configure English-to-French translation model with batch-support

Start the model server. In this example, we are starting the model server with config.properties file

torchserve --start --model-store model_store --ts-config config.properties --disable-token-auth  --enable-model-api

Now let's launch English_to_French translation model, which we have built to handle batch inference. In this example, we are going to launch 1 worker which handles a batch size of 4 with a max_batch_delay of 10s.
```
curl -X POST "http://localhost:8081/models?url=TransformerEn2Fr.mar&initial_workers=1&synchronous=true&batch_size=4&max_batch_delay=10000"
```

Run batch inference command to test the model.

curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2Fr -T ./model_input/sample1.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2Fr -T ./model_input/sample2.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2Fr -T ./model_input/sample3.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2Fr -T ./model_input/sample4.txt&
{
    "input" : "Hello World !!!\n",
    "french_output" : "Bonjour le monde ! ! !"
}
{
    "input" : "Hi James, when are you coming back home? I am waiting for you.\nPlease come as soon as possible.\n",
    "french_output" : "Bonjour James, quand rentrerez-vous chez vous, je vous attends et je vous prie de venir le plus tôt possible."
}
{
    "input" : "I’m sorry, I don’t remember your name. You are you?\n",
    "french_output" : "Je vous prie de m'excuser, je ne me souviens pas de votre nom."
}
{
    "input" : "I’m well. How are you?\nIt’s going well, thank you. How are you doing?\nFine, thanks. And yourself?\n",
    "french_output" : "Je me sens bien. Comment allez-vous ? Ça va bien, merci. Comment allez-vous ?"
}

Steps to configure English-to-German translation model with batch-support

Start the model server. In this example, we are starting the model server with config.properties file

torchserve --start --model-store model_store --ts-config config.properties --disable-token-auth  --enable-model-api

Now let's launch English_to_French translation model, which we have built to handle batch inference. In this example, we are going to launch 1 worker which handles a batch size of 4 with a max_batch_delay of 10s.
```
curl -X POST "http://localhost:8081/models?url=TransformerEn2De.mar&initial_workers=1&synchronous=true&batch_size=4&max_batch_delay=10000"
```

Run batch inference command to test the model.

curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2De -T ./model_input/sample1.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2De -T ./model_input/sample2.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2De -T ./model_input/sample3.txt&
curl -X POST http://127.0.0.1:8080/predictions/TransformerEn2De -T ./model_input/sample4.txt&
{
    "input" : "Hello World !!!\n",
    "german_output" : "Hallo Welt!!!"
}
{
    "input" : "Hi James, when are you coming back home? I am waiting for you.\nPlease come as soon as possible.\n",
    "german_output" : "Hallo James, wann kommst du nach Hause? Ich warte auf dich. Bitte komm so bald wie möglich."
}
{
    "input" : "I’m sorry, I don’t remember your name. You are you?\n",
    "german_output" : "Es tut mir leid, ich erinnere mich nicht an Ihren Namen. Sie sind es?"
}
{
    "input" : "I’m well. How are you?\nIt’s going well, thank you. How are you doing?\nFine, thanks. And yourself?\n",
    "german_output" : "Mir geht es gut. Wie geht es Ihnen? Es läuft gut, danke. Wie geht es Ihnen? Gut, danke. Und sich selbst?"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nmt_transformer

nmt_transformer

README.md

Transformer (NMT) models for English-French and English-German translation.

Objectives

Serve the Transformer (NMT) models for English-French/English-German on TorchServe

Batch Inference with TorchServe using Translation (NMT) model

TorchServe Model Configuration

Steps to configure English-to-French translation model with batch-support

Steps to configure English-to-German translation model with batch-support

Files

nmt_transformer

Directory actions

More options

Directory actions

More options

Latest commit

History

nmt_transformer

Folders and files

parent directory

README.md

Transformer (NMT) models for English-French and English-German translation.

Objectives

Serve the Transformer (NMT) models for English-French/English-German on TorchServe

Batch Inference with TorchServe using Translation (NMT) model

TorchServe Model Configuration

Steps to configure English-to-French translation model with batch-support

Steps to configure English-to-German translation model with batch-support