As Model Catalogue is shipped as Docker image it can be easily run using Amazon Elastic Container Service.
You need a Amazon S3 bucket to store assets into.
You need one db.t2.small
MySQL instance running. The security group
must allow inbound connections from the ECS registered instance (see bellow).
You need Cluster available with enough registered
container instances. There is Cluster called default always available
but it's recommended to create new cluster fo reach server.
You need to start new Amazon ESC container instance for it if not
set up yet. Follow the guide here: Amazon ECS Container Instances.
The instance should be m3.xlarge
to allow running Model Catalogue with
enough memory. Don't forget to assign the cluster name to the newly created instance as described in the step 10.
If you've setup the instance properly it should appear
at Clusters / ESC Instances table.
Task definition are quite similar to Dockerfile
or docker-compose.yml
files.
They specify the containers to be run with some additional settings such as
environmental variables. You need two task definitions:
- Model Catalogue container based on metadata/registry
- Elasticsearch container based on metadata/registry-elasticsearch
You can easily create new Task Definition using following JSON definition if you go to Task Definitions / _Crate new Task Definition / Configure via JSON.
{
"containerDefinitions": [
{
"volumesFrom": [],
"memory": 10240,
"extraHosts": null,
"dnsServers": null,
"disableNetworking": null,
"dnsSearchDomains": null,
"portMappings": [
{
"hostPort": 80,
"containerPort": 8080,
"protocol": "tcp"
}
],
"hostname": null,
"essential": true,
"entryPoint": null,
"mountPoints": [],
"name": "mc",
"ulimits": null,
"dockerSecurityOptions": null,
"environment": [
{
"name": "MC_ALLOW_SIGNUP",
"value": "true"
},
{
"name": "MC_MAIL_USERNAME",
"value": "[email protected]"
},
{
"name": "MC_MAIL_PASSWORD",
"value": "c00lpwd"
},
{
"name": "MC_MAIL_PORT",
"value": "587"
},
{
"name": "MC_MAIL_HOST",
"value": "mx.example.com"
},
{
"name": "MC_NAME",
"value": "Model Catalogue"
},
{
"name": "CATALINA_OPTS",
"value": "-Djava.awt.headless=true -Dfile.encoding=UTF-8 -server -Xms4g -Xmx10g -XX:NewSize=2048m -XX:MaxNewSize=2048m -XX:PermSize=2048m -XX:MaxPermSize=2048m -XX:+DisableExplicitGC"
},
{
"name": "METADATA_JDBC_URL",
"value": "jdbc:mysql://your-rds-database.rds.amazonaws.com/your-database?autoReconnect=true&useUnicode=yes&characterEncoding=UTF-8"
},
{
"name": "METADATA_USERNAME",
"value": "username"
},
{
"name": "METADATA_PASSWORD",
"value": "strongpwd"
},
{
"name": "MC_S3_SECRET",
"value": "S3CR3T"
},
{
"name": "MC_S3_BUCKET",
"value": "mc.assets.example.com"
},
{
"name": "MC_S3_KEY",
"value": "THEKEY"
},
{
"name": "MC_ES_ELEMENTS_PER_BATCH",
"value": "20"
},
{
"name": "MC_ES_DELAY_AFTER_BATCH",
"value": "200"
},
{
"name": "METADATA_HOST",
"value": "my-catalogue.example.com"
}
],
"links": [
"mc-es:mc-es"
],
"workingDirectory": null,
"readonlyRootFilesystem": null,
"image": "metadata/registry:2",
"command": null,
"user": null,
"dockerLabels": null,
"logConfiguration": {
"logDriver": "syslog",
"options": null
},
"cpu": 2048,
"privileged": null
},
{
"volumesFrom": [],
"memory": 4096,
"extraHosts": null,
"dnsServers": null,
"disableNetworking": null,
"dnsSearchDomains": null,
"portMappings": [],
"hostname": null,
"essential": true,
"entryPoint": null,
"mountPoints": [
{
"containerPath": "/usr/share/elasticsearch/data",
"sourceVolume": "mc-es-data",
"readOnly": null
}
],
"name": "mc-es",
"ulimits": null,
"dockerSecurityOptions": null,
"environment": [],
"links": null,
"workingDirectory": null,
"readonlyRootFilesystem": null,
"image": "metadata/registry-elasticsearch:2",
"command": [],
"user": null,
"dockerLabels": null,
"logConfiguration": {
"logDriver": "json-file",
"options": null
},
"cpu": 1024,
"privileged": null
}
],
"volumes": [
{
"host": {
"sourcePath": "/opt/docker-volumes/mc-es-data"
},
"name": "mc-es-data"
}
],
"family": "your-mc-all-task-definition"
}
You need to update the Task Definition Name after saving the
JSON configuration to something more meaningful than your-mc-all-task-definition-name
and click create
.
You need to update environmental variables to reflect your own settings such as mail server, database host and credentials or catalogue URL. See the environmental variables description
The Elasticsearch indicies are considered ephemeral yet they are persisted
in separate volume /opt/docker-volumes/mc-es-data
between container
restarts. If the indicies are lost they can always be recreated using
Reindex Catalogue
admin action of the Model Catalogue from the data
stored in the database.
When you have your Task Definition ready you can Create new service
based on it. Call it for example mc
and set the minimal tasks number
to one.
There are several way how to keep your Model Catalogue up to date.
One of them is to always point to specific version such e.g.
metadata/registry:2.0.0-beta-10
instead of just metadata/registry:2
in your task definition. If you want to update to newer version then
create new Task Definition Revision change the version tag and restart
the service with the new revision as described bellow.
If you want to use the latest metadata/registry:2
then you have to
SSH into the container instance and pull the latest version manually.
ssh -i ~/.ssh/ec2containers.pem [email protected]
docker pull metadata/registry:2
Restart the service using following steps:
- Update the Model Catalogue service (
mc
in previous example) and set the number of tasks required to0
- Wait until the task is stopped
- Update the Model Catalogue service (
mc
in previous example) and set the number of tasks required to1