Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging our systemd unit files. #1

Open
wants to merge 17 commits into
base: systemd
Choose a base branch
from
Open
138 changes: 138 additions & 0 deletions integration/systemd/amazon-ecs-agent.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the
# "License"). You may not use this file except in compliance
# with the License. A copy of the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and
# limitations under the License.

# A systemd unit file to run `ecs-agent` in a Docker container. This service
# attempts to duplicate the functionality of the `ecs-init` Golang package. The
# notable differences currently are:
#
# 1. The unit file does an unconditional pull of the latest ECS Agent from
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really doing this instead of using a potentially cached agent?

# Docker Hub at startup.
# 2. The unit file does not currently handle the Agent's self-upgrade
# functionality.

[Unit]
Description=Amazon ECS Agent
Documentation=http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_agent.html
Requires=docker.service
After=docker.service

[Install]
WantedBy=multi-user.target

[Service]
# The logic around the ECS Agent restart:
# - Restart on failure
# - Wait 15 seconds between restart
# - If the ECS Agent restarts more then 10 times in 5 minutes,
# it will stay failed
Type=simple
Restart=on-failure
RestartSec=5s
RestartPreventExitStatus=5
StartLimitInterval=5min
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have data to back up this restart policy?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not. The default RestartSec time for systemd is 100ms, which I feel is way too fast and will clobber docker trying to restart the agent over and over.

StartLimitBurst=10

# It is recommended in production to version pin the ECS Agent
# to a known version vs running the latest flag on the docker container.
# See https://github.com/aws/amazon-ecs-agent/releases for agent versions.

# It is recommended to use a systemd drop-in file vs editing this line or file.
# Your drop-in file could look like this:
#
# [Service]
# Environment=ECS_AGENT_VERSION=v1.17.2
#
# Depending on the OS, it could be placed in
# /etc/systemd/system/amazon-ecs-agent.d , and named
# ecs_agent_version_override.conf
Environment=ECS_AGENT_VERSION=v1.17.2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we need to update this with each Agent release? If so, why not just pin this to "latest" and allow customers to pin to a version with the override described above?


ExecStartPre=-/usr/bin/echo "Amazon ECS Agent systemd unit file version 1.0.0"

# Load an updated ECS Agent, if it exists:
ExecStartPre=-/bin/sh -c 'test -f /var/cache/ecs/desired-image && \
docker load --quiet --input=/var/cache/ecs/desired-image \
&& rm -f $(cat /var/cache/ecs/desired-image) /var/cache/ecs/desired-image'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(cat /var/cache/ecs/desired-image) does not belong here


# If we don't have an ECS Agent, load from disk, if possible, or Docker Hub:
ExecStartPre=/bin/sh -c "`docker inspect amazon/amazon-ecs-agent &>/dev/null` \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The &> redirection syntax is a bash extension. We don't want to assume that /bin/sh is bash, as this isn't the case on all systems. Please stick with POSIX-friendly redirections, so > /dev/null 2>&1

(Note that this issue is present in multiple places.)

|| docker load --quiet -input=/var/cache/ecs/ecs-agent.tar \
|| docker pull amazon/amazon-ecs-agent:${ECS_AGENT_VERSION}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to "pull" from our S3 bucket instead of Docker Hub?


# Add the IPTable rules needed to enable IAM Roles for Tasks.
ExecStartPre=/sbin/iptables -t nat -A PREROUTING -d 169.254.170.2/32 \
-p tcp -m tcp --dport 80 -j DNAT --to-destination 127.0.0.1:51679
ExecStartPre=/sbin/iptables -t nat -A OUTPUT -d 169.254.170.2/32 \
-p tcp -m tcp --dport 80 -j REDIRECT --to-ports 51679

# Allow the port proxy to route traffic using loopback addresses
ExecStartPre=/sbin/sysctl --quiet --write net.ipv4.conf.all.route_localnet=1

# Stop the ECS Agent if it was running. Docker stop is used
# as it will send a SIGTEM and wait 10 seconds before sending SIGKILL
ExecStartPre=-/bin/sh -c "`docker stop --time 10 ecs-agent &>/dev/null`"

# Remove the container named "ecs-agent"
ExecStartPre=-/bin/sh -c "`docker rm ecs-agent &>/dev/null`"

# Create the directories needed for the ECS Agent.
ExecStartPre=-/bin/mkdir -p /var/lib/ecs/dhclient /var/ecs-data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably fail to start if it returns non-zero. mkdir -p will return zero even if the directories already exist.


# The ECS Agent is started via docker run
ExecStart=/usr/bin/docker run --name ecs-agent \
--init \
--detach=false \
--net=host \
--pid=host \
--cap-add=SYS_ADMIN \
--cap-add=NET_ADMIN \
--volume=/var/run:/var/run \
--volume=/var/log/ecs/:/log \
--volume=/var/lib/ecs/data:/data \
--volume=/etc/ecs:/etc/ecs \
--volume=/sbin:/sbin \
--volume=/lib:/lib \
--volume=/lib64:/lib64 \
--volume=/usr/lib:/usr/lib \
--volume=/proc:/host/proc \
--volume=/sys/fs/cgroup:/sys/fs/cgroup \
--volume=/var/lib/ecs/dhclient:/var/lib/dhclient \
--volume=/run/docker/execdriver/native:/var/lib/docker/execdriver/native:ro \
--publish=127.0.0.1:51678:51678 \
--env ECS_UPDATES_ENABLED=false \
--env ECS_DATADIR=/data \
--env ECS_ENABLE_TASK_IAM_ROLE=true \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like (at least with Docker 1.12.6) if you set --env they override anything set in the --env-file. We should be cautious about setting anything that we expect customers to potentially want to configure.

➜  ~ echo FOO=foo > foo.env
➜  ~ docker run --rm -it --env-file foo.env amazonlinux bash
bash-4.2# echo $FOO
foo
bash-4.2# exit
➜  ~ docker run --rm -it --env-file foo.env --env FOO=bar amazonlinux bash
bash-4.2# echo $FOO
bar
bash-4.2# exit                                                                                                                                                                                                                                                      ➜  ~ docker run --rm -it --env FOO=bar --env-file foo.env amazonlinux bash
bash-4.2# echo $FOO
bar
bash-4.2# exit

--env ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST=true \
--env ECS_ENABLE_TASK_ENI=true \
--env ECS_LOGFILE=/log/ecs-agent.log \
--env ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","syslog","awslogs","none"] \
--env ECS_CGROUP_PREFIX=ecs \
--log-driver=journald \
--env-file=/etc/ecs/ecs.config \
amazon/amazon-ecs-agent:${ECS_AGENT_VERSION}

# Docker stop is used as it will send a SIGTEM and wait 10 seconds
# before sending SIGKILL
ExecStartPre=-/bin/sh -c "`docker stop --time 10 ecs-agent &>/dev/null`"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this in ExecStartPre? Do we expect the service to already be running when we try to start it? Further, this occurs later in the file than the other ExecStartPre call to 'docker rm'.


# Remove the IPTable rules needed to enable IAM Roles for Tasks.
ExecStopPost=-/sbin/iptables -t nat -D PREROUTING -d 169.254.170.2/32 \
-p tcp -m tcp --dport 80 -j DNAT --to-destination 127.0.0.1:51679
ExecStopPost=-/sbin/iptables -t nat -D OUTPUT -d 169.254.170.2/32 \
-p tcp -m tcp --dport 80 -j REDIRECT --to-ports 51679

# Remove the port proxy to route traffic using loopback addresses
ExecStopPost=/bin/sh -c \
"/sbin/sysctl --quiet --write net.ipv4.conf.all.route_localnet=$(/sbin/sysctl \
-q -n net.ipv4.conf.default.route_localnet)"
43 changes: 43 additions & 0 deletions integration/systemd/load-and-verify-agent.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/bin/bash
# Copyright 2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You may
# not use this file except in compliance with the License. A copy of the
# License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

# This wrapper script ensures that we've loaded the desired ECS agent
# and have attached the "latest" tag to it.
# USAGE: load-and-verify-agent.sh DEFAULT_TAG

docker ping > /dev/null || {
echo "Cannot talk to Docker daemon. Aborting." >&2
exit 1
}

DEFAULT_TAG=$1 ; shift
AGENT_IMAGE_NAME=amazon/amazon-ecs-agent
DESIRED_IMAGE_FILE=/var/cache/ecs/desired-image

if [ -f "$DESIRED_IMAGE_FILE" ]; then
DESIRED_TAG=$(head -n1 "$DESIRED_IMAGE_FILE")
else
DESIRED_TAG="$DEFAULT_TAG"
fi

IMAGE_NAME="${AGENT_IMAGE_NAME}:${DESIRED_TAG}"


image_id=$(docker inspect "$IMAGE_NAME")
latest_image_id=$(docker inspect "${AGENT_IMAGE_NAME}:latest")

if -z "$image_id" ; then
# We already have the desired image. Ensure that "latest" points to it.
docker tag ${AGENT_IMAGE_NAME}:${image_id##sha256:} ${AGENT_IMAGE_NAME}:latest