Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging our systemd unit files. #1

Open
wants to merge 17 commits into
base: systemd
Choose a base branch
from

Conversation

JasonSwindle
Copy link

No description provided.


# Load an updated ECS Agent, if it exists:
ExecStartPre=-/bin/sh -c 'test -f /var/cache/ecs/desired-image && \
docker load $(cat /var/cache/ecs/desired-image) \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just docker load --input=/var/cache/ecs/desired-image No need to use cat.

We should also add --quiet to the docker load invocation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


# If we don't have an ECS Agent, load from disk, if possible, or Docker Hub:
ExecStartPre=/bin/sh -c "docker inspect amazon/amazon-ecs-agent \
|| docker load --quiet < /var/cache/ecs/ecs-agent.tar \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also update this to use --input

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,2 @@
ECS_LOGLEVEL=info
ECS_CLUSTER=default
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that it makes sense for this file to be here. There's nothing systemd-specific about it, so at the very least it probably shouldn't be in the systemd subdirectory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call-out, fixing now.

Jason Swindle added 3 commits March 23, 2018 12:18
... driver is now journald.  The ECS Agent still logs into the amazon-ecs-agent unit and writes to the file system.
Copy link
Owner

@nmeyerhans nmeyerhans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple details I noticed. Looks pretty good in general.

# Load an updated ECS Agent, if it exists:
ExecStartPre=-/bin/sh -c 'test -f /var/cache/ecs/desired-image && \
docker load --quiet --input=/var/cache/ecs/desired-image \
&& rm -f $(cat /var/cache/ecs/desired-image) /var/cache/ecs/desired-image'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(cat /var/cache/ecs/desired-image) does not belong here

&& rm -f $(cat /var/cache/ecs/desired-image) /var/cache/ecs/desired-image'

# If we don't have an ECS Agent, load from disk, if possible, or Docker Hub:
ExecStartPre=/bin/sh -c "`docker inspect amazon/amazon-ecs-agent &>/dev/null` \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The &> redirection syntax is a bash extension. We don't want to assume that /bin/sh is bash, as this isn't the case on all systems. Please stick with POSIX-friendly redirections, so > /dev/null 2>&1

(Note that this issue is present in multiple places.)


# Docker stop is used as it will send a SIGTEM and wait 10 seconds
# before sending SIGKILL
ExecStartPre=-/bin/sh -c "`docker stop --time 10 ecs-agent &>/dev/null`"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this in ExecStartPre? Do we expect the service to already be running when we try to start it? Further, this occurs later in the file than the other ExecStartPre call to 'docker rm'.

Restart=on-failure
RestartSec=5s
RestartPreventExitStatus=5
StartLimitInterval=5min
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have data to back up this restart policy?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not. The default RestartSec time for systemd is 100ms, which I feel is way too fast and will clobber docker trying to restart the agent over and over.

# attempts to duplicate the functionality of the `ecs-init` Golang package. The
# notable differences currently are:
#
# 1. The unit file does an unconditional pull of the latest ECS Agent from
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really doing this instead of using a potentially cached agent?

# Depending on the OS, it could be placed in
# /etc/systemd/system/amazon-ecs-agent.d , and named
# ecs_agent_version_override.conf
Environment=ECS_AGENT_VERSION=v1.17.2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we need to update this with each Agent release? If so, why not just pin this to "latest" and allow customers to pin to a version with the override described above?

# If we don't have an ECS Agent, load from disk, if possible, or Docker Hub:
ExecStartPre=/bin/sh -c "`docker inspect amazon/amazon-ecs-agent &>/dev/null` \
|| docker load --quiet -input=/var/cache/ecs/ecs-agent.tar \
|| docker pull amazon/amazon-ecs-agent:${ECS_AGENT_VERSION}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to "pull" from our S3 bucket instead of Docker Hub?

--publish=127.0.0.1:51678:51678 \
--env ECS_UPDATES_ENABLED=false \
--env ECS_DATADIR=/data \
--env ECS_ENABLE_TASK_IAM_ROLE=true \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like (at least with Docker 1.12.6) if you set --env they override anything set in the --env-file. We should be cautious about setting anything that we expect customers to potentially want to configure.

➜  ~ echo FOO=foo > foo.env
➜  ~ docker run --rm -it --env-file foo.env amazonlinux bash
bash-4.2# echo $FOO
foo
bash-4.2# exit
➜  ~ docker run --rm -it --env-file foo.env --env FOO=bar amazonlinux bash
bash-4.2# echo $FOO
bar
bash-4.2# exit                                                                                                                                                                                                                                                      ➜  ~ docker run --rm -it --env FOO=bar --env-file foo.env amazonlinux bash
bash-4.2# echo $FOO
bar
bash-4.2# exit

ExecStartPre=-/bin/sh -c "`docker rm ecs-agent &>/dev/null`"

# Create the directories needed for the ECS Agent.
ExecStartPre=-/bin/mkdir -p /var/lib/ecs/dhclient /var/ecs-data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably fail to start if it returns non-zero. mkdir -p will return zero even if the directories already exist.

Jason Swindle added 3 commits March 26, 2018 15:13
- ECS Agent only downloads from S3.
-- Will download from region as needed.
- MD5 checks the downloaded file.
- Moved ENV to ecs.config so they can over-written.
- Changed docs to HTTPS vs HTTP.
# Download ECS Agent from S3.
ExecStartPre=/usr/bin/echo "Downloading the ECS Agent from S3, if missing"
ExecStartPre=/bin/sh -c '\
case $$REGION in \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually work? REGION is set in a prior invocation of sh. That environment variable doesn't persist outside that shell.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. This lead to a few errors as well that have been fixed.

- A echo to work on Ubuntu as well.
- Fixed the case logic (Tested in us-gov-west-1 on Ubuntu 16.04 instance)
- Fixed test logic to be more POSIX.
- Typo
@ProgrammingAce
Copy link

ProgrammingAce commented Mar 27, 2018

For loading the environment, we could keep the /etc/ecs/ecs.conf file optional by loading the environment through the unit file. Adding this before the ExecStart (although the env file portion might have to be part of the ExecStart callout). This would preserve the exiting functionality in ECS:

# Agent environment variables
Environment=ECS_UPDATES_ENABLEg=false
Environment=ECS_DATADIR=/data
Environment=ECS_ENABLE_TASK_IAM_ROLE=true
Environment=ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST=true
Environment=ECS_ENABLE_TASK_ENI=true
Environment=ECS_LOGFILE=/log/ecs-agent.log
Environment=ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","syslog","awslogs","none"]
Environment=ECS_CGROUP_PREFIX=ecs
Environment=ECS_CLUSTER=default

# Load the ECS configuration file into the agent environment
EnvironmentFile=-/etc/ecs/ecs.conf

@nmeyerhans
Copy link
Owner

@ProgrammingAce Setting the environment variables in the unit file doesn't pass them through to the container. Docker only passes variables explicitly set via the --env or --env-file command-line flags.

I don't really like the idea of moving the environment to an external file, but it seems like it might be the best approach...

@JasonSwindle If we need an external environment file, I'm also inclined to move all the shell code out of the unit file and into something standalone, too. As you've no doubt discovered, embedding complex logic in inline code in unit files is painful. The result is brittle and difficult to maintain.

@JasonSwindle
Copy link
Author

My only concern with many small scripts is the complexity for customers to use it when not using the ECS AMI. I'm more than happy to break the shell code out of the unit file, as it was a huge pain to make everything work correctly. My other concern is how much is now mandatory in the ecs.config file. This changes the simple nature of ECS at the container instance, and will make customer usage way more error prone.

@nmeyerhans
Copy link
Owner

@JasonSwindle The alternative is that we ditch the env-file altogether and instead encourage customers to override the environment via a drop-in. It'll be more straightforward in the default case with no overrides, but I'm not sure it's ideal in terms of reducing complexity for people who want to make changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants