Feature: OOM killer protection similar to sshd #580

erpel · 2024-07-31T11:43:18Z

I initially opened this as part of amaonlinux, but it makes more sense in this project:

When the system is experiencing memory pressure, I've seen many times that ssm-agent gets killed by the OOM killer. This makes it hard to debug the situation if ssm-agent being killed results in being unable to log in and observe the situation.

I'd like ssm-agent to be run with the same OOM killer protections that sshd applies to it's own process (oom score adjustment -1000).

Alternatives would be to stop using SSM for login and switch to SSH, but this puts additional overhead on us, administering user accounts and ssh keys. SSM session manager is a useful feature that would really benefit from added efforts to increase stability.

This old bug https://bugzilla.redhat.com/show_bug.cgi?id=1010429#c0 contains some details about how it used to work with sshd - especially making sure that user processes spawned by the "protected" server don't inherit the strict protection of oom_score_adj -1000.

CuriousDolphin · 2024-09-18T12:19:00Z

I have the same problem, when the machine is saturated with ram, the ssm agent is killed and the only way is to restart it, is there any news?

gianniLesl · 2024-10-02T14:57:02Z

What OS is this on? The amazon-ssm-agent process should always restart if it crashes or is killed.

erpel · 2024-10-02T15:12:42Z

We've seen this on up to date versions of AmazonLinux 2023. In high load situations, it seems restarting does not work reliably or takes a very long time, causing issues with reaching instances.

Manually adjusting the OOM killer score in the systemd unit file (using an override for example) does help, so I feel that ssm-agent setting this automatically on the important process(es) is a good solution.

h0tw1r3 · 2024-12-06T15:49:02Z

This is a significant issue for us after switching from SSH to SSM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: OOM killer protection similar to sshd #580

Feature: OOM killer protection similar to sshd #580

erpel commented Jul 31, 2024

CuriousDolphin commented Sep 18, 2024

gianniLesl commented Oct 2, 2024

erpel commented Oct 2, 2024

h0tw1r3 commented Dec 6, 2024

Feature: OOM killer protection similar to sshd #580

Feature: OOM killer protection similar to sshd #580

Comments

erpel commented Jul 31, 2024

CuriousDolphin commented Sep 18, 2024

gianniLesl commented Oct 2, 2024

erpel commented Oct 2, 2024

h0tw1r3 commented Dec 6, 2024