-
Notifications
You must be signed in to change notification settings - Fork 38
Containers that fail with no reason causes a panic #26
Comments
Here is some more info around this particular panic. Output from {
"tasks": [{
"taskArn": "arn:aws:ecs:us-east-1:REDACTED:task/REDACTED",
"clusterArn": "arn:aws:ecs:us-east-1:REDACTED:cluster/REDACTED",
"taskDefinitionArn": "arn:aws:ecs:us-east-1:REDACTED:task-definition/deREDACTED:REDACTED",
"overrides": {
"containerOverrides": [{
"name": "app",
"command": [
"REDACTED"
],
"environment": [
]
}]
},
"lastStatus": "STOPPED",
"desiredStatus": "STOPPED",
"cpu": "256",
"memory": "512",
"containers": [{
"containerArn": "arn:aws:ecs:us-east-1:REDACTED:container/REDACTED",
"taskArn": "arn:aws:ecs:us-east-1:REDACTED:task/REDACTED",
"name": "app",
"lastStatus": "STOPPED",
"networkInterfaces": [{
"attachmentId": "REDACTED",
"privateIpv4Address": "REDACTED"
}],
"healthStatus": "UNKNOWN",
"cpu": "0"
}],
"version": 4,
"stoppedReason": "Timeout waiting for network interface provisioning to complete.",
"connectivity": "CONNECTED",
"connectivityAt": 1573233270.396,
"createdAt": 1573233066.948,
"stoppingAt": 1573233252.398,
"stoppedAt": 1573233282.065,
"group": "family:REDACTED",
"launchType": "FARGATE",
"platformVersion": "1.3.0",
"attachments": [{
"id": "REDACTED",
"type": "ElasticNetworkInterface",
"status": "DELETED",
"details": [{
"name": "subnetId",
"value": "REDACTED"
},
{
"name": "networkInterfaceId",
"value": "REDACTED"
},
{
"name": "macAddress",
"value": "REDACTED"
},
{
"name": "privateIPv4Address",
"value": "REDACTED"
}
]
}],
"healthStatus": "UNKNOWN",
"tags": []
}],
"failures": []
} |
Not sure, but it's possible I have a fix - testing it now. When waiting for the task to finish there is a built in max attempts mechanism (which is 100 by default) and a built in delay (set for 1 minute by default). You can try changing svc.WaitUntilTasksStopped to
And it will instantly throw the error above. So - it seems the fix would be to provide an outside "delay" and "max-attempts" params, |
@Eli-Goldberg have you tried out the PR I submitted? Does it fix your issue? #27 |
/cc @pda |
I can confirm that -- seems like there's no way to run an ECS task that takes more than 10 mins :( |
I have a tested fix, will open a PR today |
Hey @Eli-Goldberg did you ever land the fix for this issue? I seem to be hitting this issue reasonably frequently. |
Yeah. sorry, forgot to open a pr.
Ill do that in a bit :)
…On Tue, Jun 2, 2020, 08:53 Chris Campbell ***@***.***> wrote:
Hey @Eli-Goldberg <https://github.com/Eli-Goldberg> did you ever land the
fix for this issue? I seem to be hitting this issue reasonably frequently.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AESSMXGWORCF753PCUGDB4LRUSHWPANCNFSM4JKZRC5A>
.
|
Awesome! Thanks :) |
I've opened a pr #35. |
See #25 (comment) for more background.
From the AWS console, this looks like a case where ECS doesn't even get to the point of launching a container, so we might be able to fallback to the
ecs.Task.StoppedReason
.The text was updated successfully, but these errors were encountered: