-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cull pods that run for longer than 7 days #3042
Conversation
Merging this PR will trigger the following deployment actions. Support and Staging deployments
Production deployments
|
I agree with this change, but i dont think we should introduce it silently. I think this is something the QCL community should be informed about at least as they have in the past had longer time running calculations on a single node. I think it could make sense to notify all community champions about this, and i figure this config is us influencing the user env - so it should be documented somewhere for them. I opened #3017 about communication in situations like this among other things as i've felt that I lack agency on how to communicate a change influencing users if we've decided its the right thing to do. For this change, i suggest:
|
Thanks @consideRatio, I think that makes sense. I don't think I've the capacity to work on communicating at this point, and am not sure who does. So I'm going to unassign this to myself as I don't think I can take this to completion. |
We just received feedback from QCL (https://2i2c.freshdesk.com/a/tickets/972) on this:
I think the answer to the question is "yes" but I welcome corrections if there is subtly on this. In any case I think this response unblocks this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of the long running jobs observed was caused by the KubeSpawner bug in 6.0.0 that has been fixed and cleaned up. With that in mind, I no longer expect this to be as breaking as I thought when seeing several long lived pods.
I opened 2i2c-org/docs#193 with relevant complementary docs.
@jmunroe i pinged you for review by mistake. I meant to assign to for attribution for work done! |
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/6574132482 |
Ref #3015