You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Julius at LEAP recently ended up with 0% available storage in filestore, and reported:
In the longer run, I am really struggling with this. I have a few users that are just not keeping to their quota. It is a pretty bad situation that just one single user who doesn't know what they are doing (or just doesnt listen 👀) can bring the whole hub to a grinding halt.
I observe that we received the alert when dropping below 10% March 29th, but we didn't act until it was too late at April 9th when it dropped from 3.5% to 0%.
Available features of relevance
Terraform managed alerts when going below 10%:
pagerduty notifications
slack notifications (via pagerduty notifications)
Prometheus metrics collection and Grafana dashboards providing information about users home directories
This is available, and it can help community representatives know who consumes too much disk space
Not available features of relevance
This isn't an exhaustive list, just a quick writeup.
NFS per user quotas
To my knowledge, this isn't possible for us via the managed NFS services we use with AWS/GCP/Azure and we don't have a clear idea on how to go about resolving this.
Currently represented by this issue.
@damianavila@haroldcampbell I've put this in the engineering backlog even though its not ready to be worked with technical concrete steps etc, not sure how else to make it not not fall between cracks.
Incident prompting additional improvement
Julius at LEAP recently ended up with 0% available storage in filestore, and reported:
I observe that we received the alert when dropping below 10% March 29th, but we didn't act until it was too late at April 9th when it dropped from 3.5% to 0%.
Available features of relevance
Not available features of relevance
This isn't an exhaustive list, just a quick writeup.
We lack this currently and Julius at LEAP for example didn't get one. While they may have been setup manually in the past, they may have been overridden by terraform later?
Tracked by Terraform configured alerting to community representatives about NFS storage #3923.
To my knowledge, this isn't possible for us via the managed NFS services we use with AWS/GCP/Azure and we don't have a clear idea on how to go about resolving this.
Currently represented by this issue.
Related
The text was updated successfully, but these errors were encountered: