You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are deploying machines on Hetzner, and sometimes its not possible to create the machine due to account limit on resources:
machine_controller.go:383] Failed to reconcile machine "xxx-m-1-68c6cd6957-6hk94": failed to create machine at cloudprovider, due to failed to create server, due to core limit exceeded (resource_limit_exceeded)
It would be very useful to have a metric to monitor for this, and be able to have an alert when machines have been scheduler but are not successfully created.
The text was updated successfully, but these errors were encountered:
@rajaSahil it's probably a bad idea to include reasons in metrics (as that can explode cardinality). The easiest way to solve this would probably be by exposing the machine counts in a MachineDeployment status (there should be different fields there with "ready" machines, created machines, etc) as metrics.
@rajaSahil it's probably a bad idea to include reasons in metrics (as that can explode cardinality).
@embik I agree, we should not add reasons in metrics.
The easiest way to solve this would probably be by exposing the machine counts in a MachineDeployment status (there should be different fields there with "ready" machines, created machines, etc) as metrics.
Use-case:
We are deploying machines on Hetzner, and sometimes its not possible to create the machine due to account limit on resources:
It would be very useful to have a metric to monitor for this, and be able to have an alert when machines have been scheduler but are not successfully created.
The text was updated successfully, but these errors were encountered: