You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A more accurate alert on whether soroban-rpc is up or down, which does not give a false positive if the prometheus or elastalert services are down.
What would you like to see?
Since the RPC endpoint is not exposed publicly, we can't use Runscope, but we do have a deadmansswitch account which would be perfect for this. There's a couple ways we could do it:
the most accurate is to modify the soroban-rpc pod to constantly (once a minute or whatever) ping deadmanswitch. Either add a new pod/container whose purpose is to check the rpc endpoint and ping deadmanswitch; or modify the rpc container itself to make those pings to deadmanswitch.
less accurate is to put a cron job on a host somewhere, the cron job checks the rpc endpoint and makes the pings. The problem with this approach is you'll get a false alarm if the cron host dies.
In both cases, a simple script like the following will be sufficient:
if $(curl rpc/health | grep healthy) then curl https://nosnch.in/xxxxx
The way it works is: as long as the script frequently checks in with deadmanswitch, then an alert will not be fired. If the script fails to check in, then an alert will be fired.
What alternatives are there?
None
The text was updated successfully, but these errors were encountered:
@mwtzzz we discussed and we'd like to go with the second option
less accurate is to put a cron job on a host somewhere, the cron job checks the rpc endpoint and makes the pings. The problem with this approach is you'll get a false alarm if the cron host dies.
What problem does your feature solve?
A more accurate alert on whether soroban-rpc is up or down, which does not give a false positive if the prometheus or elastalert services are down.
What would you like to see?
Since the RPC endpoint is not exposed publicly, we can't use Runscope, but we do have a deadmansswitch account which would be perfect for this. There's a couple ways we could do it:
In both cases, a simple script like the following will be sufficient:
The way it works is: as long as the script frequently checks in with deadmanswitch, then an alert will not be fired. If the script fails to check in, then an alert will be fired.
What alternatives are there?
None
The text was updated successfully, but these errors were encountered: