-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance PagerDuty notifier with retry configuration #241
Comments
Sounds like a good improvement. I would appreciate the PR and seems like it should be straight forward to review and approve. |
@fusiondog comprehensive PR attached -- in addition to unit tests, I've also used a local instance to try out variations on set/unset values. I'm up for any adjustments to better match local source or config conventions. |
Is it possible to make it retry infinitely? |
@nh2 the way the underlying gopherduty client is written, it cannot be retried indefinitely. However, you can set a high # for For instance, My own experiments show that so long as a retry happens after 60 seconds, it's likely to succeed. It's hard to determine what an appropriate amount of retries would be, though the current rate limits are about 150 requests-per-minute - so as long as you will retry past |
@fusiondog @darkcrux there is also a PR against the gopherduty library that will allow 403 (Rate limit response) to be treated as a retry-able error: darkcrux/gopherduty#3 |
@StephenWeber Thanks for the detail, that is good to know; infinite retry would still be best for simplicity so that we don't have to rely on specific timings or logic. |
Current behavior: when there is an error making requests of the PagerDuty API, the client by default does not retry the request.
consul-alerts
logs the error.Preferred behavior: The underlying library providing PagerDuty support has logic built-in to retry requests when there is an error. By default, it does not perform retries, but through configuration values it can be set to do so. We would like to provide a way to optionally configure the client with these settings.
Primary benefit for
consul-alerts
users: PagerDuty returns an error when they rate-limit your service key. Their documentation says that the correct thing to do is to retry the request after a period of time.The client fields in question are:
We suggest two optional fields in the PagerDuty notifier configs that would look like this in Consul:
Since existing customer configurations don't have those keys, the code's default should be to retain the existing behavior. That means that if those keys are not present, that the PagerDuty client should not retry on errors.
The basic components of our suggested change are:
PagerDutyNotifier
struct and itsUnmarshal
-erI have some example code that I could also paste here to illustrate these steps, or could supply as a PR.
The text was updated successfully, but these errors were encountered: