Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check change-threshold / interval doesn't seem to be accurate #240

Open
pondix opened this issue Sep 18, 2018 · 0 comments
Open

Check change-threshold / interval doesn't seem to be accurate #240

pondix opened this issue Sep 18, 2018 · 0 comments

Comments

@pondix
Copy link

pondix commented Sep 18, 2018

Hi,

First off I'd like to say great work, this is an awesome project and very useful for Consul alerting. I'm using consul-alerts to for alerting on various system / application health checks in Consul with Opsgenie.

I'm trying to configure the alerts to notify as soon as an issue is detected however it seems notifications are only being sent a few minutes (up to 3x minutes) after initial detection.

After reading the available documentation it seems there isn't anything to tune apart from the following:

consul-alerts/config/notif-profiles/default

{
  "Interval": 1,
  "NotifList": {
    "opsgenie":true
  }
}

consul-alerts/config/checks/change-threshold

1

If I've understood correctly, with this configuration notifications should be processed every 1x minute while it is enough for a check to fail for 1x second in order to be considered "failed".

  • Is there some configuration missing from the documentation or is the code not performing according to the configured parameters?

Logs:

Sep 18 16:27:45 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:27:45Z" level=info msg="Now acting as leader."
Sep 18 16:28:02 proxysql-testing consul[25768]:     2018/09/18 16:28:02 [WARN] agent: Check "disk1" is now warning
Sep 18 16:28:23 proxysql-testing systemd[1]: dev-disk-by\x2dpartuuid-33eb67f0\x2da6ee\x2d4d01\x2d9dd4\x2d8d302d24c221.device: Job dev-disk-by\x2dpartuuid-33eb67f0\x2da6ee\x2d4d01\x2d9dd4\x2d8d302d24c221.device/start timed out.
Sep 18 16:28:23 proxysql-testing systemd[1]: Timed out waiting for device dev-disk-by\x2dpartuuid-33eb67f0\x2da6ee\x2d4d01\x2d9dd4\x2d8d302d24c221.device.
Sep 18 16:28:23 proxysql-testing systemd[1]: Dependency failed for /boot/efi.
Sep 18 16:28:23 proxysql-testing systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.
Sep 18 16:28:23 proxysql-testing systemd[1]: dev-disk-by\x2dpartuuid-33eb67f0\x2da6ee\x2d4d01\x2d9dd4\x2d8d302d24c221.device: Job dev-disk-by\x2dpartuuid-33eb67f0\x2da6ee\x2d4d01\x2d9dd4\x2d8d302d24c221.device/start failed with result 'timeout'.
Sep 18 16:28:32 proxysql-testing consul[25768]:     2018/09/18 16:28:32 [WARN] agent: Check "disk1" is now warning
Sep 18 16:29:02 proxysql-testing consul[25768]:     2018/09/18 16:29:02 [WARN] agent: Check "disk1" is now warning
Sep 18 16:29:32 proxysql-testing consul[25768]:     2018/09/18 16:29:32 [WARN] agent: Check "disk1" is now warning
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="Running reminder check."
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="Getting reminders"
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="Reminder message duration minutes:  10"
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="Setting reminder for node:  monitor-jenkins-1"
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="messages sent for notification"
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=info msg="sendBuiltin running"
Sep 18 16:29:54 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:54Z" level=debug msg="OpsGenieAlertClient.CreateAlert alias: monitor-jenkins-1"
Sep 18 16:29:55 proxysql-testing consul-alerts[25569]: time="2018-09-18T16:29:55Z" level=info msg="Opsgenie notification sent."

Further logs:

Sep 18 16:43:35 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:43:35Z" level=info msg="Processing health checks for notification."
Sep 18 16:47:15 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:47:15Z" level=info msg="Running reminder check."
Sep 18 16:47:15 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:47:15Z" level=info msg="Getting reminders"
Sep 18 16:48:49 proxysql-testing consul[26084]:     2018/09/18 16:48:49 [INFO] agent: Synced check "load1"
Sep 18 16:48:49 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:48:49Z" level=info msg="Running health check."
Sep 18 16:48:59 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:48:59Z" level=info msg="Processing health checks for notification."
Sep 18 16:49:01 proxysql-testing consul[26084]:     2018/09/18 16:49:01 [INFO] agent: Synced check "disk1"
Sep 18 16:49:01 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:49:01Z" level=info msg="Running health check."
Sep 18 16:49:11 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:49:11Z" level=info msg="Processing health checks for notification."
Sep 18 16:52:08 proxysql-testing consul[26084]:     2018/09/18 16:52:08 [INFO] agent: Synced check "load1"
Sep 18 16:52:08 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:52:08Z" level=info msg="Running health check."
Sep 18 16:52:15 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:52:15Z" level=info msg="Running reminder check."
Sep 18 16:52:15 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:52:15Z" level=info msg="Getting reminders"
Sep 18 16:52:18 proxysql-testing consul-alerts[26349]: time="2018-09-18T16:52:18Z" level=info msg="Processing health checks for notification."

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant