-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Space Health perf data not working when critical #74
Labels
Comments
Please see my fix here: |
@mmarodin Please provide a pr with your fix :) |
Elias481
added a commit
to Elias481/check_netapp_ontap
that referenced
this issue
Jul 20, 2020
…here skipped) * this is just a least-instrusive fix for this impornat issue (fixes district09#74) from my point of view the thing should be restructered further: * call space_threshold_helper just once with both thresholds * use a threshold-check function for the recurring task to determine intAlertLevel for a condition result * include thresholds in perf-data (also requested in district09#82) which would be quite hacky currently and I did not do yet despite I also want it * output perfdata also for metrics where no threshold is defined (threshold is for alerting but a historical graph would be fine even if no alert is defined)
Elias481
added a commit
to Elias481/check_netapp_ontap
that referenced
this issue
Sep 24, 2020
…here skipped) * this is just a least-instrusive fix for this impornat issue (fixes district09#74) from my point of view the thing should be restructered further: * call space_threshold_helper just once with both thresholds * use a threshold-check function for the recurring task to determine intAlertLevel for a condition result * include thresholds in perf-data (also requested in district09#82) which would be quite hacky currently and I did not do yet despite I also want it * output perfdata also for metrics where no threshold is defined (threshold is for alerting but a historical graph would be fine even if no alert is defined)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Issue Type
Bug report
Issue Detail
Expected Behavior
When checking for aggregate_health, we expect to get performance data for all aggregates.
Actual Behavior
This works fine when the state is OK (0) or WARNING (1). However, when one of the aggregates is in CRITICAL state, it disappears from the performance data output, causing problems with Nagios XI PNP performance data engine. In our case, we are checking 3 aggregates, but as soon as 1 goes in Critical state, we only get performance data for 2 aggregates. The RRD file still expects 3 datasources, so we don't see any performance graphs anymore.
I expect this behaviour will also happen with other checks which use the calc_space_health sub, as when an object is critical, it is removed before checking for Warning, and perf data is only added on Warning check.
How to reproduce Behavior
Run the script for aggregate health so that no aggregates are critical. You should see perf data for all aggregates checked. rerun the check with critical level so that one or more aggregates have critical state, they will not be included in performance data then.
Would be great if this could be fixed soon.
Thanks
Edward
The text was updated successfully, but these errors were encountered: