-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus alert rules are created only for one of the two (or more) applications related over the cos_agent interface #17
Comments
Confirmed in another model that the alert rule is created only for the app that has the grafana-agent's juju leader as its subordinate:
|
When I reproduce this, it looks like the kafka alerts made it in to prometheus, but they are labeled with |
@dstathis Confirmed:
After splitting grafana-agent into two separate apps, and relating them to kafka and zookeeper respectively, I also see However, |
Created #29 for the the alerts sticking around. |
@simskij This issue is now critical once we sort out canonical/prometheus-k8s-operator#551. Without fixing this, all alert_rules for host metrics except for one application will be missing. |
After investigating this issue more, we have determined that the implementation of alert and metrics labels will have to change significantly. We will now have all alerts and metrics labeled with the topology labels of the charm they came from rather than the topology of the principal. This change will likely be completed after the winter break. This change will require updates to the cos_agent library in client charms and may require changes to any git based alert rules. |
Bug Description
After relating grafana-agent to two principal applications, in our case, zookeeper and kafka, the generic grafana-agent host alert rules (e.g. HostCpuHighIowait) are generated only for one of the apps (zookeeper).
Noteworthy, the grafana-agent leader unit is the one related to the zookeeper app (as noticed by @dstathis on our live debugging call).
To Reproduce
The same alert rule for kafka is missing.
Noteworthy, the grafana-agent leader unit is the one related to the zookeeper app.
Environment
Monitored model:
Juju status:
Prometheus version in COS model:
Relevant log output
CLI commands output above, let me know what logs would help.
Additional context
No response
The text was updated successfully, but these errors were encountered: