Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: configure prometheus_remote_write config for non-leader #168

Merged
merged 5 commits into from
Aug 19, 2024

Conversation

ca-scribner
Copy link
Contributor

Issue

This PR fixes this issue where grafana-agent is not correctly setting its prometheus_remote_write settings for non-leader units, leading to non-leader units not properly writing their data to prometheus

Solution

The issue is caused by leadership guards in the grafana-cloud-integrator library's GrafanaCloudConfigRequirer relation handling. Specifically, it does nothing on relation changed/relation broken events.

The solution here is:

  • add tests that demonstrate the prometheus_remote_write settings are written to config for both leader and non-leader units
  • bump the grafana-cloud-integrator lib to fix the issue

Context

Testing Instructions

juju add-model ga1
juju deploy ubuntu -n 2
juju deploy grafana-agent --channel=latest/edge
juju deploy grafana-cloud-integrator --channel latest/edge --config prometheus-url=http://some.domain.name:9090/api/v1/write
juju relate grafana-agent ubuntu
juju relate grafana-agent:grafana-cloud-config grafana-cloud-integrator:grafana-cloud-config

# Then compare the following:
juju ssh grafana-agent/0 -- cat /etc/grafana-agent.yaml | yq .integrations.prometheus_remote_write
juju ssh grafana-agent/1 -- cat /etc/grafana-agent.yaml | yq .integrations.prometheus_remote_write

where we should see both units having the same data:

- basic_auth:
    password: ''
    username: ''
  tls_config:
    insecure_skip_verify: false
  url: http://some.domain.name:9090/api/v1/write

Upgrade Notes

tests/unit/test_cloud_integration.py Outdated Show resolved Hide resolved
tests/unit/test_cloud_integration.py Outdated Show resolved Hide resolved
from ops.testing import Harness


class TestUpdateStatus(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much of this test should reside in canonical/grafana-cloud-integrator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only part that's testable in grafana-cloud-integrator is added already here. The rest of this test is grafana-agent business logic (confirm the settings appear in the place they need to be). Although tbh this unit test is a pretty large unit test, more an integration test that just doesn't need juju...

…orrect

As shown in [this issue](canonical/grafana-cloud-integrator#12), we previously did not have test coverage to assert that prometheus_remote_write config was written successfully on non-leader units.  The current commit adds a test to assert these configs are available for both leader and non-leader units.
@ca-scribner ca-scribner merged commit 66c41b1 into main Aug 19, 2024
13 checks passed
@ca-scribner ca-scribner deleted the openg-2648 branch August 19, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prometheus remote-write URL is only getting pushed to the leader unit
3 participants