Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-23740 (Feature/1.0.3 beta release): Validate KSM Metrics at Install #115

Closed

Conversation

bdrennz
Copy link
Contributor

@bdrennz bdrennz commented Dec 3, 2024

This PR modifies the kube_state_metrics_reachable command by including the full list of KSM metrics passed by the chart, rather than the hardcoded subset used during testing.

The base branch is feature/1.0.2-beta-release since these changes will be for beta 1.0.3.

Testing
Below is the output of the CloudZero Agent Server post-start job. If you notice, the logs outline the metrics being validated, eventually leading to a passing status for kube_state_metrics_reachable:

{
  "level": "info",
  "log_sequence": 1,
  "msg": "Using endpoint URL: http://cloudzero-state-metrics.prom-agent.svc.cluster.local:8080/metrics",
  "op": "ksm",
  "time": "2024-12-03T21:52:29Z"
}
{
  "level": "error",
  "log_sequence": 2,
  "msg": "Failed to fetch metrics on attempt 1: Get \"http://cloudzero-state-metrics.prom-agent.svc.cluster.local:8080/metrics\": dial tcp 10.100.123.115:8080: connect: connection refused",
  "op": "ksm",
  "time": "2024-12-03T21:52:29Z"
}
{
  "level": "info",
  "log_sequence": 3,
  "msg": "Found required metric kube_node_info on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 4,
  "msg": "Found required metric kube_node_status_capacity on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 5,
  "msg": "Found required metric kube_pod_container_resource_limits on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 6,
  "msg": "Found required metric kube_pod_container_resource_requests on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 7,
  "msg": "Found required metric kube_pod_labels on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 8,
  "msg": "Found required metric kube_pod_info on attempt 2",
  "op": "ksm",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 9,
  "msg": "reporting status",
  "report": "{
    ...
  {\"name\":\"prometheus_version\",\"passing\":true},{\"name\":\"kube_state_metrics_reachable\",\"passing\":true}]}",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 10,
  "msg": "marshalled cluster status: 5683 bytes",
  "time": "2024-12-03T21:52:39Z"
}
{
  "level": "info",
  "log_sequence": 11,
  "msg": "compressed size is: 5683 bytes",
  "time": "2024-12-03T21:52:39Z"
}

Checklist

  • I have added documentation for new/changed functionality in this PR
  • All active GitHub checks for tests, formatting, and security are passing
  • The correct base branch is being used, if not main

@bdrennz bdrennz changed the title CP-23740: Feature/1.0.3 beta release CP-23740 (Feature/1.0.3 beta release): Validate KSM Metrics at Install Dec 3, 2024
@bdrennz bdrennz marked this pull request as ready for review December 4, 2024 18:01
@bdrennz bdrennz requested a review from a team as a code owner December 4, 2024 18:01
@bdrennz bdrennz closed this Dec 4, 2024
@bdrennz bdrennz force-pushed the feature/1.0.3-beta-release branch from 7695a65 to 10045f2 Compare December 4, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant