Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add optimized grafana dashboard (#1454)
* Add optimized cluster overview dashboard This optimized dashboard mainly lowers the cardinality of the CPU metrics. Specifically instead of using `avg(rate(node_cpu_seconds_total` which has a cardinality of total CPUs across all managed clusters, we instead use `cluster:node_cpu:ratio` which has a cardinality of 1 per cluster. That is with 100 clusters, with 16 CPUs, the cardinality before was 100*16 = 1600, where as with this change we now only fetch 100 metrics. This should scale quite a bit better on larger installations with many clusters/nodes. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Grafana: Use wildcard for all on cluster overview Instead of listing all clusters manually in the query, i.e like: ``` cluster=~"(local-cluster|simulated-managed-cluster-1|simulated-managed-cluster-1-1|simulated-managed-cluster-1-10|simulated-managed-cluster-1-2|simulated-managed-cluster-1-3..." ``` We set it to `".+"` simplifying the query significantly. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Tests: Add basic test for dashboard existence A quick test that checks if the dashboards exists. Signed-off-by: Jacob Baungard Hansen <[email protected]> * Kind test: Actually use CI built MCO image in test While the auxiliary images (endpoint-monitoring-operator, etc) correctly used the CI built images in kind, this was that the case for MCO itself. In this commit we make sure to load in the `IMAGE REF` from the kind env file, so that the CI image for MCO is used as well. Signed-off-by: Jacob Baungard Hansen <[email protected]> --------- Signed-off-by: Jacob Baungard Hansen <[email protected]>
- Loading branch information