Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Endpoint /cluster/stats problems #3910

Open
briend opened this issue Jan 17, 2025 · 0 comments
Open

Prometheus Endpoint /cluster/stats problems #3910

briend opened this issue Jan 17, 2025 · 0 comments

Comments

@briend
Copy link
Contributor

briend commented Jan 17, 2025

Currently the Teraslice helm chart creates a prometheus Service Monitor that scrapes two endpoints for metrics. The metrics from /cluster/stats are redundant now (teraslice_slices_processed and others are provided by the new metrics provider), so this extra endpoint should be removed:

- path: /cluster/stats
port: api
interval: {{ .Values.serviceMonitor.interval }}
metricRelabelings:
{{- toYaml .Values.serviceMonitor.metricRelabelings | nindent 8 }}
relabelings:
{{- toYaml .Values.serviceMonitor.relabelings | nindent 8 }}

Also the logic for that Service Monitor template could be cleaned up, it doesn't make sense to allow it to be created with an empty endpoints array, for instance, so maybe the whole file should be wrapped with:

{{- if and .Values.serviceMonitor.enabled .Values.terafoundation.prom_metrics_enabled }}

The other problem with the /cluster/stats endpoint is that normally it provides json, but it will provide Prometheus-formatted metrics with the correct request headers. However, it delivers an invalid content-type for Prometheus 3.0 : Content-Type: text/html: https://prometheus.io/docs/prometheus/3.0/migration/#scrape-protocols

Adding spec.fallbackScrapeProtocol: PrometheusText1.0.0 to this Service Monitor object would avoid scrape errors, if for some reason we still want to scrape it. Since we don't need /cluster/stats prometheus-style metrics anymore, maybe that functionality can simply be removed.

Here's what those /cluster/stats metrics look like:

curl -Ss --header "Accept: application/openmetrics-text;" teraslice/cluster/stats -v

> GET /cluster/stats HTTP/1.1
> Accept: application/openmetrics-text;
< HTTP/1.1 200 OK
< Content-Type: text/html; charset=utf-8

# TYPE teraslice_slices_processed counter
teraslice_slices_processed{cluster="teraslice"} 1004456
# TYPE teraslice_slices_failed counter
teraslice_slices_failed{cluster="teraslice"} 24
# TYPE teraslice_slices_queued counter
teraslice_slices_queued{cluster="teraslice"} 16
# TYPE teraslice_workers_joined counter
teraslice_workers_joined{cluster="teraslice"} 12
# TYPE teraslice_workers_disconnected counter
teraslice_workers_disconnected{cluster="teraslice"} 0
# TYPE teraslice_workers_reconnected counter
teraslice_workers_reconnected{cluster="teraslice"} 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant