Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Introduce a Persistent HTTP Probe Status Counter Metric #1367

Closed
Mistral-valaise opened this issue Feb 16, 2025 · 3 comments

Comments

@Mistral-valaise
Copy link

Description

I would like to propose adding a new persistent counter metric that accumulates the HTTP status codes returned by the HTTP probe. Currently, the Blackbox Exporter exposes a gauge metric (probe_http_status_code) representing the last HTTP status code received during a probe. However, for many use cases (e.g., calculating cumulative success rates or error counts over time), it is desirable to have a monotonically increasing counter for HTTP status codes.
that will a solution of the first part of "SLI/SLO friendly metrics #925
SLIs for success rates are built with counters "#925

Motivation & Goals
Improved Observability: With a cumulative counter (e.g., probe_http_status_counter_total with labels for target and status code), it becomes easier to calculate rates (using functions like rate() or increase()) over custom time windows in Prometheus.
Consistency with Prometheus Counters: Counters are generally used for event counts and having a counter metric for HTTP statuses would be consistent with best practices.

Proposed Implementation

•	target – representing the endpoint or target being probed.
•	status – representing the HTTP status code (as a string).

Modification in ProbeHTTP: In the ProbeHTTP function (in prober/http.go), remove the local instantiation of a counter and, instead, increment the global counter each time a probe is executed:

Exposure: Ensure that the global counter is registered during package initialization (for example, in a new file such as global_metrics.go), so that it persists across multiple probe executions and is exposed on the /metrics endpoint.

Additional Considerations:
• The new metric should be documented and added to the exporter’s README and any relevant configuration examples.

Conclusion:
This change would enhance the observability of HTTP probe results by providing a cumulative counter for HTTP status codes. I believe this would be a valuable addition for users who rely on long-term metrics and SLO calculations .

I am happy to work on a pull request to implement this feature if there is interest.

@SuperQ
Copy link
Member

SuperQ commented Feb 16, 2025

This is not necessary, as each probe_http_status_codes recorded in Prometheus. So you can use functions like count_over_time() to get the results you need.

@Mistral-valaise
Copy link
Author

Mistral-valaise commented Feb 16, 2025

Thank you for your suggestion @SuperQ
I appreciate your input on using functions like sum_over_time or count_over_time. However, here’s a summary of why that approach doesn’t meet our needs:

Gauges vs. Counters: The current HTTP status metric is a gauge that reflects the most recent status code rather than accumulating each probe event. Using range vector functions on a gauge only aggregates data over a fixed time window instead of maintaining a persistent, cumulative count.

Limited Time Window: Functions like sum_over_time only work within the specified range (e.g., the last minute), meaning past data is not retained once the window slides. This approach does not provide the long-term accumulation required for calculating accurate rates.

State Persistence: A true persistent counter needs to increment with every probe event and “remember” previous increments across scrapes. This cannot be achieved by merely applying aggregation functions to a gauge that resets or updates each scrape.

In summary, to obtain a persistent HTTP status counter, the exporter must emit a genuine counter metric that increments with every probe rather than relying on range vector functions applied to a gauge. Thank you again for your suggestion, and I hope this clarifies the limitations of the proposed approach.

@SuperQ SuperQ closed this as not planned Won't fix, can't repro, duplicate, stale Feb 16, 2025
@SuperQ
Copy link
Member

SuperQ commented Feb 16, 2025

Prometheus is the system that "remembers". The exporter is explicitly stateless.

I suggest you read about recording rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants