Expose function controller readiness metric for prometheus-based monitoring #869

kwiatekus · 2024-04-09T09:23:45Z

Description

Introduce a new metric specifically designed to reflect the readiness status of serverless function controller:

AC:

it should indicate whether the function controller's main reconciliation loop is ready to serve requests or not (if the queue is served)
the frequency of metric update should be independent from kubernetes probing frequency configuration (i.e separate go rutine with own ticker).
frequent probing should not have negative effect on function-controller performance; probe should add an event for function controller who serves it with a fast exit. (we have it already. health probing is entering reconciliation loop)
No user misconfigurations (i.e invalid function CR or function code) should have an effect on the metric (and disrupt the SLO budget)
the metric should be observable in the time frame (via promql) so that observer can model alerting rules based on aggregated time series.

The above criteria are for the basic availability indication.
Think of additional availability indicator for serverless that could be used to inspect weather every requested function CR was "attempted to be built" and those which were successfully built were "attempted to be deployed"

Reasons
Ensure SLO is observable for serverless.
Enable administrators to set up alerting and monitoring based on function controller readiness.

Attachments

kwiatekus · 2024-04-10T10:37:31Z

@ebensom Could the avs rules be configured to ping /readyz endpoint of the function controller (just like it was calling webhook before)

kwiatekus added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 9, 2024

kwiatekus changed the title ~~Expose Function Controller Readiness Metric for Prometheus-based Monitoring~~ Expose function controller readiness metric for prometheus-based monitoring Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose function controller readiness metric for prometheus-based monitoring #869

Expose function controller readiness metric for prometheus-based monitoring #869

kwiatekus commented Apr 9, 2024 •

edited

Loading

kwiatekus commented Apr 10, 2024

Expose function controller readiness metric for prometheus-based monitoring #869

Expose function controller readiness metric for prometheus-based monitoring #869

Comments

kwiatekus commented Apr 9, 2024 • edited Loading

kwiatekus commented Apr 10, 2024

kwiatekus commented Apr 9, 2024 •

edited

Loading