Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NewRelic Prometheus Configurator Not Assigning Correct Metadata Type By Reading The Metrics File #235

Open
quayly opened this issue Jun 14, 2023 · 3 comments
Labels
bug Categorizes issue or PR as related to a bug. triage/in-progress Issue or PR is in the process of being triaged.

Comments

@quayly
Copy link

quayly commented Jun 14, 2023

NewRelic Prometheus Configurator Not Assigning Correct Metadata Type By Reading The Metrics File

Description

According to https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/install-configure-prometheus-agent/migration-guide/#metric-types , any metrics name that doesn't have the suffix total, count, sum, or bucket, it will default to Gauge. We have over 300+ metrics from various application that should be tag as counter instead of gauge.

Expected Behavior

The NewRelic Prometheus Configurator scrapes all of the pods that have the prometheus.io/scrape: "true" and gets the metrics. It should read the metrics file to figure out the correct metedata type "# TYPE <metric_type>" so that it assign the correct tag. This is what nri-prometheus does! However, NewRelic Prometheus Configurator by default that if the metrics suffix is not total, count, sum, or bucket, it will be converted as Gauge.

Please see https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/install-configure-prometheus-agent/migration-guide/#metric-types for more info.

We have many metrics that doesn't have the suffix name total, count, sum, and buckets in them and they are being marked as Gauge instead of counter. It is causing the data to be incorrect when doing it goes to NewRelic.

We have to manually look thru all the metrics and verify if it the tag is correct or not. If not, we will need to override the metrics type mapping. Instruction on how to update the mapping:
https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/install-configure-remote-write/set-your-prometheus-remote-write-integration/#mapping

Issue with the provided solution are:

  • The override requires a URL. In the world of auto discovery, we would should not need to know the URL. In a company with many Prometheus metrics from many services, it would be extremely painful to have to update every URL that it discovery for each environment with it's own configuration.
  • We need to go thru every metrics that check if it should be gauge or counter, and create a uniq overwrite for each one of them.
  • Any new service discovery would need to do the two steps above
@quayly quayly added bug Categorizes issue or PR as related to a bug. triage/pending Issue or PR is pending for triage and prioritization. labels Jun 14, 2023
@workato-integration
Copy link

@workato-integration
Copy link

Jira CommentId: 218647
Commented by ekanner:

Just adding an update from the customer around our current proposed workaround, metric override rules:

"Here's my regex for the counter override so far:
regex: (.*)_(0rtt$|200$|abandoned$|added$|attempt$|broken$|cancelled$|change$|completed$|created$|destroy$|differs$|directly$|eject$|empty$|error$|errors$|exceeded$|exponential$|evictions$|fail$|failure$|failures$|fallback$|fields$|flood$|flood$|flushes$|frames$|healthy$|headers$|hits$|insertions$|invalid$|latency$|left$|local$|max_size$|merge$|miss$|misses$|mode$|modified$|notify$|ok$|out_of_merge_window$|overflow$|panic$|protos$|pushes$|ratelimited$|reached$|received$|rebuild$|removed$|rejected$|remote$|requests$|reset$|responses$|retry$|rq$|rtt$|sampled$|selected$|small$|stale$|stopped$|stream$|structures$|success$|successes$|terminations$|timeout$|trailers$|triggers$|underscores$|updated$|underscores$|wondow$|zone$)
so many different ones and still finding more... so painful"
 

@davidgit
Copy link
Contributor

This bug depends on the open github issue which will make the metadata field carrying metric type information in the Prometheus remote write protocol.

@davidgit davidgit added triage/in-progress Issue or PR is in the process of being triaged. and removed triage/pending Issue or PR is pending for triage and prioritization. labels Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug. triage/in-progress Issue or PR is in the process of being triaged.
Projects
None yet
Development

No branches or pull requests

2 participants