Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_kafka plugin stopped counting fluentbit_input_records_total #8212

Closed
ab0oo opened this issue Nov 24, 2023 · 6 comments
Closed

in_kafka plugin stopped counting fluentbit_input_records_total #8212

ab0oo opened this issue Nov 24, 2023 · 6 comments

Comments

@ab0oo
Copy link

ab0oo commented Nov 24, 2023

Bug Report

Describe the bug
In the most recent version of Fluent-bit (2.2.0), we are seeing all zeros for input_records_total in the prometheus and API outputs.

To Reproduce
Running fluentbit with one or more Kafka inputs, poll the prometheus metrics server looking for input statistics:

$ curl -s http://fluentbit-1.example.com:9101/metrics | grep -E "_input_.*_total" | grep -v "^#"
fluentbit_input_bytes_total{name="kafka.jobhistevent"} 6186154979
fluentbit_input_records_total{name="kafka.jobhistevent"} 0
fluentbit_input_bytes_total{name="kafka.adminevent"} 13324466
fluentbit_input_records_total{name="kafka.adminevent"} 0
fluentbit_input_bytes_total{name="kafka.statsevent"} 2322010006
fluentbit_input_records_total{name="kafka.statsevent"} 0

Expected behavior
Prior to version 2.2.0, the _records_total counters would indicate the correct number of records processed per input event

Your Environment

  • Version used: 2.2.0
  • Configuration:
  • Environment name and version (e.g. Kubernetes? What version?):
    Custom compiled code on RHEL8
  • Server type and version:
    n/a
  • Operating System and version:
    Linux RHEL8.6, kernel 4.18.0-custom
  • Filters and plugins:
    kafka inputs
    HTTP outputs ( we are flushing all data to BQL via HTTPS)

Additional context
We are currently using record count to verify all input/output numbers match up, to ensure no messages are dropped during a complex/convoluted chain to go from application to BQL.

@lecaros
Copy link
Contributor

lecaros commented Nov 24, 2023

@MrPibody7

@nokute78
Copy link
Collaborator

I think #8182 and #8184 are similar issues.

@MrPibody7
Copy link
Collaborator

MrPibody7 commented Nov 28, 2023

Hi @ab0oo ,
I tried to reproduce the behavior in Debian BullsEye, using one Kafka input and using https as output. Feeding one topic with 100,000 messages. The input_records_total metric seems to be counting OK.
I'm querying metrics using: /api/v1/metrics/prometheus

Would you share your fluent-bit.conf ?

@ab0oo
Copy link
Author

ab0oo commented Nov 29, 2023

Would you share your fluent-bit.conf ?

Give me a day or two to get it sanitized and approved for release.

Here's a snippet from the /api/v1/metrics JSON (since it's easier to grok when run through jq):

{
  "input": {
    "fluentbit_metrics": {
      "records": 0,
      "bytes": 0
    },
    "kafka.1": {
      "records": 0,
      "bytes": 15599641394
    },
    "kafka.2": {
      "records": 0,
      "bytes": 6813549
    },
    "kafka.3": {
      "records": 0,
      "bytes": 51819295
    },
    "kafka.4": {
      "records": 0,
      "bytes": 781991334
    },
    "kafka.5": {
      "records": 0,
      "bytes": 34224842099
    },
    "kafka.plexserverstats": {
      "records": 0,
      "bytes": 0
    }
  },

Interestingly, on the output side, there are correct record counts:

    "bql.1": {
      "proc_records": 9328826,
      "proc_bytes": 17277138663,
      "errors": 0,
      "retries": 4,
      "retries_failed": 0,
      "dropped_records": 0,
      "retried_records": 564
    },

In my case, I'm pulling prometheus counters via the prometheus_exporter, using this pipeline:

pipeline:
    inputs:
        - name: fluentbit_metrics
          alias: fluentbit_metrics
          scrape_interval: 15
          tag: fluentbit_metrics

    outputs:
        - name: prometheus_exporter
          alias: out.fluentbit_metrics
          match: fluentbit_metrics
          host: 0.0.0.0
          port: 9101

and http://fluentbit-1.example.com:9101/metrics

@nokute78
Copy link
Collaborator

nokute78 commented Dec 3, 2023

Similar issue #8182 seems to be fixed by #8223
Could you test using current master ?

@ab0oo
Copy link
Author

ab0oo commented Dec 4, 2023

I pulled this current master: 323f343 which is current as of this writing, and can confirm that both input record and byte counts work for in_kafka:
"input": {
"fluentbit_metrics": {
"records": 0,
"bytes": 0
},
"kafka.1": {
"records": 7565,
"bytes": 12994249
},
"kafka.2": {
"records": 112,
"bytes": 51379
},
"kafka.3": {
"records": 34,
"bytes": 272719
},
"kafka.4": {
"records": 562,
"bytes": 1089744
},
"kafka.5": {
"records": 15306,
"bytes": 62465348
},
"kafka.6": {
"records": 0,
"bytes": 0
}
},

Assuming this change makes it to the next release, I'm happy as a clam. Thank you!

@ab0oo ab0oo closed this as completed Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants