Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_node_exporter_metrics: broken timestamp for node_systemd_system_running #7621

Closed
positron96 opened this issue Jun 29, 2023 · 8 comments
Closed
Assignees
Labels

Comments

@positron96
Copy link

I am using fluent-bit 2.1.6 on Linux (Raspberry Pi, Raspbian 11) and have configured node-explorer input with mostly default settings:

pipeline:  
  inputs: 
    - name: node_exporter_metrics 
      tag:  metrics_node  
      scrape_interval: 60 
  outputs:
    - name: stdout  
      match: "metrics*"   
      format: json_lines  

The metrics seem good, except the node_systemd_system_running is showing zero timestamp. Here is an output from stdout (note the last line):

2023-06-29T11:57:12.242099089Z node_boot_time_seconds = 1688023927
2023-06-29T11:57:12.242099089Z node_procs_running = 3
2023-06-29T11:57:12.242099089Z node_procs_blocked = 0
2023-06-29T11:57:12.242278326Z node_time_seconds = 1688039832.2422783
2023-06-29T11:57:12.242409602Z node_load1 = 0.17000000000000001
2023-06-29T11:57:12.242409602Z node_load5 = 0.17000000000000001
2023-06-29T11:57:12.242409602Z node_load15 = 0.16
2023-06-29T11:57:12.244408198Z node_filefd_allocated = 3968
2023-06-29T11:57:12.244408198Z node_filefd_maximum = 2147483647
1970-01-01T00:00:00.000000000Z node_systemd_system_running = 0

As a consequence, the prometheus remote_write server (not shown in config) complains about timestamp being too old:

 [output:prometheus_remote_write:prometheus_remote_write.3] prometheus-prod-22-prod-eu-west-3.grafana.net:443, HTTP status=400
failed pushing to ingester: user=XXX: the sample has been rejected because its timestamp is too old (err-mimir-sample-timestamp-too-old). The affected sample has timestamp 1970-01-01T00:00:00Z and is from series {__name__="node_systemd_system_running"}
@helmut72
Copy link

helmut72 commented Jul 26, 2023

I have the same problem, but with a self hosted Grafana Mimir installation:

[2023/07/26 20:09:41] [error] [output:prometheus_remote_write:prometheus_remote_write.0] 192.168.0.1:9009, HTTP status=400
failed pushing to ingester: user=anonymous: the sample has been rejected because another sample with a more recent timestamp has already been ingested and out-of-order samples are not allowed (err-mimir-sample-out-of-order). The affected sample has timestamp 2023-07-26T18:09:40.264Z and is from series {__name__="go_memstats_alloc_bytes_total"}

It's 2h later here (20:09 = 8:09pm), not 18:09 = 6:09pm.

@positron96
Copy link
Author

@helmut72 to me this seems like a different issue, but idk.

@helmut72
Copy link

Maybe you are right, but I get the message even if I delete my metrics data in Mimir and start from scratch. They can't be out-of-order if no data exists. But I think I should open an own issue.

@helmut72
Copy link

You were right:
#7763 (comment)

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Dec 11, 2023
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@cianbe
Copy link

cianbe commented Jan 19, 2024

I know this issue is closed, could it be re-opened, or should we raise a new one?

We're experiencing a similar issue. We included systemd in node_exporter_metrics [INPUT] in fluent-bit.conf to capture system unit states in Grafana. When we observed stdout and logs, we also saw zero timestamp for node_systemd_system_running:
1970-01-01T00:00:00.000000000Z node_systemd_system_running = 0
In addition to this, metrics stopped flowing into Grafana.
We upgraded to fluent-bit 2.2.2 thinking it would be fixed as this issue looked similar to #8368. However, issue still persists.

@cosmo0920
Copy link
Contributor

cosmo0920 commented Jan 19, 2024

Hi!! We have a patch to mitigate this issue in cmetrics: fluent/cmetrics@57b5c9c
We need to release / add as a tag the new version of cmetrics and then bundle it in fluent-bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants