Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_node_exporter_metrics: implement processes metrics #7880

Merged

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Aug 30, 2023

To monitor for statuses of processes and threads, we need to implement processes metrics on in_node_exporter_metrics plugin.

Fixes #7866


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
$ bin/fluent-bit -i node_exporter_metrics -p metrics=processes -o stdout
  • Debug log output from testing the change
Fluent Bit v2.1.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/08/31 11:02:00] [ info] Configuration:
[2023/08/31 11:02:00] [ info]  flush time     | 1.000000 seconds
[2023/08/31 11:02:00] [ info]  grace          | 5 seconds
[2023/08/31 11:02:00] [ info]  daemon         | 0
[2023/08/31 11:02:00] [ info] ___________
[2023/08/31 11:02:00] [ info]  inputs:
[2023/08/31 11:02:00] [ info]      node_exporter_metrics
[2023/08/31 11:02:00] [ info] ___________
[2023/08/31 11:02:00] [ info]  filters:
[2023/08/31 11:02:00] [ info] ___________
[2023/08/31 11:02:00] [ info]  outputs:
[2023/08/31 11:02:00] [ info]      stdout.0
[2023/08/31 11:02:00] [ info] ___________
[2023/08/31 11:02:00] [ info]  collectors:
[2023/08/31 11:02:00] [ info] [fluent bit] version=2.1.9, commit=eb36c698a2, pid=606478
[2023/08/31 11:02:00] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2023/08/31 11:02:00] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/08/31 11:02:00] [ info] [cmetrics] version=0.6.3
[2023/08/31 11:02:01] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.procfs = /proc
[2023/08/31 11:02:00] [ info] [ctraces ] version=0.3.1
[2023/08/31 11:02:01] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.sysfs  = /sys
[2023/08/31 11:02:01] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] initializing
[2023/08/31 11:02:01] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] enabled metrics processes
[2023/08/31 11:02:01] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] storage_strategy='memory' (memory only)
[2023/08/31 11:02:01] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] [thread init] initialization OK
[2023/08/31 11:02:01] [ info] [output:stdout:stdout.0] worker #0 started
[2023/08/31 11:02:01] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] thread instance initialized
[2023/08/31 11:02:01] [debug] [node_exporter_metrics:node_exporter_metrics.0] created event channels: read=30 write=31
[2023/08/31 11:02:01] [debug] [stdout:stdout.0] created event channels: read=34 write=35
[2023/08/31 11:02:01] [ info] [sp] stream processor started
[2023/08/31 11:02:06] [debug] [input chunk] update output instances with new chunk size diff=1538, records=0, input=node_exporter_metrics.0
[2023/08/31 11:02:06] [debug] [task] created task=0x8447660 id=0 OK
2023-08-31T02:02:05.589311697Z node_processes_threads = 1684
2023-08-31T02:02:05.589311697Z node_processes_max_threads = 513122
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="R"} = 1
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="S"} = 1573
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="D"} = 0
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="Z"} = 3
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="T"} = 0
2023-08-31T02:02:05.589311697Z node_processes_threads_state{thread_state="I"} = 107
2023-08-31T02:02:05.589311697Z node_processes_state{state="R"} = 0
2023-08-31T02:02:05.589311697Z node_processes_state{state="S"} = 369
2023-08-31T02:02:05.589311697Z node_processes_state{state="D"} = 0
2023-08-31T02:02:05.589311697Z node_processes_state{state="Z"} = 3
2023-08-31T02:02:05.589311697Z node_processes_state{state="T"} = 0
2023-08-31T02:02:05.589311697Z node_processes_state{state="I"} = 107
2023-08-31T02:02:05.589311697Z node_processes_pids = 479
2023-08-31T02:02:05.589311697Z node_processes_max_processeses = 4194304
[2023/08/31 11:02:06] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2023/08/31 11:02:06] [debug] [out flush] cb_destroy coro_id=0
[2023/08/31 11:02:06] [debug] [task] destroy task=0x8447660 (task_id=0)
^C[2023/08/31 11:02:07] [engine] caught signal (SIGINT)
[2023/08/31 11:02:07] [ warn] [engine] service will shutdown in max 5 seconds
[2023/08/31 11:02:07] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread pause instance
[2023/08/31 11:02:07] [ info] [engine] service has stopped (0 pending tasks)
[2023/08/31 11:02:07] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread pause instance
[2023/08/31 11:02:07] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/08/31 11:02:07] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread exit instance
[2023/08/31 11:02:07] [ info] [output:stdout:stdout.0] thread worker #0 stopped
  • Attached Valgrind output that shows no leaks or memory corruption was found
==606478== 
==606478== HEAP SUMMARY:
==606478==     in use at exit: 0 bytes in 0 blocks
==606478==   total heap usage: 188,120 allocs, 188,120 frees, 23,613,784 bytes allocated
==606478== 
==606478== All heap blocks were freed -- no leaks are possible
==606478== 
==606478== For lists of detected and suppressed errors, rerun with: -s
==606478== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#1184

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@cosmo0920 cosmo0920 temporarily deployed to pr August 30, 2023 09:16 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 30, 2023 09:16 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 30, 2023 09:16 — with GitHub Actions Inactive
@cosmo0920
Copy link
Contributor Author

Note: This PR might be able to cause conflict for merging #7876.
So, I'm still mark as a draft for now.

@cosmo0920 cosmo0920 temporarily deployed to pr August 30, 2023 09:44 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-process-metrics-on-node_exporter_metrics branch from eb36c69 to b289694 Compare August 31, 2023 02:01
@cosmo0920 cosmo0920 changed the title in_node_exporter_metrics: implement process metrics in_node_exporter_metrics: implement processes metrics Aug 31, 2023
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 02:02 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 02:02 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 02:02 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 02:27 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-process-metrics-on-node_exporter_metrics branch from b289694 to 2f1f238 Compare August 31, 2023 10:12
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 10:13 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 10:13 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 10:13 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr August 31, 2023 10:44 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-process-metrics-on-node_exporter_metrics branch from 2f1f238 to dd85693 Compare September 1, 2023 02:06
@cosmo0920 cosmo0920 marked this pull request as ready for review September 1, 2023 02:07
@cosmo0920 cosmo0920 temporarily deployed to pr September 1, 2023 02:07 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr September 1, 2023 02:07 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr September 1, 2023 02:07 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr September 1, 2023 02:34 — with GitHub Actions Inactive
@edsiper
Copy link
Member

edsiper commented Sep 1, 2023

@cosmo0920 has this been tested in CentOS 7 ?

@cosmo0920
Copy link
Contributor Author

This is not tested yet. But it is not enabled by default.

@cosmo0920
Copy link
Contributor Author

cosmo0920 commented Sep 4, 2023

It works on Docker M1 macOS!

[root@9a4610912d2f /]# uname -a
Linux 9a4610912d2f 5.15.49-linuxkit #1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

[root@9a4610912d2f /]# /opt/fluent-bit/bin/fluent-bit -i node_exporter_metrics -p metrics=processes -o stdout -v
Fluent Bit v2.1.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/09/04 03:50:38] [ info] Configuration:
[2023/09/04 03:50:38] [ info]  flush time     | 1.000000 seconds
[2023/09/04 03:50:38] [ info]  grace          | 5 seconds
[2023/09/04 03:50:38] [ info]  daemon         | 0
[2023/09/04 03:50:38] [ info] ___________
[2023/09/04 03:50:38] [ info]  inputs:
[2023/09/04 03:50:38] [ info]      node_exporter_metrics
[2023/09/04 03:50:38] [ info] ___________
[2023/09/04 03:50:38] [ info]  filters:
[2023/09/04 03:50:38] [ info] ___________
[2023/09/04 03:50:38] [ info]  outputs:
[2023/09/04 03:50:38] [ info]      stdout.0
[2023/09/04 03:50:38] [ info] ___________
[2023/09/04 03:50:38] [ info]  collectors:
[2023/09/04 03:50:38] [ info] [fluent bit] version=2.1.9, commit=, pid=62
[2023/09/04 03:50:38] [debug] [engine] coroutine stack size: 196608 bytes (192.0K)
[2023/09/04 03:50:38] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/09/04 03:50:38] [ info] [cmetrics] version=0.6.3
[2023/09/04 03:50:38] [ info] [ctraces ] version=0.3.1
[2023/09/04 03:50:38] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] initializing
[2023/09/04 03:50:38] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] storage_strategy='memory' (memory only)
[2023/09/04 03:50:38] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.procfs = /proc
[2023/09/04 03:50:38] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.sysfs  = /sys
[2023/09/04 03:50:38] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] enabled metrics processes
[2023/09/04 03:50:38] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] [thread init] initialization OK
[2023/09/04 03:50:38] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] thread instance initialized
[2023/09/04 03:50:38] [debug] [node_exporter_metrics:node_exporter_metrics.0] created event channels: read=30 write=31
[2023/09/04 03:50:38] [debug] [stdout:stdout.0] created event channels: read=34 write=35
[2023/09/04 03:50:38] [ info] [sp] stream processor started
[2023/09/04 03:50:38] [ info] [output:stdout:stdout.0] worker #0 started
[2023/09/04 03:50:42] [debug] [input chunk] update output instances with new chunk size diff=1538, records=0, input=node_exporter_metrics.0
[2023/09/04 03:50:43] [debug] [task] created task=0xffffad23c500 id=0 OK
[2023/09/04 03:50:43] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
2023-09-04T03:50:42.336229550Z node_processes_threads = 6
2023-09-04T03:50:42.336229550Z node_processes_max_threads = 30516
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="R"} = 1
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="S"} = 5
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="D"} = 0
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="Z"} = 0
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="T"} = 0
2023-09-04T03:50:42.336229550Z node_processes_threads_state{thread_state="I"} = 0
2023-09-04T03:50:42.336229550Z node_processes_state{state="R"} = 0
2023-09-04T03:50:42.336229550Z node_processes_state{state="S"} = 2
2023-09-04T03:50:42.336229550Z node_processes_state{state="D"} = 0
2023-09-04T03:50:42.336229550Z node_processes_state{state="Z"} = 0
2023-09-04T03:50:42.336229550Z node_processes_state{state="T"} = 0
2023-09-04T03:50:42.336229550Z node_processes_state{state="I"} = 0
2023-09-04T03:50:42.336229550Z node_processes_pids = 2
2023-09-04T03:50:42.336229550Z node_processes_max_processeses = 99999
[2023/09/04 03:50:43] [debug] [out flush] cb_destroy coro_id=0
[2023/09/04 03:50:43] [debug] [task] destroy task=0xffffad23c500 (task_id=0)
^C[2023/09/04 03:50:44] [engine] caught signal (SIGINT)
[2023/09/04 03:50:44] [ warn] [engine] service will shutdown in max 5 seconds
[2023/09/04 03:50:44] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread pause instance
[2023/09/04 03:50:44] [ info] [engine] service has stopped (0 pending tasks)
[2023/09/04 03:50:44] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/09/04 03:50:44] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread pause instance
[2023/09/04 03:50:44] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2023/09/04 03:50:44] [debug] [input:node_exporter_metrics:node_exporter_metrics.0] thread exit instance

@@ -701,6 +720,26 @@ static int in_ne_init(struct flb_input_instance *in,
}
ne_systemd_init(ctx);
}
else if (strncmp(entry->str, "processes", 9) == 0) {
if (ctx->processes_scrape_interval == 0) {
flb_plg_debug(ctx->ins, "enabled metrics %s", entry->str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems process collector is disabled but the message says enabled

Copy link
Contributor Author

@cosmo0920 cosmo0920 Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is used when enabling processes metrics. Other messages for enabling metrics are also displayed in enabled circumstances. I mean, when "processes" string is included in metrics parameter, this message will be displayed.

@cosmo0920
Copy link
Contributor Author

cosmo0920 commented Sep 6, 2023

This PR works well on CentOS 6 (x86_64 container):

# cat /etc/redhat-release 
CentOS release 6.10 (Final)
[root@d24f26fac50a build_centos6]# bin/fluent-bit -i node_exporter_metrics -pmetrics=processes -o stdout
Fluent Bit v2.1.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/09/06 10:52:36] [ info] [fluent bit] version=2.1.9, commit=, pid=1727
[2023/09/06 10:52:36] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/09/06 10:52:36] [ info] [cmetrics] version=0.6.3
[2023/09/06 10:52:36] [ info] [ctraces ] version=0.3.1
[2023/09/06 10:52:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] initializing
[2023/09/06 10:52:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] storage_strategy='memory' (memory only)
[2023/09/06 10:52:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.procfs = /proc
[2023/09/06 10:52:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.sysfs  = /sys
[2023/09/06 10:52:36] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] thread instance initialized
[2023/09/06 10:52:36] [ info] [sp] stream processor started
[2023/09/06 10:52:36] [ info] [output:stdout:stdout.0] worker #0 started
2023-09-06T10:52:40.332391348Z node_processes_threads = 6
2023-09-06T10:52:40.332391348Z node_processes_max_threads = 513122
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="R"} = 1
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="S"} = 5
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="D"} = 0
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="Z"} = 0
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="T"} = 0
2023-09-06T10:52:40.332391348Z node_processes_threads_state{thread_state="I"} = 0
2023-09-06T10:52:40.332391348Z node_processes_state{state="R"} = 0
2023-09-06T10:52:40.332391348Z node_processes_state{state="S"} = 2
2023-09-06T10:52:40.332391348Z node_processes_state{state="D"} = 0
2023-09-06T10:52:40.332391348Z node_processes_state{state="Z"} = 0
2023-09-06T10:52:40.332391348Z node_processes_state{state="T"} = 0
2023-09-06T10:52:40.332391348Z node_processes_state{state="I"} = 0
2023-09-06T10:52:40.332391348Z node_processes_pids = 2
2023-09-06T10:52:40.332391348Z node_processes_max_processeses = 4194304
^C[2023/09/06 10:52:42] [engine] caught signal (SIGINT)
[2023/09/06 10:52:42] [ warn] [engine] service will shutdown in max 5 seconds
[2023/09/06 10:52:43] [ info] [engine] service has stopped (0 pending tasks)
[2023/09/06 10:52:43] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/09/06 10:52:43] [ info] [output:stdout:stdout.0] thread worker #0 stopped

And CentOS 6 on vagrant box:

[vagrant@localhost centos6_rpms]$ cat /etc/redhat-release 
CentOS release 6.10 (Final)
[vagrant@localhost centos6_rpms]$ /path/to/bin/fluent-bit -i node_exporter_metrics -pmetrics=processes -o stdout
<snip>
[2023/09/07 04:23:53] [ info] [output:stdout:stdout.0] worker #0 started
2023-09-07T04:23:58.560399017Z node_processes_threads = 125
2023-09-07T04:23:58.560399017Z node_processes_max_threads = 14821
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="R"} = 1
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="S"} = 124
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="D"} = 0
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="Z"} = 0
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="T"} = 0
2023-09-07T04:23:58.560399017Z node_processes_threads_state{thread_state="I"} = 0
2023-09-07T04:23:58.560399017Z node_processes_state{state="R"} = 0
2023-09-07T04:23:58.560399017Z node_processes_state{state="S"} = 115
2023-09-07T04:23:58.560399017Z node_processes_state{state="D"} = 0
2023-09-07T04:23:58.560399017Z node_processes_state{state="Z"} = 0
2023-09-07T04:23:58.560399017Z node_processes_state{state="T"} = 0
2023-09-07T04:23:58.560399017Z node_processes_state{state="I"} = 0
2023-09-07T04:23:58.560399017Z node_processes_pids = 115
2023-09-07T04:23:58.560399017Z node_processes_max_processeses = 32768
^C[2023/09/07 04:24:01] [engine] caught signal (SIGINT)
<snip>

@edsiper edsiper merged commit cfa197c into master Oct 16, 2023
@edsiper edsiper deleted the cosmo0920-implement-process-metrics-on-node_exporter_metrics branch October 16, 2023 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement processes metrics on node_exporter_metrics
2 participants