Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: v2: reload: Prevent duplicated request via api v2 reload #8461

Merged

Conversation

cosmo0920
Copy link
Contributor

When the reloading signal (SIGHUP) is received, the reloading process is sometimes in progress.
For HTTP API, we need to distinguish and switch the response when reloading / ready to reload.

Succeeded case

*   Trying 127.0.0.1:2021...
* Connected to localhost (127.0.0.1) port 2021 (#0)
> POST /api/v2/reload HTTP/1.1
> Host: localhost:2021
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Length: 2
> Content-Type: application/x-www-form-urlencoded
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: Monkey/1.7.2
< Date: Tue, 06 Feb 2024 05:59:19 GMT
< Transfer-Encoding: chunked
< Content-Type: application/json
< 
* Connection #0 to host localhost left intact
{"reload":"done","status":0}

Failed case

*   Trying 127.0.0.1:2021...
* Connected to localhost (127.0.0.1) port 2021 (#0)
> POST /api/v2/reload HTTP/1.1
> Host: localhost:2021
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Length: 2
> Content-Type: application/x-www-form-urlencoded
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 400 Bad Request
< Server: Monkey/1.7.2
< Date: Tue, 06 Feb 2024 05:59:19 GMT
< Transfer-Encoding: chunked
< Content-Type: application/json
< 
* Connection #0 to host localhost left intact
{"reload":"in progress","status":-2

  • Succeeded case: 200 OK
  • Failed case: 400 Bad Request

Resolves #8457


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
$ bin/fluent-bit -i dummy -o stdout -H -P 2021 -Y
  • Debug log output from testing the change
Fluent Bit v3.0.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

____________________
< Fluent Bit v2.2.2 >
 -------------------
          \
           \
            \          __---__
                    _-       /--______
               __--( /     \ )XXXXXXXXXXX\v.
             .-XXX(   O   O  )XXXXXXXXXXXXXXX-
            /XXX(       U     )        XXXXXXX\
          /XXXXX(              )--_  XXXXXXXXXXX\
         /XXXXX/ (      O     )   XXXXXX   \XXXXX\
         XXXXX/   /            XXXXXX   \__ \XXXXX
         XXXXXX__/          XXXXXX         \__---->
 ---___  XXX__/          XXXXXX      \__         /
   \-  --__/   ___/\  XXXXXX            /  ___--/=
    \-\    ___/    XXXXXX              '--- XXXXXX
       \-\/XXX\ XXXXXX                      /XXXXX
         \XXXXXXXXX   \                    /XXXXX/
          \XXXXXX      >                 _/XXXXX/
            \XXXXX--__/              __-- XXXX/
             -XXXXXXXX---------------  XXXXXX-
                \XXXXXXXXXXXXXXXXXXXXXXXXXX/
                  ""VXXXXXXXXXXXXXXXXXXV""

[2024/02/06 15:01:58] [ info] Configuration:
[2024/02/06 15:01:58] [ info]  flush time     | 1.000000 seconds
[2024/02/06 15:01:58] [ info]  grace          | 5 seconds
[2024/02/06 15:01:58] [ info]  daemon         | 0
[2024/02/06 15:01:58] [ info] ___________
[2024/02/06 15:01:58] [ info]  inputs:
[2024/02/06 15:01:58] [ info]      dummy
[2024/02/06 15:01:58] [ info] ___________
[2024/02/06 15:01:58] [ info]  filters:
[2024/02/06 15:01:58] [ info] ___________
[2024/02/06 15:01:58] [ info]  outputs:
[2024/02/06 15:01:58] [ info]      stdout.0
[2024/02/06 15:01:58] [ info] ___________
[2024/02/06 15:01:58] [ info]  collectors:
[2024/02/06 15:01:58] [ info] [fluent bit] version=3.0.0, commit=dde6ba5e10, pid=149753
[2024/02/06 15:01:58] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/02/06 15:01:58] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/02/06 15:01:58] [ info] [cmetrics] version=0.6.6
[2024/02/06 15:01:58] [ info] [ctraces ] version=0.4.0
[2024/02/06 15:01:58] [ info] [input:dummy:dummy.0] initializing
[2024/02/06 15:01:58] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/02/06 15:01:58] [debug] [dummy:dummy.0] created event channels: read=21 write=22
[2024/02/06 15:01:58] [debug] [stdout:stdout.0] created event channels: read=23 write=24
[2024/02/06 15:01:58] [ info] [output:stdout:stdout.0] worker #0 started
[2024/02/06 15:01:58] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2021
[2024/02/06 15:01:58] [ info] [sp] stream processor started
[2024/02/06 15:01:58] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2024/02/06 15:01:59] [trace] [task 0x7fceb801d9b0] created (id=0)
[2024/02/06 15:01:59] [debug] [task] created task=0x7fceb801d9b0 id=0 OK
[2024/02/06 15:01:59] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1707199318.951161261, {}], {"message"=>"dummy"}]
[2024/02/06 15:01:59] [debug] [out flush] cb_destroy coro_id=0
[2024/02/06 15:01:59] [trace] [coro] destroy coroutine=0x7fceb0001050 data=0x7fceb0001068
[2024/02/06 15:01:59] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2024/02/06 15:01:59] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:01:59] [debug] [task] destroy task=0x7fceb801d9b0 (task_id=0)
[0] dummy.0: [[1707199319.951342657, {}], {"message"=>"dummy"}]
[2024/02/06 15:02:00] [trace] [task 0x7fceb803a460] created (id=0)
[2024/02/06 15:02:00] [debug] [task] created task=0x7fceb803a460 id=0 OK
[2024/02/06 15:02:00] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2024/02/06 15:02:00] [debug] [out flush] cb_destroy coro_id=1
[2024/02/06 15:02:00] [trace] [coro] destroy coroutine=0x7fceb0007760 data=0x7fceb0007778
[2024/02/06 15:02:00] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2024/02/06 15:02:00] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:02:00] [debug] [task] destroy task=0x7fceb803a460 (task_id=0)
[2024/02/06 15:02:01] [trace] [task 0x7fceb80372d0] created (id=0)
[2024/02/06 15:02:01] [debug] [task] created task=0x7fceb80372d0 id=0 OK
[2024/02/06 15:02:01] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] dummy.0: [[1707199320.951075047, {}], {"message"=>"dummy"}]
[2024/02/06 15:02:01] [debug] [out flush] cb_destroy coro_id=2
[2024/02/06 15:02:01] [trace] [coro] destroy coroutine=0x7fceb00077d0 data=0x7fceb00077e8
[2024/02/06 15:02:01] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2024/02/06 15:02:01] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:02:01] [debug] [task] destroy task=0x7fceb80372d0 (task_id=0)
[2024/02/06 15:02:02] [trace] [task 0x7fceb8039a40] created (id=0)
[2024/02/06 15:02:02] [debug] [task] created task=0x7fceb8039a40 id=0 OK
[0] dummy.0: [[1707199321.951142966, {}], {"message"=>"dummy"}]
[2024/02/06 15:02:02] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2024/02/06 15:02:02] [debug] [out flush] cb_destroy coro_id=3
[2024/02/06 15:02:02] [trace] [coro] destroy coroutine=0x7fceb0007840 data=0x7fceb0007858
[2024/02/06 15:02:02] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2024/02/06 15:02:02] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:02:02] [debug] [task] destroy task=0x7fceb8039a40 (task_id=0)
[2024/02/06 15:02:03] [engine] caught signal (SIGHUP)
[2024/02/06 15:02:03] [ info] reloading instance pid=149753 tid=0x7fcebca0ed40
[2024/02/06 15:02:03] [ info] [reload] stop everything of the old context
[2024/02/06 15:02:03] [trace] [engine] flush enqueued data
[2024/02/06 15:02:03] [trace] [task 0x7fceb80381e0] created (id=0)
[2024/02/06 15:02:03] [debug] [task] created task=0x7fceb80381e0 id=0 OK
[2024/02/06 15:02:03] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2024/02/06 15:02:03] [ warn] [engine] service will shutdown when all remaining tasks are flushed
[2024/02/06 15:02:03] [ info] [input] pausing dummy.0
[0] dummy.0: [[1707199322.951202872, {}], {"message"=>"dummy"}]
[2024/02/06 15:02:03] [debug] [out flush] cb_destroy coro_id=4
[2024/02/06 15:02:03] [trace] [coro] destroy coroutine=0x7fceb00078b0 data=0x7fceb00078c8
[2024/02/06 15:02:03] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:02:03] [debug] [task] destroy task=0x7fceb80381e0 (task_id=0)
[2024/02/06 15:02:03] [ info] [engine] service has stopped (0 pending tasks)
[2024/02/06 15:02:03] [ info] [input] pausing dummy.0
[2024/02/06 15:02:03] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/02/06 15:02:03] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2024/02/06 15:02:03] [ info] [reload] start everything
[2024/02/06 15:02:03] [ info] [fluent bit] version=3.0.0, commit=dde6ba5e10, pid=149753
[2024/02/06 15:02:03] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/02/06 15:02:03] [ info] [storage] ver=1.5.1, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/02/06 15:02:03] [ info] [cmetrics] version=0.6.6
[2024/02/06 15:02:03] [ info] [ctraces ] version=0.4.0
[2024/02/06 15:02:03] [ info] [input:dummy:dummy.0] initializing
[2024/02/06 15:02:03] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/02/06 15:02:03] [debug] [dummy:dummy.0] created event channels: read=16 write=17
[2024/02/06 15:02:03] [debug] [stdout:stdout.0] created event channels: read=18 write=19
[2024/02/06 15:02:03] [ info] [output:stdout:stdout.0] worker #0 started
[2024/02/06 15:02:03] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2021
[2024/02/06 15:02:03] [ info] [sp] stream processor started
[2024/02/06 15:02:04] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
^C[2024/02/06 15:02:05] [engine] caught signal (SIGINT)
[2024/02/06 15:02:05] [trace] [engine] flush enqueued data
[2024/02/06 15:02:05] [trace] [task 0x7fceb801e4f0] created (id=0)
[2024/02/06 15:02:05] [debug] [task] created task=0x7fceb801e4f0 id=0 OK
[2024/02/06 15:02:05] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2024/02/06 15:02:05] [ warn] [engine] service will shutdown in max 5 seconds
[2024/02/06 15:02:05] [ info] [input] pausing dummy.0
[0] dummy.0: [[1707199324.950852761, {}], {"message"=>"dummy"}]
[2024/02/06 15:02:05] [debug] [out flush] cb_destroy coro_id=0
[2024/02/06 15:02:05] [trace] [coro] destroy coroutine=0x7fcea8000b70 data=0x7fcea8000b88
[2024/02/06 15:02:05] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/02/06 15:02:05] [debug] [task] destroy task=0x7fceb801e4f0 (task_id=0)
[2024/02/06 15:02:05] [ info] [engine] service has stopped (0 pending tasks)
[2024/02/06 15:02:05] [ info] [input] pausing dummy.0
[2024/02/06 15:02:05] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/02/06 15:02:05] [ info] [output:stdout:stdout.0] thread worker #0 stopped
  • Attached Valgrind output that shows no leaks or memory corruption was found
==150027== 
==150027== HEAP SUMMARY:
==150027==     in use at exit: 0 bytes in 0 blocks
==150027==   total heap usage: 8,910 allocs, 8,910 frees, 2,493,824 bytes allocated
==150027== 
==150027== All heap blocks were freed -- no leaks are possible
==150027== 
==150027== For lists of detected and suppressed errors, rerun with: -s
==150027== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fast reload signals are swallowed silently causing inconsistent configuration
2 participants