Fix a race condition in mk_list_add/del when connections are created/deleted #8013
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix a race condition in mk_list_add/del when connections are created/deleted.
This change wraps the connection creation and deletion in a mutex. Tested this on Windows Server 2022.
Fixes #7963
Testing
Before we can approve your change; please submit the following in a comment:
[2023/10/06 14:38:57] [trace] [io coro=000001BE423CB870] [net_write] ret=298 total=298/298
[2023/10/06 14:38:57] [trace] [io coro=000001BE423CB870] [net_write] trying 1818 bytes
[2023/10/06 14:38:57] [trace] [io coro=000001BE423CB870] [net_write] ret=1818 total=1818/1818
[2023/10/06 14:38:57] [trace] [io coro=000001BE423CB870] [net_read] try up to 4095 bytes
[2023/10/06 14:38:57] [trace] [engine] resuming coroutine=000001BE423CB870
[2023/10/06 14:38:57] [trace] [io coro=000001BE423CB870] [net_read] ret=390
[2023/10/06 14:38:57] [ info] [output:http:http.0] 179969258044.collect.observe-staging.com:443, HTTP status=200
{"ok":true}
[2023/10/06 14:38:57] [debug] [upstream] KA connection #1960 to 179969258044.collect.observe-staging.com:443 is now available
[2023/10/06 14:38:57] [debug] [out flush] cb_destroy coro_id=3700
[2023/10/06 14:38:57] [trace] [coro] destroy coroutine=000001BE423CB870 data=000001BE423CB888
[2023/10/06 14:39:02] [trace] [task 000001BE42399060] created (id=0)
[2023/10/06 14:39:02] [debug] [task] created task=000001BE42399060 id=0 OK
[2023/10/06 14:39:02] [trace] [upstream] get new connection for 179969258044.collect.observe-staging.com:443, net setup:
net.connect_timeout = 10 seconds
net.source_address = any
net.keepalive = enabled
net.keepalive_idle_timeout = 15 seconds
net.max_worker_connections = 0
[2023/10/06 14:39:02] [debug] [output:http:http.0] task_id=0 assigned to thread #0
[2023/10/06 14:39:02] [trace] [net] connection #1888 in process to 179969258044.collect.observe-staging.com:443
[2023/10/06 14:39:02] [trace] [engine] resuming coroutine=000001BE423CB930
[2023/10/06 14:39:02] [trace] [engine] resuming coroutine=000001BE423CB930
[2023/10/06 14:39:02] [trace] [tls] connection and handshake OK
[2023/10/06 14:39:02] [trace] [io] connection OK
[2023/10/06 14:39:02] [debug] [upstream] KA connection #1888 to 179969258044.collect.observe-staging.com:443 is connected
[2023/10/06 14:39:02] [debug] [http_client] not using http_proxy for header
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_write] trying 298 bytes
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_write] ret=298 total=298/298
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_write] trying 2357 bytes
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_write] ret=2357 total=2357/2357
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_read] try up to 4095 bytes
[2023/10/06 14:39:02] [trace] [engine] resuming coroutine=000001BE423CB930
[2023/10/06 14:39:02] [trace] [engine] resuming coroutine=000001BE423CB930
[2023/10/06 14:39:02] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2023/10/06 14:39:02] [trace] [io coro=000001BE423CB930] [net_read] ret=390
[2023/10/06 14:39:02] [debug] [task] destroy task=000001BE42399060 (task_id=0)
[2023/10/06 14:39:02] [ info] [output:http:http.0] 179969258044.collect.observe-staging.com:443, HTTP status=200
{"ok":true}
[2023/10/06 14:39:02] [debug] [upstream] KA connection #1888 to 179969258044.collect.observe-staging.com:443 is now available
[2023/10/06 14:39:02] [debug] [out flush] cb_destroy coro_id=3701
[2023/10/06 14:39:02] [trace] [coro] destroy coroutine=000001BE423CB930 data=000001BE423CB948
[2023/10/06 14:39:05] [engine] caught signal (SIGINT)
[2023/10/06 14:39:05] [trace] [engine] flush enqueued data
[2023/10/06 14:39:05] [ warn] [engine] service will shutdown in max 5 seconds
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
N/A
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.