Unable to connect to Quorum Queue - Server returns 541 #3943
Replies: 8 comments 2 replies
-
Can you enable debug logging and share the logs with us? It is indeed unusual that a single member quorum queue would remain unresponsive so there must be something else going on in the Raft infrastructure. |
Beta Was this translation helpful? Give feedback.
-
Hi Karl @kjnilsson, thank you very much for reaching out. @jaroslaw-bochniak please in the future save logs in files with a There is not much more on |
Beta Was this translation helpful? Give feedback.
-
I'd like to see if there are any logs related to the queue when the node /
queue is starting
…On Wed, 5 Jan 2022 at 14:56, Jarosław Bochniak ***@***.***> wrote:
Hi Karl @kjnilsson <https://github.com/kjnilsson>, thank you very much
for reaching out.
On the live server we have a lot of traffic so there might be too much
noise there. I have logs from the machine that was restored from the
snapshot of the live server that we took after the incident. It has been
upgraded to RabbitMQ 3.9.11 but it shows the queue in the same state like
the original one (same amount of messages, same number of consumers).
Those are the logs from the moment when the client tries to connect to the
queue for the consumption till it gets disconnected.
2022-01-05 14:35:06.636009+00:00 [info] <0.5000.121> Supervisor {<0.5000.121>,rabbit_connection_sup}: child reader started (<0.5001.121>): {rabbit_reader,start_link,[<0.5005.121>,{acceptor,{0,0,0,0},5672}]}
2022-01-05 14:35:06.636353+00:00 [info] <0.5001.121> accepting AMQP connection <0.5001.121> (10.19.160.34:56918 -> 10.73.29.73:5672)
2022-01-05 14:35:06.666926+00:00 [debug] <0.5001.121> Raw client connection hostname during authN phase: {10,73,29,73}
2022-01-05 14:35:06.667013+00:00 [debug] <0.5001.121> Resolved client hostname during authN phase: 10.73.29.73
2022-01-05 14:35:06.667096+00:00 [debug] <0.5001.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:06.697088+00:00 [info] <0.5005.121> Supervisor {<0.5005.121>,rabbit_connection_helper_sup}: child collector started (<0.5053.121>): {rabbit_queue_collector,start_link,[<<"10.19.160.34:56918 -> 10.73.29.73:5672">>]}
2022-01-05 14:35:06.697218+00:00 [info] <0.5005.121> Supervisor {<0.5005.121>,rabbit_connection_helper_sup}: child heartbeat_sender started (<0.5059.121>): {rabbit_heartbeat,start_heartbeat_sender,[#Port<0.1377>,10,#Fun<rabbit_reader.4.10718704>,{heartbeat_sender,<<"10.19.160.34:56918 -> 10.73.29.73:5672">>}]}
2022-01-05 14:35:06.697341+00:00 [info] <0.5005.121> Supervisor {<0.5005.121>,rabbit_connection_helper_sup}: child heartbeat_receiver started (<0.5060.121>): {rabbit_heartbeat,start_heartbeat_receiver,[#Port<0.1377>,10,#Fun<rabbit_reader.5.10718704>,{heartbeat_receiver,<<"10.19.160.34:56918 -> 10.73.29.73:5672">>}]}
2022-01-05 14:35:06.699511+00:00 [info] <0.5005.121> Supervisor {<0.5005.121>,rabbit_connection_helper_sup}: child channel_sup_sup started (<0.5047.121>): {rabbit_channel_sup_sup,start_link,[]}
2022-01-05 14:35:06.699637+00:00 [info] <0.5001.121> connection <0.5001.121> (10.19.160.34:56918 -> 10.73.29.73:5672): user 'SOME_USER' authenticated and granted access to vhost 'SOME_VHOST'
2022-01-05 14:35:06.728863+00:00 [info] <0.5078.121> Supervisor {<0.5078.121>,rabbit_channel_sup}: child writer started (<0.5074.121>): {rabbit_writer,start_link,[#Port<0.1377>,1,131072,rabbit_framing_amqp_0_9_1,<0.5001.121>,{<<"10.19.160.34:56918 -> 10.73.29.73:5672">>,1},true]}
2022-01-05 14:35:06.729049+00:00 [info] <0.5078.121> Supervisor {<0.5078.121>,rabbit_channel_sup}: child limiter started (<0.5080.121>): {rabbit_limiter,start_link,[{<<"10.19.160.34:56918 -> 10.73.29.73:5672">>,1}]}
2022-01-05 14:35:06.729393+00:00 [info] <0.5078.121> Supervisor {<0.5078.121>,rabbit_channel_sup}: child channel started (<0.5081.121>): {rabbit_channel,start_link,[1,<0.5001.121>,<0.5074.121>,<0.5001.121>,<<"10.19.160.34:56918 -> 10.73.29.73:5672">>,rabbit_framing_amqp_0_9_1,{user,<<"SOME_USER">>,[administrator],[{rabbit_auth_backend_internal,none}]},<<"SOME_VHOST">>,[{<<"connection.blocked">>,bool,true},{<<"consumer_cancel_notify">>,bool,true}],<0.5053.121>,<0.5080.121>]}
2022-01-05 14:35:22.239854+00:00 [error] <0.158.0>
2022-01-05 14:35:22.239854+00:00 [error] <0.158.0> ** Cannot get connection id for node ***@***.***_NODE'
2022-01-05 14:35:22.239854+00:00 [error] <0.158.0>
2022-01-05 14:35:29.251043+00:00 [error] <0.158.0>
2022-01-05 14:35:29.251043+00:00 [error] <0.158.0> ** Cannot get connection id for node ***@***.***_NODE'
2022-01-05 14:35:29.251043+00:00 [error] <0.158.0>
2022-01-05 14:35:37.041237+00:00 [error] <0.158.0>
2022-01-05 14:35:37.041237+00:00 [error] <0.158.0> ** Cannot get connection id for node ***@***.***_NODE'
2022-01-05 14:35:37.041237+00:00 [error] <0.158.0>
2022-01-05 14:35:41.599305+00:00 [debug] <0.20335.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:41.628863+00:00 [debug] <0.20357.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:41.637548+00:00 [debug] <0.20375.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:41.741333+00:00 [debug] <0.20446.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:41.778669+00:00 [debug] <0.20481.121> User 'SOME_USER' authenticated successfully by backend rabbit_auth_backend_internal
2022-01-05 14:35:44.043375+00:00 [error] <0.158.0>
2022-01-05 14:35:44.043375+00:00 [error] <0.158.0> ** Cannot get connection id for node ***@***.***_NODE'
2022-01-05 14:35:44.043375+00:00 [error] <0.158.0>
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ** Generic server <0.5081.121> terminating
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ** Last message in was {'$gen_cast',
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {method,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {'basic.consume',0,<<>>,<<"ctag-rabbitio-1">>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> false,false,false,false,[]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> none,noflow}}
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ** When Server state == {ch,{conf,running,rabbit_framing_amqp_0_9_1,1,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> <0.5001.121>,<0.5074.121>,<0.5001.121>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> <<"10.19.160.34:56918 -> 10.73.29.73:5672">>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> undefined,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {user,<<"SOME_USER">>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [administrator],
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{rabbit_auth_backend_internal,none}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> <<"SOME_VHOST">>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> <<"SOME_QUEUE">>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> <0.5053.121>,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{<<"connection.blocked">>,bool,true},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {<<"consumer_cancel_notify">>,bool,true}],
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> none,0,134217728,1800000,#{},1000000000},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {lstate,<0.5080.121>,false},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> none,1,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {0,{[],[]}},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {state,#{},erlang},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> #{},#{},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {state,fine,5000,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> #Ref<0.3036856249.303038465.260395>},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> false,1,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_confirms,undefined,#{}},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [],[],none,flow,[],
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_queue_type,#{},#{}},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> #Ref<0.3036856249.298057730.139594>,false}
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ** Reason for termination ==
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ** {{badmatch,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {timeout,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {'SOME_VHOST_SOME_QUEUE',
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> ***@***.***_NODE'}}},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{rabbit_quorum_queue,consume,3,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{file,"rabbit_quorum_queue.erl"},{line,752}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_queue_type,consume,3,[{file,"rabbit_queue_type.erl"},{line,342}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_channel,'-basic_consume/8-fun-0-',10,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1789}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_misc,with_exit_handler,2,[{file,"rabbit_misc.erl"},{line,524}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_channel,basic_consume,8,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1786}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_channel,handle_method,3,
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1465}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {rabbit_channel,handle_cast,2,[{file,"rabbit_channel.erl"},{line,644}]},
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121> {gen_server2,handle_msg,2,[{file,"gen_server2.erl"},{line,1067}]}]}
2022-01-05 14:36:11.822604+00:00 [error] <0.5081.121>
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> crasher:
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> initial call: rabbit_channel:init/1
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> pid: <0.5081.121>
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> registered_name: []
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> exception exit: {{badmatch,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {timeout,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {'SOME_VHOST_SOME_QUEUE',
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> ***@***.***_NODE'}}},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{rabbit_quorum_queue,consume,3,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_quorum_queue.erl"},{line,752}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_queue_type,consume,3,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_queue_type.erl"},{line,342}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_channel,'-basic_consume/8-fun-0-',10,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1789}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_misc,with_exit_handler,2,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_misc.erl"},{line,524}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_channel,basic_consume,8,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1786}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_channel,handle_method,3,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,1465}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_channel,handle_cast,2,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"rabbit_channel.erl"},{line,644}]},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {gen_server2,handle_msg,2,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{file,"gen_server2.erl"},{line,1067}]}]}
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> in function gen_server2:terminate/3 (gen_server2.erl, line 1183)
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> ancestors: [<0.5078.121>,<0.5047.121>,<0.5005.121>,<0.5000.121>,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> <0.3618.0>,<0.3617.0>,<0.3616.0>,<0.3614.0>,<0.3613.0>,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> rabbit_sup,<0.224.0>]
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> message_queue_len: 4
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> messages: [{'$gen_cast',
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {command,{'basic.consume_ok',<<"ctag-rabbitio-1">>}}},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> emit_stats,emit_gen_server2_stats,tick]
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> links: [<0.5078.121>]
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> dictionary: [{rand_seed,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {#{jump => #Fun<rand.3.92093067>,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> max => 288230376151711743,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> next => #Fun<rand.5.92093067>,type => exsplus},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [74902927976975777|69198034009057013]}},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {channel_operation_timeout,15000},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {permission_cache_can_expire,false},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {process_name,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {rabbit_channel,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {<<"10.19.160.34:56918 -> 10.73.29.73:5672">>,1}}},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {permission_cache,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> [{{resource,<<"SOME_VHOST">>,queue,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> <<"SOME_QUEUE">>},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> #{},read},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {{resource,<<"SOME_VHOST">>,exchange,<<"RabbitIoExchange">>},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> #{},read},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> {{resource,<<"SOME_VHOST">>,queue,
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> <<"SOME_QUEUE">>},
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> #{},write}]}]
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> trap_exit: true
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> status: running
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> heap_size: 17731
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> stack_size: 29
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> reductions: 46530
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121> neighbours:
2022-01-05 14:36:11.823463+00:00 [error] <0.5081.121>
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> Error on AMQP connection <0.5001.121> (10.19.160.34:56918 -> 10.73.29.73:5672, vhost: 'SOME_VHOST', user: 'SOME_USER', state: running), channel 1:
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {{badmatch,
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {timeout,
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {'SOME_VHOST_SOME_QUEUE',
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> ***@***.***_NODE'}}},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> [{rabbit_quorum_queue,consume,3,
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> [{file,"rabbit_quorum_queue.erl"},{line,752}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_queue_type,consume,3,[{file,"rabbit_queue_type.erl"},{line,342}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_channel,'-basic_consume/8-fun-0-',10,
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> [{file,"rabbit_channel.erl"},{line,1789}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_misc,with_exit_handler,2,[{file,"rabbit_misc.erl"},{line,524}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_channel,basic_consume,8,[{file,"rabbit_channel.erl"},{line,1786}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_channel,handle_method,3,[{file,"rabbit_channel.erl"},{line,1465}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {rabbit_channel,handle_cast,2,[{file,"rabbit_channel.erl"},{line,644}]},
2022-01-05 14:36:11.824561+00:00 [error] <0.5001.121> {gen_server2,handle_msg,2,[{file,"gen_server2.erl"},{line,1067}]}]}
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> supervisor: {<0.5078.121>,rabbit_channel_sup}
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> errorContext: child_terminated
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> reason: {{badmatch,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {timeout,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {'SOME_VHOST_SOME_QUEUE',
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> ***@***.***_NODE'}}},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{rabbit_quorum_queue,consume,3,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_quorum_queue.erl"},{line,752}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_queue_type,consume,3,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_queue_type.erl"},{line,342}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_channel,'-basic_consume/8-fun-0-',10,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_channel.erl"},{line,1789}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_misc,with_exit_handler,2,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_misc.erl"},{line,524}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_channel,basic_consume,8,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_channel.erl"},{line,1786}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_channel,handle_method,3,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_channel.erl"},{line,1465}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_channel,handle_cast,2,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"rabbit_channel.erl"},{line,644}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {gen_server2,handle_msg,2,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{file,"gen_server2.erl"},{line,1067}]}]}
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> offender: [{pid,<0.5081.121>},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {id,channel},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {mfargs,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {rabbit_channel,start_link,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [1,<0.5001.121>,<0.5074.121>,<0.5001.121>,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> <<"10.19.160.34:56918 -> 10.73.29.73:5672">>,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> rabbit_framing_amqp_0_9_1,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {user,<<"SOME_USER">>,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [administrator],
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{rabbit_auth_backend_internal,none}]},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> <<"SOME_VHOST">>,
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> [{<<"connection.blocked">>,bool,true},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {<<"consumer_cancel_notify">>,bool,true}],
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> <0.5053.121>,<0.5080.121>]}},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {restart_type,intrinsic},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {shutdown,70000},
2022-01-05 14:36:11.824425+00:00 [error] <0.5078.121> {child_type,worker}]
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> supervisor: {<0.5078.121>,rabbit_channel_sup}
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> errorContext: shutdown
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> reason: reached_max_restart_intensity
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> offender: [{pid,<0.5081.121>},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {id,channel},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {mfargs,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {rabbit_channel,start_link,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> [1,<0.5001.121>,<0.5074.121>,<0.5001.121>,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> <<"10.19.160.34:56918 -> 10.73.29.73:5672">>,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> rabbit_framing_amqp_0_9_1,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {user,<<"SOME_USER">>,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> [administrator],
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> [{rabbit_auth_backend_internal,none}]},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> <<"SOME_VHOST">>,
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> [{<<"connection.blocked">>,bool,true},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {<<"consumer_cancel_notify">>,bool,true}],
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> <0.5053.121>,<0.5080.121>]}},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {restart_type,intrinsic},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {shutdown,70000},
2022-01-05 14:36:11.824776+00:00 [error] <0.5078.121> {child_type,worker}]
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> Non-AMQP exit reason '{{badmatch,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {timeout,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {'SOME_VHOST_SOME_QUEUE',
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> ***@***.***_NODE'}}},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{rabbit_quorum_queue,consume,3,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_quorum_queue.erl"},{line,752}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_queue_type,consume,3,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_queue_type.erl"},{line,342}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_channel,'-basic_consume/8-fun-0-',10,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_channel.erl"},{line,1789}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_misc,with_exit_handler,2,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_misc.erl"},{line,524}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_channel,basic_consume,8,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_channel.erl"},{line,1786}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_channel,handle_method,3,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_channel.erl"},{line,1465}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {rabbit_channel,handle_cast,2,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"rabbit_channel.erl"},{line,644}]},
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> {gen_server2,handle_msg,2,
2022-01-05 14:36:11.824758+00:00 [warning] <0.5001.121> [{file,"gen_server2.erl"},{line,1067}]}]}'
2022-01-05 14:36:11.854470+00:00 [info] <0.5001.121> closing AMQP connection <0.5001.121> (10.19.160.34:56918 -> 10.73.29.73:5672, vhost: 'SOME_VHOST', user: 'SOME_USER')
2022-01-05 14:36:11.854776+00:00 [debug] <0.926.122> Closing all channels from connection '10.19.160.34:56918 -> 10.73.29.73:5672' because it has been closed
There is not much more on debug level here, except the user connections.
I will try to eliminate not relevant queues and get a hand on full logs
from server startup if you feel that it might be helpful.
The error related to Cannot get connection id for node ***@***.***_NODE'
may be related only to this restored machine, as I have not noticed such
entries on the original server.
—
Reply to this email directly, view it on GitHub
<#3943 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJAHFHBLKR7JA6HNP4GZ6TUURLZBANCNFSM5LFJK7QA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
--
*Karl Nilsson*
|
Beta Was this translation helpful? Give feedback.
-
I have removed other queues, federation config and policies to make the log shorter. The problematic queue is the one with id Edited by @lukebakken to ATTACH the log as a file. |
Beta Was this translation helpful? Give feedback.
-
This log entry indicates that the Ra machine has recovered its state and thus should be available to process commands. When you attempt consumption after this point what QoS settings are in place? I presume you are using manual acks? Are you able to publish to the queue at all? Or get an individual message using the mgmt ui? |
Beta Was this translation helpful? Give feedback.
-
In this state I cannot publish to the queue (from UI), nor consume from it (UI GET/Java consumer/Golang consumer - all fail)
|
Beta Was this translation helpful? Give feedback.
-
@kjnilsson I presume that this is not a trivial case we have experienced. WDYT about further steps? |
Beta Was this translation helpful? Give feedback.
-
@jaroslaw-bochniak are u by any chance able to fix this issue ? |
Beta Was this translation helpful? Give feedback.
-
Context
We have experienced an incident on our platform that led to the situation that a flood of messages was generated. As a result of that our consumers were not able to keep up fast enough to limit the number of messages in the queues. Finally we have ended up with 500k messages in a single quorum queue that exhausted available disc space. This led to the server crash. We have extended disc space and the amount of available RAM and started the server. It became available with most of the queues operational.
At the beginning of the recovery process for the single quorum queue (with almost 500k of messages) the
rabbitmq-queues quorum_status
command shoved (raft state = timeout
):After some time it settled with (raft state = leader)
We have verified the output of the
rabbitmqctl list_unresponsive_queues
command and the queue in question was not listed there (meaning it should be operational and responsive)Meanwhile in the UI we could observe:
At this moment the UI shoved that
21 consumers
are available but in fact no consumers were connected (this value most likely comes from cache)With this state we could not connect to the queue with consumers. Upon an attempt of consumption we have received following error:
What we have tried
max-length
(combined withx-dead-letter-exchange
) andmax-in-memory-length
policies (both regular and operator policies) followed by a server restart was also unsuccessful (policy is visible on the UI, but there is no impact on the queue behaviour)channel_operation_timeout
to60000
RabbitMQ 3.9.11
resulted in the same issueEven though it was really slow we were able to successfully use the
rabbitmq-queues peek
command to peek at all 500k events present in the queue.We are using single node cluster with
RabbitMQ 3.8.14
andErlang 23.3.1
Recovery
rabbitmq-queues peek
command to return the entire payload of the message as a means of last resort recovery for such incidents?Note:
Original message posted in rabbitmq-users mailing group but I feel GitHub discussion will be much more convenient.
Beta Was this translation helpful? Give feedback.
All reactions