ETCD peer discovery error when trying to automatically clean up the node #12807
-
Describe the bugWith ETCD peer discovery enabled, trying to clean up a node that left the cluster results in an error:
Reproduction stepsstart a local etcd "cluster": run rabbitmq in docker swarm, with 3 replicas and the following cluster_formation config: cluster_formation.peer_discovery_backend = etcd after the cluster is started, kill a node Expected behaviorkilled docker task is started again and joins the cluster Additional contextNo response |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 5 replies
-
I don't understand what's being asked here. The exception comes from this line as I don't see how this can prevent the node from re-joining the cluster. Most likely this simply creates log noise on other nodes. |
Beta Was this translation helpful? Give feedback.
-
At least in Either this is on an older version, or I cannot think of a case where a single node (an atom) would be used. @rtuk you have left out the most important piece of information: what RabbitMQ is this on? |
Beta Was this translation helpful? Give feedback.
-
A simpler way to reproduce:
and then stop any of the cluster nodes. For https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_peer_discovery_common/src/rabbit_peer_discovery_cleanup.erl#L244 and https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_peer_discovery_common/src/rabbit_peer_discovery_cleanup.erl#L303-L319
This function must return a single node value. Which it only does if the input is a single atom and it should not be in |
Beta Was this translation helpful? Give feedback.
-
Per discussion with another core team member: indeed, the "single value" results of peer discovery is a special case introduced intentionally to allow for a modified seeding (see RabbitMQ 4.0 release notes for details) by certain backends. The cleanup module simply wasn't updated to account for that, hence #12809. |
Beta Was this translation helpful? Give feedback.
Per discussion with another core team member: indeed, the "single value" results of peer discovery is a special case introduced intentionally to allow for a modified seeding (see RabbitMQ 4.0 release notes for details) by certain backends.
The cleanup module simply wasn't updated to account for that, hence #12809.