You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our consul deployment in a k8s cluster (through helm) - 3 servers. We saw an error from one of the consul-servers-0 that
2024-09-16 21:22 agent: Coordinate update error: error="Raft leader not found in server lookup mapping"
and then periodic errors (twice or three times a day) after, like this...
### From the consul agents
agent.http: Request error: method=GET url=/v1/catalog/services from=192.168.187.228:59522 error="rpc error making call: Raft leader not found in server lookup mapping"
agent.client: RPC failed to server: method=Catalog.ListServices server=192.168.70.185:8300 error="rpc error making call: Raft leader not found in server lookup mapping"
###From the consul servers consul-server-0 and consul-server-1
agent.server: Raft has a leader but other tracking of the node would indicate that the node is unhealthy or does not exist. The network may be misconfigured.: leader=192.168.x.y:8300 (consul-server-2)
the leader did not change during the time of the first error
there was a user impact in that certain requests did fail
There was no other network maintenance, or issues with cpu/memory etc. Any thoughts on why did this fail, how to recover now (should we restart the leader consul-server-2) and any parameters need be tuned to avoid recurrence?
I hit the same problem with Consul v1.19.2. On a cluster with 5 nodes elections were started after lost connection to 2 nodes from another datacenter. The leadership was passed from one consul node to another one (from the same datacenter where 3 nodes are installed). (BTW, I don't understand why the leadership was changed.) Since the election, the two disconnected nodes are back, the cluster is almost OK, but the old leader does not respond on queries with the error Raft leader not found in server lookup mapping. Other nodes work correctly and indicate the new leader correctly. The error Raft has a leader but other tracking of the node would indicate that the node is unhealthy or does not exist. The network may be misconfigured is shown on the old leader as well.
Consul Cluster got into a unstable state
Our consul deployment in a k8s cluster (through helm) - 3 servers. We saw an error from one of the consul-servers-0 that
and then periodic errors (twice or three times a day) after, like this...
There was no other network maintenance, or issues with cpu/memory etc. Any thoughts on why did this fail, how to recover now (should we restart the leader consul-server-2) and any parameters need be tuned to avoid recurrence?
Consul info for both Client and Server
Client info
Server info
Operating system and Environment details
this environment runs on a k8s cluster, v1.24. Consul is version 1.11
Log Fragments
The text was updated successfully, but these errors were encountered: