-
Bug report criteria
What happened?There is no NAME and CLIENT ADDRS infomation in row of newly added node. What did you expect to happen?I have no idea. Need help to solve this How can we reproduce it (as minimally and precisely as possible)?Add etcd4 to the cluster, then try adding it again after deleting the member. Anything else we need to know?The actual IP displayed in the environment variable and console has been modified to the hostname to hide them. Etcd version (please run commands below)$ etcd --version
etcd Version: 3.5.4
Git SHA: 08407ff76
Go Version: go1.16.15
Go OS/Arch: linux/amd64
$ etcdctl version
etcdctl version: 3.5.4
API version: 3.5 Etcd configuration (command line flags or environment variables)ETCDCTL_ENDPOINTS="http://etcd1:2379,http://etcd2:2379,http://etcd3:2379,http://etcd5:2379" Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)$ etcdctl member list -w table
+------------------+---------+-------+--------------------------+--------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-------+--------------------------+--------------------------+------------+
| 14f2acc96f228543 | started | etcd5 | http://etcd5:2380 | http://etcd5:2379 | false |
| 16b572d0cae175a2 | started | etcd1 | http://etcd1:2380 | http://etcd1:2379 | false |
| 4a4ee0d68fe66679 | started | etcd2 | http://etcd2:2380 | http://etcd2:2379 | false |
| 52fd32c8e1a0bb8b | started | etcd3 | http://etcd3:2380 | http://etcd3:2379 | false |
| ca14397393a0f603 | unstarted | | http://etcd4:2380 | | true |
+------------------+---------+-------+--------------------------+--------------------------+------------+
$ etcdctl --endpoints=<member list> endpoint status -w table
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://etcd1:2379 | 16b572d0cae175a2 | 3.5.4 | 540 MB | false | false | 23 | 4507028 | 4507028 | |
| http://etcd2:2379 | 4a4ee0d68fe66679 | 3.5.4 | 540 MB | true | false | 23 | 4507028 | 4507028 | |
| http://etcd3:2379 | 52fd32c8e1a0bb8b | 3.5.4 | 540 MB | false | false | 23 | 4507028 | 4507028 | |
| http://etcd5:2379 | 14f2acc96f228543 | 3.5.4 | 540 MB | false | false | 23 | 4507034 | 4507034 | |
+--------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ Relevant log outputmember add etcd4 --learner --peer-urls=http://etcd4:2380 --debug=true
ETCDCTL_CACERT=
ETCDCTL_CERT=
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_DISCOVERY_SRV_NAME=
ETCDCTL_ENDPOINTS=["http://etcd1:2379,http://etcd2:2379,http://etcd3:2379,http://etcd5:2379"]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=
ETCDCTL_PASSWORD=
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=simple
...
WARNING: 2023/07/31 10:31:19 [core] grpc: addrConn.createTransport failed to connect to {etcd4:2379 etcd4 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp etcd4:2379: connect: connection refused". Reconnecting...
WARNING: 2023/07/31 10:31:19 [core] grpc: addrConn.createTransport failed to connect to {etcd4:2379 etcd4 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp etcd4:2379: connect: connection refused". Reconnecting...
... |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hey @inkkim - Can you please provide the logs and configuration for the etcd4 member and confirm it can be reached over the network successfully? The error message indicates that the connection request to the specified address and port was refused by the remote server. In other words, the etcd server running on the host "etcd4" at port 2379 did not respond to the connection request, perhaps due to a network issue, or the service not running as expected. |
Beta Was this translation helpful? Give feedback.
-
# member add
export TOKEN=etcd-cluster-1
export ETCDCTL_API=3
root@etcd4:~# etcdctl member add etcd4 --learner --peer-urls=http://etcd4:2380 --debug=true
ETCDCTL_CACERT=
ETCDCTL_CERT=
ETCDCTL_COMMAND_TIMEOUT=5s
ETCDCTL_DEBUG=true
ETCDCTL_DIAL_TIMEOUT=2s
ETCDCTL_DISCOVERY_SRV=
ETCDCTL_DISCOVERY_SRV_NAME=
ETCDCTL_ENDPOINTS=[http://etcd1:2379,http://etcd2:2379,http://etcd3:2379,http://etcd4:2379,http://etcd5:2379]
ETCDCTL_HEX=false
ETCDCTL_INSECURE_DISCOVERY=true
ETCDCTL_INSECURE_SKIP_TLS_VERIFY=false
ETCDCTL_INSECURE_TRANSPORT=true
ETCDCTL_KEEPALIVE_TIME=2s
ETCDCTL_KEEPALIVE_TIMEOUT=6s
ETCDCTL_KEY=
ETCDCTL_PASSWORD=
ETCDCTL_USER=
ETCDCTL_WRITE_OUT=simple
WARNING: 2023/08/01 09:56:15 [core] Adjusting keepalive ping interval to minimum period of 10s
WARNING: 2023/08/01 09:56:15 [core] Adjusting keepalive ping interval to minimum period of 10s
INFO: 2023/08/01 09:56:15 [core] parsed scheme: "etcd-endpoints"
INFO: 2023/08/01 09:56:15 [core] ccResolverWrapper: sending update to cc: {[{etcd1:2379 etcd1 <nil> 0 <nil>} {etcd2:2379 etcd2 <nil> 0 <nil>} {etcd3:2379 etcd3 <nil> 0 <nil>} {etcd4:2379 etcd4 <nil> 0 <nil>} {etcd5:2379 etcd5 <nil> 0 <nil>}] 0xc00019b160 <nil>}
INFO: 2023/08/01 09:56:15 [core] ClientConn switching balancer to "round_robin"
INFO: 2023/08/01 09:56:15 [core] Channel switches to new LB policy "round_robin"
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: got new ClientConn state: {{[{etcd1:2379 etcd1 <nil> 0 <nil>} {etcd2:2379 etcd2 <nil> 0 <nil>} {etcd3:2379 etcd3 <nil> 0 <nil>} {etcd4:2379 etcd4 <nil> 0 <nil>} {etcd5:2379 etcd5 <nil> 0 <nil>}] 0xc00019b160 <nil>} <nil>}
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel picks a new address "etcd3:2379" to connect
INFO: 2023/08/01 09:56:15 [core] Subchannel picks a new address "etcd1:2379" to connect
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b600, CONNECTING
INFO: 2023/08/01 09:56:15 [core] Channel Connectivity change to CONNECTING
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b660, CONNECTING
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b6c0, CONNECTING
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b730, CONNECTING
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b790, CONNECTING
INFO: 2023/08/01 09:56:15 [core] Subchannel picks a new address "etcd4:2379" to connect
INFO: 2023/08/01 09:56:15 [core] Subchannel picks a new address "etcd2:2379" to connect
WARNING: 2023/08/01 09:56:15 [core] grpc: addrConn.createTransport failed to connect to {etcd4:2379 etcd4 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp etcd4:2379: connect: connection refused". Reconnecting...
WARNING: 2023/08/01 09:56:15 [core] grpc: addrConn.createTransport failed to connect to {etcd4:2379 etcd4 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp etcd4:2379: connect: connection refused". Reconnecting...
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to TRANSIENT_FAILURE
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b730, TRANSIENT_FAILURE
INFO: 2023/08/01 09:56:15 [core] Subchannel picks a new address "etcd5:2379" to connect
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to READY
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b600, READY
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to READY
INFO: 2023/08/01 09:56:15 [roundrobin] roundrobinPicker: newPicker called with info: {map[0xc00018b600:{{etcd1:2379 etcd1 <nil> 0 <nil>}}]}
INFO: 2023/08/01 09:56:15 [core] Channel Connectivity change to READY
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b6c0, READY
INFO: 2023/08/01 09:56:15 [roundrobin] roundrobinPicker: newPicker called with info: {map[0xc00018b600:{{etcd1:2379 etcd1 <nil> 0 <nil>}} 0xc00018b6c0:{{etcd3:2379 etcd3 <nil> 0 <nil>}}]}
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to READY
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b660, READY
INFO: 2023/08/01 09:56:15 [core] Subchannel Connectivity change to READY
INFO: 2023/08/01 09:56:15 [roundrobin] roundrobinPicker: newPicker called with info: {map[0xc00018b600:{{etcd1:2379 etcd1 <nil> 0 <nil>}} 0xc00018b660:{{etcd2:2379 etcd2 <nil> 0 <nil>}} 0xc00018b6c0:{{etcd3:2379 etcd3 <nil> 0 <nil>}}]}
INFO: 2023/08/01 09:56:15 [balancer] base.baseBalancer: handle SubConn state change: 0xc00018b790, READY
INFO: 2023/08/01 09:56:15 [roundrobin] roundrobinPicker: newPicker called with info: {map[0xc00018b600:{{etcd1:2379 etcd1 <nil> 0 <nil>}} 0xc00018b660:{{etcd2:2379 etcd2 <nil> 0 <nil>}} 0xc00018b6c0:{{etcd3:2379 etcd3 <nil> 0 <nil>}} 0xc00018b790:{{etcd5:2379 etcd5 <nil> 0 <nil>}}]}
Member 111db16c74e99d8d added to cluster 8a1d26397040e496
ETCD_NAME="etcd4"
ETCD_INITIAL_CLUSTER="etcd4=http://etcd4:2380,etcd5=http://etcd5:2380,etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://etcd4:2380"
ETCD_INITIAL_CLUSTER_STATE="existing" # start the new member
export TOKEN=etcd-cluster-1
export CLUSTER_STATE=existing
mkdir -p /opt/etcd/data
chmod 700 /opt/etcd/data
root@etcd4:~# /opt/etcd/etcd --data-dir=/opt/etcd/data --name etcd4 --initial-advertise-peer-urls http://etcd4:2380 --listen-peer-urls http://etcd4:2380 --advertise-client-urls http://etcd4:2379 --listen-client-urls http://etcd4:2379 --initial-cluster etcd1=http://etcd1:2379,etcd2=http://etcd2:2379,etcd3=http://etcd3:2379,etcd4=http://etcd4:2379,etcd5=http://etcd5:2379 --initial-cluster-state existing --initial-cluster-token etcd-cluster-1
{"level":"info","ts":"2023-08-01T10:12:52.083+0900","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/opt/etcd/etcd","--data-dir=/opt/etcd/data","--name","etcd4","--initial-advertise-peer-urls","http://etcd4:2380","--listen-peer-urls","http://etcd4:2380","--advertise-client-urls","http://etcd4:2379","--listen-client-urls","http://etcd4:2379","--initial-cluster","etcd1=http://etcd1:2379,etcd2=http://etcd2:2379,etcd3=http://etcd3:2379,etcd4=http://etcd4:2379,etcd5=http://etcd5:2379","--initial-cluster-state","existing","--initial-cluster-token","etcd-cluster-1"]}
{"level":"info","ts":"2023-08-01T10:12:52.083+0900","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/opt/etcd/data","dir-type":"member"}
{"level":"info","ts":"2023-08-01T10:12:52.083+0900","caller":"embed/etcd.go:131","msg":"configuring peer listeners","listen-peer-urls":["http://etcd4:2380"]}
{"level":"info","ts":"2023-08-01T10:12:52.083+0900","caller":"embed/etcd.go:139","msg":"configuring client listeners","listen-client-urls":["http://etcd4:2379"]}
{"level":"info","ts":"2023-08-01T10:12:52.084+0900","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.4","git-sha":"08407ff76","go-version":"go1.16.15","go-os":"linux","go-arch":"amd64","max-cpu-set":8,"max-cpu-available":8,"member-initialized":false,"name":"etcd4","data-dir":"/opt/etcd/data","wal-dir":"","wal-dir-dedicated":"","member-dir":"/opt/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://etcd4:2380"],"listen-peer-urls":["http://etcd4:2380"],"advertise-client-urls":["http://etcd4:2379"],"listen-client-urls":["http://etcd4:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"etcd1=http://etcd1:2379,etcd2=http://etcd2:2379,etcd3=http://etcd3:2379,etcd4=http://etcd4:2379,etcd5=http://etcd5:2379","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster-1","quota-size-bytes":2147483648,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2023-08-01T10:12:52.084+0900","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/opt/etcd/data/member/snap/db","took":"227.067µs"}
{"level":"warn","ts":"2023-08-01T10:12:52.086+0900","caller":"etcdserver/cluster_util.go:94","msg":"failed to unmarshal cluster response","address":"http://etcd5:2379/members","error":"invalid character 'p' after top-level value"}
{"level":"warn","ts":"2023-08-01T10:12:52.088+0900","caller":"etcdserver/cluster_util.go:94","msg":"failed to unmarshal cluster response","address":"http://etcd1:2379/members","error":"invalid character 'p' after top-level value"}
{"level":"warn","ts":"2023-08-01T10:12:52.090+0900","caller":"etcdserver/cluster_util.go:94","msg":"failed to unmarshal cluster response","address":"http://etcd2:2379/members","error":"invalid character 'p' after top-level value"}
{"level":"warn","ts":"2023-08-01T10:12:52.091+0900","caller":"etcdserver/cluster_util.go:94","msg":"failed to unmarshal cluster response","address":"http://etcd3:2379/members","error":"invalid character 'p' after top-level value"}
{"level":"info","ts":"2023-08-01T10:12:52.094+0900","caller":"embed/etcd.go:368","msg":"closing etcd server","name":"etcd4","data-dir":"/opt/etcd/data","advertise-peer-urls":["http://etcd4:2380"],"advertise-client-urls":["http://etcd4:2379"]}
{"level":"info","ts":"2023-08-01T10:12:52.094+0900","caller":"embed/etcd.go:370","msg":"closed etcd server","name":"etcd4","data-dir":"/opt/etcd/data","advertise-peer-urls":["http://etcd4:2380"],"advertise-client-urls":["http://etcd4:2379"]}
{"level":"fatal","ts":"2023-08-01T10:12:52.094+0900","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/etcdmain/main.go:40\nmain.main\n\t/go/src/go.etcd.io/etcd/release/etcd/server/main.go:32\nruntime.main\n\t/go/gos/go1.16.15/src/runtime/proc.go:225"} |
Beta Was this translation helpful? Give feedback.
So based on the logs it looks like an issue with starting member 4 which is not coming up successfully. I note in the environment variables you have specified initial cluster as
ETCD_INITIAL_CLUSTER="etcd4=http://etcd4:2380,etcd5=http://etcd5:2380,etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380"
However in the command line parameters you have initial cluster specified as:
Can you try with
--initial-cluster
using port2380
as per https://etcd.io/docs/v3.5/op-guide/clustering.