-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peer disappears from peerswap-listpeers randomly #185
Comments
Adding this as a release blocker. We've considered to see Blockstream Store running v23.02 disappear from peerswap-listpeers from multiple CLN v23.05 nodes. We're doing some testing before upgrading that node to v23.05. If that turns out to be the fix then we need to declare a higher minimum CLN version for PeerSwap. |
Does the node also disappear from clns |
Ok, maybe I found the problem, lets try to verify: I am seeing the following log messages:
The node behaves accordingly to the specification
It seems that the node disconnects due to a too low |
Just wrote with @wtogami, this does not seem to be the cause of the issue so this needs further investigation. |
Possible: I think I began seeing this after #189 was merged. Definite: Two CLN v23.05.x nodes see Blockstream Store v23.02.x disappear from |
The below might have been hitting the bug fixed by #206 which is different from the original bug here. We need to wait 4+ hours to see if this happens again.
|
I had an interesting conversation about this issue and it could be the case that we overload cln the way we trigger the poll messages. Right now, we send out the poll messages in parallel every other hour. As the BS node has quite a few peers this might overflow or be rejected. In the past we might have misunderstood errors on A possible solution to fix this problem would be to spread out the load of the polling system in a way that we do not send out all messages at once but in a sequential manner with a timeout between the calls. Possible data structures to accomplish this could be a priority queue or a min heap, both ordered by timestamp. Additionally we need to look out for log messages beginning with |
Could the Store node be hitting #186? Although checking a node I have access to shows a few zombie processes but commands still work so perhaps that is not the problem.
|
I doubt that this is related, but this issue and #186 are my highest priorities. |
No Blockstream Store restarts entire docker containers in order to do upgrades so it doesn't have the opportunity for old processes to survive. |
Confirmed it still happens where other nodes can't see Blockstream Store after a few hours. Blockstream Store is running CLN v23.05.2 with PeerSwap 725ca2c. Meanwhile Blockstream Store |
I am wondering, is there any benefit in persisting peerswap-peers in the database? Wouldn't it be sufficient to just store them in memory? |
Memory only is fine but that wouldn't fix our current problem right? |
On a CLN v23.05 node, a v23.02 peer (Blockstream Store node) running Peerswap seems to randomly disappear from
peerswap-listpeers
.Force disconnecting the node and letting CLN reconnect temporarily fixes it, but over time the node will disappear again. The v23.05 node has a channel to another v23.02 peer with Peerswap where this does not happen, which is strange. I'm going to try dig through logs to see if I can spot anything obvious. Will also try and replicate on signet.
The text was updated successfully, but these errors were encountered: