Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic restart of parachain RPC node #6859

Closed
2 tasks done
Kailai-Wang opened this issue Dec 12, 2024 · 6 comments
Closed
2 tasks done

Periodic restart of parachain RPC node #6859

Kailai-Wang opened this issue Dec 12, 2024 · 6 comments
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.

Comments

@Kailai-Wang
Copy link

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

Litentry parachain RPC node with tag v0.9.21-01, it's based on polkadot-sdk version v1.11.0.

Every ~10 min it crashes with relay chain error below:

2024-12-12 10:34:08 [Parachain] :zzz: Idle (12 peers), best: #6362236 (0x2fcc…5bd5), finalized #6362234 (0xaefc…c4f0), ⬇ 23.9kiB/s ⬆ 1.6kiB/s
2024-12-12 10:34:12 [Relaychain] Received finalized block via RPC: #23808419 (0x8260…ef99 -> 0xe1e7…69b7)
2024-12-12 10:34:12 [Relaychain] Received imported block via RPC: #23808422 (0xfb45…be4c -> 0x16db…8fc0)
2024-12-12 10:34:13 [Relaychain] Received imported block via RPC: #23808422 (0xfb45…be4c -> 0x00ee…4899)
2024-12-12 10:34:13 [Parachain] :zzz: Idle (12 peers), best: #6362236 (0x2fcc…5bd5), finalized #6362235 (0xe47b…c1bb), ⬇ 1.7kiB/s ⬆ 1.9kiB/s
2024-12-12 10:34:16 [Relaychain] Received finalized block via RPC: #23808420 (0xe1e7…69b7 -> 0xfacf…5ec9)
2024-12-12 10:34:18 [Parachain] :zzz: Idle (13 peers), best: #6362236 (0x2fcc…5bd5), finalized #6362235 (0xe47b…c1bb), ⬇ 11.0kiB/s ⬆ 5.7kiB/s
2024-12-12 10:34:19 [Relaychain] Received imported block via RPC: #23808423 (0x16db…8fc0 -> 0xaa77…9208)
2024-12-12 10:34:19 [Parachain] :new: Imported #6362238 (0x9d08…1153 → 0x54f5…24d9)
2024-12-12 10:34:23 [Relaychain] Overseer exited with error err=Generated(SubsystemStalled("collator-protocol-subsystem", "signal", "polkadot_node_subsystem_types::OverseerSignal"))
2024-12-12 10:34:23 [Relaychain] subsystem exited with error subsystem="network-bridge-rx" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2024-12-12 10:34:23 [Relaychain] subsystem exited with error subsystem="network-bridge-tx" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
2024-12-12 10:34:23 [Relaychain] subsystem exited with error subsystem="chain-api" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
2024-12-12 10:34:23 [Relaychain] Essential task `overseer` failed. Shutting down service.
2024-12-12 10:34:23 [Relaychain] Protocol command streams have been shut down
2024-12-12 10:34:23 [Relaychain] error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
2024-12-12 10:34:23 [Relaychain] subsystem exited with error subsystem="availability-recovery" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
2024-12-12 10:34:23 [Relaychain] cannot query the runtime API version: Unable to communicate with RPC worker: RPC worker channel closed. This can hint and connectivity issues with the supplied RPC endpoints. Message: oneshot canceled api="para_backing_state"
2024-12-12 10:34:23 [Relaychain] subsystem exited with error subsystem="runtime-api" err=Generated(Context("Signal channel is terminated and empty."))
Error: Service(Other("Essential task failed."))
CLI parameter `--execution` has no effect anymore and will be removed in the future!

Node start command:

--chain=litentry  --state-pruning=archive --port 40333 --rpc-port 9944 --rpc-cors all --enable-evm-rpc --rpc-external --trie-cache-size 0 --delayed-best-block --prometheus-port 9615 --prometheus-external --name litentry-rpc-alice --rpc-max-connections 5000 --relay-chain-rpc-urls wss://rpc.ibp.network/polkadot

Tried other relay-chain-rpc-urls but it didn't help.

An issue with similar symptom that I found is #1730, but that issue is 1 year old and allegedly there were some improvements in polkadot-sdk which should be covered in 1.11.0 already

I appreciate any help!

Steps to reproduce

No response

@Kailai-Wang Kailai-Wang added I10-unconfirmed Issue might be valid, but it's not yet known. I2-bug The node fails to follow expected behavior. labels Dec 12, 2024
@bkchr
Copy link
Member

bkchr commented Dec 12, 2024

Can you please run with -lparachain=debug,parachain::collator-protocol=trace and then post the logs around 2min before it restarts. Ty.

@Kailai-Wang Kailai-Wang changed the title Periodic Arestart of parachain RPC node Periodic restart of parachain RPC node Dec 12, 2024
@alexggh
Copy link
Contributor

alexggh commented Dec 12, 2024

I think it is this issue: #4167, which should be fixed by #4471, which is first included in polkadot v1.13.

@bkchr
Copy link
Member

bkchr commented Dec 12, 2024

Yeah looks like this issue!

@Kailai-Wang
Copy link
Author

Kailai-Wang commented Dec 12, 2024

Cool, thanks for the information 👍

Let me try to use a newer polkadot-sdk and see if it works.

I'll report back

@skunert
Copy link
Contributor

skunert commented Dec 18, 2024

Closing this, @Kailai-Wang feel free to reopen if the issue persists!

@skunert skunert closed this as completed Dec 18, 2024
@Kailai-Wang
Copy link
Author

Yes we just updated our rpc nodes with newer polkadot-sdk, now it seems to be working fine!

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I2-bug The node fails to follow expected behavior. I10-unconfirmed Issue might be valid, but it's not yet known.
Projects
None yet
Development

No branches or pull requests

4 participants