BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown #2904

exceptionplayer · 2024-06-24T15:42:01Z

Bug Report

When RedisClusterClient is shutdown, the ClusterTopologyRefreshTask may not be cancelled.

Current Behavior

When I close the RedisClusterClient by invoking RedisClusterClient.shutdown method, there is a chance that the ClusterTopologyRefreshTask is not stopped.

It seems the issue have be resolved here, but the latest code does not seem to work.

Stack trace

2024-06-24 23:27:04.223 ERROR 781313 --- [lettuce-eventExecutorLoop-3-59] i.n.u.c.D.rejectedExecution              : Failed to submit a listener notification task. Event loop shut down?

java.util.concurrent.RejectedExecutionException: event executor terminated
	at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:934) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:351) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:344) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:836) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute0(SingleThreadEventExecutor.java:827) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:817) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:862) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:500) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:185) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:95) ~[netty-transport-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:30) ~[netty-transport-4.1.109.Final.jar!/:4.1.109.Final]
	at io.lettuce.core.AbstractRedisClient.initializeChannelAsync0(AbstractRedisClient.java:439) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.AbstractRedisClient.lambda$initializeChannelAsync$2(AbstractRedisClient.java:412) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at reactor.core.publisher.LambdaMonoSubscriber.onNext(LambdaMonoSubscriber.java:171) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onNext(FluxPeekFuseable.java:210) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.request(MonoPeekTerminal.java:139) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.request(FluxPeekFuseable.java:144) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.LambdaMonoSubscriber.onSubscribe(LambdaMonoSubscriber.java:121) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onSubscribe(FluxPeekFuseable.java:178) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onSubscribe(MonoPeekTerminal.java:152) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Mono.subscribe(Mono.java:4568) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Mono.subscribeWith(Mono.java:4634) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Mono.subscribe(Mono.java:4534) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Mono.subscribe(Mono.java:4470) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at reactor.core.publisher.Mono.subscribe(Mono.java:4442) ~[reactor-core-3.6.5.jar!/:3.6.5]
	at io.lettuce.core.AbstractRedisClient.initializeChannelAsync(AbstractRedisClient.java:407) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient.connectStatefulAsync(RedisClusterClient.java:818) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient.connectToNodeAsync(RedisClusterClient.java:554) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient$NodeConnectionFactoryImpl.connectToNodeAsync(RedisClusterClient.java:1287) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh.openConnections(DefaultClusterTopologyRefresh.java:315) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh.loadViews(DefaultClusterTopologyRefresh.java:99) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient.fetchPartitions(RedisClusterClient.java:1033) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient.loadPartitionsAsync(RedisClusterClient.java:985) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.RedisClusterClient.refreshPartitionsAsync(RedisClusterClient.java:891) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.ClusterTopologyRefreshScheduler$ClusterTopologyRefreshTask.doRun(ClusterTopologyRefreshScheduler.java:338) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.lettuce.core.cluster.ClusterTopologyRefreshScheduler$ClusterTopologyRefreshTask.run(ClusterTopologyRefreshScheduler.java:323) ~[lettuce-core-6.3.2.RELEASE.jar!/:6.3.2.RELEASE/8941aea]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.109.Final.jar!/:4.1.109.Final]
	at java.base/java.lang.Thread.run(Thread.java:898) [?:?]

Environment

Lettuce version(s): [lettuce-core-6.3.2.RELEASE.jar!]

thachlp · 2024-09-12T06:26:06Z

@tishun @exceptionplayer please help review my pr 🙇

tishun · 2024-10-22T14:09:57Z

@tishun @exceptionplayer please help review my pr 🙇

@thachlp did you check out the comments in the PR?

thachlp · 2024-10-23T14:18:48Z

@tishun @exceptionplayer please help review my pr 🙇

@thachlp did you check out the comments in the PR?

Let me check

tishun · 2024-12-23T10:59:08Z

This is a bit of a cat-and-mouse problem, because there is a chain of calls that lead to the actual execution of the task.
The event loop could be alive during the scheduling of the task and then be in a closing state when the tasks starts executing. As @mp911de mentioned in #2985 (comment) we would have to - more or less - block the event loop from closing until the scheduled task completes. Basically

if we detect the event loop is active
then we would block it from completing with an reentrant lock
then we will schedule the partition refresh task
then when the task completes we will unblock

If during the refresh the event loop was required to close then the thread that initiated that would wait until the refresh is complete.

exceptionplayer changed the title ~~ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown~~ BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown Jun 24, 2024

tishun added the type: bug A general bug label Jul 3, 2024

tishun added this to the 7.x milestone Jul 3, 2024

tishun added the status: good-first-issue An issue that can only be worked on by brand new contributors label Jul 17, 2024

tishun modified the milestones: 7.x, Backlog Jul 17, 2024

thachlp mentioned this issue Sep 12, 2024

Shutdown ClusterTopologyRefreshTask properly #2985

Open

4 tasks

tishun mentioned this issue Dec 21, 2024

enablePeriodicRefresh has problematic behavior when service is shut down(close) #3089

Closed

tishun removed the status: good-first-issue An issue that can only be worked on by brand new contributors label Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown #2904

BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown #2904

exceptionplayer commented Jun 24, 2024

thachlp commented Sep 12, 2024

tishun commented Oct 22, 2024

thachlp commented Oct 23, 2024

tishun commented Dec 23, 2024

BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown #2904

BugReport: ClusterTopologyRefreshTask is not shutdown when RedisClusterClient is shutdown #2904

Comments

exceptionplayer commented Jun 24, 2024

Bug Report

Current Behavior

Environment

thachlp commented Sep 12, 2024

tishun commented Oct 22, 2024

thachlp commented Oct 23, 2024

tishun commented Dec 23, 2024