Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When parallel consumer does not close kafka consumer if commmit fails during close #597

Open
BartoszSta opened this issue Jul 6, 2023 · 6 comments

Comments

@BartoszSta
Copy link

When AbstractParallelEoSStreamProcessor.close(Duration timeout) is being executed it performs commitOffsetsThatAreReady() - if this operation fails then maybeCloseConsumer() is not executed - so consumer is not closed.
Not closing consumer means that it will stay in consumer group for max.poll.interval.ms if it is not executing polls - it can also prevent from joining consumer group by other consumers.

In my case issue was like this:

  1. Kafka removed and added ACL for consumer user (not sure if it was terraform)
  2. Poolsystem failed.
  3. My application found out that parallel processor is in failed state (isClosedOrFailed()) and tried to close it and create it again.
  4. Close failed on commit (due to timeout - not sure why), consumer was not closed but was not performing polls
  5. New parallel consumer could not join the group for 5 minutes (until old consumer was not removed from the group in max.poll.interval.ms)

Of course as I am providing kafka consumer to parallel listener I can close it myself - which I will do if close of parallel consumer fail but I think that it should be done by parallel consumer.

@johnbyrnejb
Copy link
Contributor

We will investigate this in due time. Thanks

@BartoszSta
Copy link
Author

Any update on this?
Actually it is not really possible to close provided consumer if poll is still being done (polling is holding lock on kafka consumer so other thread cannot close it) and polling is stopped after commit is performed.

@rkolesnev
Copy link
Member

@BartoszSta - Hmm - i am wondering if that commit on close failure was due to the same bug as #648 ?
I will still address force closing consumer on PC shutdown though - to prevent such lock ups regardless of underlying reason - aiming to do it in next few days or so.

@BartoszSta
Copy link
Author

Hmm - i am wondering if that commit on close failure was due to the same bug as #648 ?

@rkolesnev This happened on 0.5.2.5, so probably not, anyway something still might happen during close operation which prevents commit from working (like network issue etc.)

@rkolesnev
Copy link
Member

@BartoszSta - yeah - i was looking into it for a bit - it looks like in certain conditions actual Kafka Consumer is stuck in metadata update loop and cannot be closed - i am still trying to figure out if there is an issue with actual Kafka Consumer or how i am closing it.

@bartoszstasikowski-gft
Copy link

@rkolesnev It looks like this issue is not present in 0.5.3.1 anymore (maybe earlier), commitOffsetsThatAreReady() exceptions are checked now and consumer is closed after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants