Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify timeout for how long to wait for blocking Producer#send #121

Merged

Conversation

JorgenRingen
Copy link
Contributor

No description provided.

@JorgenRingen
Copy link
Contributor Author

JorgenRingen commented Jun 23, 2021

A little bit unsure regarding providing commit-offset-timeout as an option: https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/ConsumerOffsetCommitter.java#L148

What's the reason for why timeouts isn't retried? 🤔

if (take == null)
                    throw new InternalRuntimeError(msg("Timeout waiting for commit response {} to request {}", timeout, commitRequest));

This exception would stop the control-loop.

@astubbs
Copy link
Contributor

astubbs commented Jul 8, 2021

A little bit unsure regarding providing commit-offset-timeout as an option: https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/ConsumerOffsetCommitter.java#L148

What's the reason for why timeouts isn't retried? 🤔

if (take == null)
                    throw new InternalRuntimeError(msg("Timeout waiting for commit response {} to request {}", timeout, commitRequest));

This exception would stop the control-loop.

It does get retried, but it is retried through the same system as if the user's function threw an error. It was easier to do / simpler. Down side is that the message is of course also reprocessed, so not ideal.

Regarding retrying, do you have any preference for how this should occur? Probably would have it's own PR as this PR as a simple config change is trivial to merge (sorry for delay).

@astubbs astubbs marked this pull request as draft July 12, 2021 09:48
@astubbs
Copy link
Contributor

astubbs commented Jul 12, 2021

needs rebasing

@JorgenRingen
Copy link
Contributor Author

JorgenRingen commented Aug 2, 2021

It does get retried, but it is retried through the same system as if the user's function threw an error. It was easier to do / simpler. Down side is that the message is of course also reprocessed, so not ideal.

Regarding retrying, do you have any preference for how this should occur? Probably would have it's own PR as this PR as a simple config change is trivial to merge (sorry for delay).

Perfectly understandable 👍

Agree that reprocessing in user-function should be limited as much as possible, so consumer/producer-errors should preferably be retried in the client lib itself if possible. In this specific case it might just be a matter of aligning the timeout so that the consumer is able to retry exhaustively. Perhaps https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#consumerconfigs_request.timeout.ms is the one 🤔

Duration timeout = ParallelEoSStreamProcessor.DEFAULT_TIMEOUT;
                CommitResponse take = commitResponseQueue.poll(timeout.toMillis(), TimeUnit.MILLISECONDS); // blocks, drain until we find our response

NB: This exception actually caused the control-loop to stop as far as I remember, but msg was reprocessed after application restart :)

@JorgenRingen JorgenRingen force-pushed the fix/send_and_commit_timeouts branch from 490082b to be9518c Compare August 3, 2021 05:39
@astubbs
Copy link
Contributor

astubbs commented Sep 2, 2021

NB: This exception actually caused the control-loop to stop as far as I remember, but msg was reprocessed after application restart :)

Sorry, what exception caused the system to shutdown?

@astubbs
Copy link
Contributor

astubbs commented Sep 2, 2021

I think you need to mark this as ready for review?

@astubbs astubbs force-pushed the master branch 4 times, most recently from 8e54843 to 6312a34 Compare September 3, 2021 20:03
@JorgenRingen JorgenRingen force-pushed the fix/send_and_commit_timeouts branch from 894ae62 to 96e49d3 Compare September 7, 2021 08:02
@JorgenRingen
Copy link
Contributor Author

JorgenRingen commented Sep 7, 2021

NB: This exception actually caused the control-loop to stop as far as I remember, but msg was reprocessed after application restart :)

Sorry, what exception caused the system to shutdown?

If offset-commit timed out (10sec default): https://github.com/confluentinc/parallel-consumer/blob/master/parallel-consumer-core/src/main/java/io/confluent/parallelconsumer/internal/ConsumerOffsetCommitter.java#L156

I think you need to mark this as ready for review?

Sorry, rebased and marked as ready! Build failed, but not possible to see details: https://jenkins.confluent.io/job/confluentinc-pr/job/parallel-consumer/job/PR-121/3/display/redirect

@JorgenRingen JorgenRingen marked this pull request as ready for review September 7, 2021 08:33
@astubbs astubbs marked this pull request as draft September 16, 2021 13:49
@astubbs astubbs marked this pull request as ready for review September 16, 2021 13:49
@astubbs
Copy link
Contributor

astubbs commented Sep 16, 2021

Ah sorry, our build system is under maintenance atm.

ERROR: We have disabled builds for pull requests created by the open source communitydue to build system maintenance.We are working to finish it soon.

I'll check what the eta is.

@astubbs
Copy link
Contributor

astubbs commented Sep 17, 2021

ok @JorgenRingen , issue has been raised internally - and we should have a public PR builder setup soon, along with matrix builds for multiple AK versions like we used to have with GH actions.

@astubbs
Copy link
Contributor

astubbs commented Sep 17, 2021

FYI Matrix test against multiple AK and JDK versions in GH #103

@astubbs
Copy link
Contributor

astubbs commented Sep 28, 2021

Ok @JorgenRingen , if you rebase now it should build

@JorgenRingen JorgenRingen force-pushed the fix/send_and_commit_timeouts branch from 96e49d3 to 569ce98 Compare September 28, 2021 13:09
…periodic-consumer-sync commit-mode

Timeouts on commits will cause the control-loop to exit.
@JorgenRingen JorgenRingen force-pushed the fix/send_and_commit_timeouts branch from 569ce98 to b4e075b Compare September 28, 2021 13:17
@astubbs astubbs merged commit 4b10eb4 into confluentinc:master Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants