Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of invalid base offsets #846

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rodaine
Copy link
Contributor

@rodaine rodaine commented Oct 22, 2024

When a successfully produced batch is returned with an invalid base offset (-1) but no error, the records' internal offsets are being miscomputed from this value. An invalid offset is returned with a non-errored batch primarily in situations where a duplicate idempotent produce is detected, but the actual base offset cannot be provided. Apache Kafka's producer client handles this by setting all record offsets to -1 if the batch's base offset is -1.

This patch makes this correction in the kgo producer callback logic, as well as in the recordToRecord function (though an invalid offset is probably impossible on the fetch side).

@twmb
Copy link
Owner

twmb commented Oct 26, 2024

The more you know. I'll push a patch release with this soon.

@richardartoul
Copy link

What scenario would this happen in? I can't find anywhere in the Kafka codebase where the DuplicateSequence exception is actually thrown, and in the correct case where the client only has 5 outstanding idempotent batches, the Kafka broker will return the correct offsets/timestamp for duplicate batches

@twmb
Copy link
Owner

twmb commented Oct 30, 2024

(I'm also interested if there's an answer for the above question)

@rodaine
Copy link
Contributor Author

rodaine commented Oct 30, 2024

Honestly I can't say definitively, as this was triggered through some long-running fuzz tests against a very unhealthy cluster. Regardless, this patch improves kgo's behavior to never return erroneous offsets on records under any circumstances, which I feel is generally good.

@rodaine
Copy link
Contributor Author

rodaine commented Oct 30, 2024

(To be clear, I think neither Apache Kafka nor RedPanda have ever returned the duplicate sequence number error.)

@akshayjshah
Copy link

@twmb What's your take on this patch? It seems valuable for this client to include all the safeguards of the reference client, even if this code path isn't exercised when running against the Apache Kakfa or Redpanda implementations.

@twmb
Copy link
Owner

twmb commented Nov 11, 2024

I think this is fine and I'll merge this, but since it isn't an active problem, I'll roll this into the next release train rather than cutting a patch with just this.

@Alfus
Copy link

Alfus commented Nov 18, 2024

FYI https://github.com/jepsen-io/redpanda is explicitly designed to handle records with -1 offsets. Though again, I am not sure when/why they are encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants