Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accept references to payloads in put requests #1218

Open
2 tasks
fommil opened this issue Dec 9, 2024 · 2 comments
Open
2 tasks

accept references to payloads in put requests #1218

fommil opened this issue Dec 9, 2024 · 2 comments
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue

Comments

@fommil
Copy link

fommil commented Dec 9, 2024

Describe the feature

many builders require their input to be owned, which means that in order to handle a possible retry under failure, the caller must make a copy that is most likely going to be thrown away. This is very wasteful, especially in high throughput applications and when writing large payloads.

Use Case

here's an example of writing records to kinesis. The batch here has to be fully owned, containing the payloads. Which means we need to clone it just incase we need to retry it.

                PutRecordsRequestEntry::builder()
                    .data(Blob::new(data))
                    .partition_key(common_partition_key)
                    .build()
                    .unwrap(), // test code

...

        let putter = kinesis
            .put_records()
            .stream_name(&name)
            .set_records(Option::Some(batch));

        let mut success = false;
        let mut attempts = 0;
        while !success {
            // annoying that we have to clone the data to be able to retry
            let results = putter.clone().send().await.unwrap(); // test code
            let failures = results.failed_record_count;
            if failures.unwrap_or(0) == 0 {
                success = true;
            } else {
                tracing::debug!("RETRYING A WRITE TO THE KINESIS STREAM {name}");
                // we may have dupe records now, how exciting! it is possible to
                // only resend the ones that failed, but it's not worth it here.
                attempts += 1;
                sleep(Duration::from_secs(1)).await;
                if attempts >= 10 {
                    panic!("Too many retry attempts writing to kinesis stream {name}");
                }
            }
        }

Proposed Solution

I don't think builders are a very idiomatic way of doing a lot of this in the rust ecosystem, although it seems to be the AWS way. If there were no builders, or struct objects, we would be able to say that the methods could simply take referencse rather than owned objects. However, I doubt we'll see builders done away with... so it would be good enough if select responses returned the input parameter instead of completely consuming. That behaves effectively the same as taking a reference.

Or take Rc wrappers of everything in the builders... or push that to the smithy layer and use Rc wrapped bytes in the Blob.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

A note for the community

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue, please leave a comment
@fommil fommil added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Dec 9, 2024
@aajtodd
Copy link
Contributor

aajtodd commented Dec 9, 2024

Thanks for the issue.

I don't think builders are a very idiomatic way of doing a lot of this in the rust ecosystem, although it seems to be the AWS way.

I'd argue that builders are fairly common in Rust but that's neither here nor there. The use of builders in the Rust SDK stems in part from dealing with model evolutions and being able to provide strong backwards compatibility guarantees.


I don't think there is much we can do here but I would ask why you are implementing retries on top of the SDK which already retries requests?

@aajtodd aajtodd added p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to 'closing-soon' in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Dec 9, 2024
@fommil
Copy link
Author

fommil commented Dec 13, 2024

Unfortunately, the PutRecords batch call cannot be retried automatically, because it supports partial success... so the retry you're @aajtodd talking about probably only works up until a partial success. I would imagine all endpoints with partial success are therefore susceptible to this same problem.

In this particular case, I think what is happening is that within the batch, the puts to shards that are being rescaled are failing, but the puts to shards that are fine are succeeding. A more efficient retry user-level logic would retry only the puts that failed the first time, but it's sufficient to retry everything. Regardless, to be able to avoid data loss, one must copy all the data and throw it away in most cases, because the builder consumes the input. The total payload can be up to 5MB (see PutRecords), broken into as many as 500 individual blobs, so it's not an insignificant cost for a high throughput application. (And remember this is all happening in parallel, consider how much data that is for a stream with 1,000 shards, albeit across many producers...)

There probably is something that can be done here, though... adding the request payloads to the failed responses would for sure solve this particular issue. That way there's no need to clone every 5MB upload before sending it, we know we will get back the requests that failed. That would be a PutRecordsResultEntry and adding an optional field failed_request: Option<PutRecordsRequestEntry> would be great, only populated on failure.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to 'closing-soon' in 7 days. label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants