Skip to content

[DO NOT MERGE] Use aws file transfer manager in place of s3blobstore.actor.cpp. #12075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

saintstack
Copy link
Contributor

@saintstack saintstack commented Apr 4, 2025

Draft. Illustration of how we might move to use aws file transfer manager api.

But after this exploration, I don't think we should go this route.

Here are "Advantages" of aws file transfer manager api https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/transfer-manager.html

  • Automatic part size optimization
  • Concurrent transfer management
  • Built-in retry logic with exponential backoff
  • Progress tracking via callbacks
  • Memory management
  • Checksum verification
  • Automatic handling of S3-specific requirements
  • Built-in bandwidth throttling

It would be sweet if s3 interface could be a lib maintained by others and the 'new' aws s3client is purportedly faster but otherwise, we already have most of the above in our current implementation (or there are features we don't really want -- concurrent transfers, directory transfers). We have (configurable) multipart sizing and retrying w/ backoffs, checksum verification, etc. And there are downsides in that the file transfer calls are blocking. Under the hood, they are run in worker threads. To make them sit behind actor api requires gymnastics -- monitoring the returned file transfer handles in a little thread pool (lots of new code). The file transfer bandwidth throttling can't be meshed w/ the native fdb throttling -- not easily at least.

I looked at using the aws s3client behind s3blobstore too but again the calls are blocking so to fit flow, we'd need async'ing bridging code. Here is a little table that a 'friend' of mine made for me:

Feature Manual HTTP (S3BlobStore Original) TransferManagerWrapper Direct S3 Client (Async Wrapped)
Auth/Signing Manual SDK SDK
Multipart Mgmt Manual SDK (TransferManager) Manual
Flow Throttling Yes (Directly Integrated) No No (Difficult/Inaccurate)
API Simplicity Low High (for Transfers) Medium (SDK calls)
Async Wrapper Need No (Uses Flow primitives) Yes (Few wrappers) Yes (Many wrappers)
Overall Complexity Medium (HTTP/Auth) Medium (Async Wrapper) High (Many Async Wrappers)
Best For: Simple transfers, fine-grained control Ease of use for large transfers Advanced users needing fine-grained async control

Here are some notes on this (unfinished) PR:

Fit the aws file transfer api in behind the s3client.actor.cpp currently used by bulkload/bulkdump.

If WITH_AWS_BACKUP is set, we build s3 and transfer libs (and dependencies -- lots!).

  • fdbclient/S3Client.actor.cpp
    Go via the new S3TransferManagerWrapper.actor.h 'bridge' to use aws transfer manager. The transfer manager doesn't do delete so for these, use the aws s3client directly.

  • fdbclient/S3TransferManagerWrapper.actor.cpp
    Bridge between s3client and aws file transfer. Enqueues transfers and then runs monitors via the file transfer handle to figure when done. Monitors are run in a little thread pool. Needs startup/shutdown for thread pool.

  • fdbclient/include/fdbclient/S3TransferManagerWrapper.actor.h
    Add uploadFileWithTransferManager(std::string localFile, std::string bucketName, std::string objectKey); Add downloadFileWithTransferManager(std::string bucketName,

Unfinished!

Fit the file transfer in behind the s3client.actor.cpp currently
used by bulkload.

If WITH_AWS_BACKUP is set, we build s3 and transfer libs (and
dependencies -- lots!).

* fdbclient/S3Client.actor.cpp
 Go via the new S3TransferManagerWrapper.actor.h 'bridge' to use
 aws transfer manager but for delete, given transfer manager has
 no delete, use the aws s3client directly.

* fdbclient/S3TransferManagerWrapper.actor.cpp
 Bridge between s3client and aws file transfer.
 Enqueues transfers and then runs monitors via the file
 transfer handle to figure when done.
 Monitors are run in a little thread pool.
 Needs startup/shutdown for thread pool.

* fdbclient/include/fdbclient/S3TransferManagerWrapper.actor.h
 Add uploadFileWithTransferManager(std::string localFile, std::string bucketName, std::string objectKey);
 Add downloadFileWithTransferManager(std::string bucketName,
@saintstack saintstack marked this pull request as draft April 4, 2025 04:14
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 22afdc5
  • Duration 0:08:07
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 22afdc5
  • Duration 0:08:14
  • Result: ❌ FAILED
  • Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /opt/homebrew/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 22afdc5
  • Duration 0:08:28
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 22afdc5
  • Duration 0:09:02
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 22afdc5
  • Duration 0:09:16
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 22afdc5
  • Duration 0:10:08
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 22afdc5
  • Duration 0:13:23
  • Result: ❌ FAILED
  • Error: Error while executing command: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${HOME}/.ssh_key ec2-user@${MAC_EC2_HOST} /usr/local/bin/bash --login -c ./build_pr_macos.sh. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@saintstack saintstack changed the title Use aws file transfer manager in place of s3blobstore.actor.cpp. [DO NOT MERGE] Use aws file transfer manager in place of s3blobstore.actor.cpp. Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants