Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 snapshots does not work on non-AWS S3? #3067

Open
nyxi opened this issue May 22, 2024 · 9 comments
Open

S3 snapshots does not work on non-AWS S3? #3067

nyxi opened this issue May 22, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@nyxi
Copy link

nyxi commented May 22, 2024

Describe the bug
Dragonfly logs "InternalError" for everything S3.

Looking on the receiving end, my S3 service, it seems that Dragonfly makes HTTP requests to: https://fqdn/https://fqdn/bucket/file which of course does not work.

Log excerpts from startup and after a BGSAVE:

I20240521 12:20:35.599390     1 dfly_main.cc:646] Starting dragonfly df-v1.18.1-6851a4c845625b0b14bb145177322dafbbc9858e
(...)
I20240521 12:20:35.727556     9 snapshot_storage.cc:185] Creating AWS S3 client; region=us-east-1; https=true; endpoint=swift.elastx.cloud
I20240521 12:20:35.727806     9 credentials_provider_chain.cc:28] aws: disabled EC2 metadata
I20240521 12:20:35.730230     9 credentials_provider_chain.cc:36] aws: loaded credentials; provider=environment
I20240521 12:20:35.738584    10 snapshot_storage.cc:242] Load snapshot: Searching for snapshot in S3 path: s3://dragonfly-juicefs/
E20240521 12:21:01.488096     1 server_family.cc:816] Failed to load snapshot: Failed list objects in S3 bucket: InternalError
(...)
E20240521 14:06:23.187770     9 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InternalError
E20240521 14:06:48.932502    10 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InternalError
E20240521 14:06:48.940094     9 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InternalError
E20240521 14:06:48.940392     8 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InternalError
E20240521 14:06:48.944156    11 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InternalError
I20240521 14:06:48.944537    10 server_family.cc:1720] Error in BgSaveFb: Input/output error: Failed to open write file

To Reproduce
S3 credentials in environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.

S3 endpoint in environment variable DFLY_s3_endpoint (in my case swift.elastx.cloud).

Snapshot dir set in environment variable DFLY_dir (in my case s3://dragonfly-juicefs).

Expected behavior
S3 snapshots to work.

Environment (please complete the following information):

  • Kubernetes, Dragonfly Operator v1.1.2 with Dragonfly v1.18.1

Additional context
S3 API for Openstack Swift - not AWS

@nyxi nyxi added the bug Something isn't working label May 22, 2024
@8times4
Copy link

8times4 commented May 22, 2024

experiencing the same/similar issue with Cloudflare's R2:

I20240522 17:17:10.716820    12 snapshot_storage.cc:185] Creating AWS S3 client; region=us-east-1; https=true; endpoint=https://<account_id>.r2.cloudflarestorage.com
I20240522 17:17:10.716926    12 credentials_provider_chain.cc:28] aws: disabled EC2 metadata
I20240522 17:17:10.718613    12 credentials_provider_chain.cc:36] aws: loaded credentials; provider=environment
I20240522 17:17:10.723218    13 snapshot_storage.cc:242] Load snapshot: Searching for snapshot in S3 path: s3://dflydb-prod/
W20240522 17:17:10.743796    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:10.745679    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:10.800523    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:10.906309    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:11.112648    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:11.518488    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:12.327304    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:13.933485    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:17.150838    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:23.564281    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
W20240522 17:17:36.376405    13 http_client.cc:261] aws: http client: failed to resolve host; host=https; error=generic:99
E20240522 17:17:36.379698     1 server_family.cc:816] Failed to load snapshot: Failed list objects in S3 bucket: 

coupled the following when trying to SAVE:

E20240522 17:01:49.042889     9 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: 

Same environment as OP.

Thanks,

  • 8x4

@romange
Copy link
Collaborator

romange commented May 22, 2024

@8times4 what are the command-line flags you used to run dragonfly?

@8times4
Copy link

8times4 commented May 22, 2024

@8times4 what are the command-line flags you used to run dragonfly?

just the default one by the operator with --dir s3://dflydb-prod and --s3_endpoint=https://<account_id>.r2.cloudflarestorage.com

Here's also a docker command to replicate w/o k8s (needs a cf account):

docker run --rm -p 6379:6379 -e AWS_ACCESS_KEY_ID=<access_key> -e AWS_SECRET_ACCESS_KEY=<secret_key> -e AWS_REGION=us-east-1 --ulimit memlock=-1 docker.dragonflydb.io/dragonflydb/dragonfly:1.18.1 --dir s3://dflydb-prod --logtostderr --requirepass=password --s3_endpoint=https://<account_id>.r2.cloudflarestorage.com

@romange
Copy link
Collaborator

romange commented May 23, 2024

@andydunstall should the endpoint flag be with the https prefix?

@andydunstall
Copy link
Contributor

andydunstall commented May 24, 2024

@andydunstall should the endpoint flag be with the https prefix?

Yep, you can configure http/https using --s3_use_https

@nyxi
Copy link
Author

nyxi commented May 28, 2024

To be clear the --s3-endpoint should be without the scheme prefix as that is provided by the --s3_use_https flag.

This is the problem @8times4 is hitting, but not the problem I'm having as detailed in the first post.

@romange
Copy link
Collaborator

romange commented May 28, 2024

@nyxi DSKY_s3_endpoint is a typo in the issue or in your real configuration? it should be DFLY_

@nyxi
Copy link
Author

nyxi commented May 28, 2024

@nyxi DSKY_s3_endpoint is a typo in the issue or in your real configuration? it should be DFLY_

Typo, sorry for the confusion. Updated the first post.

@pepzwee
Copy link

pepzwee commented Jul 3, 2024

I'm having a similar issue while using Backblaze's S3 compatible API.

The difference is that I am instead receiving InvalidArgument when trying to save.

I20240703 12:49:26.809077     1 init.cc:78] dragonfly running in opt mode.
I20240703 12:49:26.809190     1 dfly_main.cc:646] Starting dragonfly df-v1.19.2-2ff628203925b206c4a1031aa24916523dc5382e
I20240703 12:49:26.809424     1 dfly_main.cc:690] maxmemory has not been specified. Deciding myself....
I20240703 12:49:26.809434     1 dfly_main.cc:699] Found 3.13GiB available memory. Setting maxmemory to 2.50GiB
W20240703 12:49:26.809468     1 dfly_main.cc:373] Weird error 1 switching to epoll
I20240703 12:49:26.887785     1 proactor_pool.cc:147] Running 3 io threads
I20240703 12:49:26.890972     1 server_family.cc:721] Host OS: Linux 6.8.0-31-generic x86_64 with 3 threads
I20240703 12:49:26.902956     9 snapshot_storage.cc:185] Creating AWS S3 client; region=eu-central-003; https=true; endpoint=s3.eu-central-003.backblazeb2.com
I20240703 12:49:26.903054     9 credentials_provider_chain.cc:28] aws: disabled EC2 metadata
I20240703 12:49:26.907462     9 credentials_provider_chain.cc:36] aws: loaded credentials; provider=environment
I20240703 12:49:26.921793    10 snapshot_storage.cc:242] Load snapshot: Searching for snapshot in S3 path: s3://dg-df-backups/
W20240703 12:49:26.965176     1 server_family.cc:814] Load snapshot: No snapshot found
I20240703 12:49:26.979427     9 listener_interface.cc:101] sock[9] AcceptServer - listening on port 6379
E20240703 12:49:58.401154    10 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InvalidArgument
E20240703 12:49:58.438844     8 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InvalidArgument
E20240703 12:49:58.439571     9 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InvalidArgument
E20240703 12:49:58.439711    10 s3_write_file.cc:137] aws: s3 write file: failed to create multipart upload: InvalidArgument

Update:

I've been doing more digging and according to Backblaze's docs there are certain limitations with their S3 implementation:

Requests that include the following checksum HTTP headers are rejected with a 400 Bad Request response:
x-amz-checksum-crc32, x-amz-checksum-crc32c, x-amz-checksum-sha1, x-amz-checksum-sha256, x-amz-checksum-algorithm, x-amz-checksum-mode

Fair enough, I'll just set s3_sign_payload to false and it should be fixed, I think to myself.. nope. It's still adding the following headers when set to false and I'm assuming that's why Backblaze is not working.

image

I'll send them a question asking if that's indeed the case why it's not working, I'll update later when I hear back.

Update 2:

Currently, checksum headers in the HTTP request are unsupported, which is why it is being rejected. We can absolutely make a feature recommendation on your behalf in order to improve our compatibility with the S3 API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants