You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug ⚠️ I understand that this is a very niche issue, but I thought I share this trap with others.
If you accidentally point an S3 object_store client to an gRPC endpoint, it will happily read empty objects for most paths (i.e. all paths that are not covered by the gRPC endpoint). This can become quite a debug nightmare.
To Reproduce
Set up a gRPC server, e.g. w/ tonic. The code example for the client sets this up to port 1234.
Then configure an S3 Client to point at it. It's important that you
let store = object_store::aws::AmazonS3Builder::new().with_bucket_name("dummy").with_client_options(
object_store::ClientOptions::new().with_allow_http(true).with_http2_only(),).with_endpoint("http://localhost:1234").with_skip_signature(true).build().unwrap();
Expected behavior
I was naively expecting the client to error.
Additional context
gRPC for some bizarre reasons decides to not use the HTTP status code at all but instead a custom response header grpc-status. In our case, this is set to 12 for UNIMPLEMENTED, see https://grpc.github.io/grpc/core/md_doc_statuscodes.html .
The response body for UNIMPLEMENTED is empty. The content-length response header is set to 0 (that's required by the object_store client).
➡️ So I think what we could do as some kind of safeguard would be to check the grpc-status response header and bail out if it is set.
The text was updated successfully, but these errors were encountered:
S3 responses for GET operations are just the pure data, there's NO wrapping of the bytes into any object like JSON or XML. All metadata is transmitted via headers, which our client doesn't need (also to support alternative S3 implementations more easily).
IMO this isn't something we should look to handle, if you point it at an HTTP server that returns 200 OK, how is it to know that this is wrong? Sure we could handle grpc-status, but it seems rather odd to me to be inspecting a protocol specific header.
Describe the bug
⚠️ I understand that this is a very niche issue, but I thought I share this trap with others.
If you accidentally point an S3
object_store
client to an gRPC endpoint, it will happily read empty objects for most paths (i.e. all paths that are not covered by the gRPC endpoint). This can become quite a debug nightmare.To Reproduce
Set up a gRPC server, e.g. w/
tonic
. The code example for the client sets this up to port1234
.Then configure an S3 Client to point at it. It's important that you
Expected behavior
I was naively expecting the client to error.
Additional context
gRPC for some bizarre reasons decides to not use the HTTP status code at all but instead a custom response header
grpc-status
. In our case, this is set to12
forUNIMPLEMENTED
, seehttps://grpc.github.io/grpc/core/md_doc_statuscodes.html .
The response body for
UNIMPLEMENTED
is empty. Thecontent-length
response header is set to0
(that's required by theobject_store
client).➡️ So I think what we could do as some kind of safeguard would be to check the
grpc-status
response header and bail out if it is set.The text was updated successfully, but these errors were encountered: