Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ Docs ]: Missing arguments in Nessie GC documentation for "gc/expire" phase, causes "UnknownHostException" when using Minio as Data Lake. #9991

Open
kishlay-kr opened this issue Nov 25, 2024 · 1 comment

Comments

@kishlay-kr
Copy link

Issue description

Description

There is one important field missing in the documentation for Nessie GC "gc" arguments. This omission causes the expire phase of nessie-gc to fail with the error "Received an UnknownHostException when attempting to interact with a service".

Setup:

  1. Nessie version: 0.99.0
  2. Trino version: 459
  3. Minio version: RELEASE.2024-08-29T01-40-52Z

I am using Minio as the data lake for my local setup.

Steps to reproduce:

  1. Run expire phase of the nessie-gc tool using minio as the data-lake. java -jar nessie-gc.jar expire -<other args>
  2. Use the s3 arguments provided in the documentation.

Proposed fix:

Nessie-gc has some command arguments for expire phase specific to the data lake ->

For S3:
    - s3.access-key-id
    - s3.secret-access-key
    - s3.endpoint, if you use an S3 compatible object store like MinIO

Here, 1 more argument is needed
s3.path-style-access=true

Without this argument the s3 data-lake endpoint generated by nessie-gc is aws compatible which in our setup gives "UnknownHostException".

@snazy
Copy link
Member

snazy commented Nov 25, 2024

Hi @kishlay-kr,
do you want to provide a PR to add some more context information to the help message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants