Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Inability to Create Catalog via REST API Without Using Polaris CLI First #358

Closed
1 task done
mouadk opened this issue Oct 8, 2024 · 7 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@mouadk
Copy link

mouadk commented Oct 8, 2024

Is this a possible security vulnerability?

  • This is NOT a possible security vulnerability

Describe the bug

When creating a catalog using the CLI, a CreateCatalogRequest is created, and a CatalogEntity gets persisted in the underlying storage. During this process, the IcebergCatalogAdapter#getConfig (corresponding to the api/v1/config) is invoked, which fetches the catalog name surfaced in the warehouse field and returns properties (https://iceberg.apache.org/docs/latest/configuration/#catalog-properties) for overrides if applicable.

However, when creating a catalog for the first time (assuming it was not done using the Polaris CLI), there is no existing catalog unless you bypass the api/v1/config call and directly invoke catalog creation, similar to the CLI (as seen in regtests). For example, when using the Java core API, the process first fetches the config during catalog build (org.apache.iceberg.rest.RESTSessionCatalog.initialize) to determine if any overrides are required. This results in the error:

Exception in thread "main" org.apache.iceberg.exceptions.RESTException: Unable to process: Unable to find warehouse XXXX

I am forced to first create the catalog using the Polaris CLI and cannot achieve the same via the REST endpoint by specifying the warehouse location in the configuration.

Would returning something like {"defaults":{},"overrides":{}} be a solution in this case? Also, why is it that the warehouse field requires the catalog name instead of allowing a direct reference to locations like S3 or Blob storage?

Is this expected behavior; where we must first create the catalog using the Polaris CLI?
Or did I overlook something?

Thanks in advance.

To Reproduce

Use Java Core API to create a Catalog.

Actual Behavior

If Catalog unknown to Polaris, Error is raised when creating a new catalog (NOT FOUND)

Expected Behavior

Empty Result ?

Additional context

No response

System information

MacBook Pro: 2.9 GHz Quad-Core Intel Core i7

@mouadk mouadk added the bug Something isn't working label Oct 8, 2024
@mouadk
Copy link
Author

mouadk commented Oct 9, 2024

Also it looks like the implementation is not matching the API Spec https://github.com/apache/iceberg/blob/208ab20dc9ab8bcab3ee525d0ddaba80eeae7609/open-api/rest-catalog-open-api.yaml#L75

@MonkeyCanCode
Copy link
Contributor

Something seems to be off. I do recalled I can do those via REST directly. Will set up a test environment then provide an update.

@MonkeyCanCode
Copy link
Contributor

MonkeyCanCode commented Nov 24, 2024

@mouadk

Here is the steps I performed to create catalog via REST directly without going through CLI:

show current catalogs if any:

yong@DESKTOP:~$ curl -X GET -H "Authorization: Bearer principal:root;realm:default-realm" http://localhost:8181/api/management/v1/catalogs
{"catalogs":[]}

create a new catalog with file as storage backend (file is only for local testing purpose):

yong@DESKTOP:~$ curl -X POST -H "Authorization: Bearer principal:root;realm:default-realm" "http://localhost:8181/api/management/v1/catalogs" \
>   -H "Content-Type: application/json" \
>   -d '{
>         "catalog": {
>           "name": "polaris",
>           "type": "INTERNAL",
>           "readOnly": false,
>           "properties": {
>             "default-base-location": "file:///tmp/polaris/"
>           },
>           "storageConfigInfo": {
>             "storageType": "FILE",
>             "allowedLocations": [
>               "file:///tmp"
>             ]
>           }
>         }
>       }'

list the catalogs again:

yong@DESKTOP:~$ curl -X GET -H "Authorization: Bearer principal:root;realm:default-realm" http://localhost:8181/api/management/v1/catalogs
{"catalogs":[{"type":"INTERNAL","type":"INTERNAL","name":"polaris","properties":{"default-base-location":"file:///tmp/polaris/"},"createTimestamp":1732459417891,"lastUpdateTimestamp":1732459417891,"entityVersion":1,"storageConfigInfo":{"storageType":"FILE","storageType":"FILE","allowedLocations":["file:///tmp","file:///tmp/polaris/"]}}]}

Let me know if is still causing issue for your setup.

@MonkeyCanCode
Copy link
Contributor

@flyrain should we close this one as the above snippets show it can be done without CLI (via curl)?

@flyrain
Copy link
Contributor

flyrain commented Jan 3, 2025

@mouadk This behavior is expected. The Iceberg REST specification is intentionally scoped to operate at the table and namespace level, treating any concepts beyond that as implementation-specific details.

Catalog server implementations, such as Polaris, may introduce additional layers(e.g., catalog) beyond namespaces. While these layers can be inferred from the warehouse field in the config endpoint, such implementations are neither standardized nor mandated by the Iceberg REST specification.

With that, users cannot rely on the Iceberg REST client to create a Polaris catalog, as it does not have knowledge of Polaris-specific concepts or higher-level abstractions. To create Polaris catalogs, the Polaris REST API or CLI should be used. This separation ensures that implementation-specific details remain decoupled from the standardized REST API.

Thanks @MonkeyCanCode for pinging me here. I think we can close this one.

@mouadk
Copy link
Author

mouadk commented Jan 3, 2025

Hello @MonkeyCanCode, I was using the iceberg core client to create the catalog, what @flyrain said makes sense, thanks for the explanation and sorry for my late response.

@MonkeyCanCode
Copy link
Contributor

Hello @MonkeyCanCode, I was using the iceberg core client to create the catalog, what @flyrain said makes sense, thanks for the explanation and sorry for my late response.

All good, thanks for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants