-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(iceberg): support iceberg engine connection #20298
base: main
Are you sure you want to change the base?
Conversation
); | ||
|
||
statement ok | ||
set iceberg_engine_connection = 'public.my_conn'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we use the connection = conn
syntax, but use a session variable instead? :thinking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finding a way to allow users to set a default behavior for a whole database. Adding a syntax is also ok, however, our iceberg table can be used with a connector which also can have a connection. This might be somewhat confusing for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's indeed a problem... So the with
can be both for connector and for the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crunchybridge https://arc.net/l/quote/vdcxffua
snowflake https://arc.net/l/quote/fomfghdw
Both of them allow the setting of a variable in a database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term connection
in iceberg_engine_connection
seems a little vague here, since it's not clear whether it's for the catalog or the volume. What about iceberg_engine_volume
...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we will allow user to specify both volume and catalog (or any 1 of them) in the same connection, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. iceberg connection can specify catalog and bucket together. Although for iceberg engine, we only allow specifying bucket
This pull request has been modified. If you want me to regenerate unit test for any of the files related, please find the file in "Files Changed" tab and add a comment |
pub fn get_secret_by_id( | ||
&self, | ||
db_name: &str, | ||
secret_id: u32, | ||
) -> CatalogResult<&Arc<SecretCatalog>> { | ||
let secret_id = SecretId::new(secret_id); | ||
for schema in self.get_database_by_name(db_name)?.iter_schemas() { | ||
if let Some(secret) = schema.get_secret_by_id(&secret_id) { | ||
return Ok(secret); | ||
} | ||
} | ||
Err(CatalogError::NotFound("secret", secret_id.to_string())) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unused
); | ||
|
||
statement ok | ||
set iceberg_engine_connection = 'public.my_conn'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term connection
in iceberg_engine_connection
seems a little vague here, since it's not clear whether it's for the catalog or the volume. What about iceberg_engine_volume
...?
); | ||
|
||
statement ok | ||
set iceberg_engine_connection = 'public.my_conn'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we will allow user to specify both volume and catalog (or any 1 of them) in the same connection, right?
with.insert("warehouse.path".to_owned(), warehouse_path.clone()); | ||
if let Some(warehouse_path) = warehouse_path.clone() { | ||
with.insert("warehouse.path".to_owned(), warehouse_path.clone()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sink and source options look basically the same. We'd better have a with_common
shared by with_source
and with_sink
let _s3_region = params | ||
.properties | ||
.get("s3.region") | ||
.ok_or_else(|| anyhow!("`s3.region` must be set in iceberg engine connection"))? | ||
.to_owned(); | ||
let _s3_endpoint = params.properties.get("s3.endpoint").map(|s| s.to_owned()); | ||
let _warehouse_path = params | ||
.properties | ||
.get("warehouse.path") | ||
.map(|s| s.to_owned()) | ||
.ok_or_else(|| { | ||
anyhow!("`warehouse.path` must be set in iceberg engine connection") | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate properties here isn't very elegant and not very user-friendly. Could we validate when create connection
?
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
iceberg_engine_connection
to allow users to provide their own bucket for the iceberg engine via iceberg connection. Currently, only warehouse information is allowed to be configured in the iceberg engine connection. Iceberg catalog is still handled by us in the meta sql backend. With this config, it can make us much easier to share iceberg tables with users, since the underlying warehouse is managed by users and they can have a better control of the warehouse credential.Checklist
Documentation
Release note