-
Notifications
You must be signed in to change notification settings - Fork 224
Refactor BasePolarisTableOperations & BasePolarisViewOperations #1426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, Polaris does not need to reuse Iceberg Catalog
classes at all when dealing with files in storage (we could use FileIO directly), but current approach is fine too. Should be good to merge with a CODE_COPIED_TO_POLARIS
mark.
} | ||
|
||
@Override | ||
public TableMetadata refresh() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose a lot of code came from Iceberg, but this class probably deserves a CODE_COPIED_TO_POLARIS
mark (in comments, example in PolarisRestCatalogIntegrationTest
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should communicate that most of this was copied, good callout.
However, currently AFAIK CODE_COPIED_TO_POLARIS
is reserved for blocks of code (files?) that are 100% copied with no modifications. There are some modifications here, and we are planning to make more over time. What do you think is the best way to represent the situation -- should we just put CODE_COPIED_TO_POLARIS
and explain this in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: to be clear there should be no functional changes, but there are minor changes (e.g. LOG -> LOGGER) needed to make things work within IcebergCatalog and I have done some light refactoring to improve readability and reduce repeated code. In the future though we may make functional changes. I've added a comment at least calling out that the code was originally more or less copied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea for CODE_COPIED_TO_POLARIS
is to automatically check acknowledgement in LICENSE / NOTICE files. So technically any reuse of substantial code blocks should be mentioned even if the code was modified.
@jbonofre : WDYT?
Since the check is by file name, one comment per file is sufficient. For that matter, maybe it is best to move the reused code to a top-level class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on getting @jbonofre's POV here
As for making this a top-level class (file) I would love to do so. However, these classes are tightly coupled with IcebergCatalog
right now and rely on its private members and methods. The refactor gets significantly more complicated when done that way. So, I decided to try this smaller, more incremental, change which is what's actually needed to unblock the behavior proposed in #1378.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using inner classes if fine too from my POV
/** | ||
* An implementation of {@link TableOperations} that integrates with {@link IcebergCatalog}. Much | ||
* of this code was originally copied from {@link | ||
* org.apache.iceberg.BaseMetastoreTableOperations}. COPIED_COPIED_TO_POLARIS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected CI to fail now that we have the COPIED_COPIED_TO_POLARIS
mark, but this file in not mentioned in LICENSE 🤔
BasePolarisTableOperations and BasePolarisViewOperations currently extend BaseMetastoreTableOperations and BaseViewOperations respectively. In PRs like #433 and #1378 it has been suggested that instead they could extend more fundamental types like
TableOperations
instead. This PR implements that suggestion.There are no semantic changes made to the ...Operations types in this PR, only a refactor that would allow us to customize their logic in the future.