Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28736:Remove DFS_URI authorization for CREATE_TABLE event with n… #5689

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

rtrivedi12
Copy link
Contributor

…o explicit LOCATION

What changes were proposed in this pull request?

This update ensures that DFS_URI authorization is triggered only when necessary during the CREATE TABLE command. Specifically:

For managed table creation, DFS_URI authorization is skipped if the location is under the default managed DB location.
For external table creation, DFS_URI authorization is skipped if the table location is under the default external DB location.

Why are the changes needed?

This prevents unnecessary DFS_URI authorization for tables created within default DB locations, and adding extra policies in Ranger for query execution from external clients like Spark.

Does this PR introduce any user-facing change?

Yes, After this change, users will no longer need the DFS_URI policy for tables that are created in the default managed DB location (for managed tables) or the default external DB location (for external tables).

Is the change a dependency upgrade?

No

How was this patch tested?

Changes were verified by running the CREATE table command from Spark sql.

Copy link

@nrg4878
Copy link
Contributor

nrg4878 commented Mar 26, 2025

@saihemanth-cloudera @ayushtkn Could you please help review this PR? Thank you

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx @rtrivedi12, had a quick pass, dropped some questions/comments.

One generic question, I couldn't find time to debug, does the HMS Translation layer kicks in before this or after. Just trying to see if things don't change post we made a decision here

super(preEventContext);

super(preEventContext);
this.wh = preEventContext.getHandler().getWh();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to save it? It should be always available in preEventContext?

We can directly do below

 preEventContext.getHandler().getWh().getDefaultTablePath(database, table.getTableName(), isExternalTable);

Comment on lines -65 to +71
String uri = getSdLocation(table.getSd());
Database database = event.getDatabase();
String uri = getSdLocation(table.getSd());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was some unconventional spacing, but can we keep the same thing to maintain redability

    List<HivePrivilegeObject> ret   = new ArrayList<>();
    PreCreateTableEvent       event = (PreCreateTableEvent) preEventContext;
    Table                     table = event.getTable();
    Database                  database = event.getDatabase();
    String                    uri = getSdLocation(table.getSd());

return ret;
}

boolean isExternalTable = table.getTableType().equalsIgnoreCase(TableType.EXTERNAL_TABLE.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work?

MetaStoreUtils.isExternalTable(table);

maybe we can use this directly if it does rather than storing in the variable

boolean isExternalTable = table.getTableType().equalsIgnoreCase(TableType.EXTERNAL_TABLE.toString());
String expectedTablePath = null;
try {
expectedTablePath = wh.getDefaultTablePath(database, table.getTableName(), isExternalTable).toString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to getTableName & things like that

      expectedTablePath = wh.getDefaultTablePath(database, table).toString();

try {
expectedTablePath = wh.getDefaultTablePath(database, table.getTableName(), isExternalTable).toString();
} catch (MetaException e) {
LOG.warn("Got exception fetching Default table location for table " + table.getTableName(), e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use {} + add the database name as well

Comment on lines -79 to +100
Database database = event.getDatabase();
String uri = getSdLocation(table.getSd());
Database database = event.getDatabase();
String uri = getSdLocation(table.getSd());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you match the indentation with above lines

Comment on lines +105 to +116
if (StringUtils.isNotEmpty(uri)) {
boolean isExternalTable = table.getTableType().equalsIgnoreCase(TableType.EXTERNAL_TABLE.toString());
String expectedTablePath = null;
try {
expectedTablePath = wh.getDefaultTablePath(database, table.getTableName(), isExternalTable).toString();
} catch (MetaException e) {
LOG.warn("Got exception fetching Default table location for table " + table.getTableName(), e);
}

// Skip DFS_URI for external tables and if managed table location is under default db path
if (!isExternalTable) {
if (StringUtils.isEmpty(expectedTablePath) || !uri.equals(expectedTablePath)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part looks being duplicated, can you refactor into a method & return whether you need to add DFS_URI or not

}

// Skip DFS_URI for external tables and if managed table location is under default db path
if (!isExternalTable) {
Copy link
Member

@ayushtkn ayushtkn Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, trying to understand: In getInputObj you didn't had this !isExternalTable, but here you have it, why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants