-
Notifications
You must be signed in to change notification settings - Fork 919
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[REVIEW] JSON host tree algorithms (#16545)
Depends on #16836 This change adds a new host tree building algorithms for JSON reader and utf8 field name support. This constructs the device_column_tree using an adjacency list created from parent information. This adjacency list is pruned based on input schema, and also types are enforced as per schema. `mark_is_pruned` Tree is constructed from pruned adjacency list, (with mixed types handling). `construct_tree` utf8 field name support added: (spark requested) utf8 decoding of field names during hashing of field nodes so that utf8 encoded field names also match to same column. All unit tests passes, 1 unit test added where old algorithm fails. This code is kept under experimental flag. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) URL: #16545
- Loading branch information
Showing
11 changed files
with
1,011 additions
and
120 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.