Skip to content

Commit

Permalink
Refactor parquet thrift reader (#14097)
Browse files Browse the repository at this point in the history
Refactors the current `CompactProtocolReader` used to parse parquet file metadata. The main goal of the refactor is to allow easier use of `std::optional` fields in the thrift structs to prevent situations as in #14024 where an optional field is an empty string. The writer cannot distinguish between present-but-empty and not-present, so chooses the latter when writing the field. This PR adds a `ParquetFieldOptional` functor that can wrap the other field functors, obviating the need to write a new optional functor for each type.

Authors:
  - Ed Seidl (https://github.com/etseidl)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #14097
  • Loading branch information
etseidl authored Sep 20, 2023
1 parent eb6d134 commit 40d4cc5
Show file tree
Hide file tree
Showing 7 changed files with 662 additions and 706 deletions.
Loading

0 comments on commit 40d4cc5

Please sign in to comment.