You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
Is your feature request related to a problem or challenge?
As far as I can tell, there is no good way to load a subset of files from a partitioned table. Using ListingTable or another TableProvider like DeltaTableProvider from deltalake, I'm able to read_table, but this loads the entire table. I can also load a list of parquet files with read_parquet, but this doesn't work with partitioned tables if the partitions are not "materialized" columns in the raw parquet. The only way I've found to load partitioned files is by iterating over a list of file paths, and doing the entire TableProvider/read_table process on each one individually, and unioning the results together.
Describe the solution you'd like
It seems like it would be nice to be able to create a TableProvider with a table path, then pass some sort of file "whitelist" in. Maybe a read_table_files(TableProvider, impl IntoIterator<Item = String>).
Describe alternatives you've considered
As stated above, I've tried reading the files one-by-one and unioning results, but it's shockingly inefficient compared to reading all files at once.
Additional context
No response
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
Is your feature request related to a problem or challenge?
As far as I can tell, there is no good way to load a subset of files from a partitioned table. Using ListingTable or another TableProvider like DeltaTableProvider from deltalake, I'm able to read_table, but this loads the entire table. I can also load a list of parquet files with read_parquet, but this doesn't work with partitioned tables if the partitions are not "materialized" columns in the raw parquet. The only way I've found to load partitioned files is by iterating over a list of file paths, and doing the entire TableProvider/read_table process on each one individually, and unioning the results together.
Describe the solution you'd like
It seems like it would be nice to be able to create a TableProvider with a table path, then pass some sort of file "whitelist" in. Maybe a read_table_files(TableProvider, impl IntoIterator<Item = String>).
Describe alternatives you've considered
As stated above, I've tried reading the files one-by-one and unioning results, but it's shockingly inefficient compared to reading all files at once.
Additional context
No response
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: