You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Things will be faster and cleaner if we simply use pyarrow straight through to pass around intermediate representations.
Passing arrow data through pandas tends to give another chance for typecasting errors to creep in, potentially makes it harder to correct errors involving timezones in date fields, etc.
Nothing especially urgent here, particularly because there are some necessary join operations that can't be handled natively by pyarrow.compute. In nonconsumptive, those are usually handled by polars right now; in this one, it might make more sense to do that relational logic on arrow tables by using duckdb on them, since one nice feature of duck is that it just lets you write SQL on local arrow dataframes.
The text was updated successfully, but these errors were encountered:
For once #145 is complete.
Things will be faster and cleaner if we simply use pyarrow straight through to pass around intermediate representations.
Passing arrow data through pandas tends to give another chance for typecasting errors to creep in, potentially makes it harder to correct errors involving timezones in date fields, etc.
Nothing especially urgent here, particularly because there are some necessary join operations that can't be handled natively by pyarrow.compute. In nonconsumptive, those are usually handled by polars right now; in this one, it might make more sense to do that relational logic on arrow tables by using duckdb on them, since one nice feature of duck is that it just lets you write SQL on local arrow dataframes.
The text was updated successfully, but these errors were encountered: