-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
By default match fields by position #80
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #80 +/- ##
==========================================
+ Coverage 92.01% 92.13% +0.11%
==========================================
Files 70 71 +1
Lines 8879 9009 +130
==========================================
+ Hits 8170 8300 +130
Misses 709 709 ☔ View full report in Codecov by Sentry. |
ed4907a
to
e6b10bb
Compare
We add an option for `COPY FROM` called `match_by_name` which matches Parquet file fields to PostgreSQL table columns `by their names` rather than `by their order` in the schema. By default, the option is `false`. The option is useful when field order differs between the Parquet file and the table, but their names match. **!!IMPORTANT!!**: This is a breaking change. Before the PR, we match always by name. This is a bit strict and not common way to match schemas. (e.g. COPY FROM csv at postgres or COPY FROM of duckdb match by field position by default) This is why we match by position by default and have a COPY FROM option `match_by_name` that can be set to true for the old behaviour. Closes #39.
e6b10bb
to
4241cae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
- could consider a GUC with the default match_by method
- maybe add a table to the readme with important changes
lets move it to breaking changes section when we release. |
We add an option for
COPY FROM
calledmatch_by [position(default) | name]
which determines the match method for the Postgres table columns and Parquet file fields. By default, the match method isposition
. Match byname
is useful when field order differs between the Parquet file and the table, but their names match.!!IMPORTANT!!: This is a breaking change. Before the PR, we match always by name. This is a bit strict and not common way to match schemas. (e.g. COPY FROM csv at postgres or COPY FROM of duckdb match by field position by default) This is why we match by position by default now and have a COPY FROM option
match_by
that can be set toname
for the old behavior.Closes #39.