Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column name checks are too strict #39

Open
onderkalaci opened this issue Oct 8, 2024 · 0 comments · May be fixed by #80
Open

Column name checks are too strict #39

onderkalaci opened this issue Oct 8, 2024 · 0 comments · May be fixed by #80
Labels
api-change includes breaking changes enhancement New feature or request

Comments

@onderkalaci
Copy link

I think we should not care much about the column names, we already check the column types, which is fine.

 create table t1(a int);

 insert into t1 VALUES (1);
 copy t1 to '/tmp/data.parquet';

create table t2(b int);
copy t2 from '/tmp/data.parquet';
ERROR:  column "b" is not found in parquet file
@onderkalaci onderkalaci added the enhancement New feature or request label Oct 8, 2024
aykut-bozkurt added a commit that referenced this issue Nov 26, 2024
We add an option for `COPY FROM` called `match_by_position` which matches Parquet file fields to PostgreSQL table columns
`by their position` in the schema rather than `by their names`. By default, the option is `false`. The option is useful
when field names differ between the Parquet file and the table, but their order aligns.

Closes #39.
@aykut-bozkurt aykut-bozkurt linked a pull request Nov 26, 2024 that will close this issue
aykut-bozkurt added a commit that referenced this issue Nov 27, 2024
We add an option for `COPY FROM` called `match_by_position` which matches Parquet file fields to PostgreSQL table columns
`by their position` in the schema rather than `by their names`. By default, the option is `false`. The option is useful
when field names differ between the Parquet file and the table, but their order aligns.

Closes #39.
aykut-bozkurt added a commit that referenced this issue Nov 27, 2024
We add an option for `COPY FROM` called `match_by_name` which matches Parquet file fields to PostgreSQL table columns
`by their names` rather than `by their order` in the schema. By default, the option is `false`. The option is useful
when field order differs between the Parquet file and the table, but their names match.

**!!IMPORTANT!!**: This is a breaking change. Before the PR, we match always by name. This is a bit strict and not common
way to match schemas. (e.g. COPY FROM csv at postgres or COPY FROM of duckdb match by field position by default)
This is why we match by position by default and have a COPY FROM option `match_by_name` that can be set to true
for the old behaviour.

Closes #39.
aykut-bozkurt added a commit that referenced this issue Nov 27, 2024
We add an option for `COPY FROM` called `match_by_name` which matches Parquet file fields to PostgreSQL table columns
`by their names` rather than `by their order` in the schema. By default, the option is `false`. The option is useful
when field order differs between the Parquet file and the table, but their names match.

**!!IMPORTANT!!**: This is a breaking change. Before the PR, we match always by name. This is a bit strict and not common
way to match schemas. (e.g. COPY FROM csv at postgres or COPY FROM of duckdb match by field position by default)
This is why we match by position by default and have a COPY FROM option `match_by_name` that can be set to true
for the old behaviour.

Closes #39.
@aykut-bozkurt aykut-bozkurt added the api-change includes breaking changes label Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change includes breaking changes enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants