Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax error when creating foreign table from Iceberg #177

Open
2 tasks done
ifm-pgarner opened this issue Nov 19, 2024 · 4 comments
Open
2 tasks done

Syntax error when creating foreign table from Iceberg #177

ifm-pgarner opened this issue Nov 19, 2024 · 4 comments
Labels
bug Something isn't working priority-high High priority issue user-request This issue was directly requested by a user

Comments

@ifm-pgarner
Copy link

ifm-pgarner commented Nov 19, 2024

What happens?

I'm following through the docs for the first time, to set up a foreign table that points to Iceberg table that I mounted to the container

I get this error:

=> CREATE FOREIGN TABLE operation ()
SERVER iceberg_server
OPTIONS (
    files 'iceberg/analyticsng/operation/metadata/00693-b272df17-984f-4028-ade1-cb5bef59ef64.metadata.json'
);
ERROR:  syntax error at or near "authorization"
LINE 1: ...RCHAR, ADD COLUMN dealer_name VARCHAR, ADD COLUMN authorizat...
                                                             ^
QUERY:  ALTER TABLE public.operation ADD COLUMN id VARCHAR, ADD COLUMN operation_id VARCHAR, ADD COLUMN order_id VARCHAR, ADD COLUMN dealer_id VARCHAR, ADD COLUMN dealer_name VARCHAR, ADD COLUMN authorization VARCHAR, ADD COLUMN on_day BOOLEAN, ADD COLUMN hashed_description

it looks to me like this may just be due lack of quoting for the column names in the generated DDL query?

i.e. the problem is that I have a column called authorization and that is a reserved keyword for Postgres:
https://www.postgresql.org/docs/current/sql-keywords-appendix.html#:~:text=non%2Dreserved-,AUTHORIZATION,-reserved%20(can%20be

To Reproduce

I guess create an Iceberg table with a field called authorization and try import it into pg_analytics

OS:

macOS

ParadeDB Version:

latest-pg17

Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

ParadeDB Docker Image

Full Name:

Paul Garner

Affiliation:

Infomedia

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include the code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

  • Yes, I have
@ifm-pgarner ifm-pgarner added the bug Something isn't working label Nov 19, 2024
@philippemnoel
Copy link
Collaborator

Does it work with other column names? Does your column need to be called authorization?

@ifm-pgarner
Copy link
Author

ifm-pgarner commented Nov 20, 2024

Does it work with other column names?

It appears to, that column wasn't the first ADD COLUMN statement in the query, e.g. order_id, dealer_id, dealer_name all seem fine.

Does your column need to be called authorization?

are you... joking? why not make the query safe by quoting the identifier names?

See https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

Looks like the problem is here maybe:

format!("ADD COLUMN {} {}", column_name, pg_type)

@ifm-pgarner
Copy link
Author

Does your column need to be called authorization?

are you... joking?

Apologies, and taking your question on good faith...

No I can't. I am trying to import data from an existing database, which was originally created by a system (Django web framework) which always quotes identifier names, allowing use of values such as authorization as db column names when those names make sense to the application.

(schema and table names should also be quoted for same reason)

@philippemnoel
Copy link
Collaborator

Does your column need to be called authorization?

are you... joking?

Apologies, and taking your question on good faith...

No I can't. I am trying to import data from an existing database, which was originally created by a system (Django web framework) which always quotes identifier names, allowing use of values such as authorization as db column names when those names make sense to the application.

(schema and table names should also be quoted for same reason)

I figured the answer was going to yes, but I just wanted to make sure!

It sounds like you've found the root cause. Our team is pretty swamped with work on https://github.com/paradedb/paradedb, so it's hard to get their attention for pg_analytics right now... Would you be willing to contribute the fix yourself? We'd love to credit you for it, and I'll do a release right away to get it in your hands.

@philippemnoel philippemnoel added priority-high High priority issue user-request This issue was directly requested by a user labels Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority-high High priority issue user-request This issue was directly requested by a user
Projects
None yet
Development

No branches or pull requests

2 participants