Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot convert the same OracleDataFrame to a PyArrow table twice #470

Open
cpcloud opened this issue Mar 13, 2025 · 3 comments
Open

Cannot convert the same OracleDataFrame to a PyArrow table twice #470

cpcloud opened this issue Mar 13, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@cpcloud
Copy link

cpcloud commented Mar 13, 2025

  1. What versions are you using?
platform.platform: Linux-6.13.6-x86_64-with-glibc2.40
sys.maxsize > 2**32: True
platform.python_version: 3.12.9
oracledb.__version__: 3.0.0
  1. Is it an error or a hang or a crash? An error.

  2. What error(s) or behavior you are seeing?

ArrowInvalid: Cannot import released ArrowSchema
  1. Does your application call init_oracle_client()?

No

  1. Include a runnable Python script that shows the problem.
In [26]: import oracledb, pyarrow as pa

In [27]: con = oracledb.connect(oracledb.makedsn("localhost", 1521, service_name="IBIS_TESTING", sid=None), user="IBIS
       ⋮ ", password="ibis", stmtcachesize=0)

In [28]: odf = con.fetch_df_all("SELECT 1 AS \"a\"")

In [29]: t = pa.Table.from_arrays(arrays=odf.column_arrays(), names=odf.column_names())

In [30]: t = pa.Table.from_arrays(arrays=odf.column_arrays(), names=odf.column_names())
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ in pyarrow.lib.Table.from_arrays:4851                                                            │
│                                                                                                  │
│ in pyarrow.lib._sanitize_arrays:1593                                                             │
│                                                                                                  │
│ in pyarrow.lib._schema_from_arrays:1574                                                          │
│                                                                                                  │
│ in pyarrow.lib.array:271                                                                         │
│                                                                                                  │
│ in pyarrow.lib.Array._import_from_c_capsule:1875                                                 │
│                                                                                                  │
│ in pyarrow.lib.pyarrow_internal_check_status:155                                                 │
│                                                                                                  │
│ in pyarrow.lib.check_status:92                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ArrowInvalid: Cannot import released ArrowSchema
@cpcloud cpcloud added the bug Something isn't working label Mar 13, 2025
@cjbj
Copy link
Member

cjbj commented Mar 17, 2025

I believe this is a features/limitation of nanoarrow. I'll let @aosingh comment.

@aosingh
Copy link
Member

aosingh commented Mar 17, 2025

@cpcloud

The lifetime semantics is defined by Arrow C Data Interface and Arrow PyCapsule Interface. The following note is from the documentation:

If the capsule has been passed to a consumer, the consumer should have moved the data and marked the release callback as null, so there isn’t a risk of releasing data the consumer is using. Read more in the C Data Interface specification.

In the example shown, pyarrow marks the release callback as NULL after invoking it during the first conversion and raises an error during the second. If twice conversion is a valid usecase, we will have to check how to do that.

@cjbj
Copy link
Member

cjbj commented Mar 17, 2025

@cpcloud do you have a practical use case for converting twice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants