-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add C Data integration test shared library #337
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #337 +/- ##
==========================================
+ Coverage 88.04% 88.12% +0.07%
==========================================
Files 72 73 +1
Lines 11370 11509 +139
==========================================
+ Hits 10011 10142 +131
- Misses 1359 1367 +8 ☔ View full report in Codecov by Sentry. |
I think you can take a look at what we do for Rust, since it's a similar case of a non-monorepo implementation: (in short: locations can be influenced with an environment variable) |
|
||
const char* nanoarrow_CDataIntegration_ExportSchemaFromJson(const char* json_path, | ||
ArrowSchema* out) { | ||
ArrowErrorInit(&global_error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this deallocate a previously-initialized error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't (it is basically global_error.message[0] = '\0';
). The C++ implementation seems to use static std::string
which I think is similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But does making an error message allocate its storage dynamically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't (the global error here is a static ArrowError
, aka static char[1024]
).
static ArrowErrorCode ExportBatchFromJson(const char* json_path, int num_batch, | ||
ArrowArray* out, ArrowError* error) { | ||
MaterializedArrayStream data; | ||
NANOARROW_RETURN_NOT_OK(MaterializeJsonFilePath(json_path, &data, error)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very important, but ideally we'd only parse JSON batch num_batch
, instead of parsing them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! This makes sense (particularly after adding the tracking for bytes allocated).
This PR adds the shared library target required by the archery integration tetster, based on https://github.com/apache/arrow/blob/main/cpp/src/arrow/integration/c_data_integration_internal.cc .
I haven't tested this via archery because I have no idea how to do so (the implementation names and file locations seem hard-coded?); however, it does add a googletest file with some minimal examples to at least make sure everything is wired up.