Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arrow: fix recipe when arrow/*:parquet=True #24044

Merged
merged 21 commits into from
Oct 30, 2024

Conversation

robomics
Copy link
Contributor

@robomics robomics commented May 20, 2024

Specify library name and version: arrow/all

When the arrow package is built with arrow/*:parquet=True, arrow/*:with_thrift=True is also required.
Thrift is found by Arrow by calling FindThriftAlt.cmake, which uses some custom logic to define thrift::thrift when the project is compiled on Windows.

This causes the transitive_header=True specified in Thrift's recipe to be ignored.

def requirements(self):
self.requires("boost/1.84.0", transitive_headers=True)

This PR patches FindThriftAlt.cmake so that find_package(Thrift) is always called.

Fixes #20082 (note that without the patch I am unable to build the package using MSVC regardless of the build_type).

While debugging the above issue I also realized that ARROW_WITH_RE2 was not explicitly set by the recipe, which on my machine causes compilation to fail for v11.0.0. This PR also addresses this issue.

Finally, I am bumping thrift, xsimd, and zstdto avoid version conflicts with other packages on cci.


@conan-center-bot

This comment has been minimized.

@valgur
Copy link
Contributor

valgur commented May 21, 2024

Thanks! Please also set parquet=True in default_options as well. It was only left disabled due to build issues in the last PR and this one should probably fix these.

@robomics
Copy link
Contributor Author

Another thing I noticed, is that arrow always uses a vendored version of mimalloc: https://github.com/apache/arrow/blob/b2e8c33c86c819b167a1cbca834da3c9047a9350/cpp/cmake_modules/ThirdpartyToolchain.cmake#L2197-L2204

Should I drop mimalloc from the requirement list (but leave the option to toggle mimalloc on or off)?

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@robomics robomics closed this May 22, 2024
@robomics robomics reopened this May 22, 2024
@conan-center-bot

This comment has been minimized.

@robomics
Copy link
Contributor Author

robomics commented May 23, 2024

@RubenRBS I think this is ready for review :).

The conan v2 pipeline failure seems to be due to the same issue described in #24011.

@ghost ghost mentioned this pull request May 26, 2024
3 tasks
@robomics robomics closed this Jun 12, 2024
@robomics robomics reopened this Jun 12, 2024
@conan-center-bot

This comment has been minimized.

@robomics
Copy link
Contributor Author

robomics commented Oct 4, 2024

@perseoGI CI is green! :)

Copy link
Member

@AbrilRBS AbrilRBS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments from a review alongside @perseoGI

@@ -261,6 +264,11 @@ def validate(self):
if self.dependencies["jemalloc"].options.enable_cxx:
raise ConanInvalidConfiguration("jemmalloc.enable_cxx of a static jemalloc must be disabled")

if self.options.with_thrift and not self.options.with_zlib:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more checks in ThirdpartyToolchain.cmake that might be worth to reflect in the recipe, instead of relying on autodiscovery

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That file is over 5000 lines long :D. I do not have the bandwidth to read through it to see what's going on.
If you have some specific checks in mind?

Otherwise I suggest we leave this to be addressed in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only one that stood out to me, so I'm happy with keeping that on a follow-up PR, yes :)

@@ -101,7 +101,7 @@ class ArrowConan(ConanFile):
"with_glog": False,
"with_grpc": False,
"with_json": False,
"with_thrift": False,
"with_thrift": True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good reason to have this set to true? Is this just for testing the changes n the CI? We could then revert it now

Copy link
Contributor Author

@robomics robomics Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! with_thrift needs to be True in order to build Arrow with Parquet support (which is now enabled by default based on #24044 (comment)).

@@ -351,6 +360,7 @@ def generate(self):
tc.variables["ARROW_WITH_THRIFT"] = bool(self.options.with_thrift)
tc.variables["Thrift_SOURCE"] = "SYSTEM"
if self.options.with_thrift:
tc.variables["ARROW_THRIFT"] = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, this variable was not set, but maybe we would need to set ARROW_WITH_THRIFT instead? Am I missing something? Do you have any insight? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super familiar with Arrow's code base, so I am not 100% sure on what's the difference between ARROW_WITH_THRIFT and ARROW_THRIFT.

A quick search on Arrow's codebase suggests that ARROW_THRIFT is only referenced in

$ find arrow -type f -exec grep -P "\bARROW_THRIFT\b" {} +

arrow/cpp/cmake_modules/ThirdpartyToolchain.cmake:if(ARROW_THRIFT)
arrow/python/build/lib.linux-x86_64-cpython-312/cmake_modules/ThirdpartyToolchain.cmake:if(ARROW_THRIFT)

Where it seems to be used only to turn on ZLIB support - link

if(ARROW_THRIFT)
  set(ARROW_WITH_ZLIB ON)
endif()

So I think setting both ARROW_WITH_THRIFT and ARROW_THRIFT is ok (will push a commit to do so ASAP).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@AbrilRBS AbrilRBS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mean to approve just yet!

@conan-center-bot

This comment has been minimized.

Co-authored-by: Abril Rincón Blanco <[email protected]>
Co-authored-by: PerseoGI <[email protected]>
@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot
Copy link
Collaborator

Conan v1 pipeline ✔️

Warning

Conan Center will stop receiving updates for Conan 1.x packages soon - please see announcement.

All green in build 28 (b776db92e460f2c179a2140de91a76eb55e4d3d3):

  • arrow/15.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/16.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/16.1.0:
    Built 18 packages out of 22 (All logs)

  • arrow/17.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/11.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/7.0.0:
    All packages built successfully! (All logs)

  • arrow/10.0.1:
    Built 18 packages out of 22 (All logs)

  • arrow/12.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/14.0.2:
    Built 18 packages out of 22 (All logs)

  • arrow/8.0.0:
    All packages built successfully! (All logs)

  • arrow/14.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/12.0.1:
    Built 18 packages out of 22 (All logs)

  • arrow/13.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/10.0.0:
    Built 18 packages out of 22 (All logs)

  • arrow/14.0.1:
    Built 18 packages out of 22 (All logs)

  • arrow/8.0.1:
    All packages built successfully! (All logs)


Conan v2 pipeline ✔️

Note: Conan v2 builds are now mandatory. Please read our discussion about it.

All green in build 30 (b776db92e460f2c179a2140de91a76eb55e4d3d3):

  • arrow/17.0.0:
    All packages built successfully! (All logs)

  • arrow/14.0.1:
    All packages built successfully! (All logs)

  • arrow/16.1.0:
    All packages built successfully! (All logs)

  • arrow/16.0.0:
    All packages built successfully! (All logs)

  • arrow/14.0.0:
    All packages built successfully! (All logs)

  • arrow/13.0.0:
    All packages built successfully! (All logs)

  • arrow/15.0.0:
    All packages built successfully! (All logs)

  • arrow/14.0.2:
    All packages built successfully! (All logs)

  • arrow/11.0.0:
    All packages built successfully! (All logs)

  • arrow/10.0.1:
    All packages built successfully! (All logs)

  • arrow/8.0.0:
    All packages built successfully! (All logs)

  • arrow/12.0.1:
    All packages built successfully! (All logs)

  • arrow/7.0.0:
    Built 6 packages out of 10 (All logs)

  • arrow/10.0.0:
    All packages built successfully! (All logs)

  • arrow/12.0.0:
    All packages built successfully! (All logs)

  • arrow/8.0.1:
    All packages built successfully! (All logs)

@robomics
Copy link
Contributor Author

Things are looking good now.
Can you please have a look @AbrilRBS, @perseoGI?

Copy link
Member

@AbrilRBS AbrilRBS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now looks good on my end, will ask @perseoGI to re-review for tomorrow so we can finally get this merged, thanks a lot for your patience @robomics and for taking the time to address our reviews, we appreciate it :)

@@ -261,6 +264,11 @@ def validate(self):
if self.dependencies["jemalloc"].options.enable_cxx:
raise ConanInvalidConfiguration("jemmalloc.enable_cxx of a static jemalloc must be disabled")

if self.options.with_thrift and not self.options.with_zlib:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only one that stood out to me, so I'm happy with keeping that on a follow-up PR, yes :)

@@ -351,6 +360,7 @@ def generate(self):
tc.variables["ARROW_WITH_THRIFT"] = bool(self.options.with_thrift)
tc.variables["Thrift_SOURCE"] = "SYSTEM"
if self.options.with_thrift:
tc.variables["ARROW_THRIFT"] = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@perseoGI perseoGI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally! This is merging! Thank you for your patience!!!

@conan-center-bot conan-center-bot merged commit e4ffd89 into conan-io:master Oct 30, 2024
69 checks passed
@robomics
Copy link
Contributor Author

Awesome!
Thank you both for your comments and help throughout the review :)

@robomics robomics deleted the fix/arrow branch October 30, 2024 10:49
@robomics robomics mentioned this pull request Nov 1, 2024
3 tasks
OMGtechy pushed a commit to OMGtechy/conan-center-index that referenced this pull request Dec 31, 2024
* Fix recipe when arrow/*:parquet=True

* Add patch description and type annotations

* Set parquet=True

* Do not look for static analyzers

* Bugfix

* Bugfix

* Refactor patches

* Backport fixes from upstream

* Patch v17.0.0

* Woops. Forgot to apply patch to v16.1.0

* Bump deps

* Bugfix

* Add patch_source

* Add check to ensure with_boost=True when with_thrift=True

* Fix typos

* Cleanups

* Set ARROW_THRIFT

Co-authored-by: Abril Rincón Blanco <[email protected]>
Co-authored-by: PerseoGI <[email protected]>

* Bugfix

* Bugfix

---------

Co-authored-by: Abril Rincón Blanco <[email protected]>
Co-authored-by: PerseoGI <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] arrow 13.0.0 not building for msvc debug profile
5 participants