Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-18036: [Packaging] Build Python wheel for musllinux #45470

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nveloso
Copy link

@nveloso nveloso commented Feb 9, 2025

Rationale for this change

Please check #18036.

What changes are included in this PR?

Almost everything needed for building and testing python wheels for musllinux.
The service python-wheel-musllinux-test-unittests is currently broken (see next section) and I need to test running the alpine-linux-verify-rc docker image.

Are these changes tested?

I was able to successfully generate a musllinux wheel by running the following:

docker-compose build python-wheel-musllinux-1-2
docker-compose run python-wheel-musllinux-1-2

I was also able to run python-wheel-musllinux-test-imports with no errors.

I'm not able to run python-wheel-musllinux-test-unittests because there are 2 tests failing and I don't think they are related with my changes. Can you please confirm?
The failing tests are:

  • test_uwsgi_integration
  • test_print_stats

I believe the root cause is the same which is related to this:
/arrow/cpp/src/arrow/filesystem/s3fs.cc:3461: arrow::fs::FinalizeS3 was not called even though S3 was initialized. This could lead to a segmentation fault at exit !!! uWSGI process 3487 got Segmentation Fault !!!

Do you have any idea of what it might be?

Here are some logs of the failed tests:

====================================================================================== FAILURES ======================================================================================
_______________________________________________________________________________ test_uwsgi_integration _______________________________________________________________________________

    @pytest.mark.s3
    def test_uwsgi_integration():
        # GH-44071: using S3FileSystem under uwsgi shouldn't lead to a crash at shutdown
        try:
            subprocess.check_call(["uwsgi", "--version"])
        except FileNotFoundError:
            pytest.skip("uwsgi not installed on this Python")

        port = find_free_port()
        args = ["uwsgi", "-i", "--http", f"127.0.0.1:{port}",
                "--wsgi-file", os.path.join(here, "wsgi_examples.py")]
        proc = subprocess.Popen(args, stdin=subprocess.DEVNULL)
        # Try to fetch URL, it should return 200 Ok...
        try:
            url = f"http://127.0.0.1:{port}/s3/"
            start_time = time.time()
            error = None
            while time.time() < start_time + 5:
                try:
                    with urlopen(url) as resp:
                        assert resp.status == 200
                    break
                except OSError as e:
                    error = e
                    time.sleep(0.1)
            else:
                pytest.fail(f"Could not fetch {url!r}: {error}")
        finally:
            proc.terminate()
        # ... and uwsgi should gracefully shutdown after it's been asked above
>       assert proc.wait() == 30  # UWSGI_END_CODE = 30
E       AssertionError: assert -11 == 30
E        +  where -11 = wait()
E        +    where wait = <Popen: returncode: -11 args: ['uwsgi', '-i', '--http', '127.0.0.1:49245', '...>.wait

usr/local/lib/python3.9/site-packages/pyarrow/tests/test_fs.py:2052: AssertionError
-------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------
2.0.28
-------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------
*** Starting uWSGI 2.0.28 (64bit) on [Sat Feb  8 18:56:14 2025] ***
compiled with version: 13.2.1 20231014 on 31 October 2024 19:02:44
os: Linux-6.8.0-50-generic #51-Ubuntu SMP PREEMPT_DYNAMIC Sat Nov  9 18:03:35 UTC 2024
nodename: ae5a02215122
machine: aarch64
clock source: unix
pcre jit disabled
detected number of CPU cores: 4
current working directory: /
detected binary path: /usr/local/bin/python3.9
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
*** WARNING: you are running uWSGI without its master process manager ***
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on 127.0.0.1:49245 fd 4
spawned uWSGI http 1 (pid: 3488)
uwsgi socket 0 bound to TCP address 127.0.0.1:40033 (port auto-assigned) fd 3
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
Python version: 3.9.19 (main, Mar 20 2024, 20:45:15)  [GCC 12.2.1 20220924]
--- Python VM already initialized ---
Python main interpreter initialized at 0xeff90822b840
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 72904 bytes (71 KB) for 1 cores
*** Operational MODE: single process ***
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0xeff90822b840 pid: 3487 (default app)
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
spawned uWSGI worker 1 (and the only) (pid: 3487, cores: 1)
[pid: 3487|app: 0|req: 1/1] 127.0.0.1 () {30 vars in 346 bytes} [Sat Feb  8 18:56:14 2025] GET /s3/ => generated 12 bytes in 20 msecs (HTTP/1.1 200) 1 headers in 44 bytes (1 switches on core 0)
/arrow/cpp/src/arrow/filesystem/s3fs.cc:3461:  arrow::fs::FinalizeS3 was not called even though S3 was initialized.  This could lead to a segmentation fault at exit
!!! uWSGI process 3487 got Segmentation Fault !!!
________________________________________________________________________ test_print_stats[system_memory_pool] ________________________________________________________________________

pool_factory = <cyfunction system_memory_pool at 0xe04d9b5c3ad0>

    @pytest.mark.parametrize('pool_factory', supported_factories())
    def test_print_stats(pool_factory):
        code = f"""if 1:
            import pyarrow as pa

            pool = pa.{pool_factory.__name__}()
            buf = pa.allocate_buffer(64, memory_pool=pool)
            pool.print_stats()
            """
        res = subprocess.run([sys.executable, "-c", code], check=True,
                             universal_newlines=True, stdout=subprocess.PIPE,
                             stderr=subprocess.PIPE)
        if sys.platform == "linux":
            # On Linux at least, all memory pools should emit statistics
>           assert res.stderr.strip() != ""
E           AssertionError: assert '' != ''
E            +  where '' = <built-in method strip of str object at 0xe04d9c3ec6f0>()
E            +    where <built-in method strip of str object at 0xe04d9c3ec6f0> = ''.strip
E            +      where '' = CompletedProcess(args=['/usr/local/bin/python', '-c', 'if 1:\n        import pyarrow as pa\n\n        pool = pa.system...= pa.allocate_buffer(64, memory_pool=pool)\n        pool.print_stats()\n        '], returncode=0, stdout='', stderr='').stderr

usr/local/lib/python3.9/site-packages/pyarrow/tests/test_memory.py:295: AssertionError

There is also a lot of skipped tests (603) and I'm not sure if this is ok. Here is the final report:
============================================= 2 failed, 7200 passed, 603 skipped, 12 xfailed, 2 xpassed, 5 warnings in 80.21s (0:01:20) ==============================================

Are there any user-facing changes?

I don't think so.

Copy link

github-actions bot commented Feb 9, 2025

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@github-actions github-actions bot added the awaiting review Awaiting review label Feb 9, 2025
@kou
Copy link
Member

kou commented Feb 10, 2025

@github-actions crossbow submit wheel-musllinux-*

Copy link

Revision: 732fc35

Submitted crossbow builds: ursacomputing/crossbow @ actions-d80189b582

Task Status
wheel-musllinux-1-2-cp310-cp310-amd64 GitHub Actions
wheel-musllinux-1-2-cp310-cp310-arm64 GitHub Actions
wheel-musllinux-1-2-cp311-cp311-amd64 GitHub Actions
wheel-musllinux-1-2-cp311-cp311-arm64 GitHub Actions
wheel-musllinux-1-2-cp312-cp312-amd64 GitHub Actions
wheel-musllinux-1-2-cp312-cp312-arm64 GitHub Actions
wheel-musllinux-1-2-cp313-cp313-amd64 GitHub Actions
wheel-musllinux-1-2-cp313-cp313-arm64 GitHub Actions
wheel-musllinux-1-2-cp313-cp313t-amd64 GitHub Actions
wheel-musllinux-1-2-cp313-cp313t-arm64 GitHub Actions
wheel-musllinux-1-2-cp39-cp39-amd64 GitHub Actions
wheel-musllinux-1-2-cp39-cp39-arm64 GitHub Actions

dev/tasks/python-wheels/github.musllinux.yml Outdated Show resolved Hide resolved
ci/scripts/python_wheel_musllinux_build.sh Outdated Show resolved Hide resolved
ci/docker/python-wheel-musllinux.dockerfile Outdated Show resolved Hide resolved
ci/docker/alpine-linux-3.18-verify-rc.dockerfile Outdated Show resolved Hide resolved
@kou kou changed the title GH-18036: [Packaging] Build python wheel for musl linux GH-18036: [Packaging] Build Python wheel for musllinux Feb 10, 2025
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR

dev/tasks/python-wheels/github.musllinux.yml Outdated Show resolved Hide resolved
@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

I'm not able to run python-wheel-musllinux-test-unittests because there are 2 tests failing and I don't think they are related with my changes. Can you please confirm? The failing tests are:

* test_uwsgi_integration

This one should be investigated, as it ends with a crash in uWSGI, even though the test is meant to check that uWSGI doesn't crash.

* test_print_stats

This one looks like the test is too strict (it assumes that Linux implies glibc), we should probably relax it on musllinux.

Merge manylinux and musllinux build scripts into one
@nveloso nveloso force-pushed the python-wheel-for-alpine branch from 7c40829 to 8ce0f68 Compare February 13, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review Awaiting review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants