Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Python tests failing with error about numpy image #4204

Closed
jameslamb opened this issue Apr 19, 2021 · 6 comments · Fixed by #4238
Closed

[ci] Python tests failing with error about numpy image #4204

jameslamb opened this issue Apr 19, 2021 · 6 comments · Fixed by #4238

Comments

@jameslamb
Copy link
Collaborator

Description

For the last week or so, I have seen a few Python CI jobs fail with an error like the following

__________ ERROR collecting tests/python_package_test/test_sklearn.py __________
ImportError while importing test module '/Users/runner/work/LightGBM/LightGBM/tests/python_package_test/test_sklearn.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../../miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/__init__.py:22: in <module>
    from . import multiarray
../../../../miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/multiarray.py:12: in <module>
    from . import overrides
../../../../miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/overrides.py:7: in <module>
    from numpy.core._multiarray_umath import (
E   ImportError: dlopen(/Users/runner/miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so, 2): Library not loaded: @rpath/libopenblas.dylib
E     Referenced from: /Users/runner/miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so
E     Reason: image not found

During handling of the above exception, another exception occurred:
../../../../miniconda/envs/test-env/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
../tests/python_package_test/test_sklearn.py:7: in <module>
    import numpy as np
../../../../miniconda/envs/test-env/lib/python3.9/site-packages/numpy/__init__.py:140: in <module>
    from . import core
../../../../miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/__init__.py:48: in <module>
    raise ImportError(msg)
E   ImportError: 
E   
E   IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
E   
E   Importing the numpy C-extensions failed. This error can happen for
E   many reasons, often due to issues with your setup or how NumPy was
E   installed.
E   
E   We have compiled some common reasons and troubleshooting tips at:
E   
E       https://numpy.org/devdocs/user/troubleshooting-importerror.html
E   
E   Please note and check the following:
E   
E     * The Python version is: Python3.9 from "/Users/runner/miniconda/envs/test-env/bin/python"
E     * The NumPy version is: "1.19.2"
E   
E   and make sure that they are the versions you expect.
E   Please carefully study the documentation linked above for further help.
E   
E   Original error was: dlopen(/Users/runner/miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so, 2): Library not loaded: @rpath/libopenblas.dylib
E     Referenced from: /Users/runner/miniconda/envs/test-env/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so
E     Reason: image not found

I see the same stacktrace (or very similar) for all test modules, not just test_sklearn.py

=========================== short test summary info ============================
ERROR ../tests/python_package_test/test_basic.py
ERROR ../tests/python_package_test/test_consistency.py
ERROR ../tests/python_package_test/test_dask.py
ERROR ../tests/python_package_test/test_dual.py
ERROR ../tests/python_package_test/test_engine.py
ERROR ../tests/python_package_test/test_plotting.py
ERROR ../tests/python_package_test/test_sklearn.py
ERROR ../tests/python_package_test/test_utilities.py
!!!!!!!!!!!!!!!!!!! Interrupted: 8 errors during collection !!!!!!!!!!!!!!!!!!!!
============================== 8 errors in 2.18s ===============================

I most recently saw this on the sdist (macOS-latest, Python 3.9) on GitHub Actions (https://github.com/microsoft/LightGBM/pull/4203/checks?check_run_id=2382402602).

Reproducible example

This has been happening sporadically on CI jobs. In every case where I've encountered it, I have found that re-running the job fixes it.

Environment info

LightGBM continuous integration.

Additional Comments

I have seen this error on several different PRs, most of which did not touch the Python package, so I do not think it is related to any one open PR. For example, the build linked above is from #4203, which only has R package changes.

I see one similar issue on numpy's issue tracker: numpy/numpy#18663. In that issue, the author claimed that this error was solved by preferring numpy from conda-forge instead of Anaconda default channels.

I do see that the most recent version of numpy (1.20.2) was uploaded to conda-forge 21 days ago (https://anaconda.org/conda-forge/numpy)

image

That version has not made it to the defaults channel yet. The last update of numpy there was version 1.19.2 more than 4 months ago (https://anaconda.org/anaconda/numpy).

image

I didn't mark this blocking since it seems to only happen occasionally.

@jameslamb
Copy link
Collaborator Author

I'm thinking about how we had a similar issue solved by #4054, where lag in updating the anaconda default channels caused some incompatible versions to be found in CI.

And I remember the comment that we shouldn't MIX different channels because it can make environment solves a lot slower: #4054 (review)

@StrikerRUS do you think we should change CI jobs to use ONLY conda-forge? All of the packages we care about are on codna-forge, and I'm getting the impression that the automation for conda-forge provides strong guarantees about timely updates to packages and lower risk of conflicting packages being uploaded.

@StrikerRUS
Copy link
Collaborator

I believe the reason of the error is in the following line:

  libopenblas        pkgs/main::libopenblas-0.3.13-h7ddc91~ --> conda-forge::libopenblas-0.3.7-hd44dcd8_1

This line is presented in red CI jobs and is not presented in green ones.

And I remember the comment that we shouldn't MIX different channels because it can make environment solves a lot slower

Also, we should avoid mixing channels because of different policies of native libraries handling: conda-forge/graphviz-feedstock#35 (comment).

I would prefer installing graphviz via pip or some other way, which will allow us to drop conda-forge channel from our CI. I believe that conda install ... is more common than conda install -c conda-forge ... among users and doing so allows us replicate most common use-cases at our CI.

@jameslamb
Copy link
Collaborator Author

conda install is more common than conda install -c conda-forge

I feel that conda-forge is just as popular as the default channel these days but I don't have any evidence of that I can off. By the way, Numpy's installation docs do recommend using conda forge on Windows and Mac: https://numpy.org/install/

Any way, can you clarify...do you think that this issue would be solved by switching to pip install-ing graphviz? Or is that an unrelated comment?

@StrikerRUS
Copy link
Collaborator

Yeah, I wish I knew some statistics about the channels usage.

Numpy's installation docs do recommend using conda forge on Windows and Mac

Sorry, I can't find this. I did search for forge on that page, but didn't see where they recommend conda-forge. Do you mean this line in the Advanced users section?

Unless you’re fine with only the packages in the defaults channel, make conda-forge your default channel via setting the channel priority.

Any way, can you clarify...do you think that this issue would be solved by switching to pip install-ing graphviz? Or is that an unrelated comment?

That my comment was about this issue. Install graphviz by another metfod -> drop conda-forge channel -> fix this issue.

@jameslamb
Copy link
Collaborator Author

Yes, I meant that line in the Advanced Users section. Sorry, they don't have anchors set up on that page to link to specific sections.

Thanks for clarifying. I'll experiment with other ways to install graphviz then.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants