Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I just install form pip #47

Open
renjiechao88 opened this issue Dec 17, 2020 · 11 comments
Open

I just install form pip #47

renjiechao88 opened this issue Dec 17, 2020 · 11 comments

Comments

@renjiechao88
Copy link

from thundergbm import TGBMClassifier
Traceback (most recent call last):
File "", line 1, in
File "/home/dell/anaconda3/lib/python3.7/site-packages/thundergbm/init.py", line 10, in
from .thundergbm import *
File "/home/dell/anaconda3/lib/python3.7/site-packages/thundergbm/thundergbm.py", line 32, in
thundergbm = CDLL(lib_path)
File "/home/dell/anaconda3/lib/python3.7/ctypes/init.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory

@rivershah
Copy link

I am getting the same error. Here is the cuda version on the machine.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

@Kurt-Liuhf
Copy link
Collaborator

Kurt-Liuhf commented Dec 21, 2020

Hi, can you show me what commands you used to install the ThunderGBM? It should be noticed that you should use the .whl file here if you want to use ThunderGBM in a Window system.

@renjiechao88
Copy link
Author

I was install on ubuntu16.04 and use pip install thundergbm,but when i use
from thundergbm import *
it was failed

@Kurt-Liuhf
Copy link
Collaborator

Hi @renjiechao88, thanks for your feedback. I have tried the same command for ThunderGBM installation on CentOS but it passed the test. I have read your error report and I recommend you to do the following things:
cd to /usr/local/cuda (or you self-defined CUDA installation path)
find -name libcus*
To see if you can find libcusparse.so.10.0 or another version of libcusparse.so.* that matches your CUDA version.
If you can find it, make sure your configuration of environment variables is correct (by using the command echo $LD_LIBRARY_PATH).
If you cannot find it, please try installing the corresponding CUDA Toolkit and configure the environment variables.
Hope it helps.

@renjiechao88
Copy link
Author

usefind -name libcus* command, my result is
./doc/man/man7/libcusolver.so.7 ./doc/man/man7/libcusparse.7 ./doc/man/man7/libcusparse.so.7 ./doc/man/man7/libcusolver.7 ./lib64/libcusparse.so.9.0 ./lib64/libcusolver.so.9.0.176 ./lib64/stubs/libcusparse.so ./lib64/stubs/libcusolver.so ./lib64/libcusolver.so.9.0 ./lib64/libcusparse_static.a ./lib64/libcusparse.so ./lib64/libcusolver_static.a ./lib64/libcusparse.so.9.0.176 ./lib64/libcusolver.so
my CUDA version is also 9.0 and i can use thunderSVM normally,but when i use thunderGBM it has error say
OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory

@renjiechao88
Copy link
Author

echo $LD_LIBRARY_PATH result is
/usr/local/cuda-9.0/lib64

so thank you for your response @Kurt-Liuhf

@Kurt-Liuhf
Copy link
Collaborator

Hi @renjiechao88, can you try installing ThunderGBM by using this wheel file? I have rebuilt it by using CUDA 9.0. Thank you.

@rivershah
Copy link

Here is the sequence of commands I used. Please let me know if further information needed. Is it a cuda 11 issue? Thanks.
$ /apps/python/3.7.9/bin/python3.7 -m pip install --upgrade thundergbm
echo $LD_LIBRARY_PATH /apps/python/3.7.9/lib/:/usr/local/cuda/lib64

/usr/local/cuda/lib64 $ ls
libaccinj64.so            libcufft_static_nocallback.a  libcusolver.so            libnppicc.so.11.2.1.68   libnppist.so.11.2.1.68  libnvjpeg_static.a
libaccinj64.so.11.2       libcufftw.so                  libcusolver.so.11         libnppicc_static.a       libnppist_static.a      libnvperf_host.so
libaccinj64.so.11.2.67    libcufftw.so.10               libcusolver.so.11.0.2.68  libnppidei.so            libnppisu.so            libnvperf_host_static.a
libcublasLt.so            libcufftw.so.10.4.0.72        libcusolver_static.a      libnppidei.so.11         libnppisu.so.11         libnvperf_target.so
libcublasLt.so.11         libcufftw_static.a            libcusparse.so            libnppidei.so.11.2.1.68  libnppisu.so.11.2.1.68  libnvptxcompiler_static.a
libcublasLt.so.11.3.1.68  libcuinj64.so                 libcusparse.so.11         libnppidei_static.a      libnppisu_static.a      libnvrtc-builtins.so
libcublasLt_static.a      libcuinj64.so.11.2            libcusparse.so.11.3.1.68  libnppif.so              libnppitc.so            libnvrtc-builtins.so.11.2
libcublas.so              libcuinj64.so.11.2.67         libcusparse_static.a      libnppif.so.11           libnppitc.so.11         libnvrtc-builtins.so.11.2.67
libcublas.so.11           libculibos.a                  liblapack_static.a        libnppif.so.11.2.1.68    libnppitc.so.11.2.1.68  libnvrtc.so
libcublas.so.11.3.1.68    libcupti.so                   libmetis_static.a         libnppif_static.a        libnppitc_static.a      libnvrtc.so.11.2
libcublas_static.a        libcupti.so.11.2              libnppc.so                libnppig.so              libnpps.so              libnvrtc.so.11.2.67
libcudadevrt.a            libcupti.so.2020.3.0          libnppc.so.11             libnppig.so.11           libnpps.so.11           libnvToolsExt.so
libcudart.so              libcupti_static.a             libnppc.so.11.2.1.68      libnppig.so.11.2.1.68    libnpps.so.11.2.1.68    libnvToolsExt.so.1
libcudart.so.11.0         libcurand.so                  libnppc_static.a          libnppig_static.a        libnpps_static.a        libnvToolsExt.so.1.0.0
libcudart.so.11.2.72      libcurand.so.10               libnppial.so              libnppim.so              libnvblas.so            libOpenCL.so
libcudart_static.a        libcurand.so.10.2.3.68        libnppial.so.11           libnppim.so.11           libnvblas.so.11         libOpenCL.so.1
libcufft.so               libcurand_static.a            libnppial.so.11.2.1.68    libnppim.so.11.2.1.68    libnvblas.so.11.3.1.68  libOpenCL.so.1.0
libcufft.so.10            libcusolverMg.so              libnppial_static.a        libnppim_static.a        libnvjpeg.so            libOpenCL.so.1.0.0
libcufft.so.10.4.0.72     libcusolverMg.so.11           libnppicc.so              libnppist.so             libnvjpeg.so.11         nvrtc-prev
libcufft_static.a         libcusolverMg.so.11.0.2.68    libnppicc.so.11           libnppist.so.11          libnvjpeg.so.11.3.1.68  stubs

Python 3.7.9 (default, Dec 15 2020, 09:47:30)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import thundergbm
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/xxx/.local/lib/python3.7/site-packages/thundergbm/__init__.py", line 10, in <module>
    from .thundergbm import *
  File "/home/xxx/.local/lib/python3.7/site-packages/thundergbm/thundergbm.py", line 32, in <module>
    thundergbm = CDLL(lib_path)
  File "/apps/python/3.7.9/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory

@Kurt-Liuhf
Copy link
Collaborator

Hi @rivershah, thanks for your information. I think it is the cuda 11.0 (due to the lack of libcusparse.so.10.0)that causes the issue. You can try to build a suitable tgbm .whl for your machine from scratch. You should refer to the commands listed in How to build the Python wheel file for Linux. Then you can use pip to install the wheel file built by yourself.
Enjoy~

@rivershah
Copy link

@Kurt-Liuhf Thanks for looking at this. Unfortunately I work in a fairly closed down cluster environment and it'll be difficult to get these dependencies onto each machine without a simple pip install process. Any chance that the pip install thundergbm command can be compatible with cuda 11 out of the box please?

@arilwan
Copy link

arilwan commented Nov 12, 2021

Got this error trying to rebuild:

$ mkdir build && cd build && cmake .. && make -j
....
CMakeFiles/Makefile2:126: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/all' failed
make[1]: *** [src/thundergbm/CMakeFiles/thundergbm.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Couldn't trace all 2 errors back in the log but here is one:

~/thundergbm/src/thundergbm/sparse_columns.cu(52): error: identifier "cusparseScsr2csc" is undefined

1 error detected in the compilation of "~/thundergbm/src/thundergbm/sparse_columns.cu".
CMake Error at thundergbm_generated_sparse_columns.cu.o.cmake:266 (message):
  Error generating file
  ~/thundergbm/build/src/thundergbm/CMakeFiles/thundergbm.dir//./thundergbm_generated_sparse_columns.cu.o

src/thundergbm/CMakeFiles/thundergbm.dir/build.make:154: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/thundergbm_generated_sparse_columns.cu.o' failed

CUDA version:

$ whereis cuda
cuda: /usr/local/cuda

$ ls -l /usr/local/ | grep cuda
lrwxrwxrwx  1 root root   22 nov 11 16:38 cuda -> /etc/alternatives/cuda
lrwxrwxrwx  1 root root   25 nov 11 16:38 cuda-11 -> /etc/alternatives/cuda-11
drwxr-xr-x 16 root root 4096 nov 11 16:38 cuda-11.5

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_Sep_13_19:13:29_PDT_2021
Cuda compilation tools, release 11.5, V11.5.50
Build cuda_11.5.r11.5/compiler.30411180_0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants