-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test suite hangs with -DGPU_SOLVE #130
Comments
How many MPIs are you using? |
1 |
I can reproduce the issue on Summit at OLCF. #!/bin/bash rm -rf build export PARMETIS_ROOT=$OLCF_PARMETIS_ROOT cmake .. Run script: #!/bin/bash cd build Result: Test project /ccs/home/jeanluc/GIT/superlu_dist/build |
Thanks for providing these helpful instructions and I can reproduce the issue now. The problem was calling pdgssvx with nrhs=0 will skip some setups for GPU solves, which causes hanging when calling it later with nrhs>0 and options->Fact=FACTORED. This commit should fix the problem: However, the GPU solve in the master branch only support nmpi=1. You will still see the failures reported by "make test" when mpirun -np >1. I recommend not enabling GPU solve for the smoke/regression tests. |
When I build superlu_dist with -DGPU_SOLVE in the C flags, the test suites seems to fail after printing out
.. B to X redistribute time 0.0001
.. Setup L-solve time 0.0000
.. L-solve time 0.0003
.. L-solve time (MAX) 0.0003
.. Setup U-solve time 0.0000
Test time = 1500.04 sec
(seems to reach time limit of 1500 seconds).
Taking out -DGPU_SOLVE of the build and the test suite runs fine.
The text was updated successfully, but these errors were encountered: