-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conflict with 'do' loop from CUDArt #3
Comments
Thanks for reporting this issue, tomorrow I'll take a look at possible causes. Note, though, that my GPU-enabled computer is currently in repair, so I'll be able to test solution only in 4-5 days. I'll try to generate some useful ideas earlier. |
I have a windows and a mac both with enabled GPU, so you can count on me to test whatever you want. |
Can you please try 2 following snippets:
and
My current guess is that |
In fact, the problem may be in default random number generator that may reference device that has already been reset. Can you replace:
with
? |
Just tested, #1 and #2 as firstly described, none works on the second ran. about:
and
both return this different error:
in IJulia
|
Forgot to ask: how many devices do you have? I.e. what is the length of |
Only 1 device! |
Ok, one more experiment. Please, enter the following and check a result:
If I understand it correctly, at least some of these pairs should give an error, but better to make sure. |
well,
gives an error anyway since the only device I have is finally,
gives no error |
Ah, sorry for providing incorrect commands. Today I should get back my laptop, so I will be able to make further experiments myself. |
It becomes curiouser and curiouser. I modified your example just a little bit and the error seems to be gone:
Note that this time call to Exact criteria function doesn't matter, even stub function |
Well, it makes sense, because However, I was able to get some more interesting error message:
Last call results in the following stack trace:
|
Some more updates. I can reproduce this error in pure C:
Compiling and running it gives:
Where 77 stands for If, however, we move |
My bad, in the C code I provided I didn't recreate RNG. With recreated RNG the code works. But what's interesting, Julia code from one of previous examples works fine for me too!
I can run this code whatever number of times in REPL without any error. @joaquimg Could you please open a fresh Julia session (without any previous commands) and try it out? Looks like after CUDA runtime have got memory error, it gets into inconsistent state, and any further operations - including reallocation of memory and recreation of RNG - start behaving incorrectly, producing strange and illogical errors. So any testing should always be done in a fresh session - once an error occurred, all further results are compromised. |
I started the REPL and right in the second run I got the error: WARNING: cuRAND operation failed with status: in statuscheck at C:\Users\joaquimgarcia.julia\v0.4\CURAND\src\wrappers.jl:28 2015-11-21 21:05 GMT-02:00 Andrei Zhabinski [email protected]:
|
Can you try C version?
Copy and paste this code into file called "curand_test.cu", then compile and run it with:
|
Ok, I could build the progam, but how do I run that on windows? sorry for 2015-11-23 11:16 GMT-02:00 Andrei Zhabinski [email protected]:
|
Ok, i generated an .exe and got this as result: 0.7402 0.9210 0.0390 0.9690 0.9251 0.4464 0.6673 0.1099 0.4702 0.5132 2015-11-23 20:05 GMT-02:00 Joaquim Garcia [email protected]:
|
So the same error as previously. Now we know that:
Is there anybody with another Windows machine who can test it and see if the issue is reproduced? |
Thats a good result! Maybe its time to ask for help in julia-users or julia-dev?
|
Good idea, I posted a request for experiment in the old topic. As for my config, yes, I'm on Linux with CUDA 6.5. |
The C code returned the same on my MAC with cuda 7.0 0.7402 0.9210 0.0390 0.9690 0.9251 0.4464 0.6673 0.1099 0.4702 0.5132
|
I just got the C code running on my MAC with cuda 7.5 and someonelse did it in windows with cuda 7.5: |
I found the following problem while usinf CURAND combined with CUDArt´s ´do´ loop
if I load:
using CUDArt
using CURAND
and then I run either:
d_a = curand(Float64, 1000)
a = to_host(d_a)
OR:
result = devices(dev->capability(dev)[1]>=2) do devlist
end
I can repeat either block as many time I want (as long as its just one of them).
However if I run both combined (or alternatedly in any order):
d_a = curand(Float64, 1000)
a = to_host(d_a)
result = devices(dev->capability(dev)[1]>=2) do devlist
end
I get the error in IJulia:
WARNING: CUDA error triggered from:
LoadError: "unspecified launch failure"
while loading In[3], in expression starting on line 2
in checkerror at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\libcudart-6.5.jl:16
in checkerror at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\libcudart-6.5.jl:15
in copy! at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\arrays.jl:152
in to_host at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\arrays.jl:87
in include_string at loading.jl:266
in execute_request_0x535c5df2 at C:\Users\joaquimgarcia.julia\v0.4\IJulia\src\execute_request.jl:177
in eventloop at C:\Users\joaquimgarcia.julia\v0.4\IJulia\src\IJulia.jl:141
in anonymous at task.jl:447
and in cmd:
WARNING: CUDA error triggered from:
in checkerror at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\libcudart-6.5.jl
:15
in copy! at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\arrays.jl:152
in to_host at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\arrays.jl:87ERROR:
"unspecified launch failure"
in checkerror at C:\Users\joaquimgarcia.julia\v0.4\CUDArt\src\libcudart-6.5.jl
:16
in the following post, from julia-users, this issue is also commented:
https://groups.google.com/forum/#!topic/julia-users/mJjjTyU7cQ0
The text was updated successfully, but these errors were encountered: