-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large amount of error output can lead to crashes #2067
Comments
That's ultimately due to the CUDA driver not handling the amount of output we throw at it. It shouldn't crash, but we probably also shouldn't be emitting that much output in the first place. Maybe we could have a global atomic flag that keeps track of how many exceptions have been reported, so that the |
Hi sir, I would be happy to address this issue , According to me the bug could be fixed by following these things for the below provide method. https://github.com/JuliaGPU/CUDA.jl/blob/master/src/device/runtime.jl#L29-L48 if you agree I can start a PR on it. |
Yes, that may work. Also, no need to ask for permission to work on issues! |
https://developer.nvidia.com/blog/asynchronous-error-reporting-when-printf-just-wont-do/ may contain some useful information for this |
Describe the bug
I'm not sure whether this is helpful, but the error message explicitly asked to be submitted, so here it is. I produced it by mistake.
To reproduce
This throws an extremely long message that took quite some time to finish, VSCode could not fit it all, so this does not include the beginning of the message. Each of the below lines was repeated many times.
And finally
Manifest.toml
Expected behavior
Same as if you drop
CUDA
in the above code.Version info
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: