-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap _local_size
into atomic to avoid unwanted modifications while operator()
is used with set()
in petsc vector.
#3809
Conversation
I'm scared of whether these are the only fixes we need, but if this fixes your MOOSE case then it's definitely a step in the right direction. Could you add something like your MOOSE failure case to a new unit test in |
I can confirm that this change fixes the issue, based on 100 runs with 10 threads. The issue usually pooped up in 5-10 runs with this many threads. |
I'm curious how you figured out that this was the fix, for example did you use a debugging tool that suggested it? Also, you made _first and _last std::atomic but we don't .load() or store them anywhere? Finally, general question: is this (declaring a variable atomic) something that needs to be done differently depending on if TBB is used vs. std::threads? |
To be honest I am not a threading expert, but I'll try to answer your questions as accurately as I can.
The debugger pointed to the The reason I used I have no idea if this would change based on different threading libraries, I was wrapping the member variables in atomic based on the examples on the c++ standard website so I hope oneTBB respects that. I am also open to other suggestions for assigning a value to an already existing petsc vector that doesn't rely on other operations which manipulate the size. |
I created a test for this and it exhibits out of memory access in the petsc vector. It is not deterministic, so I think this change did not solve all the problems. The new error is coming from
with
In restore array, this gets set to nullptr too. I am not sure I can pull an atomic here (I am fairly certain I cant considering that this member is used with petsc functions directly). I'll push the test just in case. |
d6fbb50
to
3c82b0a
Compare
I prefer the phrase "a debugging assistant", thank you very much! But seriously, this whole section of code terrifies me; I don't understand the C++ memory-order standards well at all. I tried to figure out all the variables that might get caught out by a read-only get-array transitioning to a read-write one, but aside from the miss we've already caught, I'm not even 100% certain I'm not yet sure what to do about |
Sigh, we'll have to dig into the double checked locking pattern again. With this PR, I still get these errors out of thread sanitizer on the MOOSE tests (using grmnptr/moose@63f614deae)
|
Thanks for running the sanitizer on the PR. I disabled threading on that new capability for the time being. |
I'll close this PR for now. I don't think I'll dive deeper, threading is not yet a requirement for the capability I am working on. |
Closes #3808