-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enqueue/dequeue operation ordering issue #129
Comments
We do have issues in the queue library to do with atomics and memory orderings. This is a current work in progress and we plan to increment and read the head and tail index using atomic operations with read aquire semantics. (rather than standalone fences). However, I think in the example you gave the compiler/CPU would be unable to reorder those two memory operations due to a data dependency of using the tail in the first line. |
I'm not a compiler expert, but based on personal experience (I have seen gcc generate code that reordered stores) and on conversations with someone more knowledgeable, I believe prudence is in order. While there is a dependence that the compiler must honor (use of the original value of Perhaps the following quote from the C specification is the most relevant: "an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries" (C11 specification, section 5.1.2.3 [I'm reading a draft of the standard close to actual publication; I expect the actual standard to be very close]). In other words, as long as the “observable behavior” remains the same, then the compiler can perform various optimizations. It is as if the optimizer sees our code as equivalent to the following single line:
And in that line of code, an implied order is harder to discern. Obviously things changes if the accesses are volatile: "at a sequence point all previous accesses to volatile objects have stabilized and no subsequent accesses have occurred" (from the GCC documentation). But that's not currently the case here. If you change your code to use atomics, then you may introduce "volatile" statements (atomics are often defined as volatile inlined assembly statements) that will prevent the compiler from applying certain optimizations (e.g., store instruction reordering). |
There needs to be a memory barrier here even on UP to prevent the reordering. |
A C compiler can and will reorder non-volatile operations (even across sequence points). So I'm wondering if the enqueue/dequeue functions (e.g., in sddf/include/sddf/network/queue.h) need annotations to prevent such reorderings (I ran into such an issue in a previous project).
In other words, I'm wondering if a code sequence like
might be necessary even in the uniprocessor case (the sequence above is gcc specific; a different sequence might be required for a different compiler).
In summary, the potentially necessary change prevents the compiler from reordering some operations; while THREAD_MEMORY_RELEASE() prevents the hardware from doing analogous things at run time.
The text was updated successfully, but these errors were encountered: