Enqueue/dequeue operation ordering issue #129

alainkaegi · 2024-06-07T23:01:45Z

A C compiler can and will reorder non-volatile operations (even across sequence points). So I'm wondering if the enqueue/dequeue functions (e.g., in sddf/include/sddf/network/queue.h) need annotations to prevent such reorderings (I ran into such an issue in a previous project).

In other words, I'm wondering if a code sequence like

queue->free->buffers[queue->free->tail % queue->size] = buffer;
__asm__ __volatile("":::"memory");
queue->free->tail++;

might be necessary even in the uniprocessor case (the sequence above is gcc specific; a different sequence might be required for a different compiler).

In summary, the potentially necessary change prevents the compiler from reordering some operations; while THREAD_MEMORY_RELEASE() prevents the hardware from doing analogous things at run time.

The text was updated successfully, but these errors were encountered:

Courtney3141 · 2024-06-19T23:41:05Z

We do have issues in the queue library to do with atomics and memory orderings. This is a current work in progress and we plan to increment and read the head and tail index using atomic operations with read aquire semantics. (rather than standalone fences).

However, I think in the example you gave the compiler/CPU would be unable to reorder those two memory operations due to a data dependency of using the tail in the first line.

alainkaegi · 2024-06-22T06:14:30Z

I'm not a compiler expert, but based on personal experience (I have seen gcc generate code that reordered stores) and on conversations with someone more knowledgeable, I believe prudence is in order.

While there is a dependence that the compiler must honor (use of the original value of tail and its update later on), there is no dependence that compels the compiler to update the content of tail and buffers in a particular order (at least as far as the code snippet we are considering).

Perhaps the following quote from the C specification is the most relevant: "an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries" (C11 specification, section 5.1.2.3 [I'm reading a draft of the standard close to actual publication; I expect the actual standard to be very close]).

In other words, as long as the “observable behavior” remains the same, then the compiler can perform various optimizations. It is as if the optimizer sees our code as equivalent to the following single line:

queue->free->buffers[queue->free->tail++ % queue->size] = buffer;

And in that line of code, an implied order is harder to discern.

Obviously things changes if the accesses are volatile: "at a sequence point all previous accesses to volatile objects have stabilized and no subsequent accesses have occurred" (from the GCC documentation). But that's not currently the case here.

If you change your code to use atomics, then you may introduce "volatile" statements (atomics are often defined as volatile inlined assembly statements) that will prevent the compiler from applying certain optimizations (e.g., store instruction reordering).

wom-bat · 2024-06-27T22:51:29Z

There needs to be a memory barrier here even on UP to prevent the reordering.

alainkaegi changed the title ~~Enqueue/dequeue opperation ordering issue~~ Enqueue/dequeue operation ordering issue Jun 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enqueue/dequeue operation ordering issue #129

Enqueue/dequeue operation ordering issue #129

alainkaegi commented Jun 7, 2024

Courtney3141 commented Jun 19, 2024

alainkaegi commented Jun 22, 2024

wom-bat commented Jun 27, 2024

Enqueue/dequeue operation ordering issue #129

Enqueue/dequeue operation ordering issue #129

Comments

alainkaegi commented Jun 7, 2024

Courtney3141 commented Jun 19, 2024

alainkaegi commented Jun 22, 2024

wom-bat commented Jun 27, 2024