-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split schedule many #54
Split schedule many #54
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has worked really nicely. A few minor comments, and some notes for getting multi-schedule and read-only working together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjp41 next_slot_reader, next_slot_writer: The comment added in commit doesn't follow the assert. The assert was there to ensure that last two bits of the pointer are zero (since we are using those bits for other purposes). I think the comment about specific bits being set in the status variable should be removed as there is no precondition for those bit to be in a particular state when the function is called.
@marioskogias @mjp41: I think there might be a deadlock scenario after the refactoring. In the release phase, if the current one is a reader, it will not wakeup the next slot if that is the reader. The assumption is that all contiguous readers in the queue (without any writer in between) will be woken up simultaneously: Either by the reader being the first one in the queue or writer waking up all readers until it reaches the end or finds another writer.
Before the refactoring, a reader slot when entering the queue (first one in the queue) will mark itself as read_available before setting the slot as ready. This will ensure that when a new reader joins the queue, it will spin on the 2PL bit and once the loop breaks, the new reader will see that the previous reader is available and will mark itself as ready for execution (decrement the exec_count).
After the refactoring, the steps done when reader A is the first one in the queue is moved after setting the slot as ready. This can create a scenario where the next reader B sees that the previous reader A is blocked, so it will assume that some writer will wake them up (That is the only possibility). And the reader A just marks itself as available and continue. Now B is blocked as no one will wake it up.
I'm not able to replicate this scenario via asserts but I think it can result in a deadlock. One possible solution is to wakeup reader B when reader A sees that the next slot is a reader and it is set before A marks its slot as available.
So I think Vishal is right. So there are two cases where we need to specify that the read_available flag:
The second of these is after the release phase, so I think could cause a lost wake up. The first one is in the correct place. I think we need to flag the read_available for read only chains in: verona-rt/src/rt/sched/behaviourcore.h Lines 849 to 853 in 45e9371
|
The read reference count also needs to be taken along with marking the slot as being read available before marking the slot as ready otherwise read and writes can happen in parallel. |
I am not sure I fully understand the concern. I see the difference before and after the refactoring, but it is not clear to me why in this example A will not wake up B. Is the problem that B will never be scheduled or that it will not be scheduled as early as it could? I think that given that we call resolve() at the very end of the function, none of the behaviours are run before this function finishes. |
@mjp41 @vishalgupta97 I made some changes now set the read_available before the release. I also tried to consolidate and remove duplicate code with the Please take a look and let me know what you think. |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -759,6 +793,9 @@ namespace verona::rt | |||
Slot* last_slot; | |||
size_t transfer_count; | |||
bool had_no_predecessor; | |||
// The last two are only use for reads only chains | |||
size_t ref_count; | |||
size_t ex_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be merged with the ec
array? Or the ec
array be removed and just use this?
Attempt to refactor schedule many as proposed in #49