-
Notifications
You must be signed in to change notification settings - Fork 993
[embassy-executor]: Upstream "Earliest Deadline First" Scheduler #4035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CC @hawkw for thoughts |
284187b
to
8510b48
Compare
https://crates.io/crates/cordyceps/0.3.3 is now published, I'll get this PR updated to use the published crate tomorrow. |
b718466
to
46a975b
Compare
Updated with released |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objection from me, seems like a nice improvement with the cordyceps usage as well.
Some concerns:
|
Appreciate the reviews!
Yep, this could be addressed in a future PR, right now we've only been using this on targets with atomics. How we do this might depend on what we decide to do with Cordyceps.
I'd like to try and make the case for keeping the cordyceps dependency! In it's favor:
We certainly COULD "hard-fork" cordyceps, but by the time that we've brought over the datastructures and tests, we will have essentially vendored a significant part of the crate. I could also narrowly "backport" the changes needed for DRS on top of the hand-rolled impl that embassy used, but that would be disappointing, IMO, for the testing and shared usage reasons mentioned above.
I'd definitely look at the Linked trait used in cordyceps, IMO it is still a bit of an unsafe API, but DOES do a good job of being the basics you need to abstract over intrusive structures. This commonality allowed TaskHeaders to be moved between
The reason that it is done that way is that the TransferStack is atomically shared (with interrupts!), while the local It's maybe possible to do "search and extract" without a mutex, but it would definitely be "spicier" to reason about. Especially without miri and loom testing.
It comes from an internal proposal document, though I don't think there's any attachment to the name. We could rename it if you prefer. There's a chance for confusion if we name it something that doesn't work exactly the same, but I'm not too stressed about it. I'll ask internally if there's prior art for the chosen name. Either way, I'll resolve the merge conflict so it's clean! |
2c4d045
to
2d5003e
Compare
* Start hacking in cordyceps This adds a third kind of runqueue, for now it should work the same as the current "atomics" runqueue, but uses a cordyceps TransferStack instead of the existing home-rolled linked list. * Clean up, use new cordyceps feature * A bit more cleanup * Update docs to be more clear
This implements a minimal version of Deadline Rank Scheduling, as well as ways to access and set Deadlines. This still needs some UX improvements, but is likely Enough for testing.
2d5003e
to
f914341
Compare
Rebased the PR, and changed the name to "Earliest Deadline First" scheduler. I also added an example using the scheduler and Happy to update if there's a better way though! |
The current basic demo gives this as output, by the way:
|
I'm not sure I understand the end goal, is it to test that the example works, or that the EDF works on nrf? For the latter, I think adding a test in tests/nrf/ with the appropriate
This should work I think, you add a feature in tests/nrf/Cargo.toml named 'edf' which enables it on the embassy-executor dep, and add the As for the other comments on the PR, I'm fine with the cordyceps deps, and actually think it's good to not roll our own (To me, it seems even better to reuse something outside for the additional 'hardening' you get, you can reduce risk by pinning the version). |
Hmm, probably not, mostly just ensuring there is some basic regression testing for the feature, in case something happens in future refactoring. I'll think about it a bit and do a follow-up PR if it makes sense.
Agreed :) I'd like to get an ok (at at least a "meh, whatever") from @Dirbaio since he raised objections before I actually hit merge tho. |
I would prefer everything to land together. Reasons:
re cordyceps: it's not about its quality (it's great), it's about the complexity and amount of code embassy-executor contributors and maintainers have to deal with. Status quo is a straightforward implementation of a transfer stack with two atomic ops. Also, using cordyceps won't magically make the executor correct. For example, one key requirement is you mustn't enqueue tasks that are already enqueued. This is ensured by If we want to catch these bugs we'd have to add Loom testing to the whole executor, and if we do that then the fact that Cordyceps has loom testing of the data structures in isolation doesn't help that much. We already do have miri testing btw.
This is a very good point. It should be doable though. When a task is enqueued it's "exclusively owned" by the executor thread until the RUN_QUEUED bit is cleared, so it should be fine to iterate and pop. You only have to worry about contention in the queue head I think. The reason I think "iterate queue to choose which task to poll" is better is because it makes it easier to decouple the scheduler from the runqueue. A "scheduler" could be a thing that gets to iterate the runqueue and emits a verdict of which task to poll next. The default scheduler would decide to poll the first task its sees, the deadline scheduler would iterate to find the earliest deadline and decide to poll that. |
This was why I re-implemented the existing atomic runqueue on top of the TransferStack, which is how I originally interpreted your request to de-duplicate. We're would be in the same state if this PR is merged as we are today, there are two runqueues, one atomic, and one non-atomic. The current code is (I believe) a proof of concept we are already at the point where you can implement different schedulers with the same runqueue - both "regular" and "edf" atomic variants share the same data structure, constructor, and enqueue method, only the dequeue method is different.
Performance wise, I think they will be equivalent, pop sort pop (cordyceps) is basically I worry about edge cases and complexity when it comes to the search + pop. We'd need to:
Basically, we would need to reimplement this code: With the added complexity that those two initial head special cases are ALSO now CAS loops instead of just regular pointer ops, and we might have to start over or do a more complicated removal if HEAD was moved at all while we do the first two steps. In general: I think trying to search in-place is going to be MUCH more complex for little gain, performance or API clarity wise. I'd probably push back on this request for the EDF scheduler. Regarding the "one true waitqueue", would you be interested in a CS-capable version of If not, there are two alternatives I can think of: The first would be to "inline" the functionality in embassy-executor, keeping two separate runqueues, and inventing a new api, essentially identical to cordyceps' TransferQueue, to allow abstraction of the single runqueue for different scheduler implementations. The second would be to rework the So, overall:
I'm more than happy to implement a CS-based version of cordyceps::TransferStack to allow for the simplification of the RunQueue interface, if having cordyceps isn't a hard no from you. Otherwise I'll pursue re-implementing SortedList manually to match the previous atomic runqueue. |
This PR builds on top of #4033, and shouldn't be merged until that PR is.This PR also requires hawkw/mycelium#520 to be merged and released before this PR can be merged.This PR adds the
"Deadline Rank Scheduler""Earliest Deadline First Scheduler", a cooperative (non-preemptive) scheduling algorithm that takes into account the deadline of a task when deciding which tasks in the WAITING state to poll first.This PR also re-implements the
run_queue_atomics
implementation to use thecordyceps::TransferStack
instead of the previously home-rolled atomic linked list/stack, in order to reduce the duplicated code.