-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support for growing allocations #1340
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍔
goto error; | ||
} | ||
} else { | ||
if (strcmp ("grow_allocate", cmd) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to solidify my understanding here:
- When there is a call to the match request callback, the only condition (match op) this is allowed for is
grow_allocate
- In the cast that the jobid exists but it's not
grow_allocate
, that is an error case. Only that one supports an existing id. - In the case we don't have a job id and it is
grow_allocate
that's another error because we require one! - Shrinking an allocation is done via partial cancel (thus not in this function)
One case I'm interested to test is if it's possible to shrink down to 0 and then go back up, and also what happens if you try to grow or shrink in unreasonable ways (where those ways are TBA!).
@@ -99,7 +100,10 @@ int dfu_traverser_t::request_feasible (detail::jobmeta_t const &meta, | |||
get_graph_db ()->metadata.graph_duration.graph_end.time_since_epoch ()) | |||
.count (); | |||
// only the initial time matters for allocate | |||
const int64_t target_time = op == match_op_t::MATCH_ALLOCATE ? meta.at : graph_end - 1; | |||
const int64_t target_time = | |||
(op == match_op_t::MATCH_ALLOCATE || op == match_op_t::MATCH_GROW_ALLOCATION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this syntax say "if the statement is true (if it's allocate or grow) default to the meta.at, otherwise the graph end -1 (or is it the other way around)? I know this syntax exists in JavaScript and I always get it wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if that is the case, it's saying that if we grow or match allocate, we want the time to be the job metadata "at" which I assume is the time matched at? Just to check - for grow do we have a new at time for "at" set here in the job metadata (otherwise it would be the previous at associated with the jobid when the first allocation was done)?
@@ -118,7 +122,8 @@ int dfu_traverser_t::request_feasible (detail::jobmeta_t const &meta, | |||
} | |||
if (feasible_nodes < target_nodes) { | |||
// no chance, don't even try | |||
if (op == match_op_t::MATCH_ALLOCATE_ORELSE_RESERVE || op == match_op_t::MATCH_ALLOCATE) { | |||
if (op == match_op_t::MATCH_ALLOCATE_ORELSE_RESERVE || op == match_op_t::MATCH_ALLOCATE | |||
|| op == match_op_t::MATCH_GROW_ALLOCATION) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EEEEEEEEEEEEBUSY continues doing nodey things ⌨️
@@ -217,6 +222,7 @@ int dfu_traverser_t::schedule (Jobspec::Jobspec &jobspec, | |||
break; | |||
} | |||
case match_op_t::MATCH_ALLOCATE: | |||
case match_op_t::MATCH_GROW_ALLOCATION: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this syntax - does this mean do nothing for match allocate, or that match allocate is the same as match grow allocate? I would guess the latter since one line was just added (so we want EBUSY for both cases).
In that case, this is saying if no resources are available, just return busy. I think this answers my question about reserve - we can't reserve with an update. It's a yes / no, and right now, if we don't have it ask again later!
} else { | ||
jobid = static_cast<int64_t> (std::strtoll (args[3].c_str (), NULL, 10)); | ||
if (!ctx->jobs.contains (jobid)) { | ||
std::cerr << "ERROR: nonexistent jobid " << jobid << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is (I think) a third check for existence - I am guessing this is done because the state might change, and better to be safe than sorry.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1340 +/- ##
========================================
- Coverage 75.7% 75.6% -0.1%
========================================
Files 112 112
Lines 16410 16450 +40
========================================
+ Hits 12426 12444 +18
- Misses 3984 4006 +22
|
This PR adds
grow_allocate
functionality, which attempts to allocate new resources specified by a jobspec to an existing jobid. The PR adds necessary functionality to the traverser, resource module, and REAPI, and adds testsuite tests for correct behavior.This PR is WIP pending client testing in FluxQueue, Fluence, and additional testsuite tests.