-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shell: add extra info to doom exceptions #6453
Conversation
ce9d74d
to
7f51a84
Compare
We might be able to shorten it by just prefixing the existing exception note with Now that I'm thinking about it, I wonder if |
Oh that's a good idea, that just slipped my mind. |
7f51a84
to
c00ed88
Compare
re-pushed with tweaks per comments above
so now it looks like this (example from t2614-job-shell-doom)
|
Again, it would be shorter to just put the jobid as a prefix:
Since there's no mistaking that's the jobid. However, that's pretty minor and probably not worth quibbling about. Edit: moved jobid after job.exception in example output. |
9449739
to
69e5ceb
Compare
ok, re-pushed with the output message tweak. Also had to tweak two tests that had to be updated for the expected output. Edit: output is now
|
src/shell/doom.c
Outdated
static char buf[PATH_MAX+1] = {0}; | ||
json_t *s = json_array_get (doom->shell->info->jobspec->command, 0); | ||
const char *path = json_string_value (s); | ||
char *rv; | ||
snprintf (buf, PATH_MAX, "%s", path); | ||
rv = basename (buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler to just do something like
const char *p = strrchr (s, "/");
const char *base = p ? p + 1 : s;
And avoid the copy? Maybe we should have our own implementation of basename()
that doesn't modify its argument (I think some places in the codebase may already assume GNU basename(3)
, which doesn't modify its arg, but since we support alpine and someday BSD, that's probably not safe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that would be better. When I read the manpage for basename(3) I was like "huh, it needs to modify the buffer?" Lets go with your simpler approach.
Ah, bummer. The change I suggested breaks some flux-sched checks which are matching Sorry if that was a bad suggestion. |
those tests are easily fixable (as is one I missed in |
69e5ceb
to
31800a8
Compare
Up to you if you don't mind submitting a flux-sched PR |
31800a8
to
8be29d8
Compare
rebased and re-pushed, should pass flux-sched now that flux-framework/flux-sched#1315 merged. Although unsure if @trws wants to comment on output format. |
Nah, looks good to me, just helped to have the context. Thanks @chu11! |
setting WIP, I'm not sure what happened to my branch w/ my tweak from comments above. Ugh, maybe i lost it during pre-thanksgiving break madness. |
Problem: Exceptions from the doom plugin do not specify what job failed. It can be confusing to users which job failed amongst multiple in a batch job, or if the broker from a batch job itself failed. Add the job executable name to the doom exception outputs. Fixes flux-framework#6357
Problem: In a batch job, it may be difficult to discern which job had an exception. When flux job attach outputs an exception, also output the jobid of the job. Update tests in t2608-job-shell-log.t and t2304-sched-simple-alloc-check.t for change in expected output.
8be29d8
to
57f5124
Compare
removed WIP, fixed things per comments above |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6453 +/- ##
==========================================
+ Coverage 83.62% 83.63% +0.01%
==========================================
Files 523 523
Lines 87547 87552 +5
==========================================
+ Hits 73212 73226 +14
+ Misses 14335 14326 -9
|
Problem: Exceptions from the doom plugin do not specify what job failed. It can be confusing to users which job failed amongst multiple in a batch job, or if the broker from a batch job itself failed.
Add the job executable name and jobid to the doom exception outputs.
Fixes #6357
we could obviously debate the message format, I went back and forth and just ended up at this. Tweak suggestions welcome.