-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tutorials: Add a flux cancel tutorial #210
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,114 @@ | |||
.. _flux-job-cancel: | |||
.. _flux-job-cancelall: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a high level note - Dan and I were trying to remember how to cancel the other day and the intuitive thing (that follows other tools) would be flux job cancel --all
. Maybe we could deprecate cancelall
or (so the intuitive one is available) just provide both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't happen to write cancelall
, although I suspect it was because flux job cancel
takes jobids as input while flux job cancelall
does not. So separating them out made more sense than having to deal with mixing up the logic of both together.
But it's a fair point, perhaps bring up the --all
option for a discussion in flux-core
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those commands were modeled after Unix/Linux kill
vs killall
.
Considering we're still building interfaces, this command could be considered a "plumbing" command in the future, and so I wonder if a whole tutorial on flux job cancel
is necessary.
Feels like a "canceling jobs" howto might be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the time being, in that flux job cancel
is a command (and one we badly needed yesterday!) it should be included here and we can move it / change our minds about it when the CLI is refactored.
With respect to that, I have some pretty strong opinions about cli design so I hope you invite me to the conversation, of course bringing a silver spike and garlic in case I get out of hand 🧛
|
||
$ flux mini submit sleep 100 | ||
ƒh35Dh5qRyq | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe between here we could add like:
If you've job submit the job, the id will be readily available. If you need to see a list of your current jobs, you can use `flux jobs`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also just merged flux job last
which might be even easier than flux jobs
in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case I was mostly trying to show that the job was running. Since flux mini submit
already output the jobid. Then later flux jobs
showed the job was canceled.
The special ``flux job cancelall`` command allows you to cancel many jobs. Several options allow you to | ||
target a specific set of jobs. | ||
|
||
To start off, lets create 100 jobs that will sleep infinitely. We will use the special ``--cc`` option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh no, sleeping beauty jobs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does --cc
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i assume "carbon copy", which we can mention (and hopefully its obvious what it does given what its called)
|
||
As you can see, we have a lot of jobs waiting to run (state ``S``) and a lot of running jobs (state ``R``). | ||
|
||
Lets first ``flux job cancell`` without any options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets first ``flux job cancell`` without any options. | |
Lets first ``flux job cancel`` without any options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or this should be "cancelall" ?
.. code-block:: console | ||
|
||
$ flux job cancelall | ||
flux-job: Command matched 100 jobs (-f to confirm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this to be flux job cancel --all --force
for some future update - I could guess that easily!
As you can see all the jobs are now canceled after passing the ``-f`` option to ``flux job cancelall``. | ||
|
||
``flux job cancellall`` has several options to filter the jobs to cancel. Perhaps the most commonly used | ||
option is ``-S``. The ``-S`` option specifies the state of a job to cancel. The most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have an equivalent long form? It might be easier to read if it's like --state
and then the user would remember that (and be able to guess the short form) vs. learning to use -S
but not knowing what it means.
0aa0cef
to
78cd7f0
Compare
Cancelling a lot of jobs | ||
------------------------ | ||
|
||
The special ``flux job cancelall`` command allows you to cancel many jobs. Several options allow you to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be misleading since flux job cancel
can also cancel many jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good point, multiple jobids vs filtering mechanism.
As you can see ``flux job cancelall --states=pending`` would target the 52 pending jobs for cancellation and | ||
``flux job cancelall --states=running`` would target the current 48 running jobs for cancellation. | ||
|
||
And that's it! If you have any questions, please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since cancelall is mentioned, might as well mention flux pkill
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought about that, but pkill
is really about signaling. So its a bit different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know Flux resorted to violence. 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No pkill
cancels jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well crud, i guess i was just thinking pkill(1)
... yeah, i guess it should go here too
78cd7f0
to
2ce9597
Compare
re-pushed with tweaks given the comments above, thanks! |
.. code-block:: console | ||
|
||
$ flux mini submit --cc=1-100 sleep inf | ||
<snip - many job ids printed out> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: could use --quiet to suppress jobids on standard output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I didn't use --quiet
in the #195 fast job submission guide too. I just didn't want to have to explain another option I used :P
maybe it's not a net win??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I only mentioned it since you had to snip the output. Wondering how that will work if output is ever autogenerated.
I didn't get a chance to review the other tutorial pr
2ce9597
to
8087105
Compare
ok, re-pushed adding a |
8087105
to
caeb736
Compare
b1f4132
to
2fb96cf
Compare
8b08ea5
to
3286fad
Compare
re-pushed adding a note given work on flux-framework/flux-core#5055 |
No description provided.