Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tutorials: Add a flux cancel tutorial #210

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Feb 17, 2023

No description provided.

@@ -0,0 +1,114 @@
.. _flux-job-cancel:
.. _flux-job-cancelall:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a high level note - Dan and I were trying to remember how to cancel the other day and the intuitive thing (that follows other tools) would be flux job cancel --all. Maybe we could deprecate cancelall or (so the intuitive one is available) just provide both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't happen to write cancelall, although I suspect it was because flux job cancel takes jobids as input while flux job cancelall does not. So separating them out made more sense than having to deal with mixing up the logic of both together.

But it's a fair point, perhaps bring up the --all option for a discussion in flux-core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those commands were modeled after Unix/Linux kill vs killall.

Considering we're still building interfaces, this command could be considered a "plumbing" command in the future, and so I wonder if a whole tutorial on flux job cancel is necessary.
Feels like a "canceling jobs" howto might be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the time being, in that flux job cancel is a command (and one we badly needed yesterday!) it should be included here and we can move it / change our minds about it when the CLI is refactored.

With respect to that, I have some pretty strong opinions about cli design so I hope you invite me to the conversation, of course bringing a silver spike and garlic in case I get out of hand 🧛


$ flux mini submit sleep 100
ƒh35Dh5qRyq

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe between here we could add like:

If you've job submit the job, the id will be readily available. If you need to see a list of your current jobs, you can use `flux jobs`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also just merged flux job last which might be even easier than flux jobs in this case

Copy link
Member Author

@chu11 chu11 Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case I was mostly trying to show that the job was running. Since flux mini submit already output the jobid. Then later flux jobs showed the job was canceled.

The special ``flux job cancelall`` command allows you to cancel many jobs. Several options allow you to
target a specific set of jobs.

To start off, lets create 100 jobs that will sleep infinitely. We will use the special ``--cc`` option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no, sleeping beauty jobs!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does --cc mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume "carbon copy", which we can mention (and hopefully its obvious what it does given what its called)


As you can see, we have a lot of jobs waiting to run (state ``S``) and a lot of running jobs (state ``R``).

Lets first ``flux job cancell`` without any options.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Lets first ``flux job cancell`` without any options.
Lets first ``flux job cancel`` without any options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or this should be "cancelall" ?

.. code-block:: console

$ flux job cancelall
flux-job: Command matched 100 jobs (-f to confirm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be flux job cancel --all --force for some future update - I could guess that easily!

As you can see all the jobs are now canceled after passing the ``-f`` option to ``flux job cancelall``.

``flux job cancellall`` has several options to filter the jobs to cancel. Perhaps the most commonly used
option is ``-S``. The ``-S`` option specifies the state of a job to cancel. The most
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have an equivalent long form? It might be easier to read if it's like --state and then the user would remember that (and be able to guess the short form) vs. learning to use -S but not knowing what it means.

Cancelling a lot of jobs
------------------------

The special ``flux job cancelall`` command allows you to cancel many jobs. Several options allow you to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be misleading since flux job cancel can also cancel many jobs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good point, multiple jobids vs filtering mechanism.

As you can see ``flux job cancelall --states=pending`` would target the 52 pending jobs for cancellation and
``flux job cancelall --states=running`` would target the current 48 running jobs for cancellation.

And that's it! If you have any questions, please
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cancelall is mentioned, might as well mention flux pkill?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought about that, but pkill is really about signaling. So its a bit different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know Flux resorted to violence. 😱

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No pkill cancels jobs

Copy link
Member Author

@chu11 chu11 Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well crud, i guess i was just thinking pkill(1) ... yeah, i guess it should go here too

@chu11
Copy link
Member Author

chu11 commented Feb 17, 2023

re-pushed with tweaks given the comments above, thanks!

.. code-block:: console

$ flux mini submit --cc=1-100 sleep inf
<snip - many job ids printed out>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: could use --quiet to suppress jobids on standard output

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I didn't use --quiet in the #195 fast job submission guide too. I just didn't want to have to explain another option I used :P

maybe it's not a net win??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I only mentioned it since you had to snip the output. Wondering how that will work if output is ever autogenerated.

I didn't get a chance to review the other tutorial pr

@chu11
Copy link
Member Author

chu11 commented Feb 17, 2023

ok, re-pushed adding a flux pkill section.

@chu11 chu11 mentioned this pull request Feb 17, 2023
16 tasks
@chu11 chu11 force-pushed the flux_job_cancel branch 3 times, most recently from b1f4132 to 2fb96cf Compare March 27, 2023 14:00
@chu11 chu11 changed the title tutorials: Add a flux job cancel tutorial tutorials: Add a flux cancel tutorial Mar 30, 2023
@chu11 chu11 force-pushed the flux_job_cancel branch 7 times, most recently from 8b08ea5 to 3286fad Compare May 11, 2023 16:20
@chu11
Copy link
Member Author

chu11 commented May 11, 2023

re-pushed adding a note given work on flux-framework/flux-core#5055

@chu11 chu11 force-pushed the flux_job_cancel branch from 7b164dc to 7d510f9 Compare June 8, 2023 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants