Skip to content

Commit

Permalink
tutorials: Add a flux job cancel tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
Al Chu11 committed Feb 17, 2023
1 parent 7412532 commit 2ce9597
Show file tree
Hide file tree
Showing 4 changed files with 124 additions and 0 deletions.
Binary file modified auto_examples/auto_examples_jupyter.zip
Binary file not shown.
Binary file modified auto_examples/auto_examples_python.zip
Binary file not shown.
122 changes: 122 additions & 0 deletions tutorials/commands/flux-job-cancel.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
.. _flux-job-cancel:
.. _flux-job-cancelall:

========================
How to Cancel a Flux Job
========================

Inevitably submitted jobs will have to be canceled for one reason or another. This tutorial
will show you how.

----------------------------
How to Cancel a Job by Jobid
----------------------------

The basic way to cancel a job is through ``flux job cancel``. All you have to do is specify
the jobid on the command line. Here is a simple example after submitting a job.

.. code-block:: console
$ flux mini submit sleep 100
ƒh35Dh5qRyq
$ flux jobs ƒh35Dh5qRyq
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒh35Dh5qRyq achu sleep R 1 1 13.33s corona174
$ flux job cancel ƒh35Dh5qRyq
<snip wait a little bit>
$ flux jobs ƒh35Dh5qRyq
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒh35Dh5qRyq achu sleep CA 1 1 20.18s corona174
In the above example we submitted a simple job via ``flux mini submit`` that simply
runs ``sleep``. Passing the resulting jobid to ``flux jobs`` shows that it is
running (state is ``R``).

We cancel the job simply by passing the jobid to ``flux job cancel``. After waiting
a little bit, we see that the job is now canceled in ``flux jobs`` (state is ``CA``).

While we only passed one jobid to ``flux job cancel`` in this example, multiple jobids can be
passed on the commandline to cancel many jobs.

Note that in this particular example we happened to know the jobid of our job. If you do
not know the the jobid of your job, you can always use ``flux jobs`` to see a list of all
your currently active jobs.

------------------------
Cancelling All Your Jobs
------------------------

The ``flux job cancelall`` command allows you to cancel jobs without specifying jobids.
By default it cancels all of your active jobs, but several options allow you to target a subset of the jobs.

To start off, lets create 100 jobs that will sleep infinitely. We will use the special ``--cc`` (carbon copy)
option to ``flux mini submit`` that will submit 100 duplicate copies of the ``sleep`` job.

.. code-block:: console
$ flux mini submit --cc=1-100 sleep inf
<snip - many job ids printed out>
$ flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒjTWS5m3 achu sleep S 1 - -
ƒjTWS5m4 achu sleep S 1 - -
ƒjTWS5m5 achu sleep S 1 - -
ƒjTWS5m6 achu sleep S 1 - -
<snip - there are many jobs waiting to be run>
ƒjTWS5m2 achu sleep R 1 1 8.858s corona212
ƒjTWS5m1 achu sleep R 1 1 8.860s corona212
ƒjTUx6Um achu sleep R 1 1 8.870s corona212
ƒjTUx6Uk achu sleep R 1 1 8.870s corona212
ƒjTUx6Uj achu sleep R 1 1 8.870s corona212
ƒjTUx6Ui achu sleep R 1 1 8.871s corona212
<snip - there are many jobs running>
As you can see, we have a lot of jobs waiting to run (state ``S``) and a lot of running jobs (state ``R``).

Lets first ``flux job cancelall`` without any options.

.. code-block:: console
$ flux job cancelall
flux-job: Command matched 100 jobs (-f to confirm)
As you can see, ``flux job cancelall`` found all 100 jobs to cancel, but it hasn't canceled them yet. In order to go through
with the cancellation you must specify the ``-f`` (or ``--force``) option.

.. code-block:: console
$ flux job cancelall -f
flux-job: Canceled 100 jobs (0 errors)
$ flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
As you can see, all the jobs are now canceled after passing the ``-f`` option to ``flux job cancelall``. ``flux jobs``
confirms there are no longer any of our jobs running or waiting to run.

``flux job cancellall`` has several options to filter the jobs to cancel. Perhaps the most commonly used
option is the ``-S`` or ``--states`` option. The ``--states`` option specifies the state(s) of a job to cancel. The most
common states to target are ``pending`` and ``running``. Lets resubmit our 100 jobs and see the result
of trying to cancel ``pending`` vs ``running`` jobs.

.. code-block:: console
$ flux mini submit --cc=1-100 sleep inf
<snip - many job ids printed out>
$ flux job cancelall --states=pending
flux-job: Command matched 52 jobs (-f to confirm)
$ flux job cancelall --states=running
flux-job: Command matched 48 jobs (-f to confirm)
As you can see ``flux job cancelall --states=pending`` would target the 52 pending jobs for cancellation and
``flux job cancelall --states=running`` would target the current 48 running jobs for cancellation.

And that's it! If you have any questions, please
`let us know <https://github.com/flux-framework/flux-docs/issues>`_.
2 changes: 2 additions & 0 deletions tutorials/commands/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Welcome to the Command Tutorials! These tutorials should help you to map specifi
with your use case, and then see detailed usage.

- ``flux mini submit/flux mini run`` (:ref:`flux-mini-submit`): "Submit a job in a Flux instance"
- ``flux job cancel/flux job cancelall`` (:ref:`flux-job-cancel`): "Cancel a job you submitted"
- ``flux proxy`` (:ref:`ssh-across-clusters`): "Send commands to a Flux instance across clusters using ssh"

This section is currently 🚧️ under construction 🚧️, so please come back later to see more command tutorials!
Expand All @@ -17,4 +18,5 @@ This section is currently 🚧️ under construction 🚧️, so please come bac
:caption: Command Tutorials

flux-mini-submit
flux-job-cancel
ssh-across-clusters

0 comments on commit 2ce9597

Please sign in to comment.