From fd4b96bb14841b2a457390a97ee441bc226acc4b Mon Sep 17 00:00:00 2001 From: Jim Garlick Date: Wed, 29 Mar 2023 12:05:07 -0700 Subject: [PATCH] flux-job(1): update WAIT section Problem: the man page entry for flux job wait does not adequately describe the design of waitable jobs. Rework this section to emphasies the underlying design. Fixes #5038 --- doc/man1/flux-job.rst | 43 +++++++++++++++++++++++++++---------------- 1 file changed, 27 insertions(+), 16 deletions(-) diff --git a/doc/man1/flux-job.rst b/doc/man1/flux-job.rst index 9f8523f44752..df51c3710c8d 100644 --- a/doc/man1/flux-job.rst +++ b/doc/man1/flux-job.rst @@ -103,24 +103,35 @@ Wait for job(s) to complete and exit with the largest exit code. WAIT ==== -A waitable job may be waited on with ``flux job wait``. A specific job -can be waited on by specifying a jobid. If no jobid is specified, the -command will wait for any waitable user job to complete, outputting that -jobid before exiting. This command will exit with error if the job is not -successful. - -Compared to ``flux job status``, there are several advantages / -disadvantages of using ``flux job wait``. For a large number of jobs, -``flux job wait`` is far more efficient, especially when used with the -``--all`` option below. In addition, job ids do not have to be specified -to ``flux job wait``. - -The two major limitations are that jobs must be submitted with the -waitable flag, which can only be done in user instances. In addition, -``flux job wait`` can only be called once per job. +``flux job wait`` behaves like the UNIX :linux:man2:`wait` system call, +for jobs submitted with the ``waitable`` flag. Compared to other methods +of synchronizing on job completion and obtaining results, it is very +lightweight. + +The result of a waitable job may only be consumed once. This is a design +feature that makes it possible to call ``flux job wait`` in a loop until all +results are consumed. + +.. note:: + Only the instance owner is permitted to submit jobs with the ``waitable`` + flag. + +When run with a jobid argument, ``flux job wait`` blocks until the specified +job completes. If the job was successful, it silently exits with a code of +zero. If the has failed, an error is printed on stderr, and it exits with +a code of one. It is an error if the job was not submitted with the +``waitable`` flag. + +When run without arguments, ``flux job wait`` blocks until the next waitable +job completes and behaves as above except that the jobid is printed to stdout. +If there are no waitable jobs, it exits with a code of 1. + +``flux job wait --all`` loops through waitable jobs as they complete, printing +their jobids. If all jobs are successful, it exits with a code of zero. If +any jobs have failed, it exits with a code of one. **-a, --all** - Wait for all waitable jobs. Will exit with error if any jobs are + Wait for all waitable jobs and exit with error if any jobs are not successful. **-v, --verbose**