5.8.0
DailyDreaming
released this
04 Jan 23:01
·
387 commits
to master
since this release
Changelog
Highlighted Features Added
- Toil server now exposes workflow tasks via WES (#4046).
- Toil server now has a
--wes_dialect agc
option that will hide any tasks that don't have Amazon Batch job IDs, and put the IDs in the task names for those that do (#4047). - Toil jobs now accept an
accelerators
requirement, likeaccelerators=1
oraccelerators={'kind': 'gpu', 'brand': 'nvidia', 'count': 2}
(#4163) - Include total requested cores for each job type in
toil stats
(#4173) - Toil jobs now expose
job.accelerators
to workflow - Add prefix suffix params to
AbstractFileStore.getLocalTempFile
andAbstractFileStore.getLocalTempFileName
(#4273) - CWL:
--no-compute-checksum
,--strict-cpu-limit
,--disable-validate
, and--fast-parser
are now available
Breaking Changes
- Toil's built-in autoscaler now guesses that some memory and disk space on nodes will not actually be available for jobs; pass
--assumeZeroOverhead
to revert to the old behavior (#2103)
CWL
- CWL job unit and display names have been changed to make more sense as task names, and management of them has been unified into a
CWLNamedJob
. (#4046/#4047) - CWL
CUDARequirement
is parsed bycwltool
and turned into a requirement for the minimum requested number of nvidia GPU accelerators (#3982) - fix false warning when outputSource contains only one None value (#4300)
Kubernetes
KubernetesBatchSystem
can addnvidia.com/gpu
andamd.com/gpu
resource requests for jobs that request those accelerators (#4163)KubernetesBatchSystem
can request GPUs bymodel
key, if nodes are labeled appropriately (#4163)
Dependencies
Misc
- Toil WES server now accepts requests that leave out workflow_params. (#4037)
- The
MessageBus
has been expanded to usepypubsub
, and now hasMessageInbox
andMessageOutbox
objects to represent connections to it. (#4046/#4047) ToilMetrics
now rides on theMessageBus
rails. (#4046/#4047)- Toil workflows now have a
--writeMessages
option, which takes a file to which a line-oriented stream ofMessageBus
messages will be written. Reading this file will allow you to recover the current state of the workflow. (#4046/#4047) - Add code for warning check to be used when launching cluster with AWS. (#3514)
- Use a CI prebake image for gitlab testing. (#4185)
- Toil clusters now have
/var/tmp
as the default temporary directory, since they often make large temporary files (#4148) - Adds basic testing for slurm using a slurm docker cluster by running sample workflows. (#3856)
- Add message bus documentation (#4239)
SingleMachineBatchSystem
can schedule nvidia GPU accelerators, limiting the concurrent jobs to no more than there are accelerators to support, and settingCUDA_VISIBLE_DEVICES
in the tasks' environments to tell them which nvidia GPU(s) to use. (#4163)AWSBatchBatchSystem
can use AWS Batch's GPU resource to provide nvidia GPU accelerators (#4163)- Toil jobs no longer need to re-run after their child/followOn/service jobs in order to delete themselves. (#3188)
- Message bus is now thread safe (#4276)
- Docker build has been updated with new Aventer Mesos deb URL (fixes #4290)
docker
binary in the container has been updated to that included in the Ubuntu repos (fixes #4282)- Singularity in the appliance has been updated to 3.10 which is >=3.9, for cgroups v2 support.
- Base Ubuntu container image for the appliance has been updated to 22.04, which has a new enough libc for Debian's Singularity 3.10 debs.
- Safer type usage checking for systems without boto3 installed
- Tests are now more runnable post-installation. Temporary paths are not selected based upon the location of the tests themselves. (#4287)
Bug Fixes
- Only use
/var/run/user
if XDG tells us we have it in our session. Otherwise we will try other places, including/run/lock/toil
. (#4170) toil destroy-cluster
: terminate stopped instances when destroying the cluster (#4271)- fileJobStore: handle arbitrary
os.link
errors to work on some filesystems (#2232)
Thank you to our contributors!