From d0b9a3dfab51efb6c67b8f70facca8a30bac3805 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Fri, 19 Jul 2024 16:15:03 -0500 Subject: [PATCH 1/8] Enable spras in CHTC with new executor and profile --- docker-wrappers/SPRAS/README.md | 67 ++++++++++++++++--- .../SPRAS/spras_profile/config.yaml | 11 +++ environment.yml | 5 +- pyproject.toml | 2 +- 4 files changed, 72 insertions(+), 13 deletions(-) create mode 100644 docker-wrappers/SPRAS/spras_profile/config.yaml diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index ca047558..592fb539 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -7,12 +7,12 @@ This image comes bundled with all of the necessary software packages to run SPRA To create the Docker image, make sure you are in this repository's root directory, and from your terminal run: -``` +```bash docker build -t /: -f docker-wrappers/SPRAS/Dockerfile . ``` For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.1.0, you'd run: -``` +```bash docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile . ``` @@ -21,7 +21,7 @@ is being installed with `pip`, it's also possible to specify that you want devel spras package that receives changes without re-installation, change the `pip` installation line to: -``` +```bash pip install -e .[dev] ``` @@ -48,16 +48,59 @@ DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.1.0 -f The folder `docker-wrappers/SPRAS` also contains several files that can be used to test this container on HTCondor. To test the `spras` container in this environment, first login to an HTCondor Access Point (AP). Then, from the AP clone this repo: -``` +```bash git clone https://github.com/Reed-CompBio/spras.git ``` -When you're ready to run SPRAS as an HTCondor workflow, navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run -`condor_submit spras.sub`, which will submit SPRAS to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using -the value of `EXAMPLE_CONFIG` as the SPRAS configuration file. Note that you can alter the configuration file to test various workflows, but you should leave -`unpack_singularity = true`, or it is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which -appears to fail periodically in HTCondor. +There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second +is to use the the snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. + +### Submitting All Jobs to a Single EP + +Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run `condor_submit spras.sub`, which will submit SPRAS +to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS +configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it +is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which appears to fail periodically +in HTCondor. +**Note**: The `spras.sub` submit file is an example of how this workflow could be submitted from a CHTC Access Point (AP) to the OSPool. To run in the local +CHTC pool, omit the `+WantGlideIn` and `requirements` lines + +### Submitting Parallel Jobs + +Parallelizing SPRAS workflows with HTCondor currently requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor/tree/spras-feature-dev). +To get this executor, clone the forked repository using the following: +```bash +git clone -b spras-feature-dev https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor.git +``` + +Then, from your activated `spras` conda environment (important), run: +```bash +pip install snakemake-executor-plugin-htcondor/ +``` + +Currently, this executor requires that all input to the workflow is scoped to the current working directory. Therefore, you'll need to copy the +Snakefile and your input directory (as specified by `example_config.yaml`) to this directory: +```bash +cp ../../Snakefile . && \ +cp -r ../../input . +``` + +It's also necessary for this workflow to create an Apptainer image from the published Docker image. See [Creating an Apptainer image for SPRAS](#creating-an-apptainer-image-for-spras) +for instructions. + +To start the workflow with HTCondor, run: +```bash +snakemake --profile spras_profile +``` + +Resource requirements can be adjusted as needed in `spras_profile/config.yaml`, and HTCondor logs for this workflow can be found in `.snakemake/htcondor`. +You can set a different log directory by adding `htcondor-jobdir: /path/to/dir` to the profile's configuration. + +**Note**: This workflow requires that the terminal session responsible for running snakemake stays active. Closing the terminal will suspend jobs, +but the workflow can use Snakemakes checkpointing to pick up any jobs where they left off. + +### Job Monitoring To monitor the state of the job, you can run `condor_q` for a snapshot of how the job is doing, or you can run `condor_watch_q` if you want realtime updates. Upon completion, the `output` directory from the workflow should be returned as `spras/docker-wrappers/SPRAS/output`, along with several files containing the workflow's logging information (anything that matches `logs/spras_*` and ending in `.out`, `.err`, or `.log`). If the job was unsuccessful, these files should @@ -67,9 +110,11 @@ contain useful debugging clues about what may have gone wrong. the version of SPRAS you want to test, and push the image to your image repository. To use that container in the workflow, change the `container_image` line of `spras.sub` to point to the new image. -**Note**: In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert +## Creating an Apptainer image for SPRAS + +In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert the SPRAS image to a `.sif` container image before running someplace like the OSPool. To do this, run: -``` +```bash apptainer build spras.sif docker://reedcompbio/spras:v0.1.0 ``` to produce the file `spras.sif`. Then, substitute this value as the `container_image` in the submit file. diff --git a/docker-wrappers/SPRAS/spras_profile/config.yaml b/docker-wrappers/SPRAS/spras_profile/config.yaml new file mode 100644 index 00000000..3d72043f --- /dev/null +++ b/docker-wrappers/SPRAS/spras_profile/config.yaml @@ -0,0 +1,11 @@ +jobs: 30 +executor: htcondor +configfile: example_config.yaml +shared-fs-usage: none +default-resources: + job_wrapper: "spras.sh" + # If running in CHTC, this only works with apptainer images + container_image: "spras.sif" + universe: "container" + request_disk: "16GB" + request_memory: "8GB" diff --git a/environment.yml b/environment.yml index bcbb69c0..47ae801d 100644 --- a/environment.yml +++ b/environment.yml @@ -3,7 +3,7 @@ channels: - conda-forge dependencies: - adjusttext=0.7.3.1 - - bioconda::snakemake-minimal=8.11.6 + - bioconda::snakemake-minimal=8.16.0 - docker-py=5.0 - matplotlib=3.6 - networkx=2.8 @@ -27,3 +27,6 @@ dependencies: - pip: - graphspace_python==1.3.1 - sphinx-rtd-theme==2.0.0 + # This installs the current directory as an editable module + # Needed if running spras from another location + - -e . diff --git a/pyproject.toml b/pyproject.toml index d19a5988..def29b28 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -20,7 +20,7 @@ requires-python = ">=3.11" dependencies = [ "adjusttext==0.7.3", # A bug was introduced in older versions of snakemake that prevent it from running. Update to fix - "snakemake==8.11.6", + "snakemake==8.16.0", "docker==5.0.3", # Switched from docker-py to docker because docker-py is not maintained in pypi. This appears to have no effect "matplotlib==3.6", "networkx==2.8", From 10538cdb33b2471e7d11bdd4829d2fc96f50b059 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Fri, 19 Jul 2024 16:53:56 -0500 Subject: [PATCH 2/8] Update README.md --- docker-wrappers/SPRAS/README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 592fb539..9b070c34 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -89,7 +89,7 @@ cp -r ../../input . It's also necessary for this workflow to create an Apptainer image from the published Docker image. See [Creating an Apptainer image for SPRAS](#creating-an-apptainer-image-for-spras) for instructions. -To start the workflow with HTCondor, run: +To start the workflow with HTCondor in the CHTC pool, run: ```bash snakemake --profile spras_profile ``` @@ -97,6 +97,13 @@ snakemake --profile spras_profile Resource requirements can be adjusted as needed in `spras_profile/config.yaml`, and HTCondor logs for this workflow can be found in `.snakemake/htcondor`. You can set a different log directory by adding `htcondor-jobdir: /path/to/dir` to the profile's configuration. +To run this same workflow in the OSPool, add the following to the profile's default-resources block: +``` + classad_WantGlideIn: true + requirements: | + '(HAS_SINGULARITY == True) && (Poolname =!= "CHTC")' +``` + **Note**: This workflow requires that the terminal session responsible for running snakemake stays active. Closing the terminal will suspend jobs, but the workflow can use Snakemakes checkpointing to pick up any jobs where they left off. From 59cdb840480501a8b601924ea7633cf57cf56c1f Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Fri, 19 Jul 2024 17:07:48 -0500 Subject: [PATCH 3/8] Remove extraneous whitespace --- docker-wrappers/SPRAS/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 9b070c34..350ec95c 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -59,7 +59,7 @@ is to use the the snakemake HTCondor executor to parallelize the workflow by sub Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run `condor_submit spras.sub`, which will submit SPRAS to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS -configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it +configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which appears to fail periodically in HTCondor. From c280d4e9e99bea17e66103e32adbb21776f77cb0 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Mon, 12 Aug 2024 09:52:14 -0500 Subject: [PATCH 4/8] Don't pip-install self on conda build --- environment.yml | 3 --- 1 file changed, 3 deletions(-) diff --git a/environment.yml b/environment.yml index 47ae801d..60ae3043 100644 --- a/environment.yml +++ b/environment.yml @@ -27,6 +27,3 @@ dependencies: - pip: - graphspace_python==1.3.1 - sphinx-rtd-theme==2.0.0 - # This installs the current directory as an editable module - # Needed if running spras from another location - - -e . From 4955b537f060e13a572a80c9576766a3ea7e7221 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Mon, 19 Aug 2024 09:49:32 -0500 Subject: [PATCH 5/8] Update comments/READMEs based on PR feedback --- .gitignore | 3 +++ README.md | 3 +++ docker-wrappers/SPRAS/README.md | 18 ++++++++++++------ .../SPRAS/spras_profile/config.yaml | 12 ++++++++++++ 4 files changed, 30 insertions(+), 6 deletions(-) diff --git a/.gitignore b/.gitignore index 51f1997f..6df8c5e9 100644 --- a/.gitignore +++ b/.gitignore @@ -141,3 +141,6 @@ TempMat.mat # OSX-specific stuff **/.DS_Store + +# SPRAS singularity container +spras.sif \ No newline at end of file diff --git a/README.md b/README.md index 8dc81a2b..2e6c6b54 100644 --- a/README.md +++ b/README.md @@ -56,6 +56,9 @@ Output files will be written to the `output` directory. You do not need to manually download Docker images from DockerHub before running SPRAS. The workflow will automatically download any missing images as long as Docker is running. +### Running SPRAS with HTCondor +Large SPRAS workflows may benefit from execution with HTCondor, a scheduler/manager for distributed high-throughput computing workflows that allows many Snakemake steps to be run in parallel. For instructions on running SPRAS in this setting, see `docker-wrappers/SPRAS/README.md`. + ## Components **Configuration file**: Specifies which pathway reconstruction algorithms to run, which hyperparameter combinations to use, and which datasets to run them on. diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 350ec95c..43d0d490 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -53,7 +53,7 @@ git clone https://github.com/Reed-CompBio/spras.git ``` There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second -is to use the the snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. +is to use the Snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. ### Submitting All Jobs to a Single EP @@ -68,10 +68,13 @@ CHTC pool, omit the `+WantGlideIn` and `requirements` lines ### Submitting Parallel Jobs -Parallelizing SPRAS workflows with HTCondor currently requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor/tree/spras-feature-dev). -To get this executor, clone the forked repository using the following: +Parallelizing SPRAS workflows with HTCondor requires two additional pieces of setup. First, it requires an activated SPRAS conda environment with a `pip install`-ed version of the SPRAS module (see the main `README.md` for detailed instructions on pip installation of SPRAS). + +Second, it requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/htcondor/snakemake-executor-plugin-htcondor). + +To get install this executor in the spras conda environment, clone the forked repository using the following: ```bash -git clone -b spras-feature-dev https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor.git +git clone https://github.com/htcondor/snakemake-executor-plugin-htcondor.git ``` Then, from your activated `spras` conda environment (important), run: @@ -105,10 +108,13 @@ To run this same workflow in the OSPool, add the following to the profile's defa ``` **Note**: This workflow requires that the terminal session responsible for running snakemake stays active. Closing the terminal will suspend jobs, -but the workflow can use Snakemakes checkpointing to pick up any jobs where they left off. +but the workflow can use Snakemake's checkpointing to pick up any jobs where they left off. + +**Note**: If you encounter an error that says `No module named 'spras'`, make sure you've `pip install`-ed the SPRAS module into your conda environment. ### Job Monitoring -To monitor the state of the job, you can run `condor_q` for a snapshot of how the job is doing, or you can run `condor_watch_q` if you want realtime updates. +To monitor the state of the job, you can use a second terminal to run `condor_q` for a snapshot of how the workflow is doing, or you can run `condor_watch_q` for realtime updates. + Upon completion, the `output` directory from the workflow should be returned as `spras/docker-wrappers/SPRAS/output`, along with several files containing the workflow's logging information (anything that matches `logs/spras_*` and ending in `.out`, `.err`, or `.log`). If the job was unsuccessful, these files should contain useful debugging clues about what may have gone wrong. diff --git a/docker-wrappers/SPRAS/spras_profile/config.yaml b/docker-wrappers/SPRAS/spras_profile/config.yaml index 3d72043f..0cfb2bba 100644 --- a/docker-wrappers/SPRAS/spras_profile/config.yaml +++ b/docker-wrappers/SPRAS/spras_profile/config.yaml @@ -1,11 +1,23 @@ +# Default configuration for the SPRAS/HTCondor executor profile. Each of these values +# can also be passed via command line flags, e.g. `--jobs 30 --executor htcondor`. + +# 'jobs' specifies the maximum number of HTCondor jobs that can be in the queue at once. jobs: 30 executor: htcondor configfile: example_config.yaml +# Indicate to the plugin that jobs running on various EPs do not share a filesystem with +# each other, or with the AP. shared-fs-usage: none + +# Default resources will apply to all workflow steps. If a single workflow step fails due +# to insufficient resources, it can be re-run with modified values. Snakemake will handle +# picking up where it left off, and won't re-run steps that have already completed. default-resources: job_wrapper: "spras.sh" # If running in CHTC, this only works with apptainer images container_image: "spras.sif" universe: "container" + # The value for request_disk should be large enough to accommodate the runtime container + # image, any additional PRM container images, and your input data. request_disk: "16GB" request_memory: "8GB" From 2655d1ba71c3f2437f5c05dbf31edd5721c4ce2f Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 3 Sep 2024 14:48:25 -0500 Subject: [PATCH 6/8] Use Apptainer with HTCondor by default and point to SPRAS v0.2.0 Two issues solved with this commit -- first, Apptainer seems to be the way to go when working with HTCondor, both in and out of the OSPool. Instead of having instructions that say "if you encounter problem XXX, use Apptainer", this just tells the user to build the Apptainer image in the first place. Secondly, Neha encountered an issue while testing HTCondor compatibility where the container's Snakefile version was incompatible with the version of the Snakefile being transferred from the AP to the EP. This resulted in a confusing error message and wouldn't be straight forward to recognize for most people, so I decided I'd urge the users to build their own containers in the first place. If they do this, then they can be sure there are no compatibility issues. --- docker-wrappers/SPRAS/README.md | 63 ++++++++++++++++++--------------- docker-wrappers/SPRAS/spras.sub | 15 ++++---- 2 files changed, 42 insertions(+), 36 deletions(-) diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 43d0d490..8105de4f 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -1,19 +1,19 @@ # SPRAS Docker image -## Building +## Building Images A Docker image for SPRAS that is available on [DockerHub](https://hub.docker.com/repository/docker/reedcompbio/spras) This image comes bundled with all of the necessary software packages to run SPRAS, and can be used for execution in distributed environments (like HTCondor). -To create the Docker image, make sure you are in this repository's root directory, and from your terminal run: +To create the Docker image locally, make sure you are in this repository's root directory, and from your terminal run: ```bash docker build -t /: -f docker-wrappers/SPRAS/Dockerfile . ``` -For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.1.0, you'd run: +For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.2.0, you'd run: ```bash -docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile . +docker build -t reedcompbio/spras:v0.2.0 -f docker-wrappers/SPRAS/Dockerfile . ``` This will copy the entire SPRAS repository into the container and install SPRAS with `pip`. As such, any changes you've made to the current SPRAS repository will be reflected in version of SPRAS installed in the container. Since SPRAS @@ -38,27 +38,47 @@ Or to temporarily override your system's default during the build, prepend your DOCKER_DEFAULT_PLATFORM=linux/amd64 ``` -For example, to build reedcompbio/spras:v0.1.0 on Apple Silicon as a linux/amd64 container, you'd run: +For example, to build reedcompbio/spras:v0.2.0 on Apple Silicon as a linux/amd64 container, you'd run: ``` -DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile . +DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.2.0 -f docker-wrappers/SPRAS/Dockerfile . ``` -## Testing +### Converting Docker Images to Apptainer/Singularity Images -The folder `docker-wrappers/SPRAS` also contains several files that can be used to test this container on HTCondor. To test the `spras` container +It may be necessary in some cases to create an Apptainer image for SPRAS, especially if you intend to run your workflow using distributed systems like HTCondor. Apptainer (formerly known as Singularity) uses image files with `.sif` extensions. Assuming you have Apptainer installed, you can create your own sif image from an already-built Docker image with the following command: +```bash +apptainer build .sif docker:// +``` + +For example, creating an Apptainer image for the `v0.2.0` SPRAS image might look like: +```bash +apptainer build spras-v0.2.0.sif docker://reedcompbio/spras:v0.2.0 +``` + +After running this command, a new file called `spras-v0.2.0` will exist in the directory where the command was run. + +## Working with HTCondor + +The folder `docker-wrappers/SPRAS` also contains several files that can be used to run workflows with this container on HTCondor. To use the `spras` image in this environment, first login to an HTCondor Access Point (AP). Then, from the AP clone this repo: ```bash git clone https://github.com/Reed-CompBio/spras.git ``` +**Note:** To work with SPRAS in HTCondor, it is recommended that you build an Apptainer image instead of using Docker. See [Converting Docker Images to Apptainer/Singularity Images](#converting-docker-images-to-apptainersingularity-images) for instructions. Importantly, the Apptainer image must be built for the linux/amd64 architecture. Most HTCondor APs will have `apptainer` installed, but they may not have `docker`. If this is the case, you can build the image with Docker on your local machine, push the image to Docker Hub, and then convert it to Apptainer's `sif` format on the AP. + There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second is to use the Snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. ### Submitting All Jobs to a Single EP -Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run `condor_submit spras.sub`, which will submit SPRAS -to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS +Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory (`mkdir logs`). Next, modify `spras.sub` so that it uses the SPRAS apptainer image you created: +``` +container_image = < your spras image >.sif +``` + +Then run `condor_submit spras.sub`, which will submit SPRAS to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which appears to fail periodically in HTCondor. @@ -68,18 +88,13 @@ CHTC pool, omit the `+WantGlideIn` and `requirements` lines ### Submitting Parallel Jobs -Parallelizing SPRAS workflows with HTCondor requires two additional pieces of setup. First, it requires an activated SPRAS conda environment with a `pip install`-ed version of the SPRAS module (see the main `README.md` for detailed instructions on pip installation of SPRAS). +Parallelizing SPRAS workflows with HTCondor requires the same setup as the previous section, but with two additions. First, it requires an activated SPRAS conda environment with a `pip install`-ed version of the SPRAS module (see the main `README.md` for detailed instructions on pip installation of SPRAS). Second, it requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/htcondor/snakemake-executor-plugin-htcondor). -To get install this executor in the spras conda environment, clone the forked repository using the following: -```bash -git clone https://github.com/htcondor/snakemake-executor-plugin-htcondor.git -``` - -Then, from your activated `spras` conda environment (important), run: +After activating your `spras` conda environment and `pip`-installing SPRAS, you can install the HTCondor Snakemake executor with the following: ```bash -pip install snakemake-executor-plugin-htcondor/ +pip install git+https://github.com/htcondor/snakemake-executor-plugin-htcondor.git ``` Currently, this executor requires that all input to the workflow is scoped to the current working directory. Therefore, you'll need to copy the @@ -89,8 +104,7 @@ cp ../../Snakefile . && \ cp -r ../../input . ``` -It's also necessary for this workflow to create an Apptainer image from the published Docker image. See [Creating an Apptainer image for SPRAS](#creating-an-apptainer-image-for-spras) -for instructions. +**Note:** It is best practice to make sure that the Snakefile you copy for your workflow is the same version as the Snakefile baked into your workflow's container image. When this workflow runs, the Snakefile you just copied will be used during remote execution instead of the Snakefile from the container. As a result, difficult-to-diagnose versioning issues may occur if the version of SPRAS in the remote container doesn't support the Snakefile on your current branch. The safest bet is always to create your own image so you always know what's inside of it. To start the workflow with HTCondor in the CHTC pool, run: ```bash @@ -123,15 +137,6 @@ contain useful debugging clues about what may have gone wrong. the version of SPRAS you want to test, and push the image to your image repository. To use that container in the workflow, change the `container_image` line of `spras.sub` to point to the new image. -## Creating an Apptainer image for SPRAS - -In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert -the SPRAS image to a `.sif` container image before running someplace like the OSPool. To do this, run: -```bash -apptainer build spras.sif docker://reedcompbio/spras:v0.1.0 -``` -to produce the file `spras.sif`. Then, substitute this value as the `container_image` in the submit file. - ## Versions: The versions of this image match the version of the spras package within it. diff --git a/docker-wrappers/SPRAS/spras.sub b/docker-wrappers/SPRAS/spras.sub index 49d064d5..01dee32c 100644 --- a/docker-wrappers/SPRAS/spras.sub +++ b/docker-wrappers/SPRAS/spras.sub @@ -13,15 +13,16 @@ SNAKEFILE = ../../Snakefile ############################################################ # Specify that the workflow should run in the SPRAS # -# container. In the OSPool, this image is usually # -# converted automatically to an Apptainer/Singularity # -# image, which is why the example config has # -# `unpack_singularity = true`. # +# container. You can either use a docker:// URL, or point # +# directly to an Apptainer image (recommended). Note that # +# if running in the OSPool, most docker images are first # +# automatically converted to Apptainer issues, but it's # +# generally recommended that you build your own image # +# first # ############################################################ universe = container -container_image = docker://reedcompbio/spras:v0.2.0 -# container_image = spras.sif - +container_image = .sif +# container_image = docker://reedcompbio/spras:v0.2.0 ############################################################ # Specify names for log/stdout/stderr files generated by # From ac805d64ac31840e72d40e9b72fbd9974403ebb9 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Tue, 3 Sep 2024 14:56:26 -0500 Subject: [PATCH 7/8] Fix trailing whitespace --- docker-wrappers/SPRAS/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 8105de4f..137599a6 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -66,7 +66,7 @@ in this environment, first login to an HTCondor Access Point (AP). Then, from th git clone https://github.com/Reed-CompBio/spras.git ``` -**Note:** To work with SPRAS in HTCondor, it is recommended that you build an Apptainer image instead of using Docker. See [Converting Docker Images to Apptainer/Singularity Images](#converting-docker-images-to-apptainersingularity-images) for instructions. Importantly, the Apptainer image must be built for the linux/amd64 architecture. Most HTCondor APs will have `apptainer` installed, but they may not have `docker`. If this is the case, you can build the image with Docker on your local machine, push the image to Docker Hub, and then convert it to Apptainer's `sif` format on the AP. +**Note:** To work with SPRAS in HTCondor, it is recommended that you build an Apptainer image instead of using Docker. See [Converting Docker Images to Apptainer/Singularity Images](#converting-docker-images-to-apptainersingularity-images) for instructions. Importantly, the Apptainer image must be built for the linux/amd64 architecture. Most HTCondor APs will have `apptainer` installed, but they may not have `docker`. If this is the case, you can build the image with Docker on your local machine, push the image to Docker Hub, and then convert it to Apptainer's `sif` format on the AP. There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second is to use the Snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. From be8904dd75a5533398716e290b854aa90e507478 Mon Sep 17 00:00:00 2001 From: Justin Hiemstra Date: Wed, 4 Sep 2024 16:05:22 -0500 Subject: [PATCH 8/8] Add missing file extension in docs and add missing newline to gitignore --- .gitignore | 2 +- docker-wrappers/SPRAS/README.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.gitignore b/.gitignore index 6df8c5e9..c85b22cc 100644 --- a/.gitignore +++ b/.gitignore @@ -143,4 +143,4 @@ TempMat.mat **/.DS_Store # SPRAS singularity container -spras.sif \ No newline at end of file +spras.sif diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 137599a6..a163e2b8 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -55,7 +55,7 @@ For example, creating an Apptainer image for the `v0.2.0` SPRAS image might look apptainer build spras-v0.2.0.sif docker://reedcompbio/spras:v0.2.0 ``` -After running this command, a new file called `spras-v0.2.0` will exist in the directory where the command was run. +After running this command, a new file called `spras-v0.2.0.sif` will exist in the directory where the command was run. ## Working with HTCondor