Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docs #441

Merged
merged 21 commits into from
Nov 21, 2024
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 88 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,14 @@ You can delete the <writable_image_dir> now.

## 2. Configuration

Adjust `config/config.yaml` for your particular use case.
Adjust `config/config.yaml` for your particular use case. It is possible to use pre-calculated features (e.g. [downloaded from our features database](https://github.com/KosinskiLab/AlphaPulldown?tab=readme-ov-file#installation)) by adding paths to the features to your config/config.yaml

```yaml
feature_directory :
- "/path/to/directory/with/features/"
DimaMolod marked this conversation as resolved.
Show resolved Hide resolved
```
> [!NOTE]
> If your folders contain compressed features, you have to set `--compress-features` flag to True, otherwise AlphaPulldown will not recognize these features and start calculations from scratch!

If you want to use CCP4 for analysis, open `config/config.yaml` in a text editor and change the path to the analysis container to:

Expand All @@ -320,6 +327,12 @@ example:2:1-50
example:1-50_example:1-50
example:1:1-50_example:1:1-50
```
One can also specify several amino acid ranges in one line to be modeled together:

```
example:1-50:70-100
example:2:1-50:70-100
```

This format similarly extends for the folding of heteromers:

Expand Down Expand Up @@ -367,9 +380,78 @@ Slurm specific parameters that do not need to be modified by non-expert users.
**only_generate_features**
If set to True, stops after generating features and does not perform structure prediction and reporting.


## 3. Execution

After following the Installation and Configuration steps, you are now ready to run the snakemake pipeline. To do so, navigate into the cloned pipeline directory and run:
After following the Installation and Configuration steps, you are now ready to run the Snakemake pipeline. To do so, navigate into the cloned pipeline directory and run:

```bash
snakemake \
--use-singularity \
--singularity-args "-B /scratch:/scratch \
-B /g/kosinski:/g/kosinski \
--nv " \
--jobs 200 \
--restart-times 5 \
--profile slurm_noSidecar \
--rerun-incomplete \
--rerun-triggers mtime \
--latency-wait 30 \
-n
```

> [!Warning]
> Running Snakemake in the foreground on a remote server can cause the process to terminate if the session is disconnected. To avoid this, you can run Snakemake in the background and redirect the output to log files. Here are two approaches depending on your environment:

- **For SLURM clusters:** Use `srun` to submit the job in the background:

```bash
srun --job-name=snakemake_job \
snakemake \
--use-singularity \
--singularity-args "-B /scratch:/scratch \
-B /g/kosinski:/g/kosinski \
--nv " \
--jobs 200 \
--restart-times 5 \
--profile slurm_noSidecar \
--rerun-incomplete \
--rerun-triggers mtime \
--latency-wait 30 \
DimaMolod marked this conversation as resolved.
Show resolved Hide resolved
DimaMolod marked this conversation as resolved.
Show resolved Hide resolved
&> log.txt &
```

- **For non-SLURM systems:** You can use `screen` to run the process in a persistent session:

1. Start a `screen` session:
```bash
screen -S snakemake_session
```
2. Run Snakemake as usual:
```bash
snakemake \
--use-singularity \
--singularity-args "-B /scratch:/scratch \
-B /g/kosinski:/g/kosinski \
--nv " \
--jobs 200 \
--restart-times 5 \
--profile slurm_noSidecar \
--rerun-incomplete \
--rerun-triggers mtime \
--latency-wait 30 \
DimaMolod marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the command cannot end with , missing line with -n?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have:

    --profile slurm_noSidecar \
    --rerun-incomplete \
    --rerun-triggers mtime \
    --latency-wait 30 \

but should be either

    --profile slurm_noSidecar \
    --rerun-incomplete \
    --rerun-triggers mtime \
    --latency-wait 30

or

    --profile slurm_noSidecar \
    --rerun-incomplete \
    --rerun-triggers mtime \
    --latency-wait 30 \
    -n

&> log.txt &
```
3. Detach from the `screen` session by pressing `Ctrl + A` then `D`. You can later reattach with:
```bash
screen -r snakemake_session
```

By following these methods, you ensure that Snakemake continues running even if the remote session disconnects.

---

This should guide users in handling both SLURM and non-SLURM environments when running the pipeline.

```bash
snakemake \
Expand Down Expand Up @@ -426,13 +508,10 @@ AlphaPulldown can be used as a set of scripts for every particular step.

### 0.1. Create Anaconda environment

**Firstly**, install [Anaconda](https://www.anaconda.com/) and create an AlphaPulldown environment, gathering necessary dependencies. We recommend to use mamba to speed up solving of dependencies:
**Firstly**, install [Anaconda](https://www.anaconda.com/) and create an AlphaPulldown environment, gathering necessary dependencies. To speed up dependency resolution, we recommend using Mamba.

```bash
conda create -n AlphaPulldown -c omnia -c bioconda -c conda-forge python==3.11 openmm==8.0 pdbfixer==1.9 kalign2 hhsuite hmmer modelcif
```

```bash
source activate AlphaPulldown
```
This usually works, but on some compute systems, users may prefer to use other versions or optimized builds of HMMER and HH-suite that are already installed.
Expand Down Expand Up @@ -710,7 +789,7 @@ Create the `create_individual_features_SLURM.sh` script and place the following
#SBATCH -o logs/create_individual_features_%A_%a_out.txt

#qos sets priority
#SBATCH --qos=low
#SBATCH --qos=normal

#Limit the run to a single node
#SBATCH -N 1
Expand All @@ -719,11 +798,7 @@ Create the `create_individual_features_SLURM.sh` script and place the following
#SBATCH --ntasks=8
#SBATCH --mem=64000

module load HMMER/3.4-gompi-2023a
module load HH-suite/3.3.0-gompi-2023a
eval "$(conda shell.bash hook)"
module load CUDA/11.8.0
module load cuDNN/8.7.0.84-CUDA-11.8.0
conda activate AlphaPulldown

# CUSTOMIZE THE FOLLOWING SCRIPT PARAMETERS FOR YOUR SPECIFIC TASK:
Expand All @@ -740,13 +815,7 @@ create_individual_features.py \
#####
```

Make the script executable by running:

```bash
chmod +x create_individual_features_SLURM.sh
```

Next, execute the following commands, replacing `<sequences.fasta>` with the path to your input FASTA file:
Execute the following commands, replacing `<sequences.fasta>` with the path to your input FASTA file:

```bash
mkdir logs
Expand Down Expand Up @@ -1157,10 +1226,7 @@ Create the `run_multimer_jobs_SLURM.sh` script and place the following code in i
#Adjust this depending on the node
#SBATCH --ntasks=8
#SBATCH --mem=64000

module load Anaconda3
DimaMolod marked this conversation as resolved.
Show resolved Hide resolved
module load CUDA/11.8.0
module load cuDNN/8.7.0.84-CUDA-11.8.0
eval "$(conda shell.bash hook)"
source activate AlphaPulldown

MAXRAM=$(echo `ulimit -m` '/ 1024.0'|bc)
Expand Down
Loading