replace paths; GPFS > CephFS (#169)

bihealth · Aug 16, 2024 · 2c43287 · 2c43287
1 parent 2e2e56c
commit 2c43287
Show file tree

Hide file tree

Showing 12 changed files with 67 additions and 64 deletions.
diff --git a/bih-cluster/docs/best-practice/env-modules.md b/bih-cluster/docs/best-practice/env-modules.md
@@ -97,7 +97,7 @@ case "${HOSTNAME}" in
 
         # Define path for temporary directories, don't forget to cleanup!
         # Also, this will only work after /fast is available.
-        export TMPDIR=/fast/users/$USER/scratch/tmp
+        export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp
         ;;
 esac
 ```
diff --git a/bih-cluster/docs/best-practice/temp-files.md b/bih-cluster/docs/best-practice/temp-files.md
@@ -16,17 +16,17 @@ When undefined, usually `/tmp` is used.
 
 Generally, there are two locations where you could put temporary files:
 
-- `/fast/users/$USER/scratch/tmp` -- inside your scratch folder on the fast GPFS file system; this location is available from all cluster nodes
+- `/data/cephfs-1/home/users/$USER/scratch/tmp` -- inside your scratch folder on the CephFS file system; this location is available from all cluster nodes
 - `/tmp` -- on the local node's temporary folder; this location is only available on the node itself.
   The slurm scheduler uses Linux namespaces such that every **job** gets its private `/tmp` even when run on the same node.
 
-### Best Practice:  Use `/fast/users/$USER/scratch/tmp`
+### Best Practice:  Use `scratch/tmp`
 
-!!! warning "Use GPFS-based TMPDIR"
+!!! warning "Use CephFS-based TMPDIR"
 
-    Generally setup your environment to use `/fast/users/$USER/scratch/tmp` as filling the local disk of a node with forgotten files can cause a lot of problems.
+    Generally setup your environment to use `/data/cephfs-1/home/users/$USER/scratch/tmp` as filling the local disk of a node with forgotten files can cause a lot of problems.
 
-Ideally, you append the following to your `~/.bashrc` to use `/fast/users/$USER/scratch/tmp` as the temporary directory.
+Ideally, you append the following to your `~/.bashrc` to use `/data/cephfs-1/home/users/$USER/scratch/tmp` as the temporary directory.
 This will also create the directory if it does not exist.
 Further, it will create one directory per host name which prevents too many entries in the temporary directory.
 
@@ -40,7 +40,7 @@ mkdir -p $TMPDIR
 ## `TMPDIR` and the scheduler
 
 In the older nodes, the local disk is a relatively slow spinning disk, in the newer nodes, the local disk is a relatively fast SSD.
-Further, the local disk is independent from the GPFS file system, so I/O volume to it does not affect the network or any other job on other nodes.
+Further, the local disk is independent from the CephFS file system, so I/O volume to it does not affect the network or any other job on other nodes.
 Please note that by default, Slurm will not change your environment variables.
 This includes the environment variable `TMPDIR`.
 

diff --git a/bih-cluster/docs/help/faq.md b/bih-cluster/docs/help/faq.md
@@ -189,11 +189,11 @@ JobId=863089 JobName=pipeline_job.sh
    MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
-   Command=/fast/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/pipeline_job.sh
-   WorkDir=/fast/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export
-   StdErr=/fast/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out
+   Command=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/pipeline_job.sh
+   WorkDir=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export
+   StdErr=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out
    StdIn=/dev/null
-   StdOut=/fast/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out
+   StdOut=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out
    Power=
    MailUser=(null) MailType=NONE
 ```
@@ -290,11 +290,11 @@ JobId=4225062 JobName=C2371_2
    MinCPUsNode=1 MinMemoryNode=150G MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=YES Contiguous=0 Licenses=(null) Network=(null)
-   Command=/fast/work/users/user_c/SCZ_replic/JR_sims/GS_wrapy/wrap_y0_VP_2371_GS_chunk2_C02.sh
-   WorkDir=/fast/work/users/user_c/SCZ_replic/JR_sims
-   StdErr=/fast/work/users/user_c/SCZ_replic/JR_sims/E2371_2.txt
+   Command=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/GS_wrapy/wrap_y0_VP_2371_GS_chunk2_C02.sh
+   WorkDir=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims
+   StdErr=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/E2371_2.txt
    StdIn=/dev/null
-   StdOut=/fast/work/users/user_c/SCZ_replic/JR_sims/slurm-4225062.out
+   StdOut=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/slurm-4225062.out
    Power=
 ```
 

diff --git a/bih-cluster/docs/how-to/software/cell-ranger.md b/bih-cluster/docs/how-to/software/cell-ranger.md
@@ -11,7 +11,7 @@ requires registration before download from [here](https://support.10xgenomics.co
 to unpack Cell Ranger, its dependencies and the `cellranger` script:
 
 ```
-cd /fast/users/$USER/work
+cd /data/cephfs-1/home/users/$USER/work
 mv /path/to/cellranger-3.0.2.tar.gz .
 tar -xzvf cellranger-3.0.2.tar.gz
 ```
@@ -22,7 +22,7 @@ will be provided in `/data/cephfs-1/work/projects/cubit/current/static_data/app_
 
 # cluster support SLURM
 
-add a file `slurm.template` to `/fast/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template` with the following contents:
+add a file `slurm.template` to `/data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template` with the following contents:
 
 ```
 #!/usr/bin/env bash
@@ -61,7 +61,7 @@ add a file `slurm.template` to `/fast/users/$USER/work/cellranger-3.0.2/martian-
 __MRO_CMD__
 ```
 
-**note**: on newer cellranger version, `slurm.template` needs to go to `/fast/users/$USER/work/cellranger-XX/external/martian/jobmanagers/`
+**note**: on newer cellranger version, `slurm.template` needs to go to `/data/cephfs-1/home/users/$USER/work/cellranger-XX/external/martian/jobmanagers/`
 
 # demultiplexing
 
@@ -74,7 +74,7 @@ create a script `run_cellranger.sh` with these contents (consult the [documentat
 ```
 #!/bin/bash
 
-/fast/users/$USER/work/cellranger-3.0.2/cellranger count \
+/data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/cellranger count \
   --id=sample_id \
   --transcriptome=/data/cephfs-1/work/projects/cubit/current/static_data/app_support/cellranger/refdata-cellranger-${species}-3.0.0\
   --fastqs=/path/to/fastqs \
@@ -93,7 +93,7 @@ sbatch --ntasks=1 --mem-per-cpu=4G --time=8:00:00 -p medium -o cellranger.log ru
 
 # cluster support SGE (outdated)
 
-add a file `sge.template` to `/fast/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template` with the following contents:
+add a file `sge.template` to `/data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template` with the following contents:
 
 ```
 # =============================================================================

diff --git a/bih-cluster/docs/how-to/software/scientific-software.md b/bih-cluster/docs/how-to/software/scientific-software.md
@@ -154,7 +154,7 @@ proc ModulesHelp { } {
 
 module-whatis {Gromacs molecular simulation toolkit (non-MPI)}
 
-set root /fast/users/YOURUSER/work/software/gromacs-mpi/2018.3
+set root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3
 
 prereq gcc/7.2.0-0
 
@@ -183,7 +183,7 @@ proc ModulesHelp { } {
 
 module-whatis {Gromacs molecular simulation toolkit (MPI)}
 
-set root /fast/users/YOURUSER/work/software/gromacs-mpi/2018.3
+set root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3
 
 prereq openmpi/4.0.3-0
 prereq gcc/7.2.0-0
@@ -210,7 +210,7 @@ You can verify the result:
 ```bash
 med0127:~$ module avail
 
------------------- /fast/users/YOURUSER/local/modules ------------------
+------------------ /data/cephfs-1/home/users/YOURUSER/local/modules ------------------
 gromacs/2018.3     gromacs-mpi/2018.3
 
 -------------------- /usr/share/Modules/modulefiles --------------------

diff --git a/bih-cluster/docs/hpc-tutorial/episode-1.md b/bih-cluster/docs/hpc-tutorial/episode-1.md
@@ -28,20 +28,20 @@ You can find the data here:
 ## Creating a Project Directory
 
 First, you should create a folder where the output of this tutorial will go.
-It would be good to have it in your `work` directory in `/fast/users/$USER`, because it is faster and there is more space available.
+It would be good to have it in your `work` directory in `/data/cephfs-1/home/users/$USER`, because it is faster and there is more space available.
 
 ```terminal
-(first-steps) $ mkdir -p /fast/users/$USER/work/tutorial/episode1
-(first-steps) $ pushd /fast/users/$USER/work/tutorial/episode1
+(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode1
+(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode1
 ```
 
 !!! important "Quotas / File System limits"
 
-    - Note well that you have a quota of 1 GB in your home directory at `/fast/users/$USER`.
+    - Note well that you have a quota of 1 GB in your home directory at `/data/cephfs-1/home/users/$USER`.
       The reason for this is that nightly snapshots and backups are created for this directory which are precious resources.
-    - This limit does not apply to your work directory at `/fast/users/$USER/work`.
+    - This limit does not apply to your work directory at `/data/cephfs-1/home/users/$USER/work`.
       The limits are much higher here but no snapshots or backups are available.
-    - There is no limit on your scratch directory at `/fast/users/$USER/scratch`.
+    - There is no limit on your scratch directory at `/data/cephfs-1/home/users/$USER/scratch`.
       However, **files placed here are automatically removed after 2 weeks.**
       This is only appropriate for files during download or temporary files.
 
@@ -51,7 +51,7 @@ In general it is advisable to have a proper temporary directory available.
 You can create one in your `~/scratch` folder and make it available to the system.
 
 ```terminal
-(first-steps) $ export TMPDIR=/fast/users/$USER/scratch/tmp
+(first-steps) $ export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp
 (first-steps) $ mkdir -p $TMPDIR
 ```
 

diff --git a/bih-cluster/docs/hpc-tutorial/episode-2.md b/bih-cluster/docs/hpc-tutorial/episode-2.md
@@ -62,7 +62,7 @@ The content of the file:
 # Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS
 #SBATCH --time=30:00
 
-export TMPDIR=/fast/users/${USER}/scratch/tmp
+export TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp
 mkdir -p ${TMPDIR}
 ```
 
@@ -72,13 +72,13 @@ Slurm will create a log file with a file name composed of the job name (`%x`) an
 To start now with our tutorial, create a new tutorial directory with a log directory, e.g.,
 
 ```terminal
-(first-steps) $ mkdir -p /fast/users/$USER/work/tutorial/episode2/logs
+(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode2/logs
 ```
 
 and copy the wrapper script to this directory:
 
 ```terminal
-(first-steps) $ pushd /fast/users/$USER/work/tutorial/episode2
+(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode2
 (first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_job.sh .
 (first-steps) $ chmod u+w submit_job.sh
 ```
@@ -116,7 +116,7 @@ Your file should look something like this:
 # Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS
 #SBATCH --time=30:00
 
-export TMPDIR=/fast/users/${USER}/scratch/tmp
+export TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp
 mkdir -p ${TMPDIR}
 
 BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta

diff --git a/bih-cluster/docs/hpc-tutorial/episode-3.md b/bih-cluster/docs/hpc-tutorial/episode-3.md
@@ -30,8 +30,8 @@ Every Snakemake run requires a `Snakefile` file. Create a new folder inside your
 copy the skeleton:
 
 ```terminal
-(first-steps) $ mkdir -p /fast/users/${USER}/work/tutorial/episode3
-(first-steps) $ pushd /fast/users/${USER}/work/tutorial/episode3
+(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode3
+(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode3
 (first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/Snakefile .
 (first-steps) $ chmod u+w Snakefile
 ```
@@ -53,7 +53,7 @@ rule alignment:
         bai='alignment/test.bam.bai',
     shell:
         r"""
-        export TMPDIR=/fast/users/${{USER}}/scratch/tmp
+        export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp
         mkdir -p ${{TMPDIR}}
 
         BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta
@@ -154,7 +154,7 @@ rule alignment:
         bai='alignment/{id}.bam.bai',
     shell:
         r"""
-        export TMPDIR=/fast/users/${{USER}}/scratch/tmp
+        export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp
         mkdir -p ${{TMPDIR}}
 
         BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta

diff --git a/bih-cluster/docs/hpc-tutorial/episode-4.md b/bih-cluster/docs/hpc-tutorial/episode-4.md
@@ -20,8 +20,8 @@ to call Snakemake. We run the script and the magic will start.
 First, create a new folder for this episode:
 
 ```terminal
-(first-steps) $ mkdir -p /fast/users/${USER}/work/tutorial/episode4/logs
-(first-steps) $ pushd /fast/users/${USER}/work/tutorial/episode4
+(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode4/logs
+(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode4
 ```
 
 And copy the wrapper script to this folder as well as the Snakefile (you can also reuse the one with the adjustments from the previous [episode](episode-3.md)):
@@ -60,7 +60,7 @@ The `Snakefile` is already known to you but let me explain the wrapper script `s
 #SBATCH --time=30:00
 
 
-export TMPDIR=/fast/users/${USER}/scratch/tmp
+export TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp
 export LOGDIR=logs/${SLURM_JOB_NAME}-${SLURM_JOB_ID}
 mkdir -p $LOGDIR
 
@@ -120,7 +120,7 @@ rule alignment:
         time='12:00:00',
     shell:
         r"""
-        export TMPDIR=/fast/users/${{USER}}/scratch/tmp
+        export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp
         mkdir -p ${{TMPDIR}}
 
         BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta

diff --git a/bih-cluster/docs/ondemand/quotas.md b/bih-cluster/docs/ondemand/quotas.md
@@ -1,5 +1,10 @@
 # OnDemand: Quota Inspection
 
+!!! info "Outdated"
+
+    This document is only valid for the old, third-generation file system and will be removed soon.
+    Quotas of our new CephFS storage are communicated via the [HPC Access](https://hpc-access.cubi.bihealth.org/) web portal.
+
 Accessing the quota report by selecting `Files` and then `Quotas` in the top menu
 will provide you with a detailed list of all quotas for directories that you are assigned to.
 

diff --git a/bih-cluster/docs/overview/for-the-impatient.md b/bih-cluster/docs/overview/for-the-impatient.md
@@ -1,6 +1,6 @@
 # Overview
-## HPC 4 Research
-**HPC 4 Research** is located in the BIH data center in Buch and connected via the BIH research network.
+## BIH HPC 4 Research
+**BIH HPC 4 Research** is located in the BIH data center in Buch and connected via the BIH research network.
 Connections can be made from Charite, MDC, and BIH networks.
 The cluster is open for users with either Charite or MDC accounts after [getting access through the gatekeeper proces](../admin/getting-access.md).
 The system has been designed to be suitable for the processing of human genetics data from research contexts (and of course data without data privacy concerns such as public and mouse data).
@@ -13,9 +13,9 @@ The cluster consists of the following major components:
 - 2 nodes for file transfers `hpc-transfer-1` and `hpc-transfer-2`,
 - a scheduling system using Slurm,
 - 228 general purpose compute nodes `hpc-cpu-{1..228}`
-- a few high memory nodes `hpc-mem-{1..4}`,
+- a few high memory nodes `hpc-mem-{1..5}`,
 - 7 nodes with 4 Tesla V100 GPUs each (!) `hpc-gpu-{1..7}` and 1 node with 10x A40 GPUs (!) `hpc-gpu-8`,
-- a high-performance, parallel GPFS file system with 2.1 PB, by DDN mounted at `/fast`,
+- a legacy parallel GPFS file system with 2.1 PB, by DDN mounted at `/fast`,
 - a next generation high-performance storage system based on Ceph/CephFS
 - a tier 2 (slower) storage system based on Ceph/CephFS
 
@@ -67,7 +67,8 @@ This addresses a lot of suboptimal (yet not dangerous, of course) points we obse
   Despite this, it is your responsibility to keep important files in the snapshot/backup protected home, ideally even in copy (e.g., a git repository) elsewhere.
   Also, keeping safe copies of primary data files, your published results, and the steps in between reproducible is your responsibility.
 - A place to store data indefinitely.
-  The fast GPFS storage is expensive and "sparse" in a way.
+  The fast CephFS Tier 1 storage is expensive and "rare".
+  CephFS Tier 2 is bigger in volume, but still not unlimited.
   The general workflow is: (1) copy data to cluster, (2) process it, creating intermediate and final results, (3) copy data elsewhere and remove it from the cluster
 - Generally suitable for primary software development.
   The I/O system might get overloaded and saving scripts might take some time.