diff --git a/admin/getting-access/index.html b/admin/getting-access/index.html index 9b6286e0..b552719c 100644 --- a/admin/getting-access/index.html +++ b/admin/getting-access/index.html @@ -3395,7 +3395,7 @@
~/.bashrc
Guide
- September 27, 2024
+ October 7, 2024
diff --git a/best-practice/env-modules/index.html b/best-practice/env-modules/index.html
index 8e989a8d..ad99b3cf 100644
--- a/best-practice/env-modules/index.html
+++ b/best-practice/env-modules/index.html
@@ -3374,7 +3374,7 @@ Please see the section Connection Problems.
The most probable cause for this is a conda installation which defaults to loading the (Base) environment on login. -To disable this behaviour you can run:
-$ conda config --set auto_activate_base false
+The most probable cause for this is a conda installation which loads files on login.
+To disable this behaviour we can put the conda intialisation code behind a bash alias to run it manually later:
+In your ~/.bashrc
find the conda block:
+# >>> conda initialize >>>
+# !! Contents within this block are managed by 'conda init' !!
+...
+# <<< conda initialize <<<
+Encapsulate the entire section in a bash function like this:
+conda_init() {
+ # >>> conda initialize >>>
+ # !! Contents within this block are managed by 'conda init' !!
+ ...
+ # <<< conda initialize <<<
+}
+
+From now on to use conda you must first run conda_init
which then loads the necessary files.
You can also run the bash shell in verbose mode to find out exactly which command is slowing down login:
$ ssh user@hpc-login-1.cubi.bihealth.org bash -iv
@@ -4285,10 +4298,10 @@ How
The reverse does not work.
In other words, you have to log into the MAX cluster and then initiate your file copies to or from the BIH HPC from there.
E.g., use rsync -avP some/path user_m@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_m@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
-How can I copy data between the Charite Network and BIH HPC?¶
-
In general, connections can only be initiated from the Charite network to the BIH network.
+
How can I copy data between the Charité Network and BIH HPC?¶
+In general, connections can only be initiated from the Charité network to the BIH network.
The reverse does not work.
-In other words, you have to be on a machine inside the Charite network and then initiate your file copies to or from the BIH HPC from there.
+In other words, you have to be on a machine inside the Charité network and then initiate your file copies to or from the BIH HPC from there.
E.g., use rsync -avP some/path user_c@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_c@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
My jobs are slow/die on the login/transfer node!¶
As of December 3, 2020 we have established a policy to limit you to 512 files and 128MB of RAM.
@@ -4376,7 +4389,7 @@
How can I exchange
September 27, 2024
+ October 7, 2024
diff --git a/help/good-tickets/index.html b/help/good-tickets/index.html
index 1ea4135e..4ff57118 100644
--- a/help/good-tickets/index.html
+++ b/help/good-tickets/index.html
@@ -3244,7 +3244,7 @@ Problems Submitting Jobs
- September 27, 2024
+ October 7, 2024
diff --git a/help/helpdesk/index.html b/help/helpdesk/index.html
index ee2c50ae..0f19cc61 100644
--- a/help/helpdesk/index.html
+++ b/help/helpdesk/index.html
@@ -3226,7 +3226,7 @@ Helpdesk Non-Scope
- September 27, 2024
+ October 7, 2024
diff --git a/help/hpc-talk/index.html b/help/hpc-talk/index.html
index 499ecce2..6f165924 100644
--- a/help/hpc-talk/index.html
+++ b/help/hpc-talk/index.html
@@ -3126,7 +3126,7 @@ HPC Talk
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/connect/gpu-nodes/index.html b/how-to/connect/gpu-nodes/index.html
index fa7de5aa..88ce9109 100644
--- a/how-to/connect/gpu-nodes/index.html
+++ b/how-to/connect/gpu-nodes/index.html
@@ -3400,7 +3400,7 @@ Fair Share / Fair Use
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/connect/high-memory/index.html b/how-to/connect/high-memory/index.html
index 87730aa0..6f40b4d7 100644
--- a/how-to/connect/high-memory/index.html
+++ b/how-to/connect/high-memory/index.html
@@ -3207,7 +3207,7 @@ How-To
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/misc/contribute/index.html b/how-to/misc/contribute/index.html
index c391f099..9adf3288 100644
--- a/how-to/misc/contribute/index.html
+++ b/how-to/misc/contribute/index.html
@@ -3128,7 +3128,7 @@ How-To: Contribute to this Document<
September 27, 2024
+ October 7, 2024
diff --git a/how-to/misc/debug-at-hpc/index.html b/how-to/misc/debug-at-hpc/index.html
index 70f5447c..d015d897 100644
--- a/how-to/misc/debug-at-hpc/index.html
+++ b/how-to/misc/debug-at-hpc/index.html
@@ -3346,7 +3346,7 @@ Don't Despair
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/misc/debug-software/index.html b/how-to/misc/debug-software/index.html
index e742f79e..ed5c3294 100644
--- a/how-to/misc/debug-software/index.html
+++ b/how-to/misc/debug-software/index.html
@@ -3672,7 +3672,7 @@ Reading Material on Debuggers
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/misc/hpc-talk/index.html b/how-to/misc/hpc-talk/index.html
index a9ec03ef..e0a12488 100644
--- a/how-to/misc/hpc-talk/index.html
+++ b/how-to/misc/hpc-talk/index.html
@@ -3282,7 +3282,7 @@ Closing Remarks
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/service/file-exchange/index.html b/how-to/service/file-exchange/index.html
index c3510b2e..c58715e7 100644
--- a/how-to/service/file-exchange/index.html
+++ b/how-to/service/file-exchange/index.html
@@ -3540,7 +3540,7 @@ Security
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/apptainer/index.html b/how-to/software/apptainer/index.html
index 1af8e4e2..f0b54a5a 100644
--- a/how-to/software/apptainer/index.html
+++ b/how-to/software/apptainer/index.html
@@ -3373,7 +3373,7 @@ Conversion Compatibility
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/cell-ranger/index.html b/how-to/software/cell-ranger/index.html
index ea19dff0..c7341ab6 100644
--- a/how-to/software/cell-ranger/index.html
+++ b/how-to/software/cell-ranger/index.html
@@ -3214,7 +3214,7 @@ cluster support SGE (outdated)
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/jupyter/index.html b/how-to/software/jupyter/index.html
index 48cfce7a..7f0164b5 100644
--- a/how-to/software/jupyter/index.html
+++ b/how-to/software/jupyter/index.html
@@ -3360,7 +3360,7 @@ Advanced
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/keras/index.html b/how-to/software/keras/index.html
index 682afdb8..1f994f82 100644
--- a/how-to/software/keras/index.html
+++ b/how-to/software/keras/index.html
@@ -3151,7 +3151,7 @@ Conda environment
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/matlab/index.html b/how-to/software/matlab/index.html
index 26d23c36..55cea2e2 100644
--- a/how-to/software/matlab/index.html
+++ b/how-to/software/matlab/index.html
@@ -3394,7 +3394,7 @@ A Working Example
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/openmpi/index.html b/how-to/software/openmpi/index.html
index b5f44ed1..a875a68c 100644
--- a/how-to/software/openmpi/index.html
+++ b/how-to/software/openmpi/index.html
@@ -3487,7 +3487,7 @@ Running Hybrid Software (MPI+
September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/scientific-software/index.html b/how-to/software/scientific-software/index.html
index c3d203ba..0751e40f 100644
--- a/how-to/software/scientific-software/index.html
+++ b/how-to/software/scientific-software/index.html
@@ -3604,7 +3604,7 @@ Launching Gromacs
- September 27, 2024
+ October 7, 2024
diff --git a/how-to/software/tensorflow/index.html b/how-to/software/tensorflow/index.html
index a9739e5a..cfac6d43 100644
--- a/how-to/software/tensorflow/index.html
+++ b/how-to/software/tensorflow/index.html
@@ -3363,7 +3363,7 @@ Writing TensorFlow Slurm Jobs
- September 27, 2024
+ October 7, 2024
diff --git a/hpc-tutorial/episode-0/index.html b/hpc-tutorial/episode-0/index.html
index c05159b8..9679cdbc 100644
--- a/hpc-tutorial/episode-0/index.html
+++ b/hpc-tutorial/episode-0/index.html
@@ -3291,7 +3291,7 @@ Preparation
- September 27, 2024
+ October 7, 2024
diff --git a/hpc-tutorial/episode-1/index.html b/hpc-tutorial/episode-1/index.html
index 840e348e..39b01020 100644
--- a/hpc-tutorial/episode-1/index.html
+++ b/hpc-tutorial/episode-1/index.html
@@ -3409,7 +3409,7 @@ Outlook: More Programs and Static
September 27, 2024
+ October 7, 2024
diff --git a/hpc-tutorial/episode-2/index.html b/hpc-tutorial/episode-2/index.html
index 2173363f..db763904 100644
--- a/hpc-tutorial/episode-2/index.html
+++ b/hpc-tutorial/episode-2/index.html
@@ -3402,7 +3402,7 @@ Job Queues
- September 27, 2024
+ October 7, 2024
diff --git a/hpc-tutorial/episode-3/index.html b/hpc-tutorial/episode-3/index.html
index 56faa201..29399373 100644
--- a/hpc-tutorial/episode-3/index.html
+++ b/hpc-tutorial/episode-3/index.html
@@ -3334,7 +3334,7 @@ First Steps: Episode 3
- September 27, 2024
+ October 7, 2024
diff --git a/hpc-tutorial/episode-4/index.html b/hpc-tutorial/episode-4/index.html
index b6a3a0f3..279cf642 100644
--- a/hpc-tutorial/episode-4/index.html
+++ b/hpc-tutorial/episode-4/index.html
@@ -3321,7 +3321,7 @@ First Steps: Episode 4
- September 27, 2024
+ October 7, 2024
diff --git a/index.html b/index.html
index 63006a9b..91ed7a48 100644
--- a/index.html
+++ b/index.html
@@ -3279,7 +3279,7 @@ Documentation Structure
- September 27, 2024
+ October 7, 2024
diff --git a/misc/external-resources/index.html b/misc/external-resources/index.html
index 2da1da9d..53dcbf92 100644
--- a/misc/external-resources/index.html
+++ b/misc/external-resources/index.html
@@ -3257,7 +3257,7 @@
September 27, 2024
+ October 7, 2024
diff --git a/misc/provided-software/index.html b/misc/provided-software/index.html
index 3ae9a7dd..6f917098 100644
--- a/misc/provided-software/index.html
+++ b/misc/provided-software/index.html
@@ -3150,7 +3150,7 @@ Administration-Provided Software
- September 27, 2024
+ October 7, 2024
diff --git a/misc/publication-list/index.html b/misc/publication-list/index.html
index b6faafff..f0f7a965 100644
--- a/misc/publication-list/index.html
+++ b/misc/publication-list/index.html
@@ -3638,7 +3638,7 @@ 2017
September 27, 2024
+ October 7, 2024
diff --git a/ondemand/interactive/index.html b/ondemand/interactive/index.html
index 264a3196..f3774338 100644
--- a/ondemand/interactive/index.html
+++ b/ondemand/interactive/index.html
@@ -3349,7 +3349,7 @@ Example
September 27, 2024
+ October 7, 2024
diff --git a/ondemand/overview/index.html b/ondemand/overview/index.html
index 476551d7..c6e776ec 100644
--- a/ondemand/overview/index.html
+++ b/ondemand/overview/index.html
@@ -3333,7 +3333,7 @@ Portal Dashboard
- September 27, 2024
+ October 7, 2024
diff --git a/ondemand/quotas/index.html b/ondemand/quotas/index.html
index 4d8d4fb1..bca32e41 100644
--- a/ondemand/quotas/index.html
+++ b/ondemand/quotas/index.html
@@ -3160,7 +3160,7 @@ OnDemand: Quota Inspection
- September 27, 2024
+ October 7, 2024
diff --git a/overview/architecture/index.html b/overview/architecture/index.html
index edc0e593..ca4ecef4 100644
--- a/overview/architecture/index.html
+++ b/overview/architecture/index.html
@@ -3325,7 +3325,7 @@ Common Use Case
- September 27, 2024
+ October 7, 2024
diff --git a/overview/for-the-impatient/index.html b/overview/for-the-impatient/index.html
index 748675bc..edb00b3e 100644
--- a/overview/for-the-impatient/index.html
+++ b/overview/for-the-impatient/index.html
@@ -3302,7 +3302,7 @@ What the Cluster Is and Is NOT
- September 27, 2024
+ October 7, 2024
diff --git a/overview/job-scheduler/index.html b/overview/job-scheduler/index.html
index f3913e6c..148652d8 100644
--- a/overview/job-scheduler/index.html
+++ b/overview/job-scheduler/index.html
@@ -3462,7 +3462,7 @@ critical
- September 27, 2024
+ October 7, 2024
diff --git a/overview/monitoring/index.html b/overview/monitoring/index.html
index 100f7e43..beca69cf 100644
--- a/overview/monitoring/index.html
+++ b/overview/monitoring/index.html
@@ -3241,7 +3241,7 @@ Aggregate GPU Utilization Visua
September 27, 2024
+ October 7, 2024
diff --git a/overview/storage/index.html b/overview/storage/index.html
index e012d359..60d2d1d8 100644
--- a/overview/storage/index.html
+++ b/overview/storage/index.html
@@ -3305,7 +3305,7 @@ Cluster Volumes and Locations
- September 27, 2024
+ October 7, 2024
diff --git a/search/search_index.json b/search/search_index.json
index 96ccea89..7f342b3e 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"Welcome to the user documentation of the BIH high-performance computing (HPC) cluster, also called HPC 4 Research. The BIH HPC cluster is managed by CUBI (Core Unit Bioinformatics). This documentation is maintained by BIH CUBI and the user community. It is a living document that you can update and add to. See How-To: Contribute to this Document for details.
The global table of contents is on the left, the one of the current page is on the right.
Additional resources
- User discussion forum
- Performance and workload monitoring
"},{"location":"#getting-started","title":"Getting Started","text":"Read the following set of pages (in order) to learn how to get access and connect to the cluster.
- Getting Access
- Connecting
- Storage
- Slurm
- Getting Help (Writing Good Tickets; if no answer found, contact the HPC Helpdesk).
- HPC Tutorial
Acknowledging BIH HPC Usage
Acknowledge usage of the cluster in your manuscript as \"Computation has been performed on the HPC for Research/Clinic cluster of the Berlin Institute of Health\". Please add your publications using the cluster to this list.
"},{"location":"#news-maintenance-announcements","title":"News & Maintenance Announcements","text":" - July 16th: New high-memory node
hpc-mem-5
with 4 TB of RAM. - Until autumn 2024: Operation Exodus \u2013 Migration of all data from GPFS to CephFS storage.
- September 30th 2024: Unmounting of
/fast
on all non-transfer nodes. - October 31st 2024: Retirement of GPFS/DDN storage.
See Maintenance for a detailed list of current, planned, and previous maintenance and update work.
"},{"location":"#technical-details","title":"Technical Details","text":"If you are interested in how this HPC cluster is set up on a technical level, we got you covered. There is an entire section on this.
"},{"location":"#documentation-structure","title":"Documentation Structure","text":"The documentation is structured as follows:
- Administrative information about administrative processes such as how to get access, register users, work groups, and projects.
- Connecting technical help for connecting to the cluster.
- Storage describes how and where files are stored.
- HPC tutorial a first demo project for getting you started quickly.
- Cluster Scheduler technical help for using the Slurm scheduler.
- OnDemand Portal introduces web HPC access.
- Best Practice guidelines on recommended usage of certain aspects of the system.
- Static Data (Cubit) documentation about the static data (files) collection on the cluster.
- How-To short(ish) solutions for specific technical problems.
- Getting Help explains how you can obtain help in using the BIH HPC.
- Miscellaneous contains a growing list of pages that don't fit anywhere else.
"},{"location":"admin/getting-access/","title":"Getting Access","text":"Access to the BIH HPC cluster is conceptually based on user groups (also known as labs or units) and projects. Users have a relatively limited storage quota within their private home folder and store big data primarily within their group's work space or in project folders. Projects are collaborative efforts involving multiple PIs/groups and are allocated separate storage space on the cluster.
Independent group leaders at BIH/Charit\u00e9/MDC can request a group on the cluster and name group members. The work group leader (the group PI) bears the responsibility for the group's members and ensures that cluster policies and etiquette are followed. In brief: Fair usage rules apply and the cluster ist not to be abused for unethical or illegal purposes. Major and/or continued violations may lead to exclusion of the entire group.
The group leader may also name one delegate (typically an IT-savvy Post-Doc) who is thereby allowed to take decision about cluster usage and work group management on behalf of the group leader. The above mentioned responsibilities stay with the group leader.
Note
- A Charit\u00e9 or MDC user account is required for accessing HPC 4 Research.
- Please only use email addresses from the institutions Charite, BIH, or MDC in the forms below.
"},{"location":"admin/getting-access/#work-groups-and-users","title":"Work Groups and Users","text":"All cluster users are member of exactly one primary work group. This affiliation is usually defined by real life organisational structures within Charit\u00e9/BIH/MDC. Leaders of independent research groups (PIs) can apply for a new cluster work group as follows:
- The group leader sends an email to hpc-helpdesk@bih-charite.de and includes the filled-out form below. Please read the notes box before sending.
- The HPC helpdesk decides on the request and creates corresponding objects on the cluster (users, groups, directories).
- New users are notified and sent further instructions via email.
Important
Changes to an existing group (adding new users, changes in resources, etc.) can only be requested by group leaders and delegates.
"},{"location":"admin/getting-access/#form-new-group","title":"Form: New Group","text":"Example values are given in curly braces.
# Group \"ag-{doe}\"\nGroup leader/PI: {John Doe}\nDelegate [optional]: {Max Mustermann}\nPurpose of cluster usage [short]: {RNA-seq analysis in colorectal cancer}\n\nRequired resources:\n- Tier 1 storage: {1 TB}\n- Tier 1 scratch: {10 TB}\n- Tier 2 storage: {10 TB}\n\n# Users\n## User 1\n- first name: {John}\n- last name: {Doe}\n- affiliation: {Charit\u00e9, Department of Oncology}\n- institute email: {john.doe@charite.de}\n- user has account with\n - [ ] BIH\n - [x] Charite\n - [ ] MDC\n- BIH/Charit\u00e9/MDC user name: {doej}\n\n## User 2\n[etc.]\n
"},{"location":"admin/getting-access/#form-add-user-to-group","title":"Form: Add User to Group","text":"Example values are given in curly braces.
# New user of AG {Doe}\n- first name: {Mia}\n- last name: {Smith}\n- affiliation: {Charit\u00e9, Department of Oncology}\n- institute email: {mia.smith@charite.de}\n- user has account with\n - [ ] BIH\n - [x] Charite\n - [ ] MDC\n- BIH/Charit\u00e9/MDC user name: {smithm}\n
Notes
- All cluster groups must have an owner and may have one delegate.
- Group ownership implies control but also accountability for their group files and members.
- Users can only be members of one primary work group.
- We strongly dis-encourage on-boarding non-lab members into your group. This cause biases in usage accounting, may raise concerns in IT security and data privacy audits and also puts unfair responsibilities on the group leader.
"},{"location":"admin/getting-access/#projects","title":"Projects","text":"Projects are secondary user groups to enable:
- collaboration and data sharing across different work groups,
- fine-grained allocation of additional storage resources,
- organising data in a fine-grained manner for better data lifecycle management.
Project creation can be initiated by group leaders and group delegates as follows:
- Send an email to hpc-helpdesk@bih-charite.de and includes the filled-out form below. Please read the notes box before sending.
- The HPC helpdesk decides on the request and creates corresponding objects on the cluster (groups, directories).
Important
Changes to an existing project (adding new users, changes in resources, etc.) can only be requested by project owners and delegates. Please send us cluster user names for adding new project members.
"},{"location":"admin/getting-access/#form","title":"Form","text":"Example values are given in curly braces.
# Project \"{doe-dbgap-rna}\"\nProject owner: {John Doe}, {doej_c}\nDelegate [optional]: {Max Mustermann}, {musterm_c}\nPurpose of cluster usage [short]: {RNA-seq data from dbGAP}\n\nRequired resources:\n- Tier 1 work: {0 TB}\n- Tier 1 scratch: {0 TB}\n- Tier 2 storage: {1 TB}\n\nAdditional members (cluster user names):\n- {sorgls_c}\n- ...\n
Notes
- All projects must have one owner and may have one delegate.
- Please note that we will enforce kebab case for all project names and folders.
- Tier 1 project storage will be supplemented with 10 TB of T1 scratch by default.
- Users can be associated with multiple projects.
- Project membership does not grant cluster access. A primary group affiliation is still required.
"},{"location":"admin/maintenance/","title":"Next Maintenance Window","text":"This page documents the current and known upcoming maintenance windows.
"},{"location":"admin/maintenance/#login-compute-and-storage-maintenance-december-13-14-2022","title":"Login, Compute and Storage Maintenance, December 13-14, 2022","text":"All informationand updates regarding maintenance will be circulated on our forum https://hpc-talk.cubi.bihealth.org/c/announcements/5.
"},{"location":"admin/maintenance/#login-compute-and-storage-maintenance-march-22-23-2022","title":"Login, Compute and Storage Maintenance, March 22-23, 2022","text":"All COMPUTE nodes and STORAGE resources won't be reachable!
All nodes will be running in RESERVATION mode. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday and Wednesday, March 22 and 23, all-day). For example, if you submit a job that can run up to 7 days after March 15 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\".
Issues of today's maintenance:
- Mounting of storage to
/tmp
on login nodes - Changing mount options of the root partition on the compute nodes
- Upgrading all nodes kernels and further packages
- This implies an upgrade of CUDA, Singularity, and further packages
- Cold reboot (\"power off, power on\") of storage system
- Exchanging
cephfs-2
switches (Tier 2 storage, not relevant for most users)
IMPORTANT
- All nodes will reboot
- All running jobs will die
- All sessions on login nodes will die
Progress Thread on hpc-talk
"},{"location":"admin/maintenance/#drmaa-deprecation-march-2-2022","title":"DRMAA Deprecation, March 2, 2022","text":" - The usage of DRMAA on the HPC is deprecated.
- In Snakemake, it has been deprecated in favor of using Snakemake Profiles as documented.
- We will support DRMAA at least until June 30, 2022 but ask all users to migrate away from it as soon as possible.
- Background:
- With DRMAA, the status of each job is queried for using
scontrol show job JOBID
and sacct -j JOBID
. - This leads to regular remote procedure calls (RPC) to the slurm control daemon.
- It leads to a lot of such calls.
- It leads to so many calls that it prevents the scheduler from working correctly and leads to service degradation for all users.
- Using Snakemake profiles is easy.
- Call Snakemake with
snakemake --profile=cubi-v1
instead of snakemake --drmaa \"...\"
. - In your rules, specify threads, running time and memory as:
rule myrule:\n # ...\n threads: 8\n resources:\n time=\"12:00:00\",\n memory=\"8G\",\n # ...\n
"},{"location":"admin/maintenance/#cluster-setting-tuning-march-1-2022","title":"Cluster Setting Tuning, March 1, 2022","text":" - We have adjusted the scheduler settings to address high number of jobs by users:
SchedulerParameters+=bf_max_job_user=50
: backfill scheduler only considers 50 jobs of each user. This mitigates an issue with some users having too many jobs and thus other users' jobs don't get ahead in the queue EnforcePartLimits=ALL
: jobs that don't fit into their partition are rejected DependencyParameters=kill_invalid_depend
: jobs that have dependencies set that cannot be fulfilled will be killed
"},{"location":"admin/maintenance/#limiting-global-memory-usage-february-14-2022","title":"Limiting Global Memory Usage, February 14, 2022","text":" - A global memory allocation limit per user is set per partition.
- The value is set to \"max CPU count per user * 7GB\".
- Users can allocate up to \"max cpu count\" CPUs or \"max cpu count * 7GB\" RAM.
- This is enforced globally (users could allocate \u2154 of their global CPU limit with 3.5 GB RAM and \u2153 with 7GB of RAM, for example).
"},{"location":"admin/maintenance/#ganglia-fixes-docs-february-3-2022","title":"Ganglia Fixes & Docs, February 3, 2022","text":" - Reparing GPFS and NVIDIA GPU monitoring in Ganglia
- Root cause was that the Python modules in Ganglia were removed from EPEL. We now have a local package build of Ganglia, if you are interested, here is the patch and Docker based build instructions.
- You can find some documentation about our Ganglia here.
"},{"location":"admin/maintenance/#misc-changes-january-29-2022","title":"Misc Changes, January 29, 2022","text":" - We have reduced oversubscription to 2x from 4x.
- We have setup the user quota on /tmp on the login nodes to 20MB to improve stability of the nodes.
"},{"location":"admin/maintenance/#enabling-oversubscription-january-6-2022","title":"Enabling Oversubscription, January 6, 2022","text":" - Many resources remain unused as users allocate too many cores to their jobs. Slurm will now oversubscribe jobs in terms of CPUs, i.e., schedule more than one allocated core per physical core/thread.
"},{"location":"admin/maintenance/#enforcing-usage-of-localtmp-resource-january-31-2022","title":"Enforcing Usage of localtmp
Resource, January 31, 2022","text":" - We will enforce using
localtmp
resource for local storage above 100MB. - See Slurm: Temporary Files for details.
"},{"location":"admin/maintenance/#temporary-file-handling-changes-december-27-2021","title":"Temporary File Handling Changes, December 27, 2021","text":" - Each job gets its private
/tmp
using Linux namespaces/cgroups. This greatly improves the reliability of cleaning up after jobs. (Technically, this is implemented using the Slurm job_container/tmpfs) plugin. - We are starting to track available local temporary space with Slurm in the general resource (
Gres
) \"localtmp\". In the future this will become a requirement. Also see Slurm: Temporary Files.
"},{"location":"admin/maintenance/#cluster-node-upgrades-december-22-23-2021","title":"Cluster Node Upgrades, December 22-23, 2021","text":" - Renaming of cluster head nodes to:
hpc-login-1.cubi.bihealth.org
hpc-login-2.cubi.bihealth.org
hpc-portal.cubi.bihealth.org
hpc-transfer-1.cubi.bihealth.org
hpc-transfer-2.cubi.bihealth.org
- Upgraded cluster operating system from CentOS 7.9 to Rocky Linux 8.5.
- Added three more GPU nodes with Tesla V100 GPUS:
hpc-gpu-{5..7}
. - Slurm has been upgraded to
28.08.5
. - Ganglia monitoring generally available at https://hpc-ganglia.cubi.bihealth.org, from internal networks.
- We have applied a number of changes to maximal running times in Slurm configuration.
"},{"location":"admin/maintenance/#gpfs-upgrade-december-20-21-2021","title":"GPFS Upgrade, December 20-21, 2021","text":"The GPFS storage system has been upgraded to the latest version to make compatible with Enterprise Linux version 8.
"},{"location":"admin/maintenance/#slurm-upgrade-to-21080-september-8-2021","title":"Slurm upgrade to 21.08.0
, September 8, 2021","text":"Slurm has been upgraded to version 21.08.0
.
"},{"location":"admin/maintenance/#network-re-cabling-september-7-8-2021","title":"Network re-cabling, September 7-8, 2021","text":"All servers/nodes won't be reachable!
All nodes will be running in reservation mode. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday and Wednesday, September 7 and 8, all-day). For example, if you submit a job that can run up to 7 days after August 30 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\".
If you already have a job running on any nodes that goes beyond September 7, 12:00 am (00:00 Uhr), this job will die.
"},{"location":"admin/maintenance/#renaming-of-gpu-high-memory-machines-scheduler-changes-september-7-2021","title":"Renaming of GPU & High Memory Machines & Scheduler Changes, September 7, 2021","text":"The GPU machines med030[1-4]
have been renamed to hpc-gpu-[1-4]
. The high memory machines med040[1-4]
have been renamed to hpc-mem-[1-4]
. It will probably take us some time to update all places in the documentation.
Further, the long
partition has been changed to allow jobs with a maximum running time of 14 days.
"},{"location":"admin/maintenance/#new-nodes-in-the-staging-partition-august-31-2021","title":"New Nodes in the staging
partition, August 31, 2021","text":"We have installed 36 new nodes (in BETA mode) in the cluster called hpc-node-[1-36]
. They have 48 cores (thus 96 hardware threads) each and have 360GiB of main memory available (for the hardware nerds, it's Intel(R) Xeon(R) Gold 6240R CPUs at 2.40GHz, featuring the cascadelake
architecture).
Right now, they are only available in the staging
partition. After some testing we will move them to the other partitions. We'd like to ask you to test them as well and report any issues to hpc-helpdesk@bih-charite.de. The nodes have been setup identically to the existing med0xxx
nodes. We do not expect big changes but the nodes might not be as stable as other oness.
Here is how you can reach them.
hpc-login-1 # srun --immediate=5 --pty --time=24:00:00 --partition=staging bash -i\n[...]\nhpc-cpu-1 #\n
Note that I'm specifying a maximal running time of 24h so the scheduler will end the job after 24 hours which is before the upcoming maintenance reservation begins. By default, the scheduler allocates 28 days to the job which means that the job cannot end before the reservation and will be scheduled to start after it. See Reservations / Maintenances for more information about maintenance reservations.
"},{"location":"admin/maintenance/#reservation-maintenance-display-on-login-august-30-2021","title":"Reservation / Maintenance Display on Login, August 30, 2021","text":"User will now be notified on login about maintenance, for example:
NOTE: scheduled maintenance(s)\n\n 1: 2021-09-07 00:00:00 to 2021-09-09 00:00:00 ALL nodes\n\nSlurm jobs will only start if they do not overlap with scheduled reservations.\nMore information:\n\n - https://bihealth.github.io/bih-cluster/slurm/reservations/\n - https://bihealth.github.io/bih-cluster/admin/maintenance/\n
"},{"location":"admin/maintenance/#update-to-job-sumission-script-august-23-2021","title":"Update to Job Sumission Script, August 23, 2021","text":"The srun
command will now behave as if --immediate=60
has been specified by default. It explains how to override this behaviour and possible reasons for job scheduling to fail within 60 seconds (reservations and full cluster).
"},{"location":"admin/maintenance/#slurm-upgrade-august-6-2021","title":"Slurm upgrade, August 6, 2021","text":"We upgrade from 20.11.2
to 20.11.8
which contains some fixes for bugs that our users actually stumbled over. The change should be non-intrusive as it's only a patch-level update.
"},{"location":"admin/maintenance/#networking-hardware-exchange-august-3-2021","title":"Networking hardware exchange, August 3, 2021","text":"Following servers won't be reachable:
- GPU nodes (med03xx)
- computing nodes (med0233-0248)
These nodes are running in reservation mode now. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday, August 3, all-day). For example, if you submit a job that can run up to 7 days after July 26 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\". If you have a job running on any of the before mentioned nodes that goes beyond August 3, 12:00 am (00:00 Uhr), this job will die. We do not expect the remaining nodes to be affected. However, there remains a minor risk of unexpected downtime of other nodes.
"},{"location":"admin/maintenance/#server-reorganization-july-13-2021","title":"Server reorganization, July 13, 2021","text":"Affected servers are:
- med02xx
- med07xx
"},{"location":"admin/maintenance/#server-reorganization-june-22-23-2021","title":"Server reorganization, June 22 + 23, 2021","text":"If you have a job running on any of the before mentioned nodes that goes beyond June 22, 6am, this job will die. We put a so-called Slurm reservation for the maintenance period. Any job that is scheduled before the maintenance and whose end time (start time + max running time) is not before the start of the maintenance will not be scheduled with the message ReqNodeNotAvail, Reserved for maintenance.
Affected servers are:
- med01xx
- med05xx
- med06xx
- med03xx
- med0405
"},{"location":"admin/maintenance/#memory-and-psu-exchange-may-31-2021","title":"Memory and PSU exchange, May 31, 2021","text":" - Memory exchange
- transfer-1.research (OK)
- med0143 (OK)
- med0147 (OK)
- med0206 (FAIL - exchange part broken)
- med0233 (FAIL - exchange part broken)
- med0254 (FAIL - exchange part broken)
- PSU exchange
- med-host024 (OK)
"},{"location":"admin/maintenance/#moving-servers-may-20-may-25-2021","title":"Moving servers, May 20 + May 25, 2021","text":" - Physically moving proxmox-{2,4} and transfer-2.research (May 20)
- Physically moving proxmox-{1,3} and transfer-1.research (May 25)
"},{"location":"admin/maintenance/#miscellaneous-maintenances-december-23-25-2020","title":"Miscellaneous Maintenances, December 23-25, 2020","text":"HPC 4 Research
- Separate HPC 4 Research group GID space from other organization's.
- Fully Unavailable
- Reboot login nodes to increase RAM on hpc-login-2.research
- Update firmwares of transfer-{1,2}.research
"},{"location":"admin/maintenance/#centos-8-migration-in-planning","title":"CentOS 8 Migration (in planning)","text":"Note
This task is currently being planned. No schedule has been fixed yet.
- All nodes will be upgraded to CentOS 8.
- This will be done in a rolling fashion over the course of 1 month.
- The login nodes must be rebooted which we will do with a break of 2 days (one node will remain running).
"},{"location":"admin/maintenance/#finalize-unification-of-mass-data-mounts","title":"Finalize unification of Mass Data Mounts","text":"Note
This task is currently being planned. No schedule has been fixed yet.
- We will remove the bind mount
/fast
that currently points to /data/gpfs-1
on HPC 4 Research. - Users should use
/data
instead of /fast
everywhere, e.g., /data/users/$NAME
etc.
"},{"location":"admin/maintenance/#previous-maintenance-windows","title":"Previous Maintenance Windows","text":""},{"location":"admin/maintenance/#hpc-4-research-miscellaneous-maintenances-december-1-2020","title":"HPC 4 Research: Miscellaneous Maintenances, December 1, 2020","text":"Time: 6am-12am
- Exchange GPFS Controller
- We need to exchange a central piece of hardware in the storage system.
- We do not expect a downtime, only a degradation of service.
- Access to the GPFS will be degraded
- Slurm Scheduler
- Upgrade to the latest and greatest version.
- Restructure scheduler installation to ease rolling upgrades without future downtimes.
- Archival of old accounting information to improve schedule performance.
- Slurm will be unavailable.
- Re-Mounting of GPFS
- The
/fast
file system will be re-mounted to /data/gpfs-1
. /fast
becomes a symbolic link to /data
on all of the cluster. - GPFS access will disappear for some time.
- Login & Transfer Node Migration
- The login nodes will be moved from physical machines to virtual machines in high-availability mode.
- Further, they will be available as
hpc-login-1.cubi.bihealth.org
and login-2...
instead of hpc-login-{1,2}
. - The same is true for,
hpc-transfer-{1,2}
which will be replaced by transfer-1.research.hpc.bihealth.org
and transfer-2...
. - The aim is to improve stability and make everything easier to manage by administration.
"},{"location":"admin/maintenance/#current-status-result","title":"Current Status / Result","text":" - We had to clear the accounting information database to make the update work within an acceptable time (we have 4M+ jobs in there). From now on we will only keep the last 31 days in the database (updated nightly).
- The old login and transfer nodes have been made available as nodes
med010[1-3]
and med012[5-6]
. - All nodes are available again.
- The maintenance is complete.
"},{"location":"admin/maintenance/#slurm-scheduler-updates-september-8-2020","title":"Slurm Scheduler Updates: September 8, 2020","text":" - To improve the scheduling behaviour we will need to restart the Slurm scheduler at ~8am.
- If everything runs well, this will finish after 30minutes (8:30 am).
- Planned Scheduler Changes:
- Introduce automatic routing of jobs to partitions.
- Make Slurm scheduler and accounting run more robustly.
"},{"location":"admin/maintenance/#network-maintenance-june-3-2020","title":"Network Maintenance: June 3, 2020","text":"On June 3, we need to perform a network maintenance at 8 am.
If everything goes well, there might be a short delay in network packages and connections will survive. In this case, the maintenance will end 8:30 am.
Otherwise, the maintenance will finish by noon.
"},{"location":"admin/maintenance/#cluster-maintenance-with-downtime-june-16","title":"Cluster Maintenance with Downtime: June 16","text":"We need to schedule a full cluster downtime on June 16.
"},{"location":"admin/maintenance/#slurm-migration","title":"Slurm Migration","text":"We will switch to the Slurm workload scheduler (from the legacy SGE). The main reason is that Slurm allows for better scheduling of GPUs (and has loads of improvements over SGE), but the syntax is a bit different. Currently, our documentation is in an transient state. We are currently extending our Slurm-specific documentation.
- March 7, 2020 (test stage): Slurm will provide 16 CPU and 3 GPU nodes (with 4 Tesla V100 each), and two high memory nodes, the remaining nodes are available in SGE. We ask users to look into scheduling with Slurm.
- March 31, 2020 (intermediate stage): Half of the nodes will be migrated to the Slurm cluster (~100), all high memory and GPU nodes will be moved to Slurm. New users are advised to use not learn SGE any more but directly use Slurm. Support for SGE is limited to bug fixing only (documentation and tips are phased out).
- May 31, 2020 (sunsetting SGE): All but 16 nodes will remain in the SGE cluster.
- June 31, 2020 (the end): SGE has reached its end of life on hpc4research.
"},{"location":"admin/maintenance/#ssh-key-management","title":"SSH Key Management","text":"SSH Key Management has switched to using Charite and MDC ActiveDirectory servers. You need to upload all keys by the end of April 2020.
- MDC Key Upload
- Charite Key Upload
Schedule
Feb 4, 2020:
Keys are now also taken from central MDC/Charite servers. You do not need to contact us any more to update your keys (we cannot accelerate the process at MDC). May 1, 2020:
Keys are now only taken from central MDC/Charite servers. You must upload your keys to central servers by then.
"},{"location":"admin/maintenance/#switch-update-location-flip-of-hpc-login-2-and-hpc-transfer-1","title":"Switch update, Location Flip of hpc-login-2 and hpc-transfer-1","text":" - Monday, February 23, 9am-15am.
Affected systems:
hpc-transfer-1
hpc-transfer-2
hpc-login-2
- a few compute nodes
The compute nodes are non-critical as we are taking them out of the queues now.
"},{"location":"admin/maintenance/#centos-76-upgrade-january-29-february-5","title":"CentOS 7.6 Upgrade, January 29, February 5","text":" - Wednesday, January 29, 2018: Reboot hpc-login-1, hpc-transfer-1
- Wednesday, February 5, 2018: Reboot hpc-login-2, hpc-transfer-2
"},{"location":"admin/maintenance/#september-03-30-2018","title":"September 03-30, 2018","text":"Starting monday 03.09.2018 we will be performing rolling update of the cluster from CentOS 7.4 to CentOS 7.5. Since update will be performed in small bunches of nodes, the only impact you should notice is smaller number of nodes available for computation.
Also, for around two weeks, you can expect that your jobs can hit both CentOS 7.4 & CentOS 7.5 nodes. This should not impact you in any way, but if you encounter any unexpected behavior of the cluster during this time, please let us know.
At some point we will have to update the transfer, and login nodes. We will do this also in parts, so the you can switch to the other machine.
Key dates are:
18.09.2018 - hpc-login-1 & hpc-transfer-1 will not be available, and you should switch to hpc-login-2 & hpc-transfer-2 respectively.
25.09.2018 - hpc-login-2 & hpc-transfer-2 will not be available, and you should switch to hpc-login-1 & hpc-transfer-1 respectively.
Please also be informed that non-invasive maintenance this weekend which we announced has been canceled, so cluster will operate normally.
In case of any concerns, issues, do not hesitate to contact us via hpc-admin@bih-charite.de, or hpc-helpdesk@bih-charite.de.
"},{"location":"admin/maintenance/#june-18-2018-0600-1500","title":"June 18, 2018, 0600-1500","text":"Due to tasks we need to perform on BIH cluster, we have planned maintenance:
- Maintenance start: 18.06.2018 06:00 AM
- Maintenance end: 18.06.2018 3:00 PM
During maintenance we will perform several actions:
- GPFS drives re-balancing to improve performance
- OS update on cluster, transfer, and login nodes
During maintenance whole cluster will not be usable, this includes:
- you will not be able to run jobs on cluster (SGE queuing system will be shutdown)
- hpc-login-{1,2} nodes will not work reliably during this time
- hpc-transfer-{1-2} nodes, and resources shared by them will be not available
Maintenance window is quite long, since we are dependent on external vendor. However, we will recover services as soon as possible.
We will keep you posted during maintenance with services status.
"},{"location":"admin/maintenance/#march-16-18-2018-mdc-it","title":"March 16-18, 2018 (MDC IT)","text":"MDC IT has a network maintenance from Friday, March 16 18:00 hours until Sunday March 18 18:00 hours.
This will affect connections to the cluster but no connections within the cluster.
"},{"location":"admin/maintenance/#january-17-2018-complete","title":"January 17, 2018 (Complete)","text":"STATUS: complete
The first aim of this window is to upgrade the cluster to CentOS 7.4 to patch against the Meltdown/Spectre vulnerabilities. For this, the login and transfer nodes have to be rebooted.
The second aim of this window is to reboot the file server to mitigate some NFS errors. For this, the SGE master has to be stopped for some time.
"},{"location":"admin/maintenance/#planprogress","title":"Plan/Progress","text":" - reboot med-file1
- update to CentOS 7.4
- front nodes
- hpc-login-1
- hpc-login-2
- hpc-login-3 (admin use only)
- hpc-transfer-1
- hpc-transfer-2
- infrastructure nodes
- qmaster*
- install-srv
- compute nodes
- med0100 to med0246
- med0247 to med0764
- special purpose compute nodes
- med0401 (high-memory)
- med0402 (high-memory)
- med0403 (high-memory)
- med0404 (high-memory)
- med0405 (GPU)
"},{"location":"admin/maintenance/#previous-maintenance","title":"Previous Maintenance","text":"(since January 2010)
- none
"},{"location":"admin/policies/","title":"Policies","text":"This page describes strictly enforced policies valid on the BIH HPC clusters.
The aim of the HPC systems is to support the users in their scientific work and relies on their cooperation. First and foremost, the administration team enforces state of the art IT security and reliability practices through their organizational and operational processes and actions. We kindly ask user to follow the Cluster Etiquette describe below to allow for fair use and flexible access to the shared resources. Beyond this, policies are introduced or enforced only when required to ensure non-restrictive access to the resources themselves. Major or recurrent breaches of policies may lead to exclusion from service.
We will update this list of policies over time. Larger changes will be announced through the mailing list.
"},{"location":"admin/policies/#cluster-etiquette","title":"Cluster Etiquette","text":" - The clusters are soft-partitioned shared resources that are made available under a \"fair use\" policy as far as possible.
- The general assumption that if a user interferes with the work of others (e.g., by blocking compute slots) then this happens accidentally.
- Please do not do this.
- If you see this happening try to contact the user yourself (use
getent passswd $USER
to find out the user's office contact details). - Send an email to hpc-helpdesk@bih-charite.de if you need administrative intervention.
- All users must be subscribed to the cluster mailing list (they are subscribed automatically when the account is created).
- When leaving please send an email to hpc-helpdesk@bih-charite.de such that we can shutdown your account in an organized fashion. We also need to arrange for cleaning up your data.
- The cluster mailing list bih-cluster@charite.de is the primary contact channel for announcements by administration to users. Users must be subscribed to the mailing list. Users must follow the announcements, failure to do so can lead to missing important policy changes and thus losing access to the cluster or data.
- Do not perform any computation on the login nodes. This includes: running
conda
, archive management tools such as tar
, (un)zip
, or gzip
. You should probably only run screen
/tmux
and maybe a text editor there. - Do not perform file transfers through the login nodes. Rather use the transfer nodes
hpc-transfer-1
and hpc-transfer-2
.
"},{"location":"admin/policies/#cluster-policies","title":"Cluster Policies","text":""},{"location":"admin/policies/#file-system-policies","title":"File System Policies","text":"In the case of violations marked with a shield () administration reserves the right to remove write and possibly read permission to the given locations. Policies marked with a robot () are automatically enforced.
- Storage on the GPFS file system is a sparse resource try to use both data volume and file sparingly. Note well that small files above ~4KB take up at least 8MB of space.
- Default quotas are as follows (each user, group, project has a
home
, work
, and scratch
volume). You can request an increase by an email to hpc-helpdesk@bih-charite.de for groups and projects. home
10k files, 1GB space work
2M files, 1TB space scratch
20M files, 200TB space
- The overall throughput limit is 10GB/sec. Try not to overload the cluster I/O wise.
- User home/work/group file sets have to be owned by the user, group is
hpc-users
and mode is u=rwx,go=
; POSIX ACLs are prohibited. This policy is automatically enforced every 5 minutes. - Group and project home/work/group file sets have to be owned by the owner, group set to the corresponding unix group and mode is
u=rwx,g=rwxs,o=
; POSIX ACLs are prohibited. This policy is automatically enforced every 5 minutes. - All files in scratch will be moved into a read-only \"trash can\" inside
scratch/BIH_TRASH
after 14 days (by mtime
) over night. Trash directories will be removed after 14 further days. - Users can arrange with hpc-helpdesk@bih-charite.de to keep files longer by using
touch
on files in scratch
and subsequently bumping the mtime
. - In the case of abuse of this mechanism / failure to communicate with hpc-helpdesk, administration reserves the right to drastically reduce scratch quota of affected users and employ other measures to ensure stability of operations.
- You can learn more in the Automated Scratch Cleanup section.
- Administration will not delete any files (outside of
/tmp
). In the case that users need to delete files that they can access but not update/delete, administration will either give write permissions to the Unix group of the work group or project or change the owner to the owner/delegate of this group. This can occur in a group/project directory of a user who has left the organization. In the case that a user leaves the organization, the owner/delegate of the hosting group can request getting access to the user's files with the express agreement of this user. - Only use
/tmp
in Slurm-controlled jobs. This will enforce that Slurm can clean up after you.
"},{"location":"admin/policies/#connections","title":"Connections","text":"Network connections are a topic important in security. In the case of violations marked with a shield () administration reserves the right to terminate connections without notice and perform other actions.
- Data transfers should happen through the transfer nodes (HPC 4 Research) and/or the compute nodes themselves.
- The cluster is not meant as a \"hop node\". Do not use it to connect to the login node first and then jump to another host outside of the cluster network. Doing so is a breach of cluster policies and quite possibly your organization's IT security policies
- As a corollary, SSH reverse tunnels are strictly prohibited.
- Outgoing connections are meant for data transfers only (in other words: using SSH/SCP to download file is fine).
- Do not leave outgoing connections open longer than necessary.
- Sessions of
screen
and tmux
are only allowed to run on the head nodes. They will be terminated automatically on the compute nodes.
"},{"location":"admin/policies/#interactive-use","title":"Interactive Use","text":" - Interactive sessions block resources to the scheduler. Reduce interactive use to the minimal time and resources possible.
- The cluster is optimized for batch processing. Interactive use is a secondary aim. Administration attempts to strike a good balance here but batch usage is most important. Consider using our Open on Demand service for interactive use.
- Interactive use should happen through the Slurm scheduler (
srun
). - SSH connections to the nodes are allowed for monitoring purposes but not meant for computation. Administration enforces this by restricting all jobs outside of Slurm to use at most 1 core and 128 MB of RAM. This limit is enforced per node per user with Linux cgroups.
- Interactive Slurm sessions on scarce resources (GPU/Highmem partitions) are limited to 24 h.
"},{"location":"admin/policies/#gpu-use","title":"GPU Use","text":" - Interactive sessions block resources to the scheduler. Interactive GPU use is discouraged.
- Accessing GPU cores outside of the Slurm scheduler has been disabled by administration.
"},{"location":"admin/policies/#account-policies","title":"Account Policies","text":" - Sharing accounts and/or credentials is strictly prohibited. Doing so is a breach of cluster policies and certainly also of your organization's IT security policies.
- Hosting shared services on the cluster is also strictly prohibited. - This includes Jupyter servers that shall only be used by the user starting them, this also includes work schedulers such as Dask. - You can assume that the cluster internal network is secure and you do not have to encrypt connections between nodes. - Connections towards outside of the cluster must be encrypted (e.g., via SSH tunnels; incoming ones as reverse tunneling is prohibited, see above). - Access to any service must be protected by appropriate means, e.g., passwords, tokens or client certificates.
"},{"location":"admin/policies/#maintenance","title":"Maintenance","text":" - Maintenance that are expected to cause major service interruptions (the whole system becomes unusable and/or jobs might be prevented to run etc.) are announced 14 days in advance.
- Maintenance of login nodes (e.g., reboot one node while the other is still available) are announced 7 days in advance.
- Maintenance of transfer nodes are announced 1 day in advance. Rationale: transfer nodes expected to not have any interactive sessions running.
"},{"location":"admin/policies/#credentials-policies","title":"Credentials Policies","text":" - Login is currently based on SSH keys only.
- SSH keys must be deposited with the host organizations (Charite/MDC) as documented.
- For technical reasons, the compute nodes also use the
~/.ssh/authorized_keys
file but their usage is discouraged.
"},{"location":"best-practice/bashrc-guide/","title":"~/.bashrc
Guide","text":"You can find the current default content of newly created user homes in /etc/skel.bih
:
hpc-login-1:~$ head /etc/skel.bih/.bash*\n==> /etc/skel.bih/.bash_logout <==\n# ~/.bash_logout\n\n==> /etc/skel.bih/.bash_profile <==\n# .bash_profile\n\n# Get the aliases and functions\nif [ -f ~/.bashrc ]; then\n . ~/.bashrc\nfi\n\n# User specific environment and startup programs\n\nPATH=$PATH:$HOME/.local/bin:$HOME/bin\n\n==> /etc/skel.bih/.bashrc <==\n# .bashrc\n\n# Source global definitions\nif [ -f /etc/bashrc ]; then\n . /etc/bashrc\nfi\n\n# Uncomment the following line if you don't like systemctl's auto-paging feature:\n# export SYSTEMD_PAGER=\n
"},{"location":"best-practice/env-modules/","title":"Custom Environment Modules","text":"This document contains a few tips for helping you using environment modules more effectively. As the general online documentation is lacking a bit, we also give the most popular commands here.
"},{"location":"best-practice/env-modules/#how-does-it-work","title":"How does it Work?","text":"Environment modules are descriptions of software packages. The module
command is provided which allows the manipulation of environment variables such as PATH
, MANPATH
, etc., such that programs are available without passing the full path. Environment modules also allow specifying dependencies between packages and conflicting packages (e.g., when the same binary is available in two packages). Further, environment variables allow the parallel installation of different software versions in parallel and then using software \"a la carte\" in your projects.
"},{"location":"best-practice/env-modules/#popular-commands","title":"Popular Commands","text":""},{"location":"best-practice/env-modules/#querying","title":"Querying","text":"List currently loaded modules:
$ module list\n
Show all available modules
$ module avail\n
"},{"location":"best-practice/env-modules/#loadingunloading-modules","title":"Loading/Unloading Modules","text":"Load one module, make sure to use a specific version to avoid ambiguities.
$ module load Jannovar/0.16-Java-1.7.0_80\n
Unload one module
$ module unload Jannovar\n
Unload all modules
$ module purge\n
"},{"location":"best-practice/env-modules/#getting-help","title":"Getting Help","text":"Get help for environment modules
$ module help\n
Get help for a particular environment module
$ module help Jannovar/0.16-Java-1.7.0_80\n
"},{"location":"best-practice/env-modules/#using-your-own-module-files","title":"Using your own Module Files","text":"You can also create your own environment modules. Simply create a directory with module files and then use module use
for using the modules from the directory tree.
$ module use path/to/modules\n
"},{"location":"best-practice/env-modules/#faq-why-bash-module-command-not-found","title":"FAQ: Why -bash: module: command not found
?","text":"On the login nodes, the module
command is not installed. You should not run any computations there, so why would you need environment modules there? ;)
meg-login2$ module\n-bash: module: command not found\n
Use srun --pty bash -i
to get to one of the compute nodes.
"},{"location":"best-practice/env-modules/#auto-loading-a-set-of-modules","title":"Auto-loading a set of Modules","text":"You will certainly finding yourself using a set of programs regularly without it being part of the core cluster installation, e.g., SAMtools, or Python 3. Just putting the appropriate module load
lines in your ~/.bashrc
will generate warnings when logging into the login node. It is thus recommended to use the following snippet for loading modules automatically on logging into a compute node:
case \"${HOSTNAME}\" in\n login-*)\n ;;\n *)\n # load Python3 environment module\n module load Python/3.4.3-foss-2015a\n\n # Define path for temporary directories, don't forget to cleanup!\n # Also, this will only work after /fast is available.\n export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp\n ;;\nesac\n
"},{"location":"best-practice/project-structure/","title":"Project File System Structure","text":"Under Construction
This guide was written for the old GPFS file system and is in the process of being updated.
"},{"location":"best-practice/project-structure/#general-aims","title":"General Aims","text":"Mostly, you can separate the files in your projects/pipelines into one of the following categories:
- scripts (and their documentation)
- configuration
- data
Ideally, scripts and documentation are independent of a given project and can be separated from the rest. Configuration is project-dependent and small and mostly does not contain any sensitive information (such as genotypes that allows for reidentification of donors). In most cases, data might be large and is either also stored elsewhere or together with scripts and configuration can be regenerated easily.
There is no backup of work
and scratch
The cluster GPFS file system /fast
is not appropriate for keeping around single \"master\" copies of data. You should have a backup and archival strategy for your valuable \"master\" copy data.
"},{"location":"best-practice/project-structure/#best-practices","title":"Best Practices","text":""},{"location":"best-practice/project-structure/#scripts","title":"Scripts","text":" - Your scripts should go into version control, e.g., a Git repository.
- Your scripts should be driven by command line parameters and/or configuration such that no paths etc. are hard-coded. If for a second data set, you need to make a copy of your scripts and adjust some variables, e.g., at the top, you're doing something in a suboptimal fashion. Rather, get these values from the command line or a configuration file and only store (sensible) defaults in your script where appropriate.
- Thus, ideally your scripts are not project-specific.
"},{"location":"best-practice/project-structure/#configuration","title":"Configuration","text":" - Your configuration usually is project-specific.
- Your configuration should also go into version contro, e.g., a Git repository.
In addition, you might need project-specific \"wrapper\" scripts that just call your project-independent script with the correct paths for your project. These scripts rather fall into the \"configuration\" category and should then live together with your configuration.
"},{"location":"best-practice/project-structure/#data","title":"Data","text":" - Your data should go into a location separate from your scripts and configuration.
- Ideally, the raw input data is separated from the work and output files such that you can make these files and directories read-only and don't accidentally damage these files.
Temporary files
You really should keep temporary files in a temporary directory, set the environment variable TMPDIR
appropriately and automatically clean them up (see Useful Tips: Temporary Files)
"},{"location":"best-practice/project-structure/#best-practices-in-practice","title":"Best Practices in Practice","text":"But how can we put this into practice? Below, we give some examples of how to do this. Note that for simplicity's sake we put all scripts and configuration into one directory/repository contrary to the best practices above. This is for educational purposes only and you should strive for reuseable scripts where it makes sense and separate scripts and configuration.
We will limit this to simple Bash scripts for education's purposes. You should be able to easily adapt this to your use cases.
Thus, the aim is to separate the data from the non-data part of the project such that we can put the non-data part of the project into a separate location and under version control. We call the location for non-data part of the project the home location of your project and the location for the data part of the project the work location of your project.
Overall, we have three options:
- Your processes are run in the home location and the sub directories used for execution are links into the work location using symlinks.
- Your processes are run in the work location and
- the scripts are linked into the work location using symlinks, OR
- the scripts are called from the home location, maybe through project-specific wrapper scripts.
"},{"location":"best-practice/project-structure/#example-link-configscripts-into-work-location-option-1","title":"Example: Link config/scripts into work location (Option 1)","text":"Creating the work directory and copy the input files into work/input
.
$ mkdir -p project/work/input\n$ cp /data/cephfs-1/work/projects/cubit/tutorial/input/* project/work/input\n
Creating the home space. We initialize a Git repository, properly configure the .gitignore
file and add a README.md
file.
$ mkdir -p project/home\n$ cd project/home\n$ cat <<EOF >.gitignore\n*~\n.*.sw?\nEOF\n$ cat <<EOF >README.md\n# Example Project\n\nThis is an example project with config/scripts linked into work location.\nEOF\n$ git init\n$ git add .gitignore README.md\n$ git commit -m 'Initial project#\n
We then create the a simple script for executing the mapping step and a configuration file that gives the path to the index and list of samples to process.
$ mkdir scripts\n$ cat <<\"EOF\" >scripts/run-mapping.sh\n#!/bin/bash\n\n# Unofficial Bash script mode, see:\n# http://redsymbol.net/articles/unofficial-bash-strict-mode/\nset -euo pipefail\n\n# Get directory to bash file, see\n# https://stackoverflow.com/a/4774063/84349\nSCRIPTPATH=\"$( cd \"$(dirname \"$0\")\" ; pwd -P )\"\n\n# Helper function to print help to stderr.\nhelp()\n{\n >&2 echo \"Run Mapping Step\"\n >&2 echo \"\"\n >&2 echo \"run-mapping.sh [-c config.sh] [-h]\"\n}\n\n# Parse command line arguments into bash variables.\nCONFIG=\nwhile getopts \"hs:\" arg; do\n case $arg in\n h)\n help()\n exit\n ;;\n s)\n CONFIG=$OPTARG\n ;;\n esac\ndone\n\n# Print the executed commands.\nset -x\n\n# Load default configuration, then load configuration file if any was given.\nsource $SCRIPTPATH/../config/default-config.sh\nif [[ -z \"$CONFIG\" ]]; then\n source $CONFIG\nfi\n\n# Create output directory.\nmkdir -p output\n\n# Actually perform the mapping. This assumes that you have\n# made the bwa and samtools commands available, e.g., using conda.\nfor sample in $SAMPLES; do\n bwa mem \\\n $BWA_INDEX \\\n input/${sample}_R1.fq.gz \\\n input/${sample}_R2.fq.gz \\\n | samtools sort \\\n -o output/${sample}.bam \\\n /dev/stdin\ndone\n\nEOF\n$ chmod +x scripts/run-mapping.sh\n$ mkdir -p config\n$ cat <<\"EOF\" >config/default-config.sh\nBWA_INDEX=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/hs37d5/hs37d5.fa\nSAMPLES=\nEOF\n$ cat <<\"EOF\" >config/project-config.sh\n$ BWA_INDEX comes from default configuration already\nSAMPLES=test\nEOF\n
This concludes the basic project setup. Now, to the symlinks:
$ cd ../work\n$ ln -s ../home/scripts ../home/config .\n
And, to the execution...
$ ./scripts/run-mapping -c config/project-config.sh\n[...]\n
"},{"location":"best-practice/project-structure/#example-link-data-into-home-option-21","title":"Example: Link Data Into Home (Option 2.1).","text":"We can reuse the project up to the statement \"This concludes the basic project setup\" in the example for option 1.
Then, we can do the following:
$ cd ../work\n$ mkdir -p output\n\n$ cd ../home\n$ cat <<\"EOF\" >>.gitignore\n\n# Ignore all data\ninput/\nwork/\noutput/\nEOF\n$ git add .gitignore\n$ git commit -m 'Ignoring data file in .gitignore'\n$ ln -s ../work ../output .\n
And we can execute everything in the home directory.
$ ./scripts/run-mapping -c config/project-config.sh\n[...]\n
"},{"location":"best-practice/project-structure/#example-wrapper-scripts-in-home-option-22","title":"Example: Wrapper Scripts in Home (Option 2.2)","text":"Again, we can reuse the project up to the statement \"This concludes the basic project setup\" in the example for option 1.
Then, we do the following:
$ cd ../work\n$ cat <<\"EOF\" >do-run-mapping.sh\n#!/bin/bash\n\n../home/scripts/run-mapping.sh \\\n -c ../home/config/project-config.sh\nEOF\n$ chmod +x do-run-mapping.sh\n
Note that the the do-run.sh
script could also go into the project-specific Git repository and be linked into the work directory.
Finally, we can run our pipeline:
$ cd ../work\n$ ./do-run-mapping.sh\n[...]\n
"},{"location":"best-practice/screen-tmux/","title":"Screen and Tmux Best Pratice","text":"The program screen
allows you to detach your session from your current login session. So in case you get disconnected your screen session will stay alive.
Hint
You have to reconnect to screen on the machine that you started it. We thus recommend starting it only on the login nodes and not on a compute node.
"},{"location":"best-practice/screen-tmux/#start-and-terminat-a-screen-session","title":"Start and terminat a screen session","text":"You start a new screen
session by
$ screen\n
When you are in a screen session you can terminate it with $ exit\n
so its gone then."},{"location":"best-practice/screen-tmux/#detach-a-screen-session","title":"Detach a screen session","text":"If you want to detach your screen session press Ctrl+a d
"},{"location":"best-practice/screen-tmux/#list-screen-sessions","title":"List screen sessions","text":"To list all your screen sessions run
$ screen -ls\n\nThere is a screen on:\n 2441.pts-1.med0236 (Detached)\n1 Socket in /var/run/screen/S-kbentel.\n
"},{"location":"best-practice/screen-tmux/#reattach-screen-session","title":"Reattach screen session","text":"To reattach a screen session run
$ screen -r screen_session_id\n
If you do not know the screen_session_id
you can get it with screen -ls
, e.g. 2441.pts-1.med0236
in the example above. You do not have to type the whole screen_session_id
only as much as is necessary to identify it uniquely. In case there is only one screen session detached it is enough to run screen -r
"},{"location":"best-practice/screen-tmux/#kill-a-detached-screen-session","title":"Kill a detached screen session","text":"Sometimes it is necessary to kill a detached screen session. This is done with the command
$ screen -X -S screen_session_id quit\n
"},{"location":"best-practice/screen-tmux/#multiple-windows-in-a-screen-session","title":"Multiple windows in a screen session","text":"It is possible to have multiple windows in a screen session. So suppose you are logged into a screen session, these are the relevant shortcuts
new win: Ctrl+a c\nnext/previous win: Ctrl+a n/p\n
To terminate a window just enter
$ exit\n
"},{"location":"best-practice/screen-tmux/#configuration-file","title":"Configuration file","text":"Here is a sensible screen configuration. Save it as ~/.screenrc
.
screenrc
"},{"location":"best-practice/screen-tmux/#fix-a-broken-screen-session","title":"Fix a broken screen session","text":"In case your screen session doesn't write to the terminal correctly, i.e. the formatting of the output is broken, you can fix it by typing to the terminal:
$ tput smam\n
"},{"location":"best-practice/software-craftmanship/","title":"General Software Craftmanship","text":"Computer software, or simply software, is a generic term that refers to a collection of data or computer instructions that tell the computer how to work, in contrast to the physical hardware from which the system is built, that actually performs the work. -- Wikipedia: Software
As you will most probably never have contact with the HPC system hardware, everything you interact with on the HPC is software. All of your scripts, your configuration files, programs installed by you or administration, and all of your data.
This should also answer the question why you should care about software and why you should try to create and use software of a minimal quality.
Software craftsmanship is an approach to software development that emphasizes the coding skills of the software developers themselves. -- Wikipedia: Software Craftmanship
This Wiki page is not mean to give you an introduction of creating good software but rather collect a (growing) list of easy-to-use and high-impact points to improve software quality. Also, it provides pointers to resources elsewhere on the internet.
"},{"location":"best-practice/software-craftmanship/#use-version-control","title":"Use Version Control","text":"Use a version control system for your configuration and your code. Full stop. Modern version control systems are Git and Subversion.
- Official Git Documentation
- Github Help
- Fix Common Git Problems
"},{"location":"best-practice/software-craftmanship/#do-not-share-gitsvn-checkouts-for-multiple-users","title":"Do not Share Git/SVN Checkouts for Multiple Users","text":"Every user should have their own Git/Subversion checkout. Otherwise you are inviting a large number of problems.
"},{"location":"best-practice/software-craftmanship/#document-your-code","title":"Document Your Code","text":"This includes
- programmer-level documentation in your source code, both inline and per code unit (e.g., function/class)
- top-level documentation, e.g., in README files.
"},{"location":"best-practice/software-craftmanship/#document-your-data","title":"Document Your Data","text":"Document where you got things from, how to re-download, etc. E.g., put a README file into each of your data top level directories.
"},{"location":"best-practice/software-craftmanship/#use-checksums","title":"Use Checksums","text":"Use MD5 or other checksums for your data. For example, md5sum
and hashdeep
are useful utilities for computing and checking them:
md5sum
How-To (tools such as sha256sum
work the same...) hashdeep
How-To
"},{"location":"best-practice/software-craftmanship/#use-a-workflow-management-system","title":"Use a Workflow Management System","text":"Use some system for managing your workflows. These systems support you by
- Detect failures and don't continue working with broken data,
- continue where you left off when someting breaks,
- make things more reproducible,
- allow distribution of jobs on the cluster.
Snakemake is a popular workflow management system widely used in Bioinformatics. A minimal approach is using Makefiles.
"},{"location":"best-practice/software-craftmanship/#understand-bash-and-shell-exit-codes","title":"Understand Bash and Shell Exit Codes","text":"If you don't want to use a workflow management system, e.g., for one-step jobs, you should at least understand Bash job management and exit codes. For example, you can use if/then/fi
in Bash together with exit codes to:
- Only call a command if the previous command succeded.
- Remove incomplete output files in case of errors.
if [[ ! -e file.md5 ]]; then\n md5sum file >file.md5 \\\n || rm -f file.md5\nfi\n
Also, learn about the inofficial Bash strict mode.
"},{"location":"best-practice/software-installation-with-conda/","title":"Software Installation with Conda","text":""},{"location":"best-practice/software-installation-with-conda/#conda","title":"Conda","text":"Users do not have the rights to install system packages on the BIH HPC cluster. For the management of bioinformatics software we therefore recommend using the conda package manager. Conda provides software in different \u201cchannels\u201d and one of those channels contains a huge selection of bioinformatics software (bioconda). Generally packages are pre-compiled and conda just downloads the binaries from the conda servers.
You are in charge of managing your own software stack, but conda makes it easy to do so. We will provide you with a description on how to install conda and how to use it. Of course there are many online resources that you can also use. Please find a list at the end of the document.
Warning
Following a change in their terms of service Anaconda Inc. has started to demand payment from research institutions for using both Anaconda, Miniconda, and the defaults channel. As a consequence, usage of this software is prohibited and we're recommending the alternative free \"miniforge\" distribution instead.
"},{"location":"best-practice/software-installation-with-conda/#premise","title":"Premise","text":"When you logged into the cluster, please make sure that you also executed srun
to log into a computation node and perform the software installation there.
"},{"location":"best-practice/software-installation-with-conda/#installing-conda","title":"Installing conda","text":"hpc-login-1:~$ srun --mem=5G --pty bash -i\nhpc-cpu-123:~$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh\nhpc-cpu-123:~$ bash Miniforge3-Linux-x86_64.sh -b -f -p $HOME/work/miniforge\nhpc-cpu-123:~$ eval \"$(/$HOME/work/miniforge/bin/conda shell.bash hook)\"\nhpc-cpu-123:~$ conda init\nhpc-cpu-123:~$ conda config --set auto_activate_base false\n
This will install conda to $HOME/work/miniforge
. You can change the path to your liking, but please note that your $HOME
folder has limited space. The work
subfolder however has a bigger quota. More about this here.
To make bioinformatics software available, we have to add the bioconda
channel to the conda configuration:
hpc-cpu-123:~$ conda config --add channels bioconda\n
"},{"location":"best-practice/software-installation-with-conda/#installing-software-with-conda","title":"Installing software with conda","text":"Installing packages with conda is straight forward:
hpc-cpu-123:~$ conda install <package>\n
This will install a package into the conda base environment. We will explain environments in detail in the next section. To search for a package, e.g. to find the correct name in conda or if it exists at all, issue the command:
hpc-cpu-123:~$ conda search <string>\n
To choose a specific version (conda will install the latest version that is compatible with the current installed Python version), you can provide the version as follows:
hpc-cpu-123:~$ conda install <package>=<version>\n
Please note that new conda installs may ship with a recently update Python version and not all packages might have been adapted. E.g., if you find out that some packages don't work after starting out/upgrading to Python 3.8, simply try to downgrade Python to 3.7 with conda install python=3.7
.
Hint
As resolving the dependency tree of an installation candidate can take a lot of time in Conda, especially when you are installing software from an environment.yaml
file, an alternative resolver has been presented that you can use to install software into your Conda environment. The time savings are immense and an installation that took more than an hour can be resolved in seconds.
Simply run
hpc-cpu-123:~$ conda install mamba\n
With that, you can install software into your environment using the same syntax as for Conda:
hpc-cpu-123:~$ mamba install <package>\n
"},{"location":"best-practice/software-installation-with-conda/#creating-an-environment","title":"Creating an environment","text":"Conda lets you create environments, such that you can test things in a different environment or group your software. Another common use case is to have different environments for the different Python versions. Since conda is Python-based, conflicting packages will mostly struggle with the Python version.
By default, conda will install packages into its root environment. Please note that software that does not depend on Python and is installed in the root environment, is is available in all other environments.
To create a Python 2.7 environment and activate it, issue the following commands:
hpc-cpu-123:~$ conda create -n py27 python=2.7\nhpc-cpu-123:~$ source activate py27\n(py27) hpc-cpu-123:~$\n
From now on, conda will install packages into the py27
environment when you issue the install
command. To switch back to the root environment, simply deactivate the py27
environment:
(py27) hpc-cpu-123:~$ source deactivate py27\nhpc-cpu-123:~$\n
But of course, as Python 2.7 is not supported any more by the Python Software Foundation, you should switch over to Python 3 already!
"},{"location":"best-practice/temp-files/","title":"Temporary Files","text":"Temporary Files and Slurm
See Slurm: Temporary Files for information how Slurm controls access to local temporary storage.
Often, it is necessary to use temporary files, i.e., write something out in the middle of your program, read it in again later, and then discard these files. For example, samtools sort
has to write out chunks of sorted read alignments for allowing to sort files larger than main memory.
"},{"location":"best-practice/temp-files/#environment-variable-tmpdir","title":"Environment Variable TMPDIR
","text":"Traditionally, in Unix, the environment variables TMPDIR
is used for storing the location of the temporary directory. When undefined, usually /tmp
is used.
"},{"location":"best-practice/temp-files/#temporary-directories-on-the-bih-cluster","title":"Temporary Directories on the BIH Cluster","text":"Generally, there are two locations where you could put temporary files:
/data/cephfs-1/home/users/$USER/scratch/tmp
-- inside your scratch folder on the CephFS file system; this location is available from all cluster nodes /tmp
-- on the local node's temporary folder; this location is only available on the node itself. The slurm scheduler uses Linux namespaces such that every job gets its private /tmp
even when run on the same node.
"},{"location":"best-practice/temp-files/#best-practice-use-scratchtmp","title":"Best Practice: Use scratch/tmp
","text":"Use CephFS-based TMPDIR
Generally setup your environment to use /data/cephfs-1/home/users/$USER/scratch/tmp
as filling the local disk of a node with forgotten files can cause a lot of problems.
Ideally, you append the following to your ~/.bashrc
to use /data/cephfs-1/home/users/$USER/scratch/tmp
as the temporary directory. This will also create the directory if it does not exist. Further, it will create one directory per host name which prevents too many entries in the temporary directory.
export TMPDIR=$HOME/scratch/tmp/$(hostname)\nmkdir -p $TMPDIR\n
Prepending this to your job scripts is also recommended as it will ensure that the temporary directory exists.
"},{"location":"best-practice/temp-files/#tmpdir-and-the-scheduler","title":"TMPDIR
and the scheduler","text":"In the older nodes, the local disk is a relatively slow spinning disk, in the newer nodes, the local disk is a relatively fast SSD. Further, the local disk is independent from the CephFS file system, so I/O volume to it does not affect the network or any other job on other nodes. Please note that by default, Slurm will not change your environment variables. This includes the environment variable TMPDIR
.
Slurm will automatically update temporary files in a job's /tmp
on the local file system when the job terminates. To automatically clean up temporary directories on the shared file system, use the following tip.
"},{"location":"best-practice/temp-files/#use-bash-traps","title":"Use Bash Traps","text":"You can use the following code at the top of your job script to set TMPDIR
to the location in your home directory and get the directory automatically cleaned when the job is done (regardless of successful or erroneous completion):
# First, point TMPDIR to the scratch in your home as mktemp will use thi\nexport TMPDIR=$HOME/scratch/tmp\n# Second, create another unique temporary directory within this directory\nexport TMPDIR=$(mktemp -d)\n# Finally, setup the cleanup trap\ntrap \"rm -rf $TMPDIR\" EXIT\n
"},{"location":"connecting/connecting-windows/","title":"Connecting via SSH on Windows","text":""},{"location":"connecting/connecting-windows/#install-ssh-client-for-windows","title":"Install SSH Client for Windows","text":"We recommend to use the program MobaXterm on Windows. MobaXterm is a software that allows you to connect to an SSH server, much like PuTTy, but also maintains your SSH key.
Alternative SSH Clients for Windows
- Another popular option is PuTTy but many users have problems configuring it correctly with SSH keys.
- On Windows 10, you can also install Windows Subsystem for Linux, e.g., together with WSL Terminal. This is not for the faint of heart (but great if you're a Unix head).
- Navigate to https://mobaxterm.mobatek.net/download-home-edition.html
- Download either the
- Portable edition (blue button lefthand-side, if you have no admin rights, e.g. on a Charite or MDC workstation), or
- Installer edition (green button righthand-side, requires admin rights on your computer).
- Install or unpack MobaXterm and start the software. As a Charite user, please cancel any firewall warnings that pop up.
"},{"location":"connecting/connecting-windows/#software-for-transfering-data-fromto-windows","title":"Software for transfering data from/to Windows","text":"For transfering data from/to Windows, we recommand using WinSCP. Install the latest version from here: https://winscp.net/eng/download.php
On the Login
screen of WinSCP create a new login by selecting New Site
.
Fill in the following parameters:
File protocol
: SFTP
Host name
: hpc-transfer-1.cubi.bihealth.org
or hpc-transfer-2.cubi.bihealth.org
User name
: your user name
Go to Advanced
> SSH
> Authentication
> Authentication parameters
> Private key file
and select your private ssh key file (in .ppk
format).
Press Ok
then Save
.
Press Login
to connect. It will ask for your private key passphrase, if you set one up.
If you need to convert your private ssh key file the .ppk
format, on the WinSCP login screen go to Tools
> PuTTYgen
and follow the steps here: https://docs.acquia.com/cloud-platform/manage/ssh/sftp-key/
"},{"location":"connecting/connecting-windows/#connecting-from-within-mdccharite-network","title":"Connecting from within MDC/Charite Network","text":"Click on Session
.
Click on SSH
.
In Basic SSH settings, enter a hostname (hpc-login-X.cubi.bihealth.org
, where X
is 1 or 2), check Specify username and enter your username in the textfield. Select the tab Advanced SSH settings, check Use private key and select your private SSH key file (possible choices described with the next to figures).
Select the id_rsa
file generated in Linux OR
select the id_rsa.ppk
file generated in Windows with MobaXterm.
Afterwards hit the OK button and MobaXterm will connect.
The session will be stored automatically and you can establish new connections later on, or also multiple ones at the same time, if you like.
"},{"location":"connecting/connecting/","title":"Connecting to HPC 4 Research","text":"HPC 4 Research is only available via the Charit\u00e9, MDC, and BIH internal networks. VPN access requires additional measures which are described in Connecting from External Networks.
There are two primary methods for interacting with BIH HPC:
- Through the \u201cOndemand\u201d web portal.
- Via SSH and Slurm.
This part of the documentation only described direct console access via SSH. For information regarding the web portal, please read OnDemand Portal. In case you're not familiar with SSH, you should probably start via the web portal or (if you are determined to learn) read through our SSH basics page.
"},{"location":"connecting/connecting/#in-brief","title":"In brief","text":"Follow these steps to connect to BIH HPC via the command line:
- Register an account via your PI.
- Generate a SSH key pair in Linux or Windows
- Submit your public key to Charite or to MDC.
-
Connect to one of the two login nodes.
# Charite Users\n$ ssh user_c@hpc-login-1.cubi.bihealth.org\n$ ssh user_c@hpc-login-2.cubi.bihealth.org\n\n# MDC Users\n$ ssh user_m@hpc-login-1.cubi.bihealth.org\n$ ssh user_m@hpc-login-2.cubi.bihealth.org\n
Hint
There are two login nodes, hpc-login-1
and hpc-login-2
. There are two for redundancy reasons. Please do not perform big file transfers or an sshfs
mount via the login nodes. For this purpose, we have hpc-transfer-1
and hpc-transfer-2
.
Please also read Advanced SSH for more custom scenarios how to connect to BIH HPC. If you are using a Windows PC to access BIH HPC, please read Connecting via SSH on Windows
-
Allocate resources on a computation node using Slurm. Do not compute on the login node!
# Start interactive shell on computation node\n$ srun --pty bash -i\n
-
Bonus: Configure your SSH client on Linux and Mac or Windows.
- Bonus: Connect from external networks .
tl;dr
- Web Access: https://hpc-portal.cubi.bihealth.org
-
SSH-Based Access:
# Interactive login (choose one)\nssh username@hpc-login-1.cubi.bihealth.org\nssh username@hpc-login-2.cubi.bihealth.org\nsrun --pty bash -i\n\n# File Transfer (choose one)\nsftp local/file username@hpc-transfer-1.cubi.bihealth.org:remote/file\nsftp username@hpc-transfer-2.cubi.bihealth.org:remote/file local/file\n\n# Interactive login into the transfer nodes (choose one)\nssh username@hpc-transfer-1.cubi.bihealth.org\nssh username@hpc-transfer-2.cubi.bihealth.org\n
"},{"location":"connecting/connecting/#what-is-my-username","title":"What is my username?","text":"Your username for accessing the cluster are composed of your username at your primary organization (Charit\u00e9/MDC) and a suffix:
- Charite user:
<Charite username>_c -> doej_c
- MDC user:
<MDC username>_m -> jdoe_m
"},{"location":"connecting/connecting/#how-can-i-connect-from-the-outside","title":"How can I connect from the outside?","text":"Please read Connecting from External Networks
"},{"location":"connecting/connecting/#i-have-problems-connecting","title":"I have problems connecting","text":"Please read Debugging Connection Problems
"},{"location":"connecting/connection-problems/","title":"Debugging Connection Problems","text":"When you encounter problems with the login to the cluster although we indicated that you should have access, depending on the issue, here is a list of how to solve the problem:
"},{"location":"connecting/connection-problems/#im-getting-a-connection-refused","title":"I'm getting a \"connection refused\"","text":"The full error message looks as follows:
ssh: connect to host hpc-login-1.cubi.bihealth.org port 22: Connection refused\n
This means that your computer could not open a network connection to the server.
- HPC 4 Research can be connected to from:
- Charite (cabled) network
- Charite VPN but only with Zusatzantrag B.
- MDC (cabled) network
- MDC VPN
- BIH (cabled) network
- If you think that there is no problem with any of this then please include the output of the following command in your ticket (use the server that you want to read instead of
<DEST>
): - Linux/Mac
ifconfig\ntraceroute <DEST>\n
- Windows
ipconfig\ntracepath <DEST>\n
"},{"location":"connecting/connection-problems/#i-can-connect-but-it-seems-that-my-account-has-no-access-yet","title":"I can connect, but it seems that my account has no access yet","text":"You're logging into BIH HPC cluster! (login-1)\n\n ***Your account has not been granted cluster access yet.***\n\n If you think that you should have access, please contact\n hpc-helpdesk@bih-charite.de for assistance.\n\n For applying for cluster access, contact hpc-helpdesk@bih-charite.de.\n\nuser@login-1's password:\n
Hint
This is the most common error, and the main cause for this is a wrong username. Please take a couple of minutes to read the What is my username?!
If you encounter this message although we told you that you have access and you checked the username as mentioned above, please write to hpc-helpdesk@bih-charite.de, always indicating the message you get and a detailed description of what you did.
"},{"location":"connecting/connection-problems/#im-getting-a-passphrase-prompt","title":"I'm getting a passPHRASE prompt","text":"You're logging into BIH HPC cluster! (login-1)\n\n *** It looks like your account has access. ***\n\n Login is based on **SSH keys only**, if you are getting a password prompt\n then please contact hpc-helpdesk@bih-charite.de for assistance.\n\nEnter passphrase for key '/home/USER/.ssh/id_rsa':\n
Here you have to enter the passphrase that was used for encrypting your private key. Read SSH Basics for further information of what is going on here.
"},{"location":"connecting/connection-problems/#i-can-connect-but-i-get-a-password-prompt","title":"I can connect, but I get a passWORD prompt","text":"You're logging into BIH HPC cluster! (login-1)\n\n *** It looks like your account has access. ***\n\n Login is based on **SSH keys only**, if you are getting a password prompt\n then please contact hpc-helpdesk@bih-charite.de for assistance.\n\nuser@login-1's password:\n
This is diffeerent from passPHRASE prompt
Please see I'm getting a passPHRASE prompt for more information.
When you encounter this message during a login attempt, there is an issue with your SSH key. In this case, please connect with increased verbosity to the cluster (ssh -vvv ...
) and mail the output and a detailed description to hpc-helpdesk@bih-charite.de.
"},{"location":"connecting/from-external/","title":"Connecting from External Networks","text":"This page describes how to connect to the BIH HPC from external networks (e.g., another university or from your home). The options differ depending on your home organization and are described in detail below.
- MDC users can use
- the MDC SSH gateway/hop node, or
- MDC VPN.
- Charite users can use
- the Charite VPN with \"VPN Zusatzantrag B\".
Getting Help with VPN and Gateway Nodes
Please note that the VPNs and gateway nodes are maintained by the central IT departments of Charite/MDC. BIH HPC IT cannot assist you in problems with these serves. Authorative information and documentation is provided by the central IT departments as well.
SSH Key Gotchas
You should use separate SSH key pairs for your workstation, laptop, home computer etc. As a reminder, you will have to register the SSH keys with your home IT organization (MDC or Charite). When using gateway nodes, please make sure to use SSH key agents and agent forwarding (ssh
flag \"-A
\").
"},{"location":"connecting/from-external/#mdc-users","title":"MDC Users","text":""},{"location":"connecting/from-external/#via-gateway-node","title":"Via Gateway Node","text":"Use the following command to perform a proxy jump via the MDC SSH gateway (ssh1
aka jail1
) when connecting to a login node. Note that for logging into the jail, the <MDC_USER>
is required.
$ ssh -J <MDC_USER>@ssh1.mdc-berlin.de <HPC_USER>@hpc-login-1.cubi.bihealth.org\n
Note
Please Note that the cluster login is independent of access to the MDC jail node ssh1.mdc-berlin.de.
- Access to the cluster is granted by BIH HPC IT through hpc-helpdesk@bih-charite.de.
- Access to the MDC jail node is managed by MDC IT.
"},{"location":"connecting/from-external/#via-mdc-vpn","title":"Via MDC VPN","text":"You can find the instructions for getting MDC VPN access here in the MDC intranet below the \"VPN\" heading. Please contact helpdesk@mdc-berlin.de for getting VPN access.
Install the VPN client and then start it. Once VPN has been activated you can SSH to the HPC just as from your workstation.
$ ssh user_m@hpc-login-1.cubi.bihealth.org\n
"},{"location":"connecting/from-external/#charite-users","title":"Charit\u00e9 Users","text":"Access to BIH HPC from external networks (including Eduroam) requires a Charit\u00e9 VPN connection with special access permissions.
"},{"location":"connecting/from-external/#general-charite-vpn-access","title":"General Charit\u00e9 VPN Access","text":"You need to apply for general Charit\u00e9 VPN access if you haven't done so already. The form can be found in the Charite Intranet and contains further instructions. Charit\u00e9 IT Helpdesk can help you with any questions.
"},{"location":"connecting/from-external/#zusatzantrag-b","title":"Zusatzantrag B","text":"Special permissions form B is also required for HPC access. You can find Zusatzantrag B in the Charit\u00e9 intranet. Fill it out and send it to the same address as the general VPN access form above.
Once you have been granted VPN access, start the client and connect to VPN. You will then be able to connect from your client in the VPN just as you do from your workstation.
$ ssh jdoe_c@hpc-login-1.cubi.bihealth.org\n
"},{"location":"connecting/from-external/#charite-vdi-not-recommended","title":"Charit\u00e9 VDI (Not recommended)","text":"Alternative to using Zusatzantrag B, you can also get access to the Charit\u00e9 VDI (Virtual Desktop Infrastructure). Here, you connect to a virtual desktop computer which is in the Charit\u00e9 network. From there, you can connect to the BIH HPC system.
You need to apply for extended VPN access to be able to access the BIH VDI. The form can be found here. It is important to tick Dienst(e), enter HTTPS and as target view.bihealth.org
. Please write to helpdesk@charite.de with the request to access the BIH VDI.
When the access has been set up, follow the instructions on client configuration for Windows, after logging in to the BIH VDI.
"},{"location":"connecting/ssh-basics/","title":"SSH Basics","text":""},{"location":"connecting/ssh-basics/#what-is-ssh","title":"What is SSH?","text":"SSH stands for S ecure Sh ell. It is a software that allows to establish a user-connection to a remote UNIX/Linux machine over the network and remote-control it from your local work-station.
Let's say you have an HPC cluster with hundreds of machines somewhere in a remote data-center and you want to connect to those machines to issue commands and run jobs. Then you would use SSH.
"},{"location":"connecting/ssh-basics/#getting-started","title":"Getting Started","text":""},{"location":"connecting/ssh-basics/#installation","title":"Installation","text":"Simply install your distributions openssh-client
package. You should be able to find plenty of good tutorials online. On Windows you can consider using MobaXterm (recommended) or Putty.
"},{"location":"connecting/ssh-basics/#connecting","title":"Connecting","text":"Let's call your local machine the client and the remote machine you want to connect to the server.
You will usually have some kind of connection information, like a hostname, IP address and perhaps a port number. Additionally, you should also have received your user-account information stating your user-name, your password, etc.
Follow the instructions below to establish a remote terminal-session.
If your are on Linux
Open a terminal and issue the following command while replacing all the <...>
fields with the actual data:
# default port\nssh <username>@<hostname-or-ip-address>\n\n# non-default port\nssh <username>@<hostname-or-ip-address> -p <port-number>\n
If you are on windows
Start putty.exe
, go into the Session
category and fill out the form, then click the Connect
button. Putty also allows to save the connection information in different profiles so you don't have to memorize and retype all fields every time you want to connect.
"},{"location":"connecting/ssh-basics/#ssh-keys","title":"SSH-Keys","text":"When you connect to a remote machine via SSH, you will be prompted for your password. This will happen every single time you connect and can feel a bit repetitive at times, especially if you feel that your password is hard to memorize. For those who don't want to type in their password every single time they connect, SSH keys are an alternative way of authentication.
Instead if being prompted for a password, SSH will simply use the key to authenticate. As this key file should be device specific, this also increases security of the login process.
You can generate a new key by issuing:
client:~$ ssh-keygen -t ed25519\n\n# 1. Choose file in which to save the key *(leave blank for default)*\n# 2. Choose a passphrase of at least five characters\n
"},{"location":"connecting/ssh-basics/#how-do-ssh-keys-work","title":"How do SSH-Keys work?","text":"An SSH key consists of two files, one private and one public key. The public key is installed on remote machines and can only be validated with the matching private key, which is stored on client computers. During the login process this is achieved via public-key cryptography.
Traditionally the algorithm used for this was RSA. Recently elliptic curve cryptography has been developed as a more secure and more performant alternative. We recommend the ed25519
type of SSH key.
"},{"location":"connecting/ssh-basics/#passphrase","title":"Passphrase","text":"The security problem with SSH keys is that anyone with access to the private key has full access to all machines that have the public key installed. Loosing the key or getting it compromised in another way imposes a serious security threat. Therefore, it is best to secure the private key with a passphrase. This passphrase is needed to unlock and use the private key.
Once you have your key-pair generated, you can easily change the passphrase of that key by issuing:
client:~$ ssh-keygen -p\n
"},{"location":"connecting/ssh-basics/#ssh-agent","title":"SSH-Agent","text":"In order to avoid having to type the passphrase of the key every time we want to use it, the key can be loaded into an SSH-Agent.
For instance, if you have connected to a login-node via Putty and want to unlock your private key in order to be able to access cluster nodes, you cant configure the SSH-Agent.
client:~$ source <(ssh-agent)\n
(The above command will load the required environment variables of the SSH-Agent into your shell environment, effectively making the agent available for your consumption.)
Next, you can load your private key:
client:~$ ssh-add\n
(You will be prompted for the passphrase of the key)
You can verify that the agent is running and your key is loaded by issuing:
client:~$ ssh-add -l\n# 'l' as in list-all-loaded-keys\n
(The command should print at least one key, showing the key-size, the hash of the key-fingerprint and the location of the file in the file-system.)
Since all home-directories are shared across the entire cluster and you created your key-pair inside your home-directory, you public-key (which is also in your home-directory) is automatically installed on all other cluster nodes, immediately. Try connecting to any cluster node. It should not prompt your for a password.
There is nothing you have to do to \"unload\" or \"lock\" the key-file. Simply disconnect.
"},{"location":"connecting/advanced-ssh/linux/","title":"Connecting via SSH on Unix","text":""},{"location":"connecting/advanced-ssh/linux/#activating-your-key-in-the-ssh-key-agent","title":"Activating your Key in the SSH Key Agent","text":"Note
The big Linux distributions automatically manage ssh-agent for you and unlock your keys at login time. If this doesn't work for you, read on.
ssh-agent
caches your SSH keys so that you do not need to type your passphrase every time it is used. Activate it by making sure ssh-agent
runs in the background and add your key:
$ eval \"$(ssh-agent -s)\"\n$ ssh-add\n
or if you chose a custom key name, specify the file like so:
$ ssh-add ~/.ssh/mdc_id_rsa\n
"},{"location":"connecting/advanced-ssh/linux/#macos","title":"MacOS","text":"If you run into problems that your key is not accepted when connecting from MacOS, please use:
$ ssh-add --apple-use-keychain\n
"},{"location":"connecting/advanced-ssh/linux/#configure-ssh-client","title":"Configure SSH Client","text":"You can define a personal SSH configuration file to make connecting to the cluster more comfortable by reducing the typing necessary by a lot. Add the following lines to the file ~/.ssh/config
file. Replace USER_NAME
with your cluster user name. You can also adapt the Host naming as you like.
Host bihcluster\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n\nHost bihcluster2\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n
Now, you can do type the following (and you don't have to remember the host name of the login node any more).
$ ssh bihcluster\n
This configuration works if you are inside Charit\u00e9, the Charit\u00e9 VPN, or MDC.
"},{"location":"connecting/advanced-ssh/linux/#mdc-users-jail-node","title":"MDC users: Jail node","text":"If you have an MDC user account and want to connect from the outside, you can use the following ~/.ssh/config
lines to set up a ProxyJump via the MDC SSH jail.
Host mdcjail\n HostName ssh1.mdc-berlin.de\n User MDC_USER_NAME\n
Now you can run
$ ssh -J mdcjail bihcluster1\n
If you are always connecting from outside the internal network, you can also add a permanent ProxyJump to the SSH configuration like so:
Host bihcluster\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n ProxyJump mdcjail\n
"},{"location":"connecting/advanced-ssh/linux/#connecting-with-another-computerlaptop","title":"Connecting with another computer/laptop","text":"If you need to connect to the cluster from another computer than the one that contains the SSH keys that you submitted for the cluster login, you have two possibilities.
- Generate another SSH key pair and submit the public part as described beforehand.
- Copy your private part of the SSH key (
~/.ssh/id_rsa
) to the second computer into the same location.
Danger
Do not leave the key on any USB stick. Delete it after file transfer. This is a sensible part of data. Make sure that the files are only readable for you.
$ cd ~/.ssh\n$ chmod g-rwx id_rsa*\n$ ssh-add id_rsa\n
"},{"location":"connecting/advanced-ssh/linux/#file-system-mount-via-sshfs","title":"File System mount via sshfs","text":"$ sshfs <USERNAME>@hpc-transfer-1.cubi.bihealth.org:/ <MOUNTPOINT>\n
hpc-transfer-1:
follows the structure <host>:<directory>
starting in the user home. <MOUNTPOINT>
must be an empty but existing and readable directory on your local computer
"},{"location":"connecting/advanced-ssh/linux/#macos_1","title":"MacOS","text":"Make sure you have both OSXFUSE and SSHFS installed. You can get both from here: https://osxfuse.github.io/ or the most recent version via Homebrew:
$ brew cask install osxfuse; brew install sshfs; brew link --overwrite sshfs\n
The last command is optional and unlinks any pre-existing links to older versions of sshfs. Now you can run $ sshfs -o follow_symlinks <USERNAME>@hpc-transfer-1<X>.cubi.bihealth.org:<directory_relative_to_Cluster_root> <MOUNTPOINT> -o volname=<BIH-FOLDER> -o allow_other,noapplexattr,noappledouble\n
"},{"location":"connecting/advanced-ssh/linux/#x11","title":"X11","text":"Do you really need to run a graphical application on the cluster?
Please note that running more complex Java applications, such as IGV may be not very efficient because of the connection speed. In most cases you can run them on your local workstation by mounting them via SSHFS.
Connect to one of the login nodes using X11 forwarding:
$ ssh -X -C -t <USERNAME>@hpc-login-1.bihealth.org\n
Once you get a login prompt, you can use the srun
command with the --x11
parameter to open a X11 session to a cluster node:
$ srun --pty --x11 bash\n
And finally you can start your X11 application, e.g.:
$ xterm\n
After a while Visual Terminal should start:
"},{"location":"connecting/advanced-ssh/overview/","title":"Advanced SSH usage","text":"Here we describe custom scenarios for using SSH to connect to BIH HPC. To keep it consise, this section is divided into separate documents for
- Linux and
- Windows users.
"},{"location":"connecting/advanced-ssh/windows/","title":"Windows","text":""},{"location":"connecting/advanced-ssh/windows/#mounting-the-fs-from-within-the-mdccharite-network","title":"Mounting the FS from within the MDC/Charite Network","text":"Danger
Mounting ssh on Windows is currently discouraged since relevant software is outdated (see also hpc-talk). Also, in most cases it is not really necessary to have a constant mount. For normal data transfer please use WinSCP instead.
Once WinSshFS is started, an icon will be added to your taskbar:
Left-clicking that icon will bring up a window. If not, right click the taskbar icon, select Show Manager
and click Add
in the menu.
Fill out the marked fields:
- Drive Name: Name that will show up in the windows explorer
- Host:
hpc-transfer-1.cubi.bihealth.org
- Username: Your cluster username
- Authentication method:
PrivateKey
. Select the id_rsa
private key, not the .ppk
format that is provided by PuTTY. Enter the password that you used to secure your key with. - Directory: Cluster directory that will be mounted, you can choose any directory you have access to on the cluster.
Then click Save
and then Mount
.
Open the explorer. A new drive with the name you gave should show up:
Finished!
"},{"location":"connecting/advanced-ssh/windows/#connecting-via-mdc-jail-node","title":"Connecting via MDC Jail Node","text":" -
This requires an active MDC account!
-
Additional to the steps above, click on the tab Network settings
.
- Check Connect through SSH gateway (jump host) and in the text field Gateway SSH server enter
ssh1.mdc-berlin.de
and in the field User your MDC username. - Check Use private key and select the SSH key that you uploaded to the MDC persdb (this might differ from your cluster key!).
- Click OK
"},{"location":"connecting/advanced-ssh/windows/#x11","title":"X11","text":"Do you really need to run a graphical application on the cluster?
Please note that running more complex Java applications, such as IGV may be not very efficient because of the connection speed. In most cases you can run them on your local workstation by mounting them via SSHFS.
Start MobaXterm, it should automatically fetch your saved Putty sessions as you can see on screen below:
Connect to one of the login nodes, by double-click on saved profile, and then use srun --pty --x11 bash
command to start X11 session to one of the nodes:
Finally, start X11 application (below example of starting Visual Terminal):
"},{"location":"connecting/generate-key/linux/","title":"Generating an SSH Key in Linux","text":" - You might already have one, check whether a file
~/.ssh/id_xxx.pub
is present. - Otherwise, create key using the following command (marking your key with your email address will make it easier to reidentify your key later on):
$ ssh-keygen -t ed25519 -C \"your_email@example.com\"\n
- Use the default location for your key
- Enter a passphrase twice to encrypt your key
What is a key passphrase?
You should set a passphrase when generating your key pair. It is used for encrypting your private key in case it is stolen or lost. When using the key for login, you will have to enter the passphrase. Many desktop environments offer ways to automatically unlock your key on login.
Read SSH Basics for more information.
The whole session should look something like this:
host:~$ ssh-keygen -t ed25519 -C \"your_email@example.com\"\nGenerating public/private ed25519 key pair.\nEnter file in which to save the key (/home/USER/.ssh/id_ed25519): \nCreated directory '/home/USER/.ssh'.\nEnter passphrase (empty for no passphrase):\nEnter same passphrase again: \nYour identification has been saved in /home/USER/.ssh/id_ed25519.\nYour public key has been saved in /home/USER/.ssh/id_ed25519.pub.\nThe key fingerprint is:\nSHA256:Z6InW1OYt3loU7z14Kmgy87iIuYNr1gJAN1tG71D7Jc your_email@example.com\nThe key's randomart image is:\n+--[ED25519 256]--+\n|.. . . o |\n|. . . + + |\n|. . = . . |\n|. . +oE. |\n|. So= o o |\n| . . . * = + + |\n| + o + B o o .|\n| oo+. .B + + . |\n|.ooooooo*. . |\n+----[SHA256]-----+\n
The file content of ~/.ssh/id_ed25519.pub
should look something like this):
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFzuiaSVD2j5y6RlFxOfREB/Vbd+47ABlxF7du5160ZH your_email@example.com\n
"},{"location":"connecting/generate-key/linux/#submit-your-key","title":"Submit Your Key","text":"As a next step you need to submit the SSH key use these links as:
- Charite user
- MDC user
"},{"location":"connecting/generate-key/windows/","title":"Generating an SSH Key in Windows","text":"Prerequisite: Installing an SSH Client
Please install an SSH client for Windows first.
"},{"location":"connecting/generate-key/windows/#generate-the-key","title":"Generate the Key","text":"Click on Tools
and MobaKeyGen (SSH key generator)
In the section Parameters make sure to set the following properties:
- Type of key to generate:
RSA
(this is the SSH-2
protocol) - Number of bits in a generated key:
4096
If all is set, hit the Generate button.
During generation, move the mouse cursor around in the blank area.
When finished, make sure to protect your generated key with a passphrase. Save the private and public key. The default name under Linux for the public key is id_rsa.pub
and id_rsa
for the private key, but you can name them however you want (the .pub
is NOT automatically added). Note that in the whole cluster wiki we will use this file naming convention. Also note that the private key will be stored in Putty format (.ppk
, this extension is added automatically).
What is your key's passphrase?
You should set a passphrase when generating your private key. This passphrase is used for encrypting you private key to protect it against the private key file theft/being lost. When using the key for login, you will have to enter it (or the first time you load it into the SSH key agent). Note that when being asked for the passphrase this does not occur on the cluster (and is thus unrelated to it) but on your local computer.
Also see SSH Basics for more information.
The gibberish in the textbox is your public key in the format how it has to be submitted to the MDC and Charite (links for this step below). Thus, copy this text and paste it to the SSH-key-submission-web-service of your institution.
Store the private key additionally in the OpenSSH format. To do so, click Conversions
and select Export OpenSSH key
. To be consistent, give the file the same name as your .ppk
private key file above (just without the .ppk
).
"},{"location":"connecting/generate-key/windows/#summary","title":"Summary","text":"To summarize, you should end up with three files:
id_rsa.pub
The public key file, it is not required if you copy and submit the SSH public key as described above and in the links below. id_rsa.ppk
This file is only needed if you plan to use Putty. id_rsa
This is your private key and the one and only most important file to access the cluster. It will be added to the sessions in MobaXterm and WinSSHFS (if required).
"},{"location":"connecting/generate-key/windows/#submit-your-key","title":"Submit Your Key","text":"As a next step you need to submit the SSH key use these links as:
- Charite user
- MDC user
"},{"location":"connecting/submit-key/charite/","title":"Submitting an SSH Key to Charite","text":"As of February 2020, SSH key submission not accepted via email anymore. Instead, use the process outline here.
For any help, please contact helpdesk@charite.de (as this site is maintained by Charite GB IT).
"},{"location":"connecting/submit-key/charite/#charite-zugangsportal","title":"Charite Zugangsportal","text":"Key are submitted in the Charite Zugangsportal. As of Feb 4, you have to use the \"test\" version for this.
Go to zugang.charite.de and login.
Follow through the login page until you reach the main menu (it's tedious but we belive in you ;) Click the \"SSH Keys\" button.
Paste your SSH key (starting with ssh-rsa
) and ending with the label (usually your email, e.g., john.doe@charite.de
) into the box (1) and press append (2). By default, the key can be found in the file ~/.ssh/id_rsa.pub
in Linux. If you generated the key in Windows, please paste the copied key from the text box. Repeat as necessary. Optionally, go back to the main menu (3) when done.
If you have generated your SSH key with PuTTy, you must right click on the ppk-file, then choose \"Edit with PuTTYgen\" in the right click menu. Enter your passphrase. Then copy the SSH key out of the upper box (already highlighted in blue).
Check if the key has been added
After you clicked append
, your key will be printed back to you (as shown in the blurred picture above).
If your key is not printed back to you then adding the SSH key to zugang.charite.de was not successful. In this case please contact helpdesk@charite.de for assistance as they (Charite GB IT) maintains that system and it is out of our (BIH HPC IT) control.
Once your key has been added, it will take a few minutes for the changes to go live.
"},{"location":"connecting/submit-key/mdc/","title":"Submitting an SSH Key to MDC","text":"For MDC users, SSH keys are submitted through the MDC PersDB interface (see below). PersDB is not maintained by BIH HPC IT but by MDC IT.
Warning
The SSH keys are only activated over night (but automatically). This is out of our control. Contact helpdesk@mdc-berlin.de for more information.
"},{"location":"connecting/submit-key/mdc/#detour-using-mdc-vmware-view-to-get-into-mdc-intranet","title":"Detour: Using MDC VMWare View to get into MDC Intranet","text":"In case you are not within the MDC network, connect to MDC VMWare view first and use the web brower in the Window session.
- Go to the MDC VMWare View
- Click \"VMWare Web Viewer\"
- Login with MDC username and password.
- Select Windows 7.
- Open Firefox or Internet Browser
"},{"location":"connecting/submit-key/mdc/#enter-mdc-persdb","title":"Enter MDC PersDB","text":" - If you are inside MDC network, you can start here, OR
- If you have the MDC VMWare Web Viewer open, start here.
"},{"location":"connecting/submit-key/mdc/#log-into-mdc-persdb","title":"Log into MDC PersDB","text":" - Open https://persdb.mdc-berlin.net/login
- Login with MDC username and password again
"},{"location":"connecting/submit-key/mdc/#click-on-mein-profil","title":"Click on \"Mein Profil\"","text":""},{"location":"connecting/submit-key/mdc/#click-on-zusaetzliches-ssh-public-key-bearbeiten","title":"Click on \"Zusaetzliches (ssh public key) -> Bearbeiten\"","text":" - This is the middle item.
"},{"location":"connecting/submit-key/mdc/#click-neue-zusaetzliche-eigenschaft-anlegen","title":"Click \"Neue zusaetzliche Eigenschaft anlegen\"","text":" - Most probably, you don't have any entries yet.
"},{"location":"connecting/submit-key/mdc/#activate-the-vmware-view-menu","title":"Activate the VMWare View Menu","text":" - This is the only way to get your SSH key into the clipboard of the Windows instance that has MDC PersDB open. :rolleyes:
"},{"location":"connecting/submit-key/mdc/#activate-clipboard-window","title":"Activate Clipboard Window","text":" - Click (middle) clipboard button.
- The clipboard window appears.
- Close the VMWare View window again.
"},{"location":"connecting/submit-key/mdc/#register-ssh-key","title":"Register SSH key","text":" - Paste SSH key from
~/.ssh/id_rsa.pub
into the clipboard window. Ensure that the whole file contents is there (should end with your email address). If you generated the key in Windows, please paste the copied key from the text box. - Left-click the \"Inhalt\" text box to put the cursor there
- Right-click the \"Inhalt\" text box, make the context menu appear, and click \"Einfuegen\"
- Click Submit
"},{"location":"connecting/submit-key/mdc/#youre-done","title":"You're Done","text":"Thus, you will only be able to connect the next day. - Bask in the glory of having completed this process.
"},{"location":"cubit/","title":"Overview","text":"The static data installation can be found at /data/cephfs-1/work/projects/cubit/18.12/static_data
.
The static data directory contains a sub-directory for the genomes, the precomputed index files for several different popular mapping tools and associated annotation (GFF and GTF files) from Ensembl and GENCODE for each of the available genomes. The top-level directory structure is as follows:
static_data/
annotations
app_support
db
exome_panel
exon_list
precomputed
reference
"},{"location":"cubit/annotations/","title":"Annotation Data","text":"The following Ensembl and GENCODE versions corresponding to the indicated reference genomes will be made available on the cluster.
Database Version Reference Genome Ensembl 65 NCBIM37 (Ensembl release corresponding to GENCODE M1) Ensembl 67 NCBIM37 (Ensembl release for sanger mouse genome assembly) Ensembl 68 GRCm38 (Ensembl release for sanger mouse genome assembly) Ensembl 74 GRCh37 (Ensembl release for GENCODE 19) Ensembl 75 GRCh37 (Latest release for GRCh37) Ensembl 79 GRCh38 (Ensembl release for GENCODE 22) Ensembl 80 GRCh38 (Ensembl release corresponding to GENCODE 22) Ensembl 80 GRCm38 (Ensembl release corresponding to GENCODE M1) GENCODE M1 NCBIM37 (No gff3 file) GENCODE M5 GRCm38 GENCODE 19 current for GRCh37 GENCODE 22 current for GRCh38 The annotation files associated with the indicated genomes can be accessed in the following directories:
static_data/annotation\n\u251c\u2500\u2500 ENSEMBL\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 65\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 67\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 68\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 74\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 75\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 79\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 80\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCm38\n\u2514\u2500\u2500 GENCODE\n \u251c\u2500\u2500 19\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n \u251c\u2500\u2500 22\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n \u251c\u2500\u2500 M1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n \u2514\u2500\u2500 M5\n \u2514\u2500\u2500 GRCm38\n
"},{"location":"cubit/app-support/","title":"Cubit Static Data: Application Support","text":"The static_data/app_support
directory contains all data files that are shipped with a software package installed in cubit. For blast
this is not complete and more databases can be added upon request.
static_data/app_support\n\u251c\u2500\u2500 blast\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 variable\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 nt\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 refseq_protein\n\u251c\u2500\u2500 Delly\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.6.5\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.6.7\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.7.5\n\u251c\u2500\u2500 GATK_bundle\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.8\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u251c\u2500\u2500 Jannovar\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.14\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.16\n\u251c\u2500\u2500 kraken\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.10.5-cubi20160426\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 bacvir\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 minikraken_20141208\n\u251c\u2500\u2500 Oncotator\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 v1_ds_Jan262015\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 1000genome_db\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 achilles\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cancer_gene_census\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ccle_by_gene\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ccle_by_gp\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 clinvar\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic_fusion\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic_tissue\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dbNSFP_ds\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dbsnp\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dna_repair_genes\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 esp6500SI_v2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 esp6500SI_v2_coverage\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 familial\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 gencode_out2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 gencode_xrefseq\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hgnc\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mutsig\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 oreganno\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 override_lists\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ref_hg\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 simple_uniprot\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 so_terms\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tcgascape\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tumorscape\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 uniprot_aa_annotation\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 uniprot_aa_xform\n\u2514\u2500\u2500 SnpEff\n \u2514\u2500\u2500 4.1\n \u2514\u2500\u2500 data\n \u251c\u2500\u2500 GRCh37.75\n \u251c\u2500\u2500 GRCh38.79\n \u251c\u2500\u2500 GRCm38.79\n \u251c\u2500\u2500 hg19\n \u251c\u2500\u2500 hg38\n \u2514\u2500\u2500 mm10\n
"},{"location":"cubit/databases/","title":"Databases","text":"The file formats in the static_data/db
folder are mostly .vcf
or .bed
files. We provide the following databases:
Database Version Reference genome COSMIC v72 GRCh37 dbNSFP 2.9 GRCh37/hg19 dbSNP b128 mm9 dbSNP b128 NCBIM37 dbSNP b142 GRCh37 dbSNP b144 GRCh38 dbSNP b147 GRCh37 dbSNP b147 GRCh38 dbSNP b150 GRCh37 dbSNP b150 GRCh38 DGV 2015-07-23 GRCh37 ExAC release0.3 GRCh37/hg19 ExAC release0.3.1 GRCh37/hg19 giab NA12878_HG001/NISTv2.19 GRCh37 goldenpath variable GRCh37 goldenpath variable hg19 goldenpath variable mm9 goldenpath variable NCBIM37 SangerMousegenomesProject REL-1211-SNPs_Indels mm9 SangerMousegenomesProject REL-1211-SNPs_Indels NCBIM37 UK10K cohort REL-2012-06-02 GRCh37 The directory structure is as follows:
static_data/db\n\u251c\u2500\u2500 COSMIC\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 v72\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u251c\u2500\u2500 dbNSFP\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.9\n\u251c\u2500\u2500 dbSNP\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b128\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b142\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b144\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 b147\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u251c\u2500\u2500 DGV\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2015-07-23\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u251c\u2500\u2500 ExAC\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 release0.3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 release0.3.1\n\u251c\u2500\u2500 giab\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NA12878_HG001\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NISTv2.19\n\u251c\u2500\u2500 goldenpath\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 variable\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u251c\u2500\u2500 SangerMouseGenomesProject\n\u2502 \u2514\u2500\u2500 REL-1211-SNPs_Indels\n\u2502 \u251c\u2500\u2500 mm9\n\u2502 \u2514\u2500\u2500 NCBIM37\n\u2514\u2500\u2500 UK10K_cohort\n \u2514\u2500\u2500 REL-2012-06-02\n
"},{"location":"cubit/exomes-panels/","title":"Exomes and Panels","text":"These exome panel data are proprietary and downloaded after registration. In case you want to use them, be sure you have access to them by creating an account at Agilent or Roche to not run into legal trouble.
static_data/exome_panel\n\u251c\u2500\u2500 Agilent\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V4\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V5\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V6\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 SureSelect_Mouse_All_Exon_V1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2514\u2500\u2500 Roche\n \u2514\u2500\u2500 SeqCap_EZ_MedExome\n \u2514\u2500\u2500 GRCh37\n
"},{"location":"cubit/exon-lists/","title":"Exon Lists","text":"Here we provide exon lists for some human genome assemblies in the .bed
-file format. Each file exists with the original coordinates contained and as a version with 10 bp padded on each site (suffix: _plus_10bp.bed
). The folder structure is self-explanatory:
static_data/exon_list\n\u251c\u2500\u2500 CCDS\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 15\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 18\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hg38\n\u2514\u2500\u2500 ENSEMBL\n \u251c\u2500\u2500 74\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n \u2514\u2500\u2500 75\n \u2514\u2500\u2500 GRCh37\n
"},{"location":"cubit/index-files/","title":"Precomputed Index Files","text":"Index files for
- BWA version 0.7.12 and 0.7.15,
- bowtie2 version 2.2.5 and
- STAR version 2.4.1d
have been precomputed. The index corresponding to each genome is stored in the following directory structure with the above mentioned reference genomes as subfolders (listed here only for Bowtie/1.1.2
, same subfolders for the remaining programs):
static_data/precomputed\n\u251c\u2500\u2500 Bowtie\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 1.1.2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 danRer10\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dm6\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ecoli\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg18\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm10\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 phix\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 sacCer3\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 UniVec\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 UniVec_Core\n\u251c\u2500\u2500 Bowtie2\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.2.5\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u251c\u2500\u2500 BWA\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.12\n\u2502\u00a0\u00a0 \u2502 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.7.15\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u2514\u2500\u2500 STAR\n \u2514\u2500\u2500 2.4.1d\n \u2514\u2500\u2500 default\n \u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n
"},{"location":"cubit/references/","title":"Reference Sequences","text":""},{"location":"cubit/references/#ncbi-mouse-reference-genome-assemblies","title":"NCBI mouse reference genome assemblies","text":"We provide the NCBI mouse reference assembly used by the Sanger Mouse Genomics group for NCBIM37 and GRCm38. This is a reliable source where the appropriate contigs have already been selected by experts. NCBIM37 is annotated with Ensembl release 67 and GRCm38 with Ensembl release 68.
"},{"location":"cubit/references/#ucsc-mouse-reference-genome-assemblies","title":"UCSC mouse reference genome assemblies","text":"The assembly sequence is in one file per chromosome and is available for mm9 and mm10. We concatenated all the chromosome files to one final fasta file for each genome assembly.
"},{"location":"cubit/references/#ncbi-human-reference-genome-assemblies","title":"NCBI human reference genome assemblies","text":" - GRCh37: We provide the version used by the 1000genomes project as it is widely used and recommended. The chromosomes and contigs are already concatenated.
- g1k_phase1/hs37: This reference sequence contains the autosomal and both sex chromosomes, an updated mitochondrial chromosome as well as \"non-chromosomal supercontigs\". The README explains the method of construction.
- g1k_phase2/hs37d5: In addition to these sequences the phase 2 reference sequence contains the herpes virus genome and decoy sequences for improving SNP calling.
- GRCh38: The GRCh38 assembly offers an \"analysis set\" that was created to accommodate next generation sequencing read alignment pipelines. We provide the three analysis sets from the NCBI.
- hs38/no_alt_analysis_set: The chromosomes, mitochondrial genome, unlocalized scaffolds, unplaced scaffolds and the Epstein-Barr virus sequence which has been added as a decoy to attract contamination in samples.
- hs38a/full_analysis_set: the alternate locus scaffolds in addition to all the sequences present in the no_alt_analysis_set.
- hs38DH/full_plus_hs38d1_analysis_set: contains the human decoy sequences from hs38d1 in addition to all the sequences present in the full_analysis set. More detailed information is available in the README.
"},{"location":"cubit/references/#ucsc-human-reference-genome-assemblies","title":"UCSC human reference genome assemblies","text":"The assembly sequence is in one file per chromosome is available for hg18, hg19 and hg38. We concatenated all the chromosome files to one final fasta file for each genome assembly. Additionally, in the subfolder chromosomes
we keep the chromosome fasta files separately for hg18 and hg19.
"},{"location":"cubit/references/#other-reference-genomes","title":"Other reference genomes","text":" - danRer10: UCSC/GRC zebrafish build 10
- dm6: UCSC/GRC Drosophila melanogaster build 6
- ecoli:
- GCA_000005845.2_ASM584v2: Genbank Escherichia coli K-12 subst. MG1655 genome
- genomemedley:
- 1: Concatenated genome of hg19, dm6, mm10; Chromosomes are tagged with corresponding organism
- PhiX: Control genome that is used by Illumina for sequencing runs
- sacCer3: UCSC's Saccharomyces cerevisiae genome build 3
- UniVec:
- 9: NCBI's non redundant reference of vector sequences, adapters, linkers and primers commonly used in the process of cloning cDNA or genomic DNA (build 9)
- UniVec_Core
- 9: A subset of UniVec build 9
The following directory structure indicates the available genomes. Where there isn't a name for the data set, either the source (e.g. sanger - from the Sanger Mouse Genomes project) or the download date is used to name the sub-directory.
static_data/reference\n\u251c\u2500\u2500 danRer10\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 dm6\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 ecoli\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GCA_000005845.2_ASM584v2\n\u251c\u2500\u2500 genomemedley\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 1\n\u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 g1k_phase1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 g1k_phase2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hs37d5\n\u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs38a\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hs38DH\n\u251c\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 sanger\n\u251c\u2500\u2500 hg18\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 hg38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 mm10\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 sanger\n\u251c\u2500\u2500 phix\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 illumina\n\u251c\u2500\u2500 sacCer3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 UniVec\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 9\n\u2514\u2500\u2500 UniVec_Core\n \u2514\u2500\u2500 9\n
"},{"location":"help/faq/","title":"Frequently Asked Questions","text":""},{"location":"help/faq/#where-can-i-get-help","title":"Where can I get help?","text":" - Talk to your colleagues!
- Have a look at our forums at HPC-talk to see if someone already solved the same problem. If not, create a new topic. Administrators, CUBI, and other users can see and answer your question.
- For problems while connecting and logging in, please contact helpdesk@mdc-berlin.de or helpdesk@charite.de.
- For problems with BIH HPC please contact [hpc-helpdesk@bih-charite.de].
"},{"location":"help/faq/#i-cannot-connect-to-the-cluster-whats-wrong","title":"I cannot connect to the cluster. What's wrong?","text":"Please see the section Connection Problems.
"},{"location":"help/faq/#connecting-to-the-cluster-takes-a-long-time","title":"Connecting to the cluster takes a long time.","text":"The most probable cause for this is a conda installation which defaults to loading the (Base) environment on login. To disable this behaviour you can run:
$ conda config --set auto_activate_base false\n
You can also run the bash shell in verbose mode to find out exactly which command is slowing down login:
$ ssh user@hpc-login-1.cubi.bihealth.org bash -iv\n
"},{"location":"help/faq/#what-is-the-difference-between-max-and-bih-cluster-what-is-their-relation","title":"What is the difference between MAX and BIH cluster? What is their relation?","text":"Administrativa
- The BIH HPC 4 Research cluster of the Berlin Institute of Health (BIH) is located in Buch and operated by BIH HPC IT. The cluster is open for users of both BIH/Charite and MDC.
- The MAX cluster is the cluster of the Max Delbrueck Center (MDC) in Buch. This cluster is used by the researchers at MDC and integrates with a lot of infrastructure of the MDC.
Request for both systems are handled separately, depending on the user's affiliation with research/service groups.
Hardware and Systems
- Both clusters consist of similar hardware for the compute nodes and both feature a DDN system at different number of nodes and different storage volume.
- Both clusters run CentOS/rocky but at potentially different version.
- BIH HPC uses the Slurm workload manager whereas MAX uses Univa Grid Engine.
- The BIH cluster has a significantly faster internal network (40GB/s optical).
Bioinformatics Software
- On the BIH cluster, users can install their own (bioinformatics) software in their user directory.
- On the MAX cluster, users can also install their own software or use software provided by Altuna Akalin's group at MDC.
"},{"location":"help/faq/#my-ssh-sessions-break-with-packet_write_wait-connection-to-xxx-broken-pipe-how-can-i-fix-this","title":"My SSH sessions break with \"packet_write_wait: Connection to XXX : Broken pipe
\". How can I fix this?","text":"Try to put the following line at the top of your ~/.ssh/config
.
ServerAliveInterval 30\n
This will make ssh
send an empty network package to the server. This will prevent network hardware from thinking your connection is unused/broken and terminating it.
If the problem persists, please report it to hpc-helpdesk@bih-charite.de.
"},{"location":"help/faq/#my-job-terminated-before-being-done-what-happened","title":"My job terminated before being done. What happened?","text":"First of all, look into your job logs. In the case that the job was terminated by Slurm (e.g., because it ran too long), you will find a message like this at the bottom. Please look at the end of the last line in your log file.
slurmstepd: error: *** JOB <your job id> ON med0xxx CANCELLED AT 2020-09-02T21:01:12 DUE TO TIME LIMIT ***\n
This indicates that you need to need to adjust the --time
limit to your sbatch
command.
slurmstepd: error: Detected 2 oom-kill event(s) in step <your job id>.batch cgroup.\nSome of your processes may have been killed by the cgroup out-of-memory handler\n
This indicates that your job tries to use more memory than has been allocated to it. Also see Slurm Scheduler: Memory Allocation
Otherwise, you can use sacct -j JOBID
to read the information that the job accounting system has recorded for your job. A job that was canceled (indicated by CANCELED
) by the Slurm job scheduler looks like this (ignore the COMPLETED
step that is just some post-job step added by Slurm automatically).
# sacct -j _JOBID_\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n_JOBID_ snakejob.+ medium hpc-ag-xx+ 4 TIMEOUT 0:0\n_JOBID_.bat+ batch hpc-ag-xx+ 4 CANCELLED 0:15\n_JOBID_.ext+ extern hpc-ag-xx+ 4 COMPLETED 0:0\n
Use the --long
flag to see all fields (and probably pipe it into less
as: sacct -j JOBID --long | less -S
). Things to look out for:
- What is the exit code?
- Is the highest recorded memory usage too high/higher than expected (field
MaxRSS
)? - Is the running time too long/longer than expected (field
Elapsed
)?
Note that --long
does not show all fields. For example, the following tells us that the given job was above its elapsed time which caused it to be killed.
# sacct -j _JOBID_ --format Timelimit,Elapsed\n Timelimit Elapsed\n---------- ----------\n 01:00:00 01:00:12\n 01:00:13\n 01:00:12\n
Use man sacct
, sacct --helpformat
, or see the Slurm Documentation for options for the --format
field of sacct
.
"},{"location":"help/faq/#im-getting-a-bus-error-core-dumped","title":"I'm getting a \"Bus error (core dumped)\"","text":"This is most probably caused by your job being allocated insufficient memory. Please see the memory part of the answer to My job terminated before being done. What happened?
"},{"location":"help/faq/#how-can-i-create-a-new-project","title":"How can I create a new project?","text":"You can create a project if you are either a group leader of an AG or a delegate of an AG. If this is the case, please follow these instructions.
"},{"location":"help/faq/#i-cannot-create-pngs-in-r","title":"I cannot create PNGs in R","text":"For using the png
method, you need to have an X11 session running. This might be the case if you logged into a cluster node using srun --x11
if configured correctly but is not the case if you submitted a bash job. The solution is to use xvfb-run
(xvfb = X11 virtual frame-buffer).
Here is the content of an example script:
$ cat img.R\n#!/usr/bin/env Rscript\n\npng('cars.png')\ncars <- c(1, 3, 6, 4, 9)\nplot(cars)\ndev.off()\n
Here, it fails without X11:
$ ./img.R\nError in .External2(C_X11, paste(\"png::\", filename, sep = \"\"), g$width, :\n unable to start device PNG\nCalls: png\nIn addition: Warning message:\nIn png(\"cars.png\") : unable to open connection to X11 display ''\nExecution halted\n
Here, it works with xvfb-run
:
$ xvfb-run ./img.R\nnull device\n 1\n$ ls\ncars.png foo.png img.R Rplots.pdf\n
"},{"location":"help/faq/#my-jobs-dont-get-scheduled","title":"My jobs don't get scheduled","text":"You can use scontrol show job JOBID
to get the details displayed about your jobs. In the example below, we can see that the job is in the PENDING
state. The Reason
field tells us that the job did not scheduled because the specified dependency was neverfulfilled. You can find a list of all job reason codes in the Slurm squeue
documentation.
JobId=863089 JobName=pipeline_job.sh\n UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(5272) MCS_label=N/A\n Priority=1 Nice=0 Account=(null) QOS=normal\n JobState=PENDING Reason=DependencyNeverSatisfied Dependency=afterok:863087(failed)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A\n SubmitTime=2020-05-03T18:57:34 EligibleTime=Unknown\n AccrueTime=Unknown\n StartTime=Unknown EndTime=Unknown Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-05-03T18:57:34\n Partition=debug AllocNode:Sid=hpc-login-1:28797\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=1,node=1,billing=1\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Command=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/pipeline_job.sh\n WorkDir=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export\n StdErr=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out\n StdIn=/dev/null\n StdOut=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out\n Power=\n MailUser=(null) MailType=NONE\n
If you see a Reason=ReqNodeNotAvail,_Reserved_for_maintenance
then also see Reservations / Maintenances.
For GPU jobs also see \"My GPU jobs don't get scheduled\".
"},{"location":"help/faq/#my-gpu-jobs-dont-get-scheduled","title":"My GPU jobs don't get scheduled","text":"There are only four GPU machines in the cluster (with four GPUs each, hpc-gpu-1 to hpc-gpu-4). Please inspect first the number of running jobs with GPU resource requests:
hpc-login-1:~$ squeue -o \"%.10i %20j %.2t %.5D %.4C %.10m %.16R %.13b\" \"$@\" | grep hpc-gpu- | sort -k7,7\n 1902163 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2\n 1902167 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2\n 1902164 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2\n 1902166 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2\n 1902162 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2\n 1902165 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2\n 1785264 bash R 1 1 1G hpc-gpu-4 gpu:tesla:2\n
This indicates that there are two free GPUs on hpc-gpu-4.
Second, inspect the node states:
hpc-login-1:~$ sinfo -n hpc-gpu-[1-4]\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\ndebug* up 8:00:00 0 n/a\nmedium up 7-00:00:00 0 n/a\nlong up 28-00:00:0 0 n/a\ncritical up 7-00:00:00 0 n/a\nhighmem up 14-00:00:0 0 n/a\ngpu up 14-00:00:0 1 drng hpc-gpu-4\ngpu up 14-00:00:0 3 mix med[0301-0303]\nmpi up 14-00:00:0 0 n/a\n
This tells you that hpc-gpu-1 to hpc-gpu-3 have jobs running (\"mix\" indicates that there are free resources, but these are only CPU cores not GPUs). hpc-gpu-4 is shown to be in \"draining state\". Let's look what's going on there.
hpc-login-1:~$ scontrol show node hpc-gpu-4\nNodeName=hpc-gpu-4 Arch=x86_64 CoresPerSocket=16\n CPUAlloc=2 CPUTot=64 CPULoad=1.44\n AvailableFeatures=skylake\n ActiveFeatures=skylake\n Gres=gpu:tesla:4(S:0-1)\n NodeAddr=hpc-gpu-4 NodeHostName=hpc-gpu-4 Version=20.02.0\n OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020\n RealMemory=385215 AllocMem=1024 FreeMem=347881 Sockets=2 Boards=1\n State=MIXED+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A\n Partitions=gpu\n BootTime=2020-06-30T20:33:36 SlurmdStartTime=2020-07-01T09:31:51\n CfgTRES=cpu=64,mem=385215M,billing=64\n AllocTRES=cpu=2,mem=1G\n CapWatts=n/a\n CurrentWatts=0 AveWatts=0\n ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s\n Reason=deep power-off required for PSU [root@2020-07-17T13:21:02]\n
The \"State\" attribute indicates the node has jobs running but is currenlty being \"drained\" (accepts no new jobs). The \"Reason\" gives that it has been scheduled for power-off for maintenance of the power supply unit.
"},{"location":"help/faq/#when-will-my-job-be-scheduled","title":"When will my job be scheduled?","text":"You can use the scontrol show job JOBID
command to inspect the scheduling information for your job. For example, the following job is scheduled to start at 2022-09-19T07:53:29
(StartTime
) and will be terminated if it does not stop before 2022-09-19T15:53:29
(EndTime
) For further information, it has been submitted at 2022-09-15T12:24:57
(SubmitTime
) and has been last considered by the scheduler at 2022-09-19T07:53:15
(LastSchedEval
).
# scontrol show job 4225062\nJobId=4225062 JobName=C2371_2\n UserId=user_c(133196) GroupId=hpc-ag-group(1030014) MCS_label=N/A\n Priority=805 Nice=0 Account=hpc-ag-group QOS=normal\n JobState=PENDING Reason=QOSMaxCpuPerUserLimit Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A\n SubmitTime=2022-09-15T12:24:57 EligibleTime=2022-09-15T12:24:57\n AccrueTime=2022-09-15T12:24:57\n StartTime=2022-09-19T07:53:29 EndTime=2022-09-19T15:53:29 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-09-19T07:53:15 Scheduler=Main\n Partition=medium AllocNode:Sid=hpc-login-1:557796\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1-1 NumCPUs=25 NumTasks=25 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=25,mem=150G,node=1,billing=25\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=150G MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=YES Contiguous=0 Licenses=(null) Network=(null)\n Command=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/GS_wrapy/wrap_y0_VP_2371_GS_chunk2_C02.sh\n WorkDir=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims\n StdErr=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/E2371_2.txt\n StdIn=/dev/null\n StdOut=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/slurm-4225062.out\n Power=\n
"},{"location":"help/faq/#my-jobs-dont-run-in-the-partition-i-expect","title":"My jobs don't run in the partition I expect","text":"You can see the partition that your job runs in with squeue -j JOBID
:
hpc-login-1:~$ squeue -j 877092\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 877092 medium snakejob holtgrem R 0:05 1 med0626\n
See Job Scheduler for information about the partition's properties and how jbos are routed to partitions. You can force jobs to run in a particular partition by specifying the --partition
parameter, e.g., by adding --partition=medium
or -p medium
to your srun
and sbatch
calls.
"},{"location":"help/faq/#my-jobs-get-killed-after-four-hours","title":"My jobs get killed after four hours","text":"This is probably answered by the answer to My jobs don't run in the partition I expect.
"},{"location":"help/faq/#how-can-i-mount-a-network-volume-from-elsewhere-on-the-cluster","title":"How can I mount a network volume from elsewhere on the cluster?","text":"You cannot.
"},{"location":"help/faq/#how-can-i-make-workstationserver-files-available-to-the-hpc","title":"How can I make workstation/server files available to the HPC?","text":"You can transfer files to the cluster through Rsync over SSH or through SFTP to the hpc-transfer-1
or hpc-transfer-2
node.
Do not transfer files through the login nodes. Large file transfers through the login nodes can cause performance degradation for the users with interactive SSH connections.
"},{"location":"help/faq/#how-can-i-circumvent-invalid-instruction-signal-4-errors","title":"How can I circumvent \"invalid instruction\" (signal 4) errors?","text":"Make sure that software is compiled with \"sandy bridge\" optimizations and no later one. E.g., use the -march=sandybridge
argument to the GCC/LLVM compiler executables.
If you absolutely need it, there are some boxes with more recent processors in the cluster (e.g., Haswell architecture). Look at the /proc/cpuinfo
files for details.
"},{"location":"help/faq/#i-have-problems-connecting-to-the-gpu-node-whats-wrong","title":"I have problems connecting to the GPU node! What's wrong?","text":"Please check whether there might be other jobs waiting in front of you! The following squeue
call will show the allocated GPUs of jobs in the gpu
queue. This is done by specifying a format string and using the %b
field.
squeue -o \"%.10i %9P %20j %10u %.2t %.10M %.6D %10R %b\" -p gpu\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(R TRES_PER_NODE\n 872571 gpu bash user1 R 15:53:25 1 hpc-gpu-3 gpu:tesla:1\n 862261 gpu bash user2 R 2-16:26:59 1 hpc-gpu-4 gpu:tesla:4\n 860771 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860772 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860773 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860770 gpu kidney.job user3 R 4-03:23:08 1 hpc-gpu-1 gpu:tesla:1\n 860766 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-3 gpu:tesla:1\n 860767 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1\n 860768 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1\n
In the example above, user1 has one job with one GPU running on hpc-gpu-3, user2 has one job running with 4 GPUs on hpc-gpu-4 and user3 has 7 jobs in total running of different machines with one GPU each.
"},{"location":"help/faq/#how-can-i-access-graphical-user-interfaces-such-as-for-matlab-on-the-cluster","title":"How can I access graphical user interfaces (such as for Matlab) on the cluster?","text":" - First of all, you will need an X(11) server on your local machine (see Wikipedia: X Window System. This server offers a \"graphical surface\" that the programs on the cluster can then paint on.
- You need to make sure that the programs running on the cluster can access this graphical surface.
- Generally, you need to connect to the login nodes with X forwarding. Refer to the manual of your SSH client on how to do this (
-X
for Linux/Mac ssh
- As you should not run compute-intensive programs on the login node, connect to a cluster node with X forwarding. With Slurm, this is done using
srun --pty --x11 bash -i
(instead of srun --pty --x11 bash -i
).
Also see:
- Running graphical(X11) applications on Windows
- Running graphical(X11) applications on Linux
"},{"location":"help/faq/#how-can-i-log-into-a-node-outside-of-the-scheduler","title":"How can I log into a node outside of the scheduler?","text":"This is sometimes useful, e.g., for monitoring the CPU/GPU usage of your job interactively.
No Computation Outside of Slurm
Do not perform any computation outside of the scheduler as (1) this breaks the purpose of the scheduling system and (2) administration is not aware and might kill you jobs.
The answer is simple, just SSH into this node.
hpc-login-1:~$ ssh hpc-cpu-xxx\n
"},{"location":"help/faq/#why-am-i-getting-multiple-nodes-to-my-job","title":"Why am I getting multiple nodes to my job?","text":"Classically, jobs on HPC systems are written in a way that they can run on multiple nodes at once, using the network to communicate. Slurm comes from this world and when allocating more than one CPU/core, it might allocate them on different nodes. Please use --nodes=1
to force Slurm to allocate them on a single node.
"},{"location":"help/faq/#how-can-i-select-a-certain-cpu-architecture","title":"How can I select a certain CPU architecture?","text":"You can select the CPU architecture by using the -C
/--constraint
flag to sbatch
and srun
. The following are available (as detected by the Linux kernel):
ivybridge
(96 nodes, plus 4 high-memory nodes) haswell
(16 nodes) broadwell
(112 nodes) skylake
(16 nodes, plus 4 GPU nodes)
You can specify contraints with OR such as --constraint=haswell|broadwell|skylake
. You can see the assignment of architectures to nodes using the sinfo -o \"%8P %.5a %.10l %.6D %.6t %10f %N\"
command. This will also display node partition, availability etc.
"},{"location":"help/faq/#help-im-getting-a-quota-warning-email","title":"Help, I'm getting a Quota Warning Email!","text":"No worries!
As documented in the Storage Locations section, each user/project/group has three storage volumes: A small home
, a larger work
and a large (but temporary) scratch
. There are limits on the size of these volumes. You get a nightly warning email in case you are over the soft limit and you will not be able to write any more data if you get above the hard limit. When you login to the login nodes, the quotas and current usage is displayed to you.
Please note that not all files will be displayed when using ls
. You have to add the -a
parameter to also show files and directory starting with a dot. Often, users are confused if these dot directories take up all of their home
quota.
Use the following command to list all files and directories in your home:
hpc-login-1:~$ ls -la ~/\n
For more information on how to keep your home directory clean and avoid quota warnings, please read Home Folder Quota.
"},{"location":"help/faq/#im-getting-a-disk-quota-exceeded-error","title":"I'm getting a \"Disk quota exceeded\" error.","text":"Most probably you are running into the same problem as described above: Help, I'm getting a Quota Warning Email!
"},{"location":"help/faq/#environment-modules-dont-work-and-i-get-module-command-not-found","title":"Environment modules don't work and I get \"module: command not found\"","text":"First of all, ensure that you are on a compute node and not on one of the login nodes. One common reason is that the system-wide Bash configuration has not been loaded, try to execute source /etc/bashrc
and then re-try using module
. In the case that the problem persists, please contact hpc-helpdesk@bih-charite.de.
"},{"location":"help/faq/#what-should-my-bashrc-look-like","title":"What should my ~/.bashrc look like?","text":"All users get their home directory setup using a skelleton files. These file names start with a dot .
and are hidden when you type ls
, you have to type ls -a
to see them. You can find the current skelleton in /etc/skel.bih
and inspect the content of the Bash related files as follows:
hpc-login-1:~$ head /etc/skel.bih/.bash*\n==> /etc/skel.bih/.bash_logout <==\n# ~/.bash_logout\n\n==> /etc/skel.bih/.bash_profile <==\n# .bash_profile\n\n# Get the aliases and functions\nif [ -f ~/.bashrc ]; then\n . ~/.bashrc\nfi\n\n# User specific environment and startup programs\n\nPATH=$PATH:$HOME/.local/bin:$HOME/bin\n\n==> /etc/skel.bih/.bashrc <==\n# .bashrc\n\n# Source global definitions\nif [ -f /etc/bashrc ]; then\n . /etc/bashrc\nfi\n\n# Uncomment the following line if you don't like systemctl's auto-paging feature:\n# export SYSTEMD_PAGER=\n
There actually are a couple of more files by default. The original copy in /etc/skel.bih
might slightly change over time during improvements but we will not touch your home directory in an unsolicited way at any time!
hpc-login-1:~$ tree -a /etc/skel.bih/\n/etc/skel.bih/\n\u251c\u2500\u2500 .bash_logout\n\u251c\u2500\u2500 .bash_profile\n\u251c\u2500\u2500 .bashrc\n\u251c\u2500\u2500 .screenrc\n\u2514\u2500\u2500 .vimrc\n
"},{"location":"help/faq/#my-program-crashes-what-should-i-do","title":"My program crashes! What should I do?","text":"Have a look at our How-To: Debug Software and How-To: Debug Software on HPC Systems guides!
But it works on my workstation!
Yes, please also refer to these guides on possible approaches to find the problem.
"},{"location":"help/faq/#my-r-sessions-is-always-crashing","title":"My R sessions is always crashing!","text":"We are aware of issues for macOS users that are related to locales. You might see something like the following in your R session or logs.
Note that this is not macOS specific by itself but by macOS using a locale that R does not understand.
Try using export LC_ALL=C
before calling R. If the problem goes away, you can add the following lines to your ~/.bashrc
file to make this permanent. Note that this is actually a good idea so your locale settings are fixed.
# Fix all locale settings to \"C\" for robustness.\nexport LC_ALL=C\n
ChatGPT on Unix Locales
Let's se what ChatGPT 3.5 tells us about Unix locale.
Me: Please explain the concept of Unix locales to me.
Certainly! Unix locales are a set of environment variables and settings that define how various aspects of software and user interactions should be handled to accommodate different languages, regions, and cultural conventions. They play a crucial role in enabling software to adapt to local preferences for displaying text, formatting numbers, handling dates, and more.
In a Unix-like operating system (such as Linux or macOS), locales provide a way to customize how programs behave based on the user's language, country, and cultural norms. Here are some of the key components of Unix locales:
-
Language: The primary language used for messages, menus, prompts, and system output. For instance, if you set your locale to \"en_US\", messages and interfaces will be in English (United States).
-
Region: The country or region that defines conventions for aspects like date and time formatting, number formatting (decimal point or comma, thousands separator, etc.), and currency symbols. For example, \"en_GB\" represents English (Great Britain).
-
Character Encoding: Specifies how characters are encoded, which is vital for correctly displaying and handling text in various scripts and languages. Common encodings include UTF-8 for Unicode text.
-
Collation: Defines how strings should be sorted. Different languages have different sorting orders for characters, and the locale determines the appropriate collation rules.
-
Numeric and Date Formats: Specifies how numbers, dates, and times should be formatted. Different regions have different conventions for date formats (dd/mm/yyyy vs. mm/dd/yyyy) and numerical representations (commas vs. periods as decimal points).
-
Currency Symbols: Defines the symbol or abbreviation used for representing currency. This is important for currency formatting in software.
Unix locales are typically identified by a combination of language and region codes, like \"en_US\" for American English or \"fr_FR\" for French (France). These locales are stored in files on the system, and they contain the necessary rules and data for proper localization.
Users and administrators can set the system's default locale, and individual users can override this setting in their environment. Here are some common commands used to manage locales:
locale
: Displays the current locale settings. locale -a
: Lists all available locales on the system. export LC_ALL=locale_code
: Temporarily sets the locale for the current session. update-locale locale_code
: Changes the system-wide default locale.
Programs that are designed to be locale-aware use these settings to display messages, format numbers and dates, and handle other language-dependent features. This allows software to be more user-friendly and relevant to different linguistic and cultural contexts.
In summary, Unix locales provide a framework for adapting software behavior to match the language, region, and cultural preferences of users, ensuring a more inclusive and globally accessible computing experience.
"},{"location":"help/faq/#which-cuda-version-is-installed","title":"Which CUDA version is installed?","text":"For this, connect to the node you want to query (via SSH but do not perform any computation via SSH!)
hpc-login-1:~$ ssh hpc-gpu-1\nhpc-gpu-1:~$ yum list installed 2>/dev/null | grep cuda.x86_64\ncuda.x86_64 10.2.89-1 @local-cuda\nnvidia-driver-latest-dkms-cuda.x86_64 3:440.64.00-1.el7 @local-cuda\n
"},{"location":"help/faq/#can-i-use-docker-on-the-cluster","title":"Can I use Docker on the Cluster?","text":"No, as Docker essentially gives you access as the root user.
However, you can use Apptainer (former Singularity) to run containers (and even many Docker contains if they are \"properly built\"). Also see Using Apptainer (with Docker Images).
"},{"location":"help/faq/#how-can-i-copy-data-between-the-max-cluster-mdc-network-and-bih-hpc","title":"How can I copy data between the MAX Cluster (MDC Network) and BIH HPC?","text":"The MAX cluster is the HPC system of the MDC. It is located in the MDC network. The BIH HPC is located in the BIH network.
In general, connections can only be initiated from the MDC network to the BIH network. The reverse does not work. In other words, you have to log into the MAX cluster and then initiate your file copies to or from the BIH HPC from there. E.g., use rsync -avP some/path user_m@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_m@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
"},{"location":"help/faq/#how-can-i-copy-data-between-the-charite-network-and-bih-hpc","title":"How can I copy data between the Charite Network and BIH HPC?","text":"In general, connections can only be initiated from the Charite network to the BIH network. The reverse does not work. In other words, you have to be on a machine inside the Charite network and then initiate your file copies to or from the BIH HPC from there. E.g., use rsync -avP some/path user_c@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_c@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
"},{"location":"help/faq/#my-jobs-are-slowdie-on-the-logintransfer-node","title":"My jobs are slow/die on the login/transfer node!","text":"As of December 3, 2020 we have established a policy to limit you to 512 files and 128MB of RAM. Further, you are limited to using the equivalent of one core. This limit is enforced for all processes originating from an SSH session and the limit is enforced on all jobs. This was done to prevent users from thrashing the head nodes or using SSH based sessions for computation.
"},{"location":"help/faq/#slurm-complains-about-execve-no-such-file-or-directory","title":"Slurm complains about execve
/ \"No such file or directory\"","text":"This means that the program that you want to execute does not exist. Consider the following example:
[user@hpc-login-1 ~]$ srun --time 2-0 --nodes=1 --ntasks-per-node=1 \\\n --cpus-per-task=12 --mem 96G --partition staging --immediate 5 \\\n --pty bash -i\nslurmstepd: error: execve(): 5: No such file or directory\nsrun: error: hpc-cpu-2: task 0: Exited with exit code 2\n
Can you spot the problem? In this case, the problem is that for long arguments such as --mem
you must use the equal sign for --arg=value
with Slurm. This means that instead of writing --mem 96G --partition staging --immediate 5
, you must use `--mem=96G --partition=staging --immediate=5
.
In this respect, Slurm deviates from the GNU argument syntax where the equal sign is optional for long arguments.
"},{"location":"help/faq/#slurmstepd-says-that-hwloc_get_obj_below_by_type-fails","title":"slurmstepd
says that hwloc_get_obj_below_by_type
fails","text":"You can ignore the following problem:
slurmstepd: error: hwloc_get_obj_below_by_type() failing, task/affinity plugin may be required to address bug fixed in HWLOC version 1.11.5\nslurmstepd: error: task[0] unable to set taskset '0x0'\n
This is a minor failure related to Slurm and cgroups. Your job should run through successfully despite this error (that is more of a warning for end-users).
"},{"location":"help/faq/#how-can-i-share-filescollaborate-with-users-from-another-work-group","title":"How can I share files/collaborate with users from another work group?","text":"Please use projects as documented here. Projects were created for this particular purpose.
"},{"location":"help/faq/#whats-the-relation-of-charite-mdc-and-cluster-accounts","title":"What's the relation of Charite, MDC, and cluster accounts?","text":"For HPC 4 Research either an active and working Charite or MDC account is required (that is, you can login e.g., into email.charite.de or mail.mdc-berlin.de). The system has a separate meta directory that is used for the authorization of users (in other words, whether the user is active, has access to the system, and which groups the user belongs to). Charite and MDC accounts map to accounts <Charite user name>_c
and <MDC user name>_m
accounts in this meta directory. In the case that a user has both Charite and MDC accounts these are completely separate entities in the meta directory. For authentication (veryfing that a user has acccess to an account), the Charite and MDC account systems (MS Active Directory) are used. Authentication currently only uses the SSH keys deposited into Charite (via zugang.charite.de) and MDC (via MDC persdb). Users have to obtain a suitable Charite/MDC account via Charite and MDC central IT departments and upload their SSH keys through the host organization systems on their own. The hpc-helpdesk process is then used for getting their accounts setup on the HPC 4 Research system (the home/work/scratch shares being setup), becoming part of the special hpc-users
group that controls access to the system and organizing users into work groups and projects.
The process of submitting keys to Charite and MDC is documented in the \"Connecting\" section.
"},{"location":"help/faq/#how-do-charitemdccluster-accounts-interplay-with-vpn-and-the-mdc-jail-node","title":"How do Charite/MDC/Cluster accounts interplay with VPN and the MDC jail node?","text":"Charite users have to obtain a VPN account with the appropriate VPN access permissions, i.e., Zusatzantrag B as documented here. For Charite VPN, as for all Charite IT systems, users must use their Charite user name (e.g., jdoe
and not jdoe_c
).
MDC users either have to use MDC VPN or the MDC jail node, as documented here. For MDC VPN and jail node, as for all MDC IT systems, users must use their MDC user name (e.g., jdoe
and not jdoe_m
).
For help with VPN or jail node, please contact the central Charite or MDC helpdesks as appropriate.
Only when connecting from the host organizations' VPN or from the host organizations' jail node, the users use the HPC 4 Research user name that is jdoe_c
or jdoe_m
and not jdoe
!
"},{"location":"help/faq/#how-can-i-exchange-data-with-external-collaborators","title":"How can I exchange data with external collaborators?","text":"BIH HPC IT does not have the resources to offer such a service to normal users.
In particular, for privacy sensitive data this comes with a large number of strings attached to fulfill all regulatory requirements. If you need to exchange such data then you need to contact the central IT departments of your home organisation:
- Charite GB IT: heldpesk@charite.de
- MDC: helpdesk@mdc-berlin.de
If your data is not privacy sensitive or you can guarantee strong encryption of the data then the Gigamove service of RWTH Aachen might come in handy:
- https://gigamove.rwth-aachen.de/en
- https://help.itc.rwth-aachen.de/en/service/1jeqhtat4k0o3/faq/
You can login via Charite/MDC credentials (or most German academic institutions) and store up to 1TB of data at a time in the account with each file having up to 100GB.
As a note, Charite GB IT has a (German) manual on how to use 7-Zip with AES256 and strong passwords for encrypting data such that it is fit for transfer over unencrypted channels. You can find it here (Charite Intranet only) at point 2.12.
- https://intranet.charite.de/it/helpdesk/anleitungen/
The key point is using a strong password (e.g. with the pwgen
utility), creating an encrypted file with AES256 encryption, using distinct password for each recipient, and exchanging the password over a second channel (SMS or voice phone). Note that the central manual remains the ground truth of information and this FAQ entry may not reflect the current process recommended by GB IT if it changes without us noticing.
"},{"location":"help/good-tickets/","title":"How-To: Write a Good Ticket","text":"Can you solve the question yourself?
Please try to solve the question yourself with this manual and Google.
If the problem turns out to be hard, we're happy to help.
This page describes how to write a good help request ticket.
- Write a descriptive summary.
- Which cluster are you on? We only support HPC 4 Research.
- Put in a short summary into the Subject.
- Expand on this in a first paragraph. Try to answer the following questions:
- What are you trying to achieve?
- When did the problem start?
- Did it work before?
- Which steps did you attempt to achieve this?
- Give us your basic information.
- Please give us your user name on the cluster.
- Put enough details in the details section.
- Please give us the exact commands you type into your console.
- What are the symptoms/is the error message
- Never put your password into the ticket. In the case that you handle person-related data of patients/study participants, never write any of this information into the ticket or sequent email.
- Please do not send us screenshot images of what you did but copy and paste the text instead.
There is more specific questions for common issues given below.
"},{"location":"help/good-tickets/#problems-connecting-to-the-cluster","title":"Problems Connecting to the Cluster","text":" - From which machine/IP do you try to connect (
ifconfig
on Linux/Mac, ipconfig
on Windows)? - Did it work before?
- What is your user name?
- Please send us the output of
ssh-add -l
and add -vvv
to the SSH command that fails for you. - What is the response of the server?
"},{"location":"help/good-tickets/#problems-submitting-jobs","title":"Problems Submitting Jobs","text":" - Please give us the directory that you run things in.
- Please send us the submission script that you have problems with.
- If the job was submitted, Slurm will give you a job ID. We will need this ID.
- Please send us the output of
scontrol show job <jobid>
or sacct --long -j <jobid>
of your job.
"},{"location":"help/helpdesk/","title":"HPC IT Helpdesk","text":"Getting Help
Our helpdesk can be reached via email to hpc-helpdesk@bih-charite.de. Please read our guide on how to write good tickets first.
Please also use the handy figure below on general problem resolution.
But before contacting the helpdesk, try to get help in the HPC Talk BIH HPC user self-help forum!
"},{"location":"help/helpdesk/#helpdesk-scope","title":"Helpdesk Scope","text":"Our helpdesk can support you in the following areas:
- Problems/questions with connecting to the clusters.
- Problems/questions with using the cluster scheduler or operating system.
- Requests for the installation of common software.
- Problems with running your software that works in other environments.
We will try our best to resolve these issues. Please note that all other questions can only be answered in a \"best effort way\".
"},{"location":"help/helpdesk/#helpdesk-non-scope","title":"Helpdesk Non-Scope","text":"The following topics are out of scope for the BIH HPC Helpdesk:
- Generic Linux or programming questions (try stackoverflow.com).
- Managing users, groups, and projects on the clusters (use hpc-helpdesk@bih-charite.de).
- Generic help with Snakemake or other workflow engines (See Stackoverflow for getting help with Snakemake).
- Help with bioinformatics or other scientific software. Please contact the authors/communities of these software for help (also known as \"upstream\").
We're happy to see if we can help when there is a concrete problem with the software, e.g.,
- something that breaks from one week to another without you changing anything and you assume a change on the cluster, or
- you need a generic dependency that you cannot install via conda or on your own. Please read the section Administration-Provided Software to learn about the kinds of software that we will install and the kinds that we will not.
"},{"location":"help/hpc-talk/","title":"HPC Talk","text":"Another community-driven possibility to get help is our \u201cHPC Talk\u201d forum. After this manual, it should be the first place to consult.
https://hpc-talk.cubi.bihealth.org/
Its main purpose is to serve as a FAQ, so with time and more people participating, you will more likely find an answer to your question. We also use it to make announcements and give an up-to-date status of current problems with the cluster, so it is worth logging in every once in a while. It is also a great first place to look at if you're experiencing problems with the cluster. Maybe it's a known issue.
Despite users also being able to answer questions, our admins do participate on a regular basis.
"},{"location":"how-to/connect/gpu-nodes/","title":"How-To: Connect to GPU Nodes","text":"The cluster has seven nodes with four Tesla V100 GPUs each: hpc-gpu-{1..7}
and one node with 10 A40 GPUs: hpc-gpu-8
.
Connecting to a node with GPUs is easy. You request one or more GPU cores by adding a generic resources flag to your Slurm job submission via srun
or sbatch
.
--gres=gpu:tesla:COUNT
will request NVIDIA V100 cores. --gres=gpu:a40:COUNT
will request NVIDIA A40 cores. --gres=gpu:COUNT
will request any available GPU cores.
Your job will be automatically placed in the Slurm gpu
partition and allocated a number of COUNT
GPUs.
Info
Fair use rules apply. As GPU nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
Interactive Use of GPU Nodes is Discouraged
While interactive computation on the GPU nodes is convenient, it makes it very easy to forget a job after your computation is complete and let it run idle. While your job is allocated, it blocks the allocated GPUs and other users cannot use them although you might not be actually using them. Please prefer batch jobs for your GPU jobs over interactive jobs.
Furthermore, interactive GPU jobs are currently limited to 24 hours. We will monitor the situation and adjust that limit to optimize GPU usage and usability.
Please also note that allocation of GPUs through Slurm is mandatory, in other words: Using GPUs via SSH sessions is prohibited. The scheduler is not aware of manually allocated GPUs and this interferes with other users' jobs.
"},{"location":"how-to/connect/gpu-nodes/#usage-example","title":"Usage example","text":""},{"location":"how-to/connect/gpu-nodes/#preparation","title":"Preparation","text":"We will setup a miniforge installation with pytorch
testing the GPU. If you already have this setup then you can skip this step
hpc-login-1:~$ srun --pty bash\nhpc-cpu-1:~$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh\nhpc-cpu-1:~$ bash Miniforge3-Linux-x86_64.sh -b -p ~/work/miniforge\nhpc-cpu-1:~$ source ~/work/miniforge/bin/activate\nhpc-cpu-1:~$ conda create -y -n gpu-test pytorch cudatoolkit=10.2 -c pytorch\nhpc-cpu-1:~$ conda activate gpu-test\nhpc-cpu-1:~$ python -c 'import torch; print(torch.cuda.is_available())'\nFalse\nhpc-cpu-1:~$ exit\nhpc-login-1:~$\n
The False
shows that CUDA is not available on the node but that is to be expected. We're only warming up!
"},{"location":"how-to/connect/gpu-nodes/#allocating-gpus","title":"Allocating GPUs","text":"Let us now allocate a GPU. The Slurm schedule will properly allocate GPUs for you and setup the environment variable that tell CUDA which devices are available. The following dry run shows these environment variables (and that they are not available on the login node).
hpc-login-1:~$ export | grep CUDA_VISIBLE_DEVICES\nhpc-login-1:~$ srun --gres=gpu:tesla:1 --pty bash\nhpc-gpu-1:~$ export | grep CUDA_VISIBLE_DEVICES\ndeclare -x CUDA_VISIBLE_DEVICES=\"0\"\nhpc-gpu-1:~$ exit\nhpc-login-1:~$ srun --gres=gpu:tesla:2 --pty bash\nhpc-gpu-1:~$ export | grep CUDA_VISIBLE_DEVICES\ndeclare -x CUDA_VISIBLE_DEVICES=\"0,1\"\n
As you see, you can also reserve multiple GPUs. If we were to open two concurrent connections (e. g. in a screen
) to the same node when allocating one GPU each, the allocated GPUs would be non-overlapping. Note that any two jobs are isolated using Linux cgroups (\"container\" technology) so you cannot accidentally use a GPU of another job.
Now to the somewhat boring part where we show that CUDA actually works.
hpc-login-1:~$ srun --gres=gpu:tesla:1 --pty bash\nhpc-gpu-1:~$ nvcc --version\nnvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2019 NVIDIA Corporation\nBuilt on Wed_Oct_23_19:24:38_PDT_2019\nCuda compilation tools, release 10.2, V10.2.89\nhpc-gpu-1:~$ source ~/work/miniforge/bin/activate\nhpc-gpu-1:~$ conda activate gpu-test\nhpc-gpu-1:~$ python -c 'import torch; print(torch.cuda.is_available())'\nTrue\n
Note
If scheduling a GPU fails, consider explicitely requesting the GPU partion via --partition gpu
(or #SBATCH --partition gpu
).
Also make sure to read the FAQ entry \"I have problems connecting to the GPU node! What's wrong?\" if you encounter problems.
"},{"location":"how-to/connect/gpu-nodes/#bonus-1-who-is-using-the-gpus","title":"Bonus #1: Who is using the GPUs?","text":"Use squeue
to find out about currently queued jobs (the egrep
only keeps the header and entries in the gpu
partition).
hpc-login-1:~$ squeue | egrep -iw 'JOBID|gpu'\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 33 gpu bash holtgrem R 2:26 1 hpc-gpu-1\n
"},{"location":"how-to/connect/gpu-nodes/#bonus-2-is-the-gpu-running","title":"Bonus #2: Is the GPU running?","text":"To find out how active the GPU nodes actually are, you can connect to the nodes (without allocating a GPU; you can do this even if the node is full) and then use nvidia-smi
.
hpc-login-1:~$ ssh hpc-gpu-1 bash\nhpc-gpu-1:~$ nvidia-smi\nFri Mar 6 11:10:08 2020\n+-----------------------------------------------------------------------------+\n| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |\n|-------------------------------+----------------------+----------------------+\n| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n|===============================+======================+======================|\n| 0 Tesla V100-SXM2... Off | 00000000:18:00.0 Off | 0 |\n| N/A 62C P0 246W / 300W | 16604MiB / 32510MiB | 99% Default |\n+-------------------------------+----------------------+----------------------+\n| 1 Tesla V100-SXM2... Off | 00000000:3B:00.0 Off | 0 |\n| N/A 61C P0 270W / 300W | 16604MiB / 32510MiB | 100% Default |\n+-------------------------------+----------------------+----------------------+\n| 2 Tesla V100-SXM2... Off | 00000000:86:00.0 Off | 0 |\n| N/A 39C P0 55W / 300W | 0MiB / 32510MiB | 0% Default |\n+-------------------------------+----------------------+----------------------+\n| 3 Tesla V100-SXM2... Off | 00000000:AF:00.0 Off | 0 |\n| N/A 44C P0 60W / 300W | 0MiB / 32510MiB | 4% Default |\n+-------------------------------+----------------------+----------------------+\n\n+-----------------------------------------------------------------------------+\n| Processes: GPU Memory |\n| GPU PID Type Process name Usage |\n|=============================================================================|\n| 0 43461 C python 16593MiB |\n| 1 43373 C python 16593MiB |\n+-----------------------------------------------------------------------------+\n
"},{"location":"how-to/connect/gpu-nodes/#fair-share-fair-use","title":"Fair Share / Fair Use","text":"Note that allocating a GPU makes it unavailable for everyone else, so please behave nicely and be cooperative. If you see someone blocking the GPU nodes for a long time, first find out who it is. You can type getent passwd USER_NAME
on any cluster node to see their email address (and work phone number if added). Send a friendly email, most likely they blocked the node accidentally. If you cannot resolve the issue (e. g. the user is not reachable) then please contact hpc-helpdesk@bih-charite.de.
"},{"location":"how-to/connect/high-memory/","title":"How-To: Connect to High-Memory Nodes","text":"The cluster has 4 high-memory nodes with 1.5 TB of RAM. You can connect to these nodes using the highmem
SLURM partition (see below). Jobs allocating more than 200 GB of RAM are automatically routed to the highmem
nodes.
Info
Fair use rules apply. As high-memory nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
"},{"location":"how-to/connect/high-memory/#how-to","title":"How-To","text":"In the cluster there are four High-memory used which can be used:
hpc-login-1:~$ sinfo -p highmem\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST \nhighmem up 14-00:00:0 3 idle med040[1-4] \n
To connect to one of them, simply allocate more than 200GB of RAM in your job.
hpc-login-1:~$ srun --pty --mem=300GB bash -i\nmed0401:~$\n
You can also pick one of the hostnames:
hpc-login-1:~$ srun --pty --mem=300GB --nodelist=med0403 bash -i\nmed0403:~$\n
After successfull login, you can see that you are in \"highmem\" queue:
med0403:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n[...]\n 270 highmem bash holtgrem R 1:25 1 med0403 \n
"},{"location":"how-to/misc/contribute/","title":"How-To: Contribute to this Document","text":"Click on the edit link at the top of each page as shown below.
- Sign in to github (or create a new account).
- Fork the repository and add your changes (more details: https://docs.github.com/en/github/getting-started-with-github/fork-a-repo )
- Add a pull request
"},{"location":"how-to/misc/debug-at-hpc/","title":"How-To: Debug Software on HPC Systems","text":"Please Contribute!
This guide is far from complete. Please feel free to contribute, e.g., refer to How-To: Contribute to this Document.
Please make sure that you have read How-To: Debug Software as a general primer.
As debugging is hard enough already, it makes one wonder how to do this on the HPC system in batch mode. Here is a list of pointers.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-1-run-it-interactively","title":"Attempt 1: Run it interactively!","text":"First of all, you can of course get an interactive session using srun --pty bash -i
and then run your program interactively. Make sure to allocate appropriate memory and cores for your purpose. You might also want to first start a screen
or tmux
session on the login node such that network interruptions to the login node don't harm your hard debugging work!
Does the program work correctly if you do this? If yes, and it only fails when run in batch mode, consider the following behaviour of the scheduler.
The scheduler takes your resource requirements and tries to find a free slot. Once it has found a free slot, it will attempt to run the program. This mainly differs in running it interactively in standard input, output, and error streams.
- By default, stdin is connected to
/dev/null
such that no input is read. You can change this with the --input=
flag to specify a file. - By default stdout and stderr are joint and written to the file specified as
--output=
. You can use certain wildcards to make the output (but also the input files) depend on things like the job ID or job name. - Please note that the directory name to the output file but exist before the job is launched. It is not sufficient to
mkdir
it in the job script itself.
Please refer to the sbatch documentation for details.
If your program fails without leaving any log file or any other trace, make sure that the path to the output file exists. To the best of the author's knowledge, there is no way to tell apart a crash because this does not exist and a program failure (except maybe for the running time of 0 seconds and memory usage of 0 bytes).
"},{"location":"how-to/misc/debug-at-hpc/#attempt-2-inspect-the-logs","title":"Attempt 2: Inspect the logs","text":"Do you see any exception in your log files? If not, continue.
If your job is canceled by scancel
or stopped because it exhausted it maximal running time or allocated resources then you will find a note in the last line of your error output log (usually folded into the standard output). Please note that if the previous output line did not include a line ending, the message might be at the very end of the last line.
The message will look similar to:
slurmstepd: error: *** JOB <your job id> ON med0xxx CANCELLED AT 2020-09-02T21:01:12 DUE TO TIME LIMIT ***\n
"},{"location":"how-to/misc/debug-at-hpc/#attempt-3-increase-loggingprinting","title":"Attempt 3: Increase logging/printing","text":"Ideally, you can add one or more --verbose
/-v
flags to your program to increase verbosity. See how far your program gets, see where it fails. This attempt will be greatly helped by reproducible running on a minimal working example.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-4-use-sattach","title":"Attempt 4: Use sattach
","text":"You can use sattach
for attaching your terminal to your running job. This way, you can perform an interactive inspection of the commands.
You can combine this with one of the next attempst of using debuggers to e.g., get an pdb
debugger at an important position of your program. However, please note that pdb
and ipdb
will stop the program's execution if the standard input stream is at end of file (which /dev/null
is and this is used by default in sbatch
jobs).
"},{"location":"how-to/misc/debug-at-hpc/#attempt-5-inspect-program-activity","title":"Attempt 5: Inspect Program Activity","text":"Log into the node that your program runs on either using srun --pty --nodelist=NODE
or using ssh
. Please note that you should never perform computational intensive things when logging into the node directly. You can then use all activity inspection tips from How-To: Debug Software.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-6-use-debuggers","title":"Attempt 6: Use Debuggers","text":"After having logged into the node running your program, you can of course also attach to the program with gdb -p PID
or cgdb -p PID
.
"},{"location":"how-to/misc/debug-at-hpc/#dont-despair","title":"Don't Despair","text":"Here are some final remarks:
- Don't despair!
- The longer you search for the problem, the more fundamental it is. Chances are that you are just overlooking something obvious which is actually easy to fix.
- Keep old log files!
- Really, really, make sure that your program runs deterministically. You will save yourself a world of pain.
"},{"location":"how-to/misc/debug-software/","title":"How-To: Debug Software","text":"Please Contribute!
This guide is far from complete. Please feel free to contribute, e.g., refer to How-To: Contribute to this Document.
Software development in general or even debugging of software are very broad topics. As such, we will not be able to handle them here comprehensively. Rather, we will give a tour de force on practical and minimal approaches of debugging of software. Here, debugging refers to the process of locating errors in your program and removing them.
Origin of the term debugging
The terms \"bug\" and \"debugging\" are popularly attributed to Admiral Grace Hopper in the 1940s. While she was working on a Mark II computer at Harvard University, her associates discovered a moth stuck in a relay and thereby impeding operation, whereupon she remarked that they were \"debugging\" the system. However, the term \"bug\", in the sense of \"technical error\", dates back at least to 1878 and Thomas Edison (see software bug for a full discussion).
-- Wikipedia: Debugging
When forgetting a moment about everything known about software engineering, programming roughly work sin the following cycle:
You run your program. In the case of failure, you need to remove the problem until the program runs through. You then start implementing the next change or feature. But how do you actually locate the problem? Let us walk through a couple of steps.
"},{"location":"how-to/misc/debug-software/#step-1-find-out-that-there-is-an-error","title":"Step 1: Find out that there is an error","text":"This might seem trivial but let us think about this for a moment. For this
- you will have to run your program on some input and observe its behaviour and output,
- you will need to have an expectation of its behaviour and output, and
- observe unexpected behaviour, including but not limited to:
- the program crashes,
- the program produce wrong or corrupted output, or
- the program produces incomplete output.
You could make this step a bit more comfortable by writing a little checker script that compares expected and actual output.
"},{"location":"how-to/misc/debug-software/#step-2-reproduce-your-error","title":"Step 2: Reproduce your error","text":"You will have to find out how often or regularly the problem occurs. Does the problem occur on all inputs or only specific ones? Does it occur with all parameters? Make sure that you can reproduce the problem, otherwise the problem will be hard to track down.
Discard randomness
In most applications, true randomness is neither required nor used in programs. Rather, pseudo random number generators are used that are usually seeded with a special value. In many cases, the current time is used which makes it hard to reproduce problems. Rather, use a fixed seed, e.g., by calling srand(42)
in C. You could also make this a parameter of your program, but make sure that you can fix all pseudo randomness in your program so you can deterministically reproduce its behaviour.
"},{"location":"how-to/misc/debug-software/#step-3-create-a-minimal-working-example-mwe","title":"Step 3: Create a minimal working example (MWE)","text":"Try to find a minimal input set on which you can produce your problem. For example, you could use samtools view FILE.bam chr1:90,000-100,000
to cut out regions from a BAM file. The next step is to nail down the problem. Ideally, you can deactivate or comment out parts of your program that are irrelevant to the problem.
This will allow you to get to the problematic point in your program quicker and make the whole debugging exercise easier on yourself.
"},{"location":"how-to/misc/debug-software/#interlude-what-we-have-up-to-here","title":"Interlude: What we have up to here","text":"We can now
- tell expected and \"other\" behaviour and output apart (ideally semi-automatically),
- reproduce the problem,
- and reproduce the problem quickly.
If you reached the points above, you have probably cut the time to resolve the problem by 90% already.
Let us now consider a few things that you can do from here to find the source of your problems.
"},{"location":"how-to/misc/debug-software/#method-1-stare-at-your-source-code","title":"Method 1: Stare at your source code","text":"Again, this is trivial, but: look at your code and try to follow through what it does with your given input. This is nicely complemented with the following methods. ;-)
There is a class of tools to help you in doing this, so-called static code analysis tools. They analyze the source code for problematic patterns. The success and power of such analysis tools tends to corellate strongly with how strictly typed the targeted programming language is. E.g., there are very powerful tools for Java, C/C++. However, there is some useful tool support out there for dynamic languages such as Python.
Here is a short list of pointers to static code analysis tools (feel free to extend the list):
- Python Static Analysis Tools
"},{"location":"how-to/misc/debug-software/#method-2-inspect-your-codes-activity","title":"Method 2: Inspect your code's activity","text":""},{"location":"how-to/misc/debug-software/#print-it","title":"Print it!","text":"The most simple approach is to use print
statements (or similar) to print the current line or value of parameters. While sometimes frowned upon, this certainly is one of the most robust ways to see what is happening in your program. However, beware that too much output might slow down your program or actually make your problem disappear in the case of subtle threading/timing issues (sometimes referred to as \"Heisenbugs\").
Standard output vs. error
Classically, Linux/Unix programs can print back to the user's terminal in two ways: standard output and standard errors. By convention, logging should go to stderr. The standard error stream also has the advantage that writing to it has a more direct effect. In contrast to stdout which is usually setup to be (line) buffered (you will only see output after the next newline character), stderr is unbuffered.
"},{"location":"how-to/misc/debug-software/#look-at-tophtop","title":"Look at top
/htop
","text":"The tools top
and htop
are useful tools for inspecting the activity on the current computer. The following parameters are useful (and are actually also available as key strokes when they are running).
-c
-- show the programs' command lines -u USER
-- show the processes of the user
You can exit either tool by pressing q
or Ctrl-C
.
Use the man
, Luke!
Besides searching the internet for a unix command, you can also read its manual page by running man TOOL
. If this does not work, try TOOL --help
to see its builtin help function. Also, doing an internet search for \"man tool\" might help.
"},{"location":"how-to/misc/debug-software/#look-at-strace","title":"Look at strace
","text":"The program strace
allows you to intercept the calls of your program to the kernel. As the kernel is needed for actions such as accessing the network or file system. Thus this is not so useful if your program gets stuck in \"user land\", but this might be useful to see which files it is accessing.
Pro-Tip: if you move the selection line of htop
to a process then you can strace the program by pressing s
.
"},{"location":"how-to/misc/debug-software/#look-at-lsof","title":"Look at lsof
","text":"The lsof
program lists all open files with the processes that are accessing them. This is useful for seeing which files you program has opened.
You can even build a progress bar with lsof, although that requires sudo
privileges which you might not have on the system that you are using.
Pro-Tip: if you move the selection line of htop
to a process then you can list the open files by pressing l
.
"},{"location":"how-to/misc/debug-software/#more-looking","title":"More looking","text":"There are more ways of inspecting your program, here are some:
- Google Perftools
- Linux
perf
"},{"location":"how-to/misc/debug-software/#interactive-debuggers","title":"Interactive Debuggers","text":"Let us now enter the world of interactive debuggers. Integrated development environment (IDEs) generally consist of an editor, a compiler/interpreter, and an ineractive/visual debugger. Usually, they have a debugger program at their core that can also be used on their command line.
"},{"location":"how-to/misc/debug-software/#old-but-gold-gdb","title":"Old but gold: gdb
","text":"On Unix systems, a widely used debugger is gdb
the GNU debugger. gdb
is a command line program and if you are not used to it, it might be hard to use. However, here are some pointers on how to use it:
The commands in interactive mode include:
quit
or Ctrl-D
to exit the debugger b file.ext:123
set breakpoint in file.ext
on line 123
r
run the program p var_name
print the value of the variable var_name
display var_name
print the value of the variable var_name
every time execution stops l
print the source code around the current line (multiple calls will show the next 10 lines or so, and so on) l 123
print lines around line 123
f
show information about the current frame (that is the current source location) bt
show the backtrace (that is all functions above the current one) n
step to the next line s
step into function calls finish
run the current function until it returns help
to get more help
You can call your program directly with command line arguments using cgdb [cgdb-args] --args path/to/program -- [program-args
.You can also attach to running programs using
cgdb -p PIDonce you have found out the process ID to attach to using
htopor
ps`.
Pro-tip: use cgdb
for an easier to use version that displays the source code in split screen and stores command line histories over sessions.
"},{"location":"how-to/misc/debug-software/#interactive-python-debuggers","title":"Interactive Python Debuggers","text":"You can get a simple REPL (read-execute-print loop) at virtually any position in your program by adding:
import pdb; pdb.set_trace()\n
You will get a prompt at the current position and can issue several commands including:
quit
or Ctrl-D
to exit the debugger p var_name
to print the variable with var_name
f
show information about the current frame (that is the current source location) bt
show the backtrace (that is all functions called above the current one) continue
to continue running help
to get more help
Pro-tip: use import ipdb; ipdb.set_trace()
(after installing the ipdb
package, of course) to get an IPython-based prompt that is much more comfortable to use.
"},{"location":"how-to/misc/debug-software/#pro-tip-version-control-your-code","title":"Pro-Tip: Version control your code!","text":"Here is a free bonus pro-tip: learn how to use version control, e.g., Git. This will allow you to go back to previous versions without problems and see current changes to your source code.
- 10 Free Online Git Courses
- Github: Resources to learn Git
"},{"location":"how-to/misc/debug-software/#pro-tip-write-automated-tests","title":"Pro-Tip: Write automated tests!","text":"Combine the pro tip on using version control (learn Git already!) with this one: learn how to write automated tests. This will allow you to quickly narrow down problematic changes in your version control history.
Again, testing is another topic alltogether, so here are just some links to testing frameworks to get you started:
- pytest: testing framework for Python
- testthat: testing framework for R
"},{"location":"how-to/misc/debug-software/#reading-material-on-debuggers","title":"Reading Material on Debuggers","text":"The following web resources can serve as a starting point on how to use debuggers.
- Chapter Debugger from Wikibook: Introduction to Software Engineering
- The Python Debugger
- Debugging with GDB
"},{"location":"how-to/misc/hpc-talk/","title":"Accessing HPC Talk","text":"We provide a user forum using the Discourse software at
- https://hpc-talk.cubi.bihealth.org
"},{"location":"how-to/misc/hpc-talk/#log-in","title":"Log In!","text":"First of all, visit the website for the first time: https://hpc-talk.cubi.bihealth.org
You will then be directed to our Single-Sign-On Page.
Use the appropriate button for your host organisation (MDC / Charite) where also your cluster account belongs to.
Then use the usual of your host organisation.
Clicked wrong organisation?
If you accidentally clicked the wrong institution then you need to clear your browser history up to the point where you clicked (e.g., for the last hour).
- Delete your Chrome browsing history
- How do I delete browsing history in firefox
"},{"location":"how-to/misc/hpc-talk/#first-steps","title":"First Steps","text":"You will be shown the following screen after the first login.
You can proceed with reading the notification or split it. The site is mostly self-explanatory. let us point you at a couple of interesting things for first steps.
Here you can setup your preferences
Use the \"New Topic\" button to create a new topic. Set a meaningful title, select a suitable category (we will update the list of categories over time), and write down your question or discussion item. Finally, click \"Create Topic\" to create the new topic.
You will be directed to the page with your new topic.
You can enable email notifications to receive emails if someone answers.
"},{"location":"how-to/misc/hpc-talk/#disabling-browser-notifications","title":"Disabling Browser Notifications","text":"In your settings, you will find an option to disable browser notifications in this browser.
Or you can use the do not disturb button.
"},{"location":"how-to/misc/hpc-talk/#closing-remarks","title":"Closing Remarks","text":"We established the HPC Talk forum as a self-help forum for users. Alas, there is a number of such sites out there already that are populated by more users.
How does HPC Talk fit in?
We think it is most useful for asking questions and discussing points that are directly related to the BIH HPC system.
What alternatives do I have?
For example:
- Stack Overflow for general programming questions, including Python/R programming
- Cross Validated for questions regading statistics
- Unix & Linux Stack Exchange for discussing all sorts of Linux/Unix questions
- Super User for certain more advanced Unix topics
"},{"location":"how-to/service/file-exchange/","title":"How-To: Use File Exchange","text":"Obtaining File Boxes
At the moment, file boxes are only available to members of core facilities (e.g., genomics, bioinformatics, or metabolomics) for exchanging files for their collaboration partners. Currently, HPC users cannot use the file box mechanism on their own.
BIH HPC IT provides a file exchange server to be used by the BIH core facilities and their users. The server is located in the BIH DMZ in Buch. Users authenticate using their Charite/BIH (user@CHARITE
) or MDC accounts (user@MDC-BERLIN
). File exchange is organized using \"file boxes\", directories created on the server to which selected users are granted access. Access control list maintenance is done with audit-trails (\"Revisionssicherheit\") and the file access itself is also logged to comply with data protection standards.
- Jump to \"From Windows\"
- Jump to \"From Linux\"
- Jump to \"From Mac\"
Access from Charite Network
Access from the Charite network (IP ranges 141.x.x.x
and 10.x.x.x
) must connect through the Charite proxy (http://proxy.charite.de:8080
). Depending on the client software that you are using, you might have to configure the proxy.
"},{"location":"how-to/service/file-exchange/#file-box-management","title":"File Box Management","text":"File boxes are created by the core facilities (e.g., the genomics facilities at Charite and MDC). The facility members also organize the access control. Please talk to your core facility contact on file exchange.
External users must obtain a Charite or MDC account first. Account creation is handled by the core facilities that the external user is a customer of.
"},{"location":"how-to/service/file-exchange/#file-access","title":"File Access","text":"Generally, you will be given a URL to your file box similar to https://file-exchange.bihealth.org/<file-box-id>/
. The files are served over an encrypted connection using WebDAV (which uses HTTPS).
The following describes how to access the files in the box from different platforms.
"},{"location":"how-to/service/file-exchange/#from-linux","title":"From Linux","text":"We describe how to access the files on the command line using the lftp
program. The program is preinstalled on the BIH (and the MDC cluster) and you should be able to just install it with yum install lftp
on CentOS/Red Hat or apt-get install lftp
on Ubuntu/Debian.
When using lftp
, you have to add some configuration first:
# cat >>~/.lftprc <<\"EOF\"\nset ssl:verify-certificate no\nset ftp:ssl-force yes\nEOF\n
In case that you want to access the files using a graphical user interface, search Google for \"WebDAV\" and your operating system or desktop environment. File browsers such as Nautilus and Thunar have built-in WebDAV support.
"},{"location":"how-to/service/file-exchange/#connecting","title":"Connecting","text":"First, log into the machine that has lftp
installed. The login nodes of the BIH cluster do not have it installed but all compute and file transfer nodes have it. Go to the data download location.
host:~$ mkdir -p ~/scratch/download_dir\nhost:~$ cd ~/scratch/download_dir\n
Next, start lftp
. You can open the connection using open -u <user>@<DOMAIN> https://file-exchange.bihealth.org/<file-box-id>/
(NB: there is a trailing slash) where
<user>
is your user name, e.g., holtgrem
, <domain>
is either MDC-BERLIN
or CHARITE
, and <file-box-id>
the file box ID from the URL provided to you.
When prompted, use your normal Charite/MDC password to login.
host:download_dir$ lftp\nlftp :~> open -u holtgrem@CHARITE https://file-exchange.bihealth.org/c62910b3-c1ba-49a5-81a6-a68f1f15aef6\nPassword:\ncd ok, cwd=/c62910b3-c1ba-49a5-81a6-a68f1f15aef6\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6>\n
"},{"location":"how-to/service/file-exchange/#browsing-data","title":"Browsing Data","text":"You can find a full reference of lftp
on the lftp man page. You could also use help COMMAND
on the lftp prompt. For example, to look at the files of the server for a bit...
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> ls\ndrwxr-xr-x -- /\ndrwxr-xr-x -- dir\n-rw-r--r-- -- file1\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> find\n./\n./dir/\n./dir/file2\n./file1\n
"},{"location":"how-to/service/file-exchange/#downloading-data","title":"Downloading Data","text":"To download all data use mirror
, e.g. with -P 4
to use four download threads.
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> mirror .\nTotal: 2 directories, 3 files, 0 symlinks\nNew: 3 files, 0 symlinks\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> exit\nhost:download_dir$ tree\n.\n\u251c\u2500\u2500 dir\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 file2\n\u251c\u2500\u2500 file1\n\u2514\u2500\u2500 file.txt\n\n1 directory, 3 files\n
Ignoring gnutls_record_recv
errors.
A common error to see is mirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.
. You can just ignore this.
"},{"location":"how-to/service/file-exchange/#uploading-data","title":"Uploading Data","text":"To upload data, you can use mirror -R .
which is essentially the \"reverse\" of the mirror command.
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> mirror -R\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nTotal: 2 directories, 3 files, 0 symlinks\nModified: 3 files, 0 symlinks\n4 errors detected\n
"},{"location":"how-to/service/file-exchange/#from-windows","title":"From Windows","text":"We recommend to use WinSCP for file transfer.
- Pre-packaged WinSCP on Charite Workstations. Charite IT has packaged WinSCP and you can install it using Matrix24 Empirum on Windows 10 using these instructions in the Charite intranet.
- Installing WinSCP yourself. You can obtain it from the WinSCP Download Page. A \"portable\" version is available that comes as a ZIP archive that you just have to extract without an installer.
"},{"location":"how-to/service/file-exchange/#connecting_1","title":"Connecting","text":"After starting WinSCP, you will see a window titled Login
. Just paste the URL (e.g., https://file-exchange.bihealth.org/c62910b3-c1ba-49a5-81a6-a68f1f15aef6/
) of the file box into the Host name
entry field. In this case, the fields File protocol
etc. will be filled automatically. Next, enter your user name as user@CHARITE
or user@MDC-BERLIN
(the capitalization of the part behind the @
is important). The window should now look similar to the one below.
Proxy Configuration on Charite Network
If you are on the Charite network then you have to configure the proxy. Otherwise, you have to skip this step.
Click Advanced
and a window titled Advanced Site Settings
will pop up. Here, select Connection / Proxy
in the left side. Select HTTP
for the Proxy type
. Then, enter proxy.charite.de
as the Proxy host name
and set the Port number
to 8080
. The window should nwo look as below. Then, click OK
to apply the proxy settings.
Finally, click Login
. You can now transfer files between the file exchange server and your local computer using drag and drop between WinSCP and your local Windows File Explorer. Alternatively, you can use the two-panel view of WinSCP to transfer files as described here.
"},{"location":"how-to/service/file-exchange/#from-mac","title":"From Mac","text":"For Mac, we you can also use lftp
as described above in From Linux. You can find install instructions here online.
Proxy Configuration on Charite Network
If you are on the Charite network then you must have configured the proxy appropriately. Otherwise, you have to skip this step.
You can find them in your System Preference
in the Network
section, in the Advanced
tab of your network (e.g., WiFi
).
If you want to use a graphical interface then we recommend the usage of Cyberduck. After starting the program, click Open Connection
on the top left, then select WebDAV (HTTPS)
and fill out the form as in the following way. Paste the file box URL into the server field and use your login name (user@CHARITE
or user@MDC-BERLIN
) with your usual password.
If you need to perform access through a graphical user interface on your Mac, please contact hpc-helpdesk@bihealth.org for support.
"},{"location":"how-to/service/file-exchange/#security","title":"Security","text":"The file exchange server has the fail2ban
software installed and configured (Charite, MDC, and BIH IPs are excluded from this).
If you are entering your user/password incorrectly for more than 5 times in 10 minutes then your machine will be banned for one hour. This means someone else that has the same IP address from the side of the file exchange server can get you blocked. This can happen if you are in the same home or university network with NAT or if you are behind a proxy. In this case you get a \"connection refused\" error. In this case, try again in one hour.
"},{"location":"how-to/software/apptainer/","title":"Using Apptainer (with Docker Images)","text":"Note
Singularity is now Apptainer! While Apptainer provides an singularity
alias for backwards compatibility, it is recommanded to adapt all workflows to use the new binary apptainer
.
Apptainer (https://apptainer.org/) is a popular alternative to docker, because it does not require to run as a privileged user. Apptainer can run Docker images out-of-the-box by converting them to the apptainer image format. The following guide gives a quick dive into using docker images with apptainer.
Build on your workstation, run on the HPC
Building images using Apptainer requires root privileges. We cannot give you these permissions on the BIH HPC. Thus, you will have to build the images on your local workstation (or anywhere where you have root access). You can then run the built images on the BIH HPC.
This is also true for the --writeable
flag. Apparently it needs root permissions which you don't have on the cluster.
"},{"location":"how-to/software/apptainer/#quickstart","title":"Quickstart","text":"Link ~/.apptainer to ~/work/.apptainer
Because you only have a quota of 1 GB in your home directory, you should symlink ~/.apptainer
to ~/work/.apptainer
.
host:~$ mkdir -p ~/work/.apptainer && ln -sr ~/work/.apptainer ~/.apptainer\n
In case you already have a apptainer directory:
host:~$ mv ~/.apptainer ~/work/.apptainer && ln -sr ~/work/.apptainer ~/.apptainer\n
Run a bash in a docker image:
host:~$ apptainer shell docker://godlovedc/lolcow\n
Run a command in a docker image:
host:~$ apptainer exec docker://godlovedc/lolcow echo \"hello, hello!\"\n
Run a bash in a docker image, enable access to the cuda driver (--nv) and mount a path (--bind or -B):
host:~$ apptainer shell --nv --bind /path_on_host/:/path_inside_container/ docker://godlovedc/lolcow\n
"},{"location":"how-to/software/apptainer/#some-caveats-and-notes","title":"Some Caveats and Notes","text":"Caveats
- The default apptainer images format (.sif) is read-only.
- By default apptainer mounts /home/$USER, /tmp, and $PWD in the container.
Notes
- Environment variables can be provided by setting them in the bash and adding the prefix
APPTAINERENV_
: host:~$ APPTAINERENV_HELLO=123 apptainer shell docker://godlovedc/lolcow echo $HELLO\n
- Calling
apptainer shell
or apptainer exec
uses as cwd the host callers cwd not the one set in the Dockerfile. One can change this by setting --pwd
.
"},{"location":"how-to/software/apptainer/#referencingproviding-docker-images","title":"Referencing/Providing Docker Images","text":""},{"location":"how-to/software/apptainer/#option-1-use-docker-images-via-docker-hub","title":"Option 1: Use Docker Images via Docker Hub","text":"The easiest variant to run a docker image available via a docker hub is by specifying its url. This causes apptainer to download the image and convert it to a apptainer image:
host:~$ apptainer run docker://godlovedc/lolcow\n
or to open a shell inside the image
host:~$ apptainer shell docker://godlovedc/lolcow\n
Furthermore, similar to docker, one can pull (and convert) remote image with the following call:
host:~$ apptainer pull docker://godlovedc/lolcow\n
In case your registry requires authentication you can provide it via a prompt by adding the option --docker-login
:
host:~$ apptainer pull --docker-login docker://ilumb/mylolcow\n
or by setting the following environment variables:
host:~$ export APPTAINER_DOCKER_USERNAME=ilumb\nhost:~$ export APPTAINER_DOCKER_PASSWORD=<redacted>\nhost:~$ apptainer pull docker://ilumb/mylolcow\n
More details can be found in the Apptainer documentation.
"},{"location":"how-to/software/apptainer/#option-2-converting-docker-images","title":"Option 2: Converting Docker Images","text":"Another option is to convert your docker image into the Apptainer/Singularity image format. This can be easily done using the docker images provided by docker2singularity.
To convert the docker image docker_image_name
to the apptainer image apptainer_image_name
one can use the following command line. The output image will be located in output_directory_for_images
.
host:~$ docker run -v /var/run/docker.sock:/var/run/docker.sock -v /output_directory_for_images/:/output --privileged -t --rm quay.io/singularity/docker2singularity --name apptainer_image_name docker_image_name\n
The resulting image can then directly be used as image:
host:~$ apptainer exec apptainer_image_name.sif bash\n
"},{"location":"how-to/software/apptainer/#conversion-compatibility","title":"Conversion Compatibility","text":"Here are some tips for making Docker images compatible with Apptainer taken from docker2singulrity:
- Define all environmental variables using the ENV instruction set. Do not rely on
~/.bashrc
, ~/.profile
, etc. - Define an
ENTRYPOINT
instruction set pointing to the command line interface to your pipeline. - Do not define
CMD
- rely only on ENTRYPOINT
. - You can interactively test the software inside the container by overriding the
ENTRYPOINT docker run -i -t --entrypoint /bin/bash bids/example
. - Do not rely on being able to write anywhere other than the home folder and /scratch. Make sure your container runs with the
--read-only --tmpfs /run --tmpfs /tmp parameters
(this emulates the read-only behavior of Apptainer). - Don't rely on having elevated user permissions.
- Don't use the
USER
instruction set.
"},{"location":"how-to/software/cell-ranger/","title":"How-To: Run CellRanger","text":""},{"location":"how-to/software/cell-ranger/#what-is-cell-ranger","title":"what is Cell Ranger?","text":"from the official website: \"Cell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis\"
"},{"location":"how-to/software/cell-ranger/#installation","title":"installation","text":"requires registration before download from here
to unpack Cell Ranger, its dependencies and the cellranger
script:
cd /data/cephfs-1/home/users/$USER/work\nmv /path/to/cellranger-3.0.2.tar.gz .\ntar -xzvf cellranger-3.0.2.tar.gz\n
"},{"location":"how-to/software/cell-ranger/#reference-data","title":"reference data","text":"will be provided in /data/cephfs-1/work/projects/cubit/current/static_data/app_support/cellranger
"},{"location":"how-to/software/cell-ranger/#cluster-support-slurm","title":"cluster support SLURM","text":"add a file slurm.template
to /data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template
with the following contents:
#!/usr/bin/env bash\n#\n# Copyright (c) 2016 10x Genomics, Inc. All rights reserved.\n#\n# =============================================================================\n# Setup Instructions\n# =============================================================================\n#\n# 1. Add any other necessary Slurm arguments such as partition (-p) or account\n# (-A). If your system requires a walltime (-t), 24 hours (24:00:00) is\n# sufficient. We recommend you do not remove any arguments below or Martian\n# may not run properly.\n#\n# 2. Change filename of slurm.template.example to slurm.template.\n#\n# =============================================================================\n# Template\n# =============================================================================\n#\n#SBATCH -J __MRO_JOB_NAME__\n#SBATCH --export=ALL\n#SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__\n#SBATCH --signal=2\n#SBATCH --no-requeue\n#SBATCH --partition=medium\n#SBATCH --time=24:00:00\n### Alternatively: --ntasks=1 --cpus-per-task=__MRO_THREADS__\n### Consult with your cluster administrators to find the combination that\n### works best for single-node, multi-threaded applications on your system.\n#SBATCH --mem=__MRO_MEM_GB__G\n#SBATCH -o __MRO_STDOUT__\n#SBATCH -e __MRO_STDERR__\n\n__MRO_CMD__\n
note: on newer cellranger version, slurm.template
needs to go to /data/cephfs-1/home/users/$USER/work/cellranger-XX/external/martian/jobmanagers/
"},{"location":"how-to/software/cell-ranger/#demultiplexing","title":"demultiplexing","text":"if that hasn't been done yet, you can use cellranger mkfastq
(details to be added)
"},{"location":"how-to/software/cell-ranger/#run-the-pipeline-count","title":"run the pipeline (count
)","text":"create a script run_cellranger.sh
with these contents (consult the documentation for help:
#!/bin/bash\n\n/data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/cellranger count \\\n --id=sample_id \\\n --transcriptome=/data/cephfs-1/work/projects/cubit/current/static_data/app_support/cellranger/refdata-cellranger-${species}-3.0.0\\\n --fastqs=/path/to/fastqs \\\n --sample=sample_name \\\n --expect-cells=n_cells \\\n --jobmode=slurm \\\n --maxjobs=100 \\\n --jobinterval=1000\n
and then submit the job via
sbatch --ntasks=1 --mem-per-cpu=4G --time=8:00:00 -p medium -o cellranger.log run_cellranger.sh\n
"},{"location":"how-to/software/cell-ranger/#cluster-support-sge-outdated","title":"cluster support SGE (outdated)","text":"add a file sge.template
to /data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template
with the following contents:
# =============================================================================\n# Template\n# =============================================================================\n#\n#$ -N __MRO_JOB_NAME__\n#$ -V\n#$ -pe smp __MRO_THREADS__\n#$ -cwd\n#$ -P medium\n#$ -o __MRO_STDOUT__\n#$ -e __MRO_STDERR__\n#$ -l h_vmem=__MRO_MEM_GB_PER_THREAD__G\n#$ -l h_rt=08:00:00\n\n#$ -m a\n#$ -M user@email.com\n\n__MRO_CMD__\n
and submit the job via
qsub -cwd -V -pe smp 1 -l h_vmem=8G -l h_rt=24:00:00 -P medium -m a -j y run_cellranger.sh\n
"},{"location":"how-to/software/jupyter/","title":"How-To: Run Jupyter","text":"SSH Tunnels Considered Harmful
Please use our Open OnDemand Portal for running Jupyter notebooks!
The information below is still accurate. However, many users find it tricky to get SSH tunnels working correctly. A considerable number of parts is involved and you have to get each step 100% correct. Helpdesk cannot support you in problems with SSH tunnels that are caused by incorrect usage.
"},{"location":"how-to/software/jupyter/#what-is-jupyter","title":"What is Jupyter","text":"Project Jupyter is a networking protocol for interactive computing that allows the user to write and execute code for a high number of different programming languages. The most used client is Jupyter Notebook that can be encountered in various form all over the web. Its basic principle is a document consisting of different cells, each of which contains either code (executed in place) or documentation (written in markdown). This allows one to handily describe the processed workflow.
"},{"location":"how-to/software/jupyter/#setup-and-running-jupyter-on-the-cluster","title":"Setup and running Jupyter on the cluster","text":"Install Jupyter on the cluster (via conda, by creating a custom environment)
hpc-cpu-x:~$ conda create -n jupyter jupyter\nhpc-cpu-x:~$ conda activate jupyter\n
(If you want to work in a language other than python, you can install more Jupyter language kernel, see the kernel list)
Now you can start the Jupyter server session (you may want to do this in a screen
& srun --pty bash -i
session as jupyter keeps running while you are doing computations)
hpc-cpu-x:~$ jupyter notebook --no-browser\n
Check the port number (usually 8888
) in the on output and remember it for later:
[I 23:39:40.860 NotebookApp] The Jupyter Notebook is running at:\n[I 23:39:40.860 NotebookApp] http://localhost:8888/\n
By default, Jupyter will create an access token (a link stated in the output) to protect your notebook against unauthorized access which you have to save and enter in the accessing browser. You can change this to password base authorization via jupyter notebook password
. If you are running multiple server on one or more nodes, one can separate them by changing the port number by adding --port=$PORT
.
"},{"location":"how-to/software/jupyter/#connecting-to-the-running-session","title":"Connecting to the Running Session","text":"This is slightly trickier as we have to create a SSH connection/tunnel with potentially multiple hops in between. The easiest way is probably to configure your .ssh/config
to automatically route your connection via the login node (and possibly MDC jail). This is described in our Advanced SSH config documentation
In short,add these lines to ~/.ssh/config
(replace curly parts):
Host bihcluster\n user {USER_NAME}\n HostName hpc-login-2.cubi.bihealth.org\n\nHost hpc-cpu*\n user {USER_NAME}\n ProxyJump bihcluster\n
For MDC users outside the MDC network:
Host mdcjail\n HostName ssh1.mdc-berlin.de\n User {MDC_USER_NAME}\n\nHost bihcluster\n user {USER_NAME}\n HostName hpc-login-2.cubi.bihealth.org\n\nHost hpc-cpu*\n user {USER_NAME}\n ProxyJump bihcluster\n
Check that this config is working by connecting like this: ssh hpc-cpu-1
. Please note that you cannot use any resources on this node without a valid Slurm session.
Now you setup a tunnel for your running Jupyter session:
workstation:~$ ssh -N -f -L 127.0.0.1:8888:localhost:{PORT} hpc-cpu-x\n
The port of your Jupyter server is usually 8888
. The cluster node srun
has sent you to determines the last argument. You should now be able to connect to your Jupyter server by typing localhost:8888
in your webbrowser (see the note about token and password above).
"},{"location":"how-to/software/jupyter/#losing-connection","title":"Losing connection","text":"It can and will happen that will lose connection, either due to network problems or due to shut-down of your computer. This is not a problem at all and you will not lose data, just reconnect to your session. If your notebooks are also losing connection (you will see a colorful remark in the top right corner), reconnect and click the colorful button. If this does not work, your work is still not lost as all cells that have been executed are automatically saved anyways. Copy all unexecuted cells (those are only saved periodically) and reload the browser page (after reconnecting) with F5
. (you can also open a copy of the notebook in another tab, just be aware that there may be synchronisation problems)
"},{"location":"how-to/software/jupyter/#ending-a-session","title":"Ending a Session","text":"There are two independent steps in ending a session:
Canceling the SSH tunnel
- Identify the running SSH process
hpc-cpu-x:~$ ps aux | grep \"$PORT\"\n
This will give you something like this:
user 54 0.0 0.0 43104 784 ? Ss 15:06 0:00 ssh -N -f -L 127.0.0.1:8888:localhost:8888 hpc-cpu-x\nuser 58 0.0 0.0 41116 1024 tty1 S 15:42 0:00 grep --color=auto 8888\n
from which you need the process ID (here 54
)
- Terminate it the process
hpc-cpu-x:~$ kill -9 $PID\n
Shutdown the Jupyter server
Open the Jupyter session, cancel the process with {Ctrl} + {C} and confirm {y}. Make sure you saved your notebooks beforehand (though auto-save catches most things).
"},{"location":"how-to/software/jupyter/#advanced","title":"Advanced","text":" - List of available Jupyter Kernels for different programming languages
- Jupyterlab is a further development in the Jupyter ecosystem that creates a display similar to RStudio with panels for the current file system and different notebooks in different tabs.
- One can install Jupyter kernels or python packages while running a server or notebook without restrictions
If anyone has figured out, the following might also be interesting (please add):
- create a Jupyter-Hub
- multi-user support
"},{"location":"how-to/software/keras/","title":"How-To: Run Keras (Multi-GPU)","text":"Because the GPU nodes med030[1-4]
has four GPU units we can train a model by using multiple GPUs in parallel. This How-To gives an example with Keras 2.2.4 together and tensorflow. Finally soem hints how you can submit a job on the cluster.
Hint
With tensorflow > 2.0 and newer keras version the multi_gpu_model
is deprecated and you have to use the MirroredStrategy
.
"},{"location":"how-to/software/keras/#keras-code","title":"Keras code","text":"we need to import the multi_gpu_model
model from keras.utils
and have to pass our actual model (maybe sequential Keras model) into it. In general Keras automatically configures the number of available nodes (gpus=None
). This seems not to work on our system. So we have to specify the numer of GPUs, e.g. two with gpus=2
. We put this in a try catch environment that it will also work on CPUs.
from keras.utils import multi_gpu_model\n\ntry: \n model = multi_gpu_model(model, gpus=2) \nexcept:\n pass\n
That's it!
Please read here on how to submit jobs to the GPU nodes.
"},{"location":"how-to/software/keras/#conda-environment","title":"Conda environment","text":"All this was tested with the following conda environment:
name: cuda channels: \n- conda-forge\n- bioconda\n- defaults\ndependencies:\n- keras=2.2.4\n- python=3.6.7\n- tensorboard=1.12.0\n- tensorflow=1.12.0\n- tensorflow-base=1.12.0\n- tensorflow-gpu=1.12.0\n
"},{"location":"how-to/software/matlab/","title":"How-To: Use Matlab","text":"Note
This information is outdated and will soon be removed.
GNU Octave as Matlab alternative
Note that GNU Octave is an Open Source alternative to Matlab. While both packages are not 100% compatible, Octave is an alternative that does not require any license management. Further, you can easily install it yourself using Conda.
Want to use the Matlab GUI?
Make sure you understand X forwarding as outline in this FAQ entry.
You can also use Open OnDemand Portal to run Matlab.
"},{"location":"how-to/software/matlab/#pre-requisites","title":"Pre-requisites","text":"You have to register with hpc-helpdesk@bih-charite.de for requesting access to the Matlab licenses. Afterwards, you can connect to the High-Memory using the license_matlab_r2016b
resource (see below).
"},{"location":"how-to/software/matlab/#how-to-use","title":"How-To Use","text":"BIH has a license of Matlab R2016b for 16 seats and various licensed packages (see below). To display the available licenses:
hpc-login-1:~$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=0 Free=16 Remote=no\n
Matlab is installed on all of the compute nodes:
# The following is VITAL so the scheduler allocates a license to your session.\nhpc-login-1:~$ srun -L matlab_r2016b:1 --pty bash -i\nmed0127:~$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=1 Free=15 Remote=no\nmed0127:~$ module avail\n----------------- /usr/share/Modules/modulefiles -----------------\ndot module-info null\nmodule-git modules use.own\n\n----------------------- /opt/local/modules -----------------------\ncmake/3.11.0-0 llvm/6.0.0-0 openmpi/3.1.0-0\ngcc/7.2.0-0 matlab/r2016b-0\nmed0127:~$ module load matlab/r2016b-0\nStart matlab without GUI: matlab -nosplash -nodisplay -nojvm\n Start matlab with GUI (requires X forwarding (ssh -X)): matlab\nmed0127:~$ matlab -nosplash -nodisplay -nojvm\n < M A T L A B (R) >\n Copyright 1984-2016 The MathWorks, Inc.\n R2016b (9.1.0.441655) 64-bit (glnxa64)\n September 7, 2016\n\n\nFor online documentation, see http://www.mathworks.com/support\nFor product information, visit www.mathworks.com.\n\n\n Non-Degree Granting Education License -- for use at non-degree granting, nonprofit,\n educational organizations only. Not for government, commercial, or other organizational use.\n\n>> ver\n--------------------------------------------------------------------------------------------\nMATLAB Version: 9.1.0.441655 (R2016b)\nMATLAB License Number: 1108905\nOperating System: Linux 3.10.0-862.3.2.el7.x86_64 #1 SMP Mon May 21 23:36:36 UTC 2018 x86_64\nJava Version: Java is not enabled\n--------------------------------------------------------------------------------------------\nMATLAB Version 9.1 (R2016b)\nBioinformatics Toolbox Version 4.7 (R2016b)\nGlobal Optimization Toolbox Version 3.4.1 (R2016b)\nImage Processing Toolbox Version 9.5 (R2016b)\nOptimization Toolbox Version 7.5 (R2016b)\nParallel Computing Toolbox Version 6.9 (R2016b)\nPartial Differential Equation Toolbox Version 2.3 (R2016b)\nSignal Processing Toolbox Version 7.3 (R2016b)\nSimBiology Version 5.5 (R2016b)\nStatistics and Machine Learning Toolbox Version 11.0 (R2016b)\nWavelet Toolbox Version 4.17 (R2016b)\n>> exit\n
"},{"location":"how-to/software/matlab/#running-matlab-ui","title":"Running MATLAB UI","text":"For starting the Matlab with GUI, make sure that your client is running a X11 server and you connect with X11 forwarding enabled (e.g., ssh -X hpc-login-1.cubi.bihealth.org
from the Linux command line). Then, make sure to use srun -L matlab_r2016b:1 --pty --x11 bash -i
for connecting to a node with X11 forwarding enabled.
client:~$ ssh -X hpc-login-1.cubi.bihealth.org\n[...]\nhpc-login-1:~ $ srun -L matlab_r2016b:1 --pty --x11 bash -i\n[...]\nmed0203:~$ module load matlab/r2016b-0\nStart matlab without GUI: matlab -nosplash -nodisplay -nojvm\n Start matlab with GUI (requires X forwarding (ssh -X)): matlab\nmed0203:~$ matlab\n[UI will start]\n
For forcing starting in text mode can be done (as said after module load
): matlab -nosplash -nodisplay -nojvm
.
Also see this FAQ entry.
"},{"location":"how-to/software/matlab/#see-available-matlab-licenses","title":"See Available Matlab Licenses","text":"You can use scontrol show lic
to see the currently available MATLAB license. E.g., here I am running an interactive shell in which I have requested 1 of the 16 MATLAB licenses, so 15 more remain.
$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=1 Free=15 Remote=no\n
"},{"location":"how-to/software/matlab/#a-working-example","title":"A Working Example","text":"Get a checkout of our MATLAB example. Then, look around at the contents of this repository.
hpc-login-1:~$ git clone https://github.com/bihealth/bih-cluster-matlab-example.git\nhpc-login-1:~$ cd bih-cluster-matlab-example\nhpc-login-1:~$ cat job_script.sh\n#!/bin/bash\n\n# Logging goes to directory sge_log\n#SBATCH -o slurm_log/%x-%J.log\n# Keep current environment variables\n#SBATCH --export=ALL\n# Name of the script\n#SBATCH --job-name MATLAB-example\n# Allocate 4GB of RAM per core\n#SBATCH --mem 4G\n# Maximal running time of 2 hours\n#SBATCH --time 02:00:00\n# Allocate one Matlab license\n#SBATCH -L matlab_r2016b:1\n\nmodule load matlab/r2016b-0\n\nmatlab -r example\n$ cat example.m\n% Example Hello World script for Matlab.\n\ndisp('Hello world!')\ndisp('Thinking...')\n\npause(10)\n\ndisp(sprintf('The square root of 2 is = %f', sqrt(2)))\nexit\n
For submitting the script, you can do the following
hpc-login-1:~$ sbatch job_script.sh\n
This will submit a job with one Matlab license requested. If you were to submit 17 of these jobs, then at least one of them would have to wait until all licenses are free.
Matlab License Server
Note that there is a Matlab license server running on the server that will check whether 16 or less Matlab sessions are currently running. If a Matlab session is running but this is not made known to the scheduler via -L matlab_r2016b
then this can lead to scripts crashing as not enough licenses are available. If this happens to you, double-check that you have specified the license requirements correctly and notify hpc-helpdesk@bih-charite.de in case of any problems. We will try to sort out the situation then.
"},{"location":"how-to/software/openmpi/","title":"How-To: Build and Run OpenMPI Program","text":"This article describes how to build an run an OpenMPI program. We will build a simple C program that uses the OpenMPI message passing interface and run it in parallel. You should be able to go from here with other languages and more complex programs. We will use a simple Makefile for building the software.
"},{"location":"how-to/software/openmpi/#loading-openmpi-environment","title":"Loading OpenMPI Environment","text":"First, load the OpenMPI package.
hpc-login-1:~$ srun --pty bash -i\nmed0127:~$ module load openmpi/4.3.0-0\n
Then, check that the installation works
med0127:~$ ompi_info | head\n Package: Open MPI root@med0127 Distribution\n Open MPI: 4.0.3\n Open MPI repo revision: v4.0.3\n Open MPI release date: Mar 03, 2020\n Open RTE: 4.0.3\n Open RTE repo revision: v4.0.3\n Open RTE release date: Mar 03, 2020\n OPAL: 4.0.3\n OPAL repo revision: v4.0.3\n OPAL release date: Mar 03, 2020\n
"},{"location":"how-to/software/openmpi/#building-the-example","title":"Building the example","text":"Next, clone the OpenMPI example project from Gitlab.
med0127:~$ git clone git@github.com:bihealth/bih-cluster-openmpi-example.git\nmed0127:~$ cd bih-cluster-openmpi-example/src\n
Makefile
.PHONY: default clean\n\n# configure compilers\nCC=mpicc\nCXX=mpicxx\n# configure flags\nCCFLAGS += $(shell mpicc --showme:compile)\nLDFLAGS += $(shell mpicc --showme:link)\n\ndefault: openmpi_example\n\nopenmpi_example: openmpi_example.o\n\nclean:\n rm -f openmpi_example.o openmpi_example\n
openmpi_example.c
#include <stdio.h>\n#include <mpi.h>\n\nint main(int argc, char** argv) {\n // Initialize the MPI environment\n MPI_Init(NULL, NULL);\n\n // Get the number of processes\n int world_size;\n MPI_Comm_size(MPI_COMM_WORLD, &world_size);\n\n // Get the rank of the process\n int world_rank;\n MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);\n\n // Get the name of the processor\n char processor_name[MPI_MAX_PROCESSOR_NAME];\n int name_len;\n MPI_Get_processor_name(processor_name, &name_len);\n\n // Print off a hello world message\n printf(\"Hello world from processor %s, rank %d\"\n \" out of %d processors\\n\",\n processor_name, world_rank, world_size);\n\n // Finalize the MPI environment.\n MPI_Finalize();\n\n return 0;\n}\n
run_mpi.sh
#!/bin/bash\n\n# Example job script for (single-threaded) MPI programs.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name openmpi_example\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per node\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log\"\n#SBATCH -o slurm_log/slurm-%x-%J.log\n\n# MPI-specific parameters\n\n# Run 64 tasks (threads/on virtual cores)\n#SBATCH --nodes 64\n\n# Make sure to source the profile.d file (not available on head nodes).\n/etc/profile.d/modules.sh\n\n# Load the OpenMPI environment module to get the runtime environment.\nmodule load openmpi/3.1.0-0\n\n# Launch the program.\nmpirun -np 64 ./openmpi_example\n
The next step is building the software
med0127:~$ make\nmpicc -c -o openmpi_example.o openmpi_example.c\nmpicc -pthread -Wl,-rpath -Wl,/opt/local/openmpi-4.0.3-0/lib -Wl,--enable-new-dtags -L/opt/local/openmpi-4.0.3-0/lib -lmpi openmpi_example.o -o openmpi_example\nmed0127:~$ ls -lh\ntotal 259K\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 287 Apr 7 23:29 Makefile\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 8.5K Apr 8 00:15 openmpi_example\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 760 Apr 7 23:29 openmpi_example.c\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 2.1K Apr 8 00:15 openmpi_example.o\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 1.3K Apr 7 23:29 run_hybrid.sh\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 663 Apr 7 23:35 run_mpi.sh\ndrwxrwx--- 2 holtgrem_c hpc-ag-cubi 4.0K Apr 7 23:29 sge_log\n
The software will run outside of the MPI environment -- but in a single process only, of course.
med0127:~$ ./openmpi_example\nHello world from processor med0127, rank 0 out of 1 processors\n
"},{"location":"how-to/software/openmpi/#running-openmpi-software","title":"Running OpenMPI Software","text":"All of the arguments are already in the run_mpi.sh
script.
med01247:~# sbatch run_mpi.sh\n
Explanation of the OpenMPI-specific arguments
--ntasks 64
: run 64 processes in the MPI environment.
Let's look at the slurm log file, e.g., in slurm_log/slurm-openmpi_example-3181.log
.
med0124:~$ cat slurm_log/slurm-openmpi_example-*.log\nHello world from processor med0133, rank 6 out of 64 processors\nHello world from processor med0133, rank 25 out of 64 processors\nHello world from processor med0133, rank 1 out of 64 processors\nHello world from processor med0133, rank 2 out of 64 processors\nHello world from processor med0133, rank 3 out of 64 processors\nHello world from processor med0133, rank 7 out of 64 processors\nHello world from processor med0133, rank 9 out of 64 processors\nHello world from processor med0133, rank 12 out of 64 processors\nHello world from processor med0133, rank 13 out of 64 processors\nHello world from processor med0133, rank 15 out of 64 processors\nHello world from processor med0133, rank 16 out of 64 processors\nHello world from processor med0133, rank 17 out of 64 processors\nHello world from processor med0133, rank 18 out of 64 processors\nHello world from processor med0133, rank 23 out of 64 processors\nHello world from processor med0133, rank 24 out of 64 processors\nHello world from processor med0133, rank 26 out of 64 processors\nHello world from processor med0133, rank 27 out of 64 processors\nHello world from processor med0133, rank 31 out of 64 processors\nHello world from processor med0133, rank 0 out of 64 processors\nHello world from processor med0133, rank 4 out of 64 processors\nHello world from processor med0133, rank 5 out of 64 processors\nHello world from processor med0133, rank 8 out of 64 processors\nHello world from processor med0133, rank 10 out of 64 processors\nHello world from processor med0133, rank 11 out of 64 processors\nHello world from processor med0133, rank 14 out of 64 processors\nHello world from processor med0133, rank 19 out of 64 processors\nHello world from processor med0133, rank 20 out of 64 processors\nHello world from processor med0133, rank 21 out of 64 processors\nHello world from processor med0133, rank 22 out of 64 processors\nHello world from processor med0133, rank 28 out of 64 processors\nHello world from processor med0133, rank 29 out of 64 processors\nHello world from processor med0133, rank 30 out of 64 processors\nHello world from processor med0134, rank 32 out of 64 processors\nHello world from processor med0134, rank 33 out of 64 processors\nHello world from processor med0134, rank 34 out of 64 processors\nHello world from processor med0134, rank 38 out of 64 processors\nHello world from processor med0134, rank 39 out of 64 processors\nHello world from processor med0134, rank 42 out of 64 processors\nHello world from processor med0134, rank 44 out of 64 processors\nHello world from processor med0134, rank 45 out of 64 processors\nHello world from processor med0134, rank 46 out of 64 processors\nHello world from processor med0134, rank 53 out of 64 processors\nHello world from processor med0134, rank 54 out of 64 processors\nHello world from processor med0134, rank 55 out of 64 processors\nHello world from processor med0134, rank 60 out of 64 processors\nHello world from processor med0134, rank 62 out of 64 processors\nHello world from processor med0134, rank 35 out of 64 processors\nHello world from processor med0134, rank 36 out of 64 processors\nHello world from processor med0134, rank 37 out of 64 processors\nHello world from processor med0134, rank 40 out of 64 processors\nHello world from processor med0134, rank 41 out of 64 processors\nHello world from processor med0134, rank 43 out of 64 processors\nHello world from processor med0134, rank 47 out of 64 processors\nHello world from processor med0134, rank 48 out of 64 processors\nHello world from processor med0134, rank 49 out of 64 processors\nHello world from processor med0134, rank 50 out of 64 processors\nHello world from processor med0134, rank 51 out of 64 processors\nHello world from processor med0134, rank 52 out of 64 processors\nHello world from processor med0134, rank 56 out of 64 processors\nHello world from processor med0134, rank 57 out of 64 processors\nHello world from processor med0134, rank 59 out of 64 processors\nHello world from processor med0134, rank 61 out of 64 processors\nHello world from processor med0134, rank 63 out of 64 processors\nHello world from processor med0134, rank 58 out of 64 processors\n
"},{"location":"how-to/software/openmpi/#running-hybrid-software-mpimultithreading","title":"Running Hybrid Software (MPI+Multithreading)","text":"In some cases, you want to mix multithreading (e.g., via OpenMP) with MPI to run one process with multiple threads that then can communicate via shared memory. Note that OpenMPI will let processes on the same node communicate via shared memory anyway, so this might not be necessary in all cases.
The file run_hybrid.sh
shows how to run an MPI job with 8 threads each.
Note well that memory is allocated on a per-slot (thus per-thread) base!
run_hybrid.sh
#!/bin/bash\n\n# Example job script for multi-threaded MPI programs, sometimes\n# called \"hybrid\" MPI computing.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name openmpi_example\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per node\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log\"\n#SBATCH -o slurm_log/slurm-%x-%J.log\n\n# MPI-specific parameters\n\n# Run 8 tasks (threads/on virtual cores)\n#SBATCH --ntasks 8\n# Allocate 4 CPUs per task (cores/threads)\n#SBATCH --cpus-per-task 4\n\n# Make sure to source the profile.d file (not available on head nodes).\nsource /etc/profile.d/modules.sh\n\n# Load the OpenMPI environment module to get the runtime environment.\nmodule load openmpi/4.0.3-0\n\n# Launch the program.\nmpirun -n 8 ./openmpi_example\n
We changed the following
- run 8 tasks (\"processes\")
- allocate 4 threads each
Let's look at the log output:
# cat slurm_log/slurm-openmpi_example-3193.log\nHello world from processor med0133, rank 1 out of 8 processors\nHello world from processor med0133, rank 3 out of 8 processors\nHello world from processor med0133, rank 2 out of 8 processors\nHello world from processor med0133, rank 6 out of 8 processors\nHello world from processor med0133, rank 0 out of 8 processors\nHello world from processor med0133, rank 4 out of 8 processors\nHello world from processor med0133, rank 5 out of 8 processors\nHello world from processor med0133, rank 7 out of 8 processors\n
Each process can now launch 4 threads (e.g., by defining export OMP_NUM_THREADS=4
before the program call).
"},{"location":"how-to/software/scientific-software/","title":"How-To: Install Custom Scientific Software","text":"This page gives an end-to-end example how to build and install Gromacs as an example for managing complex scientific software installs in user land. You don't have to learn or understand the specifics of Gromacs. We use it as an example as there are some actual users on the BIH cluster. However, installing it is out of scope of BIH HPC administration.
Gromacs is a good example as it is a sufficiently complex piece of software. Quite some configuration is done on the command line and there is no current software package of it in the common RPM repositories. However, it is quite well-documented and easy to install for scientific software so there is a lot to be learned.
"},{"location":"how-to/software/scientific-software/#related-documents","title":"Related Documents","text":" - How-To: Build and Run OpenMPI Programs
"},{"location":"how-to/software/scientific-software/#steps-for-installing-scientific-software","title":"Steps for Installing Scientific Software","text":"We will perform the following step:
- Download and extract the source of the software
- Configure the software (i.e., create the actual build system
Makefile
s) - Compile the software
- Install the software
- Create environment module files so the software is easy to use
Many scientific software packages will have more dependencies. If the dependencies are available as CentOS Core or EPEL packages (such as zlib), HPC IT administration can install them. However, otherwise you will have to install them on their own.
Warning
Do not perform the compilation on the login nodes but go to a compute node instead.
"},{"location":"how-to/software/scientific-software/#downloading-and-extracting-software","title":"Downloading and Extracting Software","text":"This is best done in your scratch
directory as we don't have to keep these files around for long. Note that the files in your scratch
directory will automatically be removed after 2 weeks. You can also use your work
directory here.
hpc-login-1:~$ srun --pty bash -i\nmed0127:~$ mkdir $HOME/scratch/gromacs-install\nmed0127:~$ cd $HOME/scratch/gromacs-install\nmed0127:~$ wget http://ftp.gromacs.org/pub/gromacs/gromacs-2018.3.tar.gz\nmed0127:~$ tar xf gromacs-2018.3.tar.gz\nmed0127:~$ ls gromacs-2018.3\nadmin cmake COPYING CTestConfig.cmake INSTALL scripts src\nAUTHORS CMakeLists.txt CPackInit.cmake docs README share tests\n
So far so good!
"},{"location":"how-to/software/scientific-software/#perform-the-configure-step","title":"Perform the Configure Step","text":"This is the most critical step. Most scientific C/C++ software has a build step and allows for, e.g., disabling and enabling features or setting installation paths. Here, you can configure the software depending on your needs and environment. Also, it is the easiest step to mess up.
Gromac's documentation is actually quite good but the author had problems to follow it to the letter. Gromacs recommends to create an MPI and a non-MPI build but the precise way did not work. This installation creates two flavours for Gromacs 2018.3, but in a different way than the Gromacs documentation proposes.
First, here is how to configure the non-MPI flavour Gromacs wants a modern compiler, so we load gcc
. We will need to note down the precise version we used so later we can load it for running Gromacs with the appropriate libraries. We will install gromacs into $HOME/work/software
, which is appropriate for user-installed software, but it could also go into a group or project directory. Note that we install the software into your work directory as software installations are quite large and might go above your home quota. Also, software installations are usually not precious enough to waste resources on snapshots and backups. Also that we force Gromacs to use AVX_256
for SIMD support (Intel sandy bridge architecture) to not get unsupported CPU instruction errors.
med0127:~$ module load gcc/7.2.0-0 cmake/3.11.0-0\nmed0127:~$ module list\nCurrently Loaded Modulefiles:\n 1) gcc/7.2.0-0 2) cmake/3.11.0-0\nmed0127:~$ mkdir gromacs-2018.3-build-nompi\nmed0127:~$ cd gromacs-2018.3-build-nompi\nmed0127:~$ cmake ../gromacs-2018.3 \\\n -DGMX_BUILD_OWN_FFTW=ON \\\n -DGMX_MPI=OFF \\\n -DGMX_SIMD=AVX_256 \\\n -DCMAKE_INSTALL_PREFIX=$HOME/work/software/gromacs/2018.3\n
Second, here is how to configure the MPI flavour. Note that we are also enabling the openmpi
module. We will also need the precise version here so we can later load the correct libraries. Note that we install the software into the directory gromacs-mpi
but switch off shared library building as recommended by the Gromacs documentation.
med0127:~$ module load openmpi/3.1.0-0\nmed0127:~$ module list\nCurrently Loaded Modulefiles:\n 1) gcc/7.2.0-0 2) cmake/3.11.0-0 3) openmpi/4.0.3-0\nmed0127:~$ mkdir gromacs-2018.3-build-mpi\nmed0127:~$ cd gromacs-2018.3-build-mpi\nmed0127:~$ cmake ../gromacs-2018.3 \\\n -DGMX_BUILD_OWN_FFTW=ON \\\n -DGMX_MPI=ON \\\n -DGMX_SIMD=AVX_256 \\\n -DCMAKE_INSTALL_PREFIX=$HOME/work/software/gromacs-mpi/2018.3 \\\n -DCMAKE_C_COMPILER=$(which mpicc) \\\n -DCMAKE_CXX_COMPILER=$(which mpicxx) \\\n -DBUILD_SHARED_LIBS=off\n
"},{"location":"how-to/software/scientific-software/#perform-the-build-and-install-steps","title":"Perform the Build and Install Steps","text":"This is simple, using -j 32
allows us to build with 32 threads. If something goes wrong: meh, the \"joys\" of compilling C software.
Getting Support for Building Software
BIH HPC IT cannot provide support for compiling scientific software. Please contact the appropriate mailing lists or forums for your scientific software. You should only contact the BIH HPC IT helpdesk only if you are sure that the problem is with the BIH HPC cluster. You should try to resolve the issue on your own and with the developers of the software that you are trying to build/use.
For the no-MPI version:
med0127:~$ cd ../cd gromacs-2018.3-build-nompi\nmed0127:~$ make -j 32\n[...]\nmed0127:~$ make install\n
For the MPI version:
med0127:~$ cd ../cd gromacs-2018.3-build-mpi\nmed0127:~$ make -j 32\n[...]\nmed0127:~$ make install\n
"},{"location":"how-to/software/scientific-software/#create-environment-modules-files","title":"Create Environment Modules Files","text":"For Gromacs 2018.3, the following is appropriate. You should be able to use this as a template for your environment module files:
med0127:~$ mkdir -p $HOME/local/modules/gromacs\nmed0127:~$ cat >$HOME/local/modules/gromacs/2018.3 <<\"EOF\"\n#%Module\nproc ModulesHelp { } {\n puts stderr {\n Gromacs molecular simulation toolkit (non-MPI version)\n\n - http://www.gromacs.org\n }\n}\n\nmodule-whatis {Gromacs molecular simulation toolkit (non-MPI)}\n\nset root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3\n\nprereq gcc/7.2.0-0\n\nconflict gromacs\nconflict gromacs-mpi\n\nprepend-path LD_LIBRARY_PATH $root/lib64\nprepend-path LIBRARY_PATH $root/lib64\nprepend-path MANPATH $root/share/man\nprepend-path PATH $root/bin\nsetenv GMXRC $root/bin/GMXRC\nEOF\n
med0127:~$ mkdir -p $HOME/local/modules/gromacs-mpi\nmed0127:~$ cat >$HOME/local/modules/gromacs-mpi/2018.3 <<\"EOF\"\n#%Module\nproc ModulesHelp { } {\n puts stderr {\n Gromacs molecular simulation toolkit (MPI version)\n\n - http://www.gromacs.org\n }\n}\n\nmodule-whatis {Gromacs molecular simulation toolkit (MPI)}\n\nset root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3\n\nprereq openmpi/4.0.3-0\nprereq gcc/7.2.0-0\n\nconflict gromacs\nconflict gromacs-mpi\n\nprepend-path LD_LIBRARY_PATH $root/lib64\nprepend-path LIBRARY_PATH $root/lib64\nprepend-path MANPATH $root/share/man\nprepend-path PATH $root/bin\nsetenv GMXRC $root/bin/GMXRC\nEOF\n
With the next command, make your local modules files path known to the environemtn modules system.
med0127:~$ module use $HOME/local/modules\n
You can verify the result:
med0127:~$ module avail\n\n------------------ /data/cephfs-1/home/users/YOURUSER/local/modules ------------------\ngromacs/2018.3 gromacs-mpi/2018.3\n\n-------------------- /usr/share/Modules/modulefiles --------------------\ndot module-info null\nmodule-git modules use.own\n\n-------------------------- /opt/local/modules --------------------------\ncmake/3.11.0-0 llvm/6.0.0-0 openmpi/3.1.0-0\ngcc/7.2.0-0 matlab/r2016b-0 openmpi/4.0.3-0\n
"},{"location":"how-to/software/scientific-software/#interlude-convenient-module-use","title":"Interlude: Convenient module use
","text":"You can add this to your ~/.bashrc
file to always execute the module use
after login. Note that module
is not available on the login or transfer nodes, the following should work fine:
med0127:~$ cat >>~/.bashrc <<\"EOF\"\ncase \"${HOSTNAME}\" in\n login-*|transfer-*)\n ;;\n *)\n module use $HOME/local/modules\n ;;\nesac\nEOF\n
Note that the paths chosen above are sensible but arbitrary. You can install any software anywhere you have permission to -- somewhere in your user and group home, maybe a project home makes most sense on the BIH HPC, no root permissions required. You can also place the module files anywhere, as long as the module use
line is appropriate.
As a best practice, you could use the following location:
- User-specific installation:
$HOME/work/software
as a root to install software to $HOME/work/software/$PKG/$VERSION
for installing a given software package in a given version $HOME/work/software/modules
as the root for modules to install $HOME/work/software/$PKG/$VERSION
for the module file to load the software in a given version $HOME/work/software/modules.sh
as a Bash script to contain the line module use $HOME/work/software/modules
- Group/project specific installation for a shared setup. Don't forget to give the group and yourself read permission only so you don't accidentally damage files after instalation (
chmod ug=rX,o= $GROUP/work/software
, the upper case X
is essential to only set +x
on directories and not files): $GROUP/work/software
as a root to install software to $GROUP/work/software/$PKG/$VERSION
for installing a given software package in a given version $GROUP/work/software/modules
as the root for modules to install $GROUP/work/software/$PKG/$VERSION
for the module file to load the software in a given version $GROUP/work/software/modules.sh
as a Bash script to contain the case
Bash snippet from above but with module use $GROUP/work/software/modules
- This setup allows multiple users to provide software installations and share it with others.
"},{"location":"how-to/software/scientific-software/#going-on-with-gromacs","title":"Going on with Gromacs","text":"Every time you want to use Gromacs, you can now do
med0127:~$ module load gcc/7.2.0-0 gromacs/2018.3\n
or, if you want to have the MPI version:
med0127:~$ module load gcc/7.2.0-0 openmpi/4.0.3-0 gromacs-mpi/2018.3\n
"},{"location":"how-to/software/scientific-software/#launching-gromacs","title":"Launching Gromacs","text":"Something along the lines of the following job script should be appropriate. See How-To: Build Run OpenMPI Programs for more information.
#!/bin/bash\n\n# Example job script for (single-threaded) MPI programs.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name gromacs\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per CPU\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log/<name>-<job id>.log\" (dir must exist)\n#SBATCH --output slurm_log/%x-%J.log\n\n# MPI-specific parameters\n\n# Launch on 8 nodes (== 8 tasks)\n#SBATCH --ntasks 8\n# Allocate 4 CPUs per task (== per node)\n#SBATCH --cpus-per-task 4\n\n# Load the OpenMPI and GCC environment module to get the runtime environment.\nmodule load gcc/4.7.0-0\nmodule load openmpi/4.0.3-0\n\n# Make custom environment modules known. Alternative, you can \"module use\"\n# them in the session you use for submitting the job.\nmodule use $HOME/local/modules\nmodule load gromacs-mpi/2018.3\n\n# Launch the program on 8 nodes and tell Gromacs to use 4 threads for each\n# invocation.\nexport OMP_NUM_THREADS=4\nmpirun -n 8 gmx_mpi mdrun -deffnm npt_1000\n
med0127:~$ mkdir slurm_log\nmed0127:~$ sbatch job_script.sh\nSubmitted batch job 3229\n
"},{"location":"how-to/software/tensorflow/","title":"How-To: Setup TensorFlow","text":"TensorFlow is a package for deep learning with optional support for GPUs. You can find the original TensorFlow installation instructions here.
This article describes how to set up TensorFlow with GPU support using Conda. This how-to assumes that you have just connected to a GPU node via srun --mem=10g --partition=gpu --gres=gpu:tesla:1 --pty bash -i
(for Tesla V100 GPUs, for A400 GPUs use --gres=gpu:a40:1
). Note that you will need to allocate \"enough\" memory, otherwise your python session will be Killed
because of too little memory. You should read the How-To: Connect to GPU Nodes tutorial on an explanation of how to do this.
This tutorial assumes, that conda has been set up as described in [Software Management]((../../best-practice/software-installation-with-conda.md).
"},{"location":"how-to/software/tensorflow/#create-conda-environment","title":"Create conda environment","text":"We recommend that you install mamba first with conda install -y mamba
and use this C++ reimplementation of the conda command
as follows.
$ conda create -y -n python-tf tensorflow-gpu\n$ conda activate python-tf\n
Let us verify that we have Python and TensorFlow installed. You might get different versions you could pin the version on installing with `conda create -y -n python-tf python==3.9.10 tensorflow-gpu==2.6.2
$ python --version\nPython 3.9.10\n$ python -c 'import tensorflow; print(tensorflow.__version__)'\n2.6.2\n
We thus end up with an installation of Python 3.9.10 with tensorflow 2.6.2.
"},{"location":"how-to/software/tensorflow/#run-tensorflow-example","title":"Run TensorFlow Example","text":"Let us now see whether TensorFlow has recognized our GPU correctly.
$ python\n>>> import tensorflow as tf\n>>> print(\"TensorFlow version:\", tf.__version__)\nTensorFlow version: 2.6.2\n>>> print(tf.config.list_physical_devices())\n[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]\n
Yay, we can proceed to run the Quickstart Tutorial.
>>> mnist = tf.keras.datasets.mnist\n>>> (x_train, y_train), (x_test, y_test) = mnist.load_data()\n>>> x_train, x_test = x_train / 255.0, x_test / 255.0\n>>> model = tf.keras.models.Sequential([\n... tf.keras.layers.Flatten(input_shape=(28, 28)),\n... tf.keras.layers.Dense(128, activation='relu'),\n... tf.keras.layers.Dropout(0.2),\n... tf.keras.layers.Dense(10)\n... ])\n>>> predictions = model(x_train[:1]).numpy()\n>>> predictions\narray([[-0.50569224, 0.26386747, 0.43226188, 0.61226094, 0.09630793,\n 0.34400576, 0.9819117 , -0.3693726 , 0.5221357 , 0.3323232 ]],\n dtype=float32)\n>>> tf.nn.softmax(predictions).numpy()\narray([[0.04234391, 0.09141268, 0.10817807, 0.12951255, 0.07731011,\n 0.09903987, 0.18743432, 0.04852816, 0.11835073, 0.09788957]],\n dtype=float32)\n>>> loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n>>> loss_fn(y_train[:1], predictions).numpy()\n2.3122327\n>>> model.compile(optimizer='adam',\n... loss=loss_fn,\n... metrics=['accuracy'])\n>>> model.fit(x_train, y_train, epochs=5)\n2022-03-09 17:53:47.237997: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)\nEpoch 1/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.2918 - accuracy: 0.9151\nEpoch 2/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.1444 - accuracy: 0.9561\nEpoch 3/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.1082 - accuracy: 0.9674\nEpoch 4/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.0898 - accuracy: 0.9720\nEpoch 5/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.0773 - accuracy: 0.9756\n<keras.callbacks.History object at 0x154e81360190>\n>>> model.evaluate(x_test, y_test, verbose=2)\n313/313 - 0s - loss: 0.0713 - accuracy: 0.9785\n[0.0713074803352356, 0.9785000085830688]\n>>> probability_model = tf.keras.Sequential([\n... model,\n... tf.keras.layers.Softmax()\n... ])\n>>> probability_model(x_test[:5])\n<tf.Tensor: shape=(5, 10), dtype=float32, numpy=\narray([[1.2339272e-06, 6.5599060e-10, 1.0560590e-06, 5.9356184e-06,\n 5.3691075e-12, 1.4447859e-07, 5.4218874e-13, 9.9996936e-01,\n 1.0347234e-07, 2.2147648e-05],\n [2.9887938e-06, 6.8461006e-05, 9.9991941e-01, 7.2003731e-06,\n 2.9751782e-13, 8.2818183e-08, 1.4307782e-06, 2.3203837e-13,\n 4.7433215e-07, 2.9504194e-14],\n [1.8058477e-06, 9.9928612e-01, 7.8716243e-05, 3.9140195e-06,\n 3.0842333e-05, 9.4537208e-06, 2.2774333e-05, 4.5549971e-04,\n 1.1015874e-04, 6.9138093e-07],\n [9.9978787e-01, 3.0206781e-08, 2.8528208e-05, 8.5581682e-08,\n 1.3851340e-07, 2.3634559e-06, 1.8480707e-05, 1.0153375e-04,\n 1.1583331e-07, 6.0887167e-05],\n [6.4914235e-07, 2.5808356e-08, 1.8225538e-06, 2.3215563e-09,\n 9.9588013e-01, 4.6049720e-08, 3.8903639e-07, 2.9772724e-05,\n 4.3141077e-07, 4.0867776e-03]], dtype=float32)>\n>>> exit()\n
"},{"location":"how-to/software/tensorflow/#writing-tensorflow-slurm-jobs","title":"Writing TensorFlow Slurm Jobs","text":"Writing Slurm jobs using TensorFlow is as easy as creating the following scripts.
tf_script.py
#/usr/bin/env python\n\nimport tensorflow as tf\nprint(\"TensorFlow version:\", tf.__version__)\nprint(tf.config.list_physical_devices())\n\nmnist = tf.keras.datasets.mnist\n\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\nx_train, x_test = x_train / 255.0, x_test / 255.0\n\n\nmodel = tf.keras.models.Sequential([\n tf.keras.layers.Flatten(input_shape=(28, 28)),\n tf.keras.layers.Dense(128, activation='relu'),\n tf.keras.layers.Dropout(0.2),\n tf.keras.layers.Dense(10)\n])\n\npredictions = model(x_train[:1]).numpy()\nprint(predictions)\n\nprint(tf.nn.softmax(predictions).numpy())\n\n# ... and so on ;-)\n
tf_job.sh
#!/usr/bin/bash\n\n#SBATCH --job-name=tf-job\n#SBATCH --mem=10g\n#SBATCH --partition=gpu\n#SBATCH --gres=gpu:tesla:1\n\nsource $HOME/work/miniforge/bin/activate\nconda activate python-tf\n\npython tf_script.py &>tf-out.txt\n
And then calling
$ sbatch tf_job.sh\n
You can find the reuslts in tf-out.txt
after completion.
$ cat tf-out.txt \n2022-03-09 18:05:54.628846: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA\nTo enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n2022-03-09 18:05:56.999848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30988 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:18:00.0, compute capability: 7.0\nTensorFlow version: 2.6.2\n[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]\n[[-0.07757086 0.04676083 0.9420195 -0.59902835 -0.26286742 -0.392514\n 0.3231195 -0.17169198 0.3480805 0.37013203]]\n[[0.07963609 0.09017922 0.22075593 0.04727634 0.06616627 0.05812084\n 0.11888511 0.07248258 0.12188996 0.12460768]]\n
"},{"location":"hpc-tutorial/episode-0/","title":"First Steps: Episode 0","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm?"},{"location":"hpc-tutorial/episode-0/#prerequisites","title":"Prerequisites","text":"This tutorial assumes familiarity with Linux/Unix operating systems. It also assumes that you have already connected to the cluster. We have collected some links to tutorials and manuals on the internet.
"},{"location":"hpc-tutorial/episode-0/#legend","title":"Legend","text":"Before we start with our first steps tutorial, we would like to introduce the following convention that we use throughout the series:
$ Commands are prefixed with a little dollar sign\n
While file paths are highlighted like this: /data/cephfs-1/work/projects/cubit/current
.
"},{"location":"hpc-tutorial/episode-0/#instant-gratification","title":"Instant Gratification","text":"After connecting to the cluster, you are located on a login node. To get to your first compute node, type srun --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i
which will launch an interactive Bash session on a free remote node running up to 7 days, enabling you to use 8 cores and 8 Gb memory. Typing exit
will you bring back to the login node.
hpc-login-1$ srun -p long --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i\nhpc-cpu-1$ exit\n$\n
See? That was easy!
"},{"location":"hpc-tutorial/episode-0/#preparation","title":"Preparation","text":"In preparation for our first steps tutorial series, we would like you to install the software for this tutorial. In general the users on the cluster will manage their own software with the help of conda. If you haven't done so so far, please follow the instructions in installing conda first. The only premise is that you are able to log into the cluster. Make also sure that you are logged in to a computation node using srun -p medium --time 1-00 --mem=4G --cpus-per-task=1 --pty bash -i
.
Now we will create a new environment, so as to not interfere with your current or planned software stack, and install into it all the software that we need during the tutorial. Run the following commands:
$ conda create -n first-steps python=3 snakemake bwa delly samtools gatk4\n$ conda activate first-steps\n(first-steps) $\n
"},{"location":"hpc-tutorial/episode-1/","title":"First Steps: Episode 1","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? This is part one of the \"First Steps\" BIH Cluster Tutorial. Here we will build a small pipeline with alignment and variant calling. The premise is that you have the tools installed as described in Episode 0. For this episode, please make sure that you are on a compute node. As a reminder, the command to access a compute node with the required resources is
$ srun --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i\n
"},{"location":"hpc-tutorial/episode-1/#tutorial-input-files","title":"Tutorial Input Files","text":"We will provide you with some example FASTQ files, but you can use your own if you like. You can find the data here:
/data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz
/data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz
"},{"location":"hpc-tutorial/episode-1/#creating-a-project-directory","title":"Creating a Project Directory","text":"First, you should create a folder where the output of this tutorial will go. It would be good to have it in your work
directory in /data/cephfs-1/home/users/$USER
, because it is faster and there is more space available.
(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode1\n(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode1\n
Quotas / File System limits
- Note well that you have a quota of 1 GB in your home directory at
/data/cephfs-1/home/users/$USER
. The reason for this is that nightly snapshots and backups are created for this directory which are precious resources. - This limit does not apply to your work directory at
/data/cephfs-1/home/users/$USER/work
. The limits are much higher here but no snapshots or backups are available. - There is no limit on your scratch directory at
/data/cephfs-1/home/users/$USER/scratch
. However, files placed here are automatically removed after 2 weeks. This is only appropriate for files during download or temporary files.
"},{"location":"hpc-tutorial/episode-1/#creating-a-directory-for-temporary-files","title":"Creating a Directory for Temporary Files","text":"In general it is advisable to have a proper temporary directory available. You can create one in your ~/scratch
folder and make it available to the system.
(first-steps) $ export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp\n(first-steps) $ mkdir -p $TMPDIR\n
"},{"location":"hpc-tutorial/episode-1/#using-the-cubit-static-data","title":"Using the Cubit Static Data","text":"The static data is located in /data/cephfs-1/work/projects/cubit/current/static_data
. For our small example, the required reference genome and index can be found at:
/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta
/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta
"},{"location":"hpc-tutorial/episode-1/#aligning-the-reads","title":"Aligning the Reads","text":"Let's align our data:
(first-steps) $ bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n /data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz \\\n| samtools view -b \\\n| samtools sort -O BAM -T $TMPDIR -o aln.bam\n\n(first-steps) $ samtools index aln.bam\n
"},{"location":"hpc-tutorial/episode-1/#perform-structural-variant-calling","title":"Perform Structural Variant Calling","text":"And do the structural variant calling:
(first-steps) $ delly call \\\n -g /data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n aln.bam\n
Note that delly will not find any variants.
"},{"location":"hpc-tutorial/episode-1/#small-variant-calling-snv-indel","title":"Small Variant Calling (SNV, indel)","text":"And now for the SNP calling (this step will take ~ 20 minutes):
(first-steps) $ gatk HaplotypeCaller \\\n -R /data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n -I aln.bam \\\n -ploidy 2 \\\n -O test.GATK.vcf\n
"},{"location":"hpc-tutorial/episode-1/#outlook-more-programs-and-static-data","title":"Outlook: More Programs and Static Data","text":"So this is it! We used the tools that we installed previously, accessed the reference data and ran a simple alignment and variant calling pipeline. You can access a list of all static data through this wiki, follow this link to the Static Data. You can also have a peek via:
(first-steps) $ tree -L 3 /data/cephfs-1/work/projects/cubit/current/static_data | less\n
"},{"location":"hpc-tutorial/episode-2/","title":"First Steps: Episode 2","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? Welcome to the second episode of our tutorial series!
Once you are logged in to the cluster, you have the possibility to distribute your jobs to all the nodes that are available. But how can you do this easily? The key command to this magic is sbatch
. This tutorial will show you how you can use this efficiently.
"},{"location":"hpc-tutorial/episode-2/#the-sbatch-command","title":"The sbatch
Command","text":"So what is sbatch
doing for you?
You use the sbatch
command in front of the script you actually want to run. sbatch
then puts your job into the job queue. The job scheduler looks at the current status of the whole system and will assign the first job in the queue to a node that is free in terms of computational load. If all machines are busy, yours will wait. But your job will sooner or later get assigned to a free node.
We strongly recommend using this process for starting your computationally intensive tasks because you will get the best performance for your job and the whole system won't be disturbed by jobs that are locally blocking nodes. Thus, everybody using the cluster benefits.
You may have noticed that you run sbatch
with a script, not with regular commands. The reason is that sbatch
only accepts bash scripts. If you give sbatch
a normal shell command or binary, it won't work. This means that we have to put the command(s) we want to use in a bash script. A skeleton script can be found at /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_job.sh
The content of the file:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=8\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=8G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nmkdir -p ${TMPDIR}\n
The lines starting with #SBATCH
are actually setting parameters for a sbatch
command, so #SBATCH --job-name=tutorial
is equal to sbatch --job-name=tutorial
. Slurm will create a log file with a file name composed of the job name (%x
) and the job ID (%j
), e.g. logs/tutorial-XXXX.log
. It will not automatically create the logs
directory, we need to do this manually first. Here, we emphasize the importance of the log files! They are the first place to look if anything goes wrong.
To start now with our tutorial, create a new tutorial directory with a log directory, e.g.,
(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode2/logs\n
and copy the wrapper script to this directory:
(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode2\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_job.sh .\n(first-steps) $ chmod u+w submit_job.sh\n
Now open this file and copy the same commands we executed in the last tutorial to this file.
To keep it simple, we will put everything into one script. This is perfectly fine because the alignment and indexing are sequential. But there are two steps that could be run in parallel, namely the variant calling, because they don't depend on each other. We will learn how to do that in a later tutorial. Your file should look something like this:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=8\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=8G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nmkdir -p ${TMPDIR}\n\nBWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\nREF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\nbwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n $BWAREF \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz \\\n| samtools view -b \\\n| samtools sort -O BAM -T $TMPDIR -o aln.bam\n\nsamtools index aln.bam\n\ndelly call -g \\\n $REF \\\n aln.bam\n\ngatk HaplotypeCaller \\\n -R $REF \\\n -I aln.bam \\\n -ploidy 2 \\\n -O test.GATK.vcf\n
Let's run it (make sure that you are in the tutorial/episode2
directory!):
(first-steps) $ sbatch submit_job.sh\n
And wait for the response which will tell you that your job was submitted and which job id number it was assigned. Note that sbatch
only tells you that the job has started, but nothing about finishing. You won't get any response at the terminal when the job finishes. It will take approximately 20 minutes to finish the job.
"},{"location":"hpc-tutorial/episode-2/#monitoring-jobs","title":"Monitoring Jobs","text":"You'll probably want to see how your job is doing. You can get a list of your jobs using:
(first-steps) $ squeue --me\n
Note that logins are also considered as jobs.
Identify your job by the <JOBID>
(1st column) or the name of the script (3rd column). The most likely states you will see (5th column of the table):
PD
pending, waiting to be submitted R
running - disappeared, either because of an error or because it finished
In the 8th column you can see that your job is very likely running on a different machine than the one you are on!
Do not use Slurm and watch
or loops
The watch
command is a useful tool for running commands in a loop every N
seconds. For example, on your workstation you could do watch 'ping -c 3 google.com'
to execute three network pings to Google every two seconds.
\ud83d\udc4e Using watch
or manual loops in a cluster environment can have bad effects when querying Slurm or the shared file system. Both are shared resources and \"expensive\" queries should not be run in loops. For Slurm, this includes running squeue
. The same would be true for running squeue -i
which performs an internal loop.
\ud83d\udc4d Use the Slurm query commands only when you actually need the output. If you run them in an (implict or explicit) loop, then do so only for a short time and don't leave this open in a screen.
Get more information about your jobs by either passing the job id:
(first-steps) $ sstat <JOBID>\n
And of course, watch what the logs are telling you:
(first-steps) $ tail -f logs/tutorial-<JOBID>.log\n
There will be no notification when your job is done, so it is best to watch the squeue --me
command. To watch the sbatch
command there is a linux command watch
that you give a command to execute every few seconds. This is useful for looking for changes in the output of a command. The seconds between two executions can be set with the -n
option. It is best to use -n 60
to minimize unnecessary load on the file system:
(first-steps) $ watch -n 60 squeue --me\n
If for some reason your job is hanging, you can delete your job using scancel
with your job-ID: (first-steps) $ scancel <job-ID>\n
"},{"location":"hpc-tutorial/episode-2/#job-queues","title":"Job Queues","text":"The cluster has a special way of organizing itself and by telling the cluster how long and with which priority you want your jobs to run, you can help it in this. There is a system set up on the cluster where you can enqueue your jobs to so-called partitions. partitions have different prioritites and are allowed for different running times. To get to know what partitions are available, and how to use them properly, we highly encourage you to read the cluster queues wiki page.
"},{"location":"hpc-tutorial/episode-3/","title":"First Steps: Episode 3","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? In this episode we will discuss how we can parallelize steps in a pipeline that are not dependent on each other. In the last episode we saw a case (the variant calling) that could have been potentially parallelized.
We will take care of that today. Please note that we are not going to use the sbatch
command we learned earlier. Thus, this tutorial will run on the same node where you execute the script. We will introduce you to Snakemake, a tool with which we can model dependencies and run things in parallel. In the next tutorial we will learn how to submit the jobs with sbatch
and Snakemake combined.
For those who know make
already, Snakemake will be familiar. You can think of Snakemake being a bunch of dedicated bash scripts that you can make dependent on each other. Snakemake will start the next script when a previous one finishes, and potentially it will run things in parallel if the dependencies allow.
Snakemake can get confusing, especially if the project gets big. This tutorial will only cover the very basics of this powerful tool. For more, we highly recommend digging into the Snakemake documentation:
- https://snakemake.readthedocs.io/en/stable/
- http://slides.com/johanneskoester/deck-1#/
Every Snakemake run requires a Snakefile
file. Create a new folder inside your tutorial folder and copy the skeleton:
(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode3\n(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode3\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/Snakefile .\n(first-steps) $ chmod u+w Snakefile\n
Your Snakefile
should look as follows:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz',\n output:\n bam='alignment/test.bam',\n bai='alignment/test.bam.bai',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/test.bam'\n output:\n 'structural_variants/test.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\nrule snps:\n input:\n 'alignment/test.bam'\n output:\n 'snps/test.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
Let me explain. The content resembles the same steps we took in the previous tutorials. Although every step has its own rule (alignment, snp calling, structural variant calling), we could instead have written everything in one rule. It is up to you to design your rules! Note that the rule names are arbitrary and not mentioned anywhere else in the file.
But there is one primary rule: the rule all
. This is the kickoff rule that makes everything run.
As you might have noticed, every rule has three main parameters: input
, output
and shell
. input
defines the files that are going into the rule, output
those that are produced when executing the rule, and shell
is the bash script that processes input
to produce output
.
Rule all
does not have any output
or shell
, it uses input
to start the chain of rules. Note that the input files of this rule are the output files of rule snps
and structural_variants
. The input of those rules is the output of rule alignment
. This is how Snakemake processes the rules: It looks for rule all
(or a rule that just has input
files) and figures out how it can create the required input files with other rules by looking at their output
files (the input
files of one rule must be the output
files of another rule). In our case it traces the workflow back to rule snps
and structural_variants
as they have the matching output files. They depend in return on the alignment, so the alignment
rule must be executed, and this is the first thing that will be done by Snakemake.
There are also some peculiarities about Snakemake:
- You can name files in
input
or output
as is done in rule alignment
with the output files. - You can access the
input
and output
files in the script by writing {input}
or {output}
. - If they are not named, they will be concatenated, separated by white space
- If they are named, access them with their name, e.g.,
{output.bam}
- Curly braces must be escaped with curly braces, e.g., for bash variables:
${{VAR}}
instead of ${VAR}
but not Snakemake internal variables like {input}
or {output}
- In the rule
structural_variants
we cheat a bit because delly does not produce output files if it can't find variants. - We do this by
touching
(i.e., creating) the required output file. Snakemake has a function for doing so (call touch()
on the filename).
- Intermediate folders in the path to output files are always created if they don't exist.
- Because Snakemake is Python based, you can write your own functions for it to use, e.g. for creating file names automatically.
But Snakemake can do more. It is able to parse the paths of the output files and set wildcards if you want. For this your input (and output) file names have to follow a parsable scheme. In our case they do! Our FASTQ files, our only initial input files, start with test
. The output of the alignment as well as the variant calling is also prefixed test
. We now can modify the Snakemake file accordingly, by exchanging every occurrence of test
in each input
or output
field with {id}
(note that you could also give a different name for your variable). Only the input rule should not be touched, otherwise Snakemake would not know which value this variable should have. Your Snakefile
should look now like this:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R2.fq.gz',\n output:\n bam='alignment/{id}.bam',\n bai='alignment/{id}.bam.bai',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/{id}.bam'\n output:\n 'structural_variants/{id}.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\nrule snps:\n input:\n 'alignment/{id}.bam'\n output:\n 'snps/{id}.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
Before we finally run this, we can make a dry run. Snakemake will show you what it would do:
(first-steps) $ snakemake -n\n
If everything looks green, you can run it for real. We provide it two cores to allow two single-threaded jobs to be run simultaneously:
(first-steps) $ snakemake -j 2\n
"},{"location":"hpc-tutorial/episode-4/","title":"First Steps: Episode 4","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? In the last episodes we learned about distributing a job among the cluster nodes using sbatch
and how to automate and parallelize our pipeline with Snakemake. We are lucky that those two powerful commands can be combined. What is the result? You will have an automated pipeline with Snakemake that uses sbatch
to distribute jobs among the cluster nodes instead of running only the same node.
The best thing is that we can reuse our Snakefile
as it is and just write a wrapper script to call Snakemake. We run the script and the magic will start.
First, create a new folder for this episode:
(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode4/logs\n(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode4\n
And copy the wrapper script to this folder as well as the Snakefile (you can also reuse the one with the adjustments from the previous episode):
(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_snakejob.sh .\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/Snakefile .\n(first-steps) $ chmod u+w submit_snakejob.sh Snakefile\n
The Snakefile
is already known to you but let me explain the wrapper script submit_snakejob.sh
:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=2\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=1G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nexport LOGDIR=logs/${SLURM_JOB_NAME}-${SLURM_JOB_ID}\nmkdir -p $LOGDIR\n\neval \"$($(which conda) shell.bash hook)\"\nconda activate first-steps\n\nset -x\n\nsnakemake --profile=cubi-v1 -j 2 -k -p --restart-times=2\n
In the beginning you see the #SBATCH
that introduces the parameters when you provide this script to sbatch
as described in the second episode. Please make sure that the logs
folder exists before starting the run! We then set and export the TMPDIR
and LOGDIR
variables. Note that LOGDIR
has a subfolder named $SLURM_JOB_NAME-$SLURM_JOB_ID
that will be created for you. Snakemake will store its logfiles for this very Snakemake run in this folder. The next new thing is set -x
. This simply prints to the terminal every command that is executed within the script. This is useful for debugging.
Finally, the Snakemake call takes place. With the --profile
option we define that Snakemake uses the Snakemake profile at /etc/xdg/snakemake/cubi-v1
. The profile will take create appropriate calls to sbatch
and interpret the following settings from your Snakemake rule:
threads
: the number of threads to execute the job on - memory in megabytes or with a suffix of
k
, M
, G
, or T
. You can specify EITHER resources.mem
/resources.mem_mb
: the memory to allocate for the whole job, OR resources.mem_per_thread
: the memory to allocate for each thread.
resources.time
: the running time of the rule, in a syntax supported by Slurm, e.g. HH:MM:SS
or D-HH:MM:SS
resources.partition
: the partition to submit your job into (Slurm will pick a fitting partition for you by default) resources.nodes
: the number of nodes to schedule your job on (defaults to 1
and you will want to keep that value unless you want to use MPI)
The other options to snakemake
have the meaning:
-j 2
: run at most two jobs at the same time -k
: keep going even if a rule execution fails -p
: print the executed shell commands --restart-times=2
: restart failing jobs up to two times
It is now time to update your Snakefile
such that it actually specifies the resources mentioned above:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R2.fq.gz',\n output:\n bam='alignment/{id}.bam',\n bai='alignment/{id}.bam.bai',\n threads: 8\n resources:\n mem='8G',\n time='12:00:00',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/{id}.bam'\n output:\n 'structural_variants/{id}.vcf'\n threads: 1\n resources:\n mem='4G',\n time='2-00:00:00',\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\ndef snps_mem(wildcards, attempt):\n mem = 2 * attempt\n return '%dG' % mem\n\nrule snps:\n input:\n 'alignment/{id}.bam'\n output:\n 'snps/{id}.vcf'\n threads: 1\n resources:\n mem=snps_mem,\n time='04:00:00',\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
We thus configure the resource consumption of the rules as follows:
alignment
with 8 threads and up to 8GB of memory in total with a running time of up to 12 hours, structural_variants
with one thread and up to 4GB of memory in with a running time of up to 2 days, snps
with one thread and running up to four hours. Instead of passing a static amount of memory, we pass a resource callable. The attempt
parameter will be passed a value of 1
on the initial invocation. If variant calling with the GATK HaplotypeCaller fails then it will retry and attempt
will have an incremented value on each invocation (2
on the first retry and so on). Thus, we try to do small variant calling with 2, 4, 6, and 8 GB.
Finally, run the script:
(first-steps) $ sbatch submit_snakejob.sh\n
If you watch squeue --me
now, you will see that the jobs are distributed to the system:
(first-steps) $ squeue --me\n
Please refer to the Snakemake documentation for more details on using Snakemake, in particular how to use the cluster configuration on how to specify the resource requirements on a per-rule base.
"},{"location":"misc/external-resources/","title":"External Resources","text":""},{"location":"misc/external-resources/#basic-linux","title":"Basic Linux","text":"The BIH HPC uses CentOS Linux. A basic understanding of Linux is required. Even better, you should already have intermediate to advanced Linux/Unix skills.
BIH HPC IT cannot provide you with basic Unix training. Please ask your home organization (e.g., Charite or MDC) to provide you with basic Linux training.
That said, here are some resources that we find useful:
"},{"location":"misc/external-resources/#internet-tutorials","title":"Internet Tutorials","text":"There is a large number of Linux tutorials online including:
- Ryans Linux Tutorial
- Digital Ocean Tutorials
- Linux Basics
- Environment Variables
- Using Jupyter Notebooks to manage SLURM jobs
"},{"location":"misc/external-resources/#internet-forums","title":"Internet Forums","text":" - Unix & Linux Stack Exchange
"},{"location":"misc/external-resources/#global-organisation-for-bioinformatics-learning-education-and-training","title":"Global Organisation for Bioinformatics Learning, Education, and Training","text":"GOBLET has a number of Bioinformatics-focused tutorials. This includes
- \"A Critical Guide to Unix\"
"},{"location":"misc/provided-software/","title":"Administration-Provided Software","text":"Some software is provided by HPC Administration based on the criteria that it is:
- system-near or system-level,
- very commonly used.
Currently, this includes:
- GCC v7.2.0
- CMake v3.11.0
- LLVM v6.0.0
- OpenMPI v4.0.3
On the GPU node, this also includes a recent NVIDIA CUDA version.
To see which software is available, use module avail
on a compute node (this will not work on login nodes):
$ module avail\n--------------------- /opt/local/modules ---------------------\ncmake/3.11.0-0 llvm/6.0.0-0\ngcc/7.2.0-0 openmpi/4.0.3-0\n
To load software, use module load
. This will adjust the environment variables accordingly, in particular update PATH
such that the executable are available.
$ which gcc\n/bin/gcc\n$ module load gcc/7.2.0-0\n$ which gcc\n/opt/local/gcc-7.2.0-0/bin/gcc\n
Problems with executing module
?
See the corresponding FAQ entry in the case that you get a -bash: module: command not found
when calling module
.
"},{"location":"misc/publication-list/","title":"Publication List","text":"The BIH Cluster is a valuable resource. It has been used to support the publications listed below.
- Please add your publications here.
- Acknowledge usage of the cluster in your manuscript as \"Computation has been performed on the HPC for Research cluster of the Berlin Institute of Health\".
"},{"location":"misc/publication-list/#articles-preprints","title":"Articles & Preprints","text":""},{"location":"misc/publication-list/#2024","title":"2024","text":"Hollunder, B., Ostrem, J.L., Sahin, I.A., Rajamani, N., Oxenford, S., Butenko, K., Neudorfer, C., Reinhardt, P., Zvarova, P., Polosan, M., Akram, H., Vissani, M., Zhang, C., Sun, B., Navratil, P., Reich, M.M., Volkmann, J., Yeh, F.-C., Baldermann, J.C., Dembek, T.A., Visser-Vandewalle, V., Alho, E.J.L., Franceschini, P.R., Nanda, P., Finke, C., K\u00fchn, A.A., Dougherty, D.D., Richardson, R.M., Bergman, H., DeLong, M.R., Mazzoni, A., Romito, L.M., Tyagi, H., Zrinzo, L., Joyce, E.M., Chabardes, S., Starr, P.A., Li, N., Horn, A., 2024. Mapping dysfunctional circuits in the frontal cortex using deep brain stimulation. Nat. Neurosci. 1\u201314. doi: 10.1038/s41593-024-01570-1
"},{"location":"misc/publication-list/#2022","title":"2022","text":"Kossen T, Hirzel MA, Madai VI, Boenisch F, Hennemuth A, Hildebrand K, Pokutta S, Sharma K, Hilbert A, Sobesky J, Galinovic I, Khalil AA, Fiebach JB and Frey D. Toward Sharing Brain Images: Differentially Private TOF-MRA Images With Segmentation Labels Using Generative Adversarial Networks. Frontiers in Artificial Intelligence. 5 (2022). issn: 2624-8212. doi: 10.3389/frai.2022.813842
"},{"location":"misc/publication-list/#2021","title":"2021","text":"Li, N., Hollunder, B., Baldermann, J. C., Kibleur, A., Treu, S., Akram, H., Al-Fatly, B., Strange, B. A., Barcia, J. A., Zrinzo, L., Joyce, E. M., Chabardes, S., Visser-Vandewalle, V., Polosan, M., Kuhn, J., K\u00fchn, A. A., & Horn, A. (2021). A Unified Functional Network Target for Deep Brain Stimulation in Obsessive-Compulsive Disorder. Biological Psychiatry. doi: 10.1016/j.biopsych.2021.04.006
Bressem KK, Vahldiek JL, Adams L, Niehues SM, Haibel H, Rodriguez VR, Torgutalp M, Protopopov M, Proft F, Rademacher J, Sieper J, Rudwaleit M, Hamm B, Makowski MR, Hermann KG, Poddubnyy D. Deep learning for detection of radiographic sacroiliitis: achieving expert-level performance. Arthritis Res Ther. 2021 Apr 8;23(1):106. doi: 10.1186/s13075-021-02484-0
Kossen T, Subramaniam P, Madai VI, Hennemuth A, Hildebrand K, Hilbert A, Sobesky J, Livne M, Galinovic I, Khalil AA, Fiebach JB, Frey D. Synthesizing anonymized and labeled TOF-MRA patches for brain vessel segmentation using generative adversarial networks. Computers in Biology and Medicine. 2021 Apr 131,104254. doi: 10.1016/j.compbiomed.2021.104254
Paraskevopoulou S., K\u00e4fer S., Zirkel F., Donath A., Petersen M., Liu S., Zhou X., Drosten C., Misof B., Junglen S. (2021). \"Viromics of extant insect orders unveil the evolution of the flavi-like superfamily.\" Virus Evolution 2021 Mar 30. doi: 10.1093/ve/veab030
Thomas Krannich, W Timothy J White, Sebastian Niehus, Guillaume Holley, Bjarni V Halld\u00f3rsson, Birte Kehr, Population-scale detection of non-reference sequence variants using colored de Bruijn graphs, Bioinformatics, 2021, btab749, doi: 10.1093/bioinformatics/btab749
Julia Markowski, Rieke Kempfer, Alexander Kukalev, Ibai Irastorza-Azcarate, Gesa Loof, Birte Kehr, Ana Pombo, Sven Rahmann, Roland F Schwarz, GAMIBHEAR: whole-genome haplotype reconstruction from Genome Architecture Mapping data, Bioinformatics, Volume 37, Issue 19, 1 October 2021, Pages 3128\u20133135. doi: 10.1093/bioinformatics/btab238
"},{"location":"misc/publication-list/#2020","title":"2020","text":"Kr\u00fctzfeldt LM, Schubach M, Kircher M. The impact of different negative training data on regulatory sequence predictions. PLoS One. 2020 Dec 1;15(12):e0237412. doi: 10.1371/journal.pone.0237412.
Klotz-Noack K, Klinger B, Rivera M, Bublitz N, Uhlitz F, Riemer P, L\u00fcthen M, Sell T, Kasack K, Gastl B, Ispasanie SSS, Simon T, Janssen N, Schwab M, Zuber J, Horst D, Bl\u00fcthgen N, Sch\u00e4fer R, Morkel M, Sers C. SFPQ Depletion Is Synthetically Lethal with BRAFV600E in Colorectal Cancer Cells. Cell Rep. 2020 Sep 22;32(12):108184. doi: 10.1016/j.celrep.2020.108184.
Kleinert, P., Martin, B., & Kircher, M. (2020). \"HemoMIPs\u2014Automated analysis and result reporting pipeline for targeted sequencing data.\" PLOS Computational Biology, 16(6), e1007956. doi: 10.1371/journal.pcbi.1007956
Ehmke, N.; Cusmano-Ozog, K.; Koenig, R.; Holtgrewe, M.; Nur, B.; Mihci, E.; Babcock, H.; Gonzaga-Jauregui, C.; Overton, J. D.; Xiao, J.; et al. Biallelic Variants in KYNU Cause a Multisystemic Syndrome with Hand Hyperphalangism. Bone 2020, 115219. doi: 10.1016/j.bone.2019.115219.
Niehus, S.; J\u00f3nsson, H.; Sch\u00f6nberger, J.; Bj\u00f6rnsson, E.; Beyter, D.; Eggertsson, H.P.; Sulem, P.; Stef\u00e1nsson, K.; Halld\u00f3rsson, B.V.; Kehr, B. PopDel identifies medium-size deletions jointly in tens of thousands of genomes. bioRxiv 2020, 10.1101/740225 doi: 10.1101/740225
Gordon, M. G., Inoue, F., Martin, B., Schubach, M., Agarwal, V., Whalen, S., ... & Kreimer, A. (2020). \"lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements.\" Nature Protocols, 15(8), 2387-2412. doi: 10.1038/s41596-020-0333-5
Paraskevopoulou S., Pirzer F., Goldmann N., Schmid J., Corman V.M., Gottula L.T.,Schroeder S., Rasche A., Muth D., Drexler J.F., Heni A.C., Eibner G.J., Page R.A., Jones T.C., M\u00fcllerM.A., Sommer S., Glebe D., and Drosten C. (2020). \"Mammalian deltavirus without hepadnavirus coinfection in the neotropical rodent Proechimys semispinosus.\" Proceedings of the National Academy of Sciences 2020 Jul 28;117(30):17977-17983. doi: 10.1073/pnas.2006750117.
"},{"location":"misc/publication-list/#2019","title":"2019","text":"Kircher, M., Xiong, C., Martin, B. et al. \"Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution.\" Nat Commun 10, 3583 (2019). doi: 10.1038/s41467-019-11526-w
Stefanovski L, Triebkorn P, Spiegler A, Diaz-Cortes M-A, Solodkin A, Jirsa V, McIntosh RA and Ritter P (2019). \"Linking Molecular Pathways and Large-Scale Computational Modeling to Assess Candidate Disease Mechanisms and Pharmacodynamics in Alzheimer's Disease.\" Front. Comput. Neurosci.. 13:54. doi: 10.3389/fncom.2019.00054
Boeddrich A., Babila J.T., Wiglenda T., Diez L., Jacob M., Nietfeld W., Huska M.R., Haenig C., Groenke N., Buntru A., Blanc E., Meier J.C., Vannoni E., Erck C., Friedrich B., Martens H., Neuendorf N., Schnoegl S., Wolfer DP., Loos M., Beule D., Andrade-Navarro M.A., Wanker E.E. (2019). \"The Anti-amyloid Compound DO1 Decreases Plaque Pathology and Neuroinflammation-Related Expression Changes in 5xFAD Transgenic Mice.\" Cell Chem Biol. 2019 Jan 17;26(1):109-120.e7. doi: 10.1016/j.chembiol.2018.10.013.
Fountain M.D., Oleson, D.S., Rech. M.E., Segebrecht, L., Hunter, J.V., McCarthy, J.M., Lupo, P.J., Holtgrewe, M., Mora, R., Rosenfeld, J.A., Isidor, B., Le Caignec, C., Saenz, M.S., Pedersen, R.C., Morgen, T.M., Pfotenhauer, J.P., Xia, F., Bi, W., Kang, S.-H.L., Patel, A., Krantz, I.D., Raible, S.E., Smith, W.E., Cristian, I., Tori, E., Juusola, J., Millan, F., Wentzensen, I.M., Person, R.E., K\u00fcry, S., B\u00e9zieau, S., Uguen, K., F\u00e9rec, C., Munnich, A., van Haelst, M., Lichtenbelt, K.D., van Gassen, K., Hagelstrom, T., Chawla, A., Perry, D.L., Taft, R.J., Jones, M., Masser-Frye, D., Dyment, D., Venkateswaran, S., Li, C., Escobar, L,.F., Horn, D., Spillmann, R.C., Pe\u00f1a, L., Wierzba, J., Strom, T.M. Parent, I. Kaiser, F.J., Ehmke, N., Schaaf, C.P. (2019). \"Pathogenic variants in USP7 cause a neurodevelopmental disorder with speech delays, altered behavior, and neurologic anomalies.\" Genet. Med. 2019 Jan 25. doi: 10.1038/s41436-019-0433-1
Holtgrewe,M., Messerschmidt,C., Nieminen,M. and Beule,D. (2019) DigestiFlow: from BCL to FASTQ with ease. Bioinformatics, 10.1093/bioinformatics/btz850.
K\u00e4fer S., Paraskevopoulou S., Zirkel F., Wieseke N., Donath A., Petersen M., Jones T.C., Liu S., Zhou X., Middendorf M., Junglen S., Misof B., Drosten C. (2019). \"Re-assessing the diversity of negative strand RNA viruses in insects.\" PLOS Pathogens 2019 Dec 12. doi: 10.1371/journal.ppat.1008224
K\u00fchnisch,J., Herbst,C., Al\u2010Wakeel\u2010Marquard,N., Dartsch,J., Holtgrewe,M., Baban,A., Mearini,G., Hardt,J., Kolokotronis,K., Gerull,B., et al. (2019) Targeted panel sequencing in pediatric primary cardiomyopathy supports a critical role of TNNI3. Clin Genet, 96, 549\u2013559. https://doi.org/10.1111/cge.13645
Marklewitz M., Dutari L.C., Paraskevopoulou S., Page R.A., Loaiza J.R., Junglen S. (2019). \"Diverse novel phleboviruses in sandflies from the Panama Canal area, Central Panama.\" Journal of General Virology 2019 May 3. doi: 10.1099/jgv.0.001260
Quade,A., Thiel,A., Kurth,I., Holtgrewe,M., Elbracht,M., Beule,D., Eggermann,K., Scholl,U.I. and H\u00e4usler,M. (2019) Paroxysmal tonic upgaze: A heterogeneous clinical condition responsive to carbonic anhydrase inhibition. European Journal of Paediatric Neurology, 10.1016/j.ejpn.2019.11.002.
"},{"location":"misc/publication-list/#2018","title":"2018","text":"Blanc, E., Holtgrewe, M., Dhamodaran, A., Messerschmidt, C., Willimsky, G., Blankenstein, T., Beule, D. (2018). \"Identification and Ranking of Recurrent Neo-Epitopes in Cancer\". bioRxiv. 2018/389437, 2018. doi: 10.1101/389437
Brandt, R., Uhlitz, F., Riemer, P., Giesecke, C., Schulze, S., El-Shimy, I.A., Fauler, B., Mielke, T., Mages, N., Herrmann, B.G., Sers, C., Bl\u00fcthgen, N., Morkel, M. (2018). \"Cell type-dependent differential activation of ERK by oncogenic KRAS or BRAF in the mouse intestinal epithelium\". bioRxiv. 2018/340844. doi: 10.1101/340844.
Holtgrewe, M., Knaus, A., Hildebrand, G., Pantel, J.-T., Rodriguesz de los Santos, M., Neveling, K., Goldmann, J., Schubach, M., J\u00e4ger, M., Couterier, M., Mundlos, S., Beule, D., Sperling, K., Krawitz, P. (2018). \"Multisite de novo mutations in human offspring after paternal exposure to ionizing radiation\", Nature Scientific Reports. 2018 Oct 2;8(1):14611. doi: 10.1038/s41598-018-33066-x.
Kircher M., Xiong C., Martin B, Schubach M, Inoue F, Bell R.JA., Costello J.F., Shendure J., Ahituv N. (2018). \"Saturation mutagenesis of disease-associated regulatory elements.\" bioRxiv (2018): 505362. doi: 10.1101/505362
PCAWG Transcriptome Core Group, Calabrese, C., Davidson, N.R., Fonseca1, N.A., He, Y., Kahles, A., Lehmann, K.-V., Liu, F., Shiraishi, Y., Soulette, C.M., Urban, L., Demircio\u011flu, D., Greger, L., Li, S., Liu, D., Perry, M.D., Xiang, L., Zhang, F., Zhang, J., Bailey, P., Erkek, S., Hoadley, K.A., Hou, Y., Kilpinen, H., Korbel, J.O., Marin, M.G., Markowski, J., Nandi11, T., Pan-Hammarstr\u00f6m, Q., Pedamallu, C.S., Siebert, R., Stark, S.G., Su, H., Tan, P., Waszak, S.M., Yung, C., Zhu, S., PCAWG Transcriptome Working Group, Awadalla, P., Creighton, C.J., Meyerson, M., Ouellette, B.F.F., Wu, K., Yang, H., ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Network, Brazma1, A., Brooks, A.N., G\u00f6ke, J., R\u00e4tsch, G., Schwarz, R.F., Stegle, O., Zhang, Z. (2018). \"Genomic basis for RNA alterations revealed by whole-genome analyses of 27 cancer types\". bioRxiv. 2018/183889. doi: 10.1101/183889
Guneykaya D., Ivanov A., Hernandez D.P., Haage V., Wojtas B., Meyer N., Maricos M., Jordan P., Buonfiglioli A., Gielniewski B., Ochocka N., C\u00f6mert, C., Friedrich, C., Artiles, L. S., Kaminska, B., Mertins, P., Beule, D., Kettenmann, H. (2018). \"Transcriptional and translational differences of microglia from male and female brains\", Cell reports. 2018 Sep 4;24(10):2773-83. doi: 10.1016/j.celrep.2018.08.001.
Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. (2018). \"CADD: predicting the deleteriousness of variants throughout the human genome\", Nucleic Acids Res. 2018 Oct 29. doi: 10.1093/nar/gky1016.
Salatzki J., Foryst-Ludwig A., Bentele K., Blumrich A., Smeir E., Ban Z., Brix S., Grune J., Beyhoff N., Klopfleisch R., Dunst S., Surma, M.A., Klose, C., Rothe, M., Heinzel, F.R., Krannich, A., Kershaw, E.E., Beule, D., Schulze, P.C., Marx, N., Kintscher, U. (2018). \"Adipose tissue ATGL modifies the cardiac lipidome in pressure-overload-induced left ventricular failure\", PLoS genetics. 2018 Jan 10;14(1):e1007171. doi: 10.1371/journal.pgen.100717.
Schubach M., Re M., Robinson P.N., Valentini G. (2017) \"Imbalance-aware machine learning for predicting rare and common disease-associated non-coding variants\", Scientific reports 7:1, 2959. doi: 10.1038/s41598-017-03011-5.
Schubert M., Klinge, B., Kl\u00fcnemann M., Sieber A., Uhlitz F., Sauer S., Garnett M., Bl\u00fcthgen N., Saez-Rodriguez J. (2018). \"Perturbation-response genes reveal signaling footprints in cancer gene expression\". Nature Communications. 9: 20, 2018. doi: 10.1038/s41467-017-02391-6
"},{"location":"misc/publication-list/#2017","title":"2017","text":"Euskirchen, P., Bielle, F., Labreche, K., Kloosterman, W.P., Rosenberg, S., Daniau, M., Schmitt, C., Masliah-Planchon, J., Bourdeaut, F., Dehais, C., et al. (2017). Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing. Acta Neuropathol 1\u201313. doi: 10.1007/s00401-017-1743-5
Euskirchen, P., Radke, J., Schmidt, M.S., Heuling, E.S., Kadikowski, E., Maricos, M., Knab, F., Grittner, U., Zerbe, N., Czabanka, M., et al. (2017). Cellular heterogeneity contributes to subtype-specific expression of ZEB1 in human glioblastoma. PLOS ONE 12, e0185376. doi: 10.1371/journal.pone.0185376
Mattei D., Ivanov A., Ferrai C., Jordan P., Guneykaya D., Buonfiglioli A., Schaafsma W., Przanowski P., Deuther-Conrad W., Brust P., Hesse S., Patt, M., Sabri, O., Ross, T.L., Eggen, B.J.L., Boddeke E.W.G.M., Kaminska, B., Beule, D., Pombo, A., Kettenmann, H., Wolf, S.A. (2017). \"Maternal immune activation results in complex microglial transcriptome signature in the adult offspring that is reversed by minocycline treatment.\" Translational psychiatry. 2017 May;7(5):e1120. doi: 10.1038/tp.2017.80.
Mamlouk, S., Childs, L. H., Aust, D., Heim, D., Melching, F., Oliveira, C., Wolf, T., Durek, P., Schumacher, D., Bl\u00e4ker, H., von Winterfeld, M., Gastl, B., M\u00f6hr, K., Menne, A., Zeugner, S., Redmer, T., Lenze, D., Tierling, S., M\u00f6bs, M., Weichert, W., Folprecht, G., Blanc, E., Beule, D., Sch\u00e4fer, R., Morkel, M., Klauschen, F., Leser, U. and Sers, C. (2017). \"DNA copy number changes define spatial patterns of heterogeneity in colorectal cancer\", Nature Communications. 2017; 8, p. 14093. doi: 10.1038/ncomms14093.
Messerschmidt, C., Holtgrewe, M. and Beule, D. (2017). \"HLA-MA: simple yet powerful matching of samples using HLA typing results\". Bioinformatics. 28, pp. 2592\u20132599. doi: 10.1093/bioinformatics/btx132.
Kammertoens, T., Friese, C., Arina, A., Idel, C., Briesemeister, D., Rothe, M., Ivanov, A., Szymborska, A., Patone, G., Kunz, S., Sommermeyer, D., Engels, B., Leisegang, M., Textor, A., Fehling, H. J., Fruttiger, M., Lohoff, M., Herrmann, A., Yu, H., Weichselbaum, R., Uckert, W., H\u00fcbner, N., Gerhardt, H., Beule, D., Schreiber, H. and Blankenstein, T. (2017). \"Tumour ischaemia by interferon-\u03b3 resembles physiological blood vessel regression\". Nature. 545(7652), pp. 98\u2013102. doi: 10.1038/nature22311.
Schulze Heuling, E., Knab, F., Radke, J., Eskilsson, E., Martinez-Ledesma, E., Koch, A., Czabanka, M., Dieterich, C., Verhaak, R.G., Harms, C., et al. (2017). Prognostic Relevance of Tumor Purity and Interaction with MGMT Methylation in Glioblastoma. Mol. Cancer Res. 15, 532\u2013540. doi: 10.1158/1541-7786.MCR-16-0322
Yaakov, G., Lerner, D., Bentele, K., Steinberger, J., Barkai, N., Bigger, J., Maisonneuve, E., Gerdes, K., Lewis, K., Dhar, N., McKinney, J. D., Gefen, O., Balaban, N. Q., Jayaraman, R., Balaban, N. Q., Merrin, J., Chait, R., Kowalik, L., Leibler, S., Balaban, N. Q., Allison, K. R., Brynildsen, M. P., Collins, J. J., Nathan, C., Lewis, K., Glickman, M. S., Sawyers, Knoechel, B., Welch, A. Z., Gibney, P. A., Botstein, D., Koshland, D. E., Levy, S. F., Ziv, N., Siegal, M. L., Stewart-Ornstein, J., Weissman, J. S., El-Samad, H., Gasch, A. P., Weinert, T., Hartwell, L., Weinert, T. A., Hartwell, L. H., Lisby, M., Rothstein, R., Mortensen, U. H., Lisby, M., Mortensen, U. H., Rothstein, R., Domkin, V., Thelander, L., Chabes, A., Hendry, J. A., Tan, G., Ou, J., Boone, C., Brown, G. W., Berry, D. B., Gasch, A. P., Lynch, M., Nishant, K. T., Serero, A., Jubin, C., Loeillet, S., Legoix-Ne, P., Nicolas, A. G., Huh, W. K., Janke, C., Lee, S. E., Blecher-Gonen, R., Martin, M., Cherry, J. M., McKenna, A., DePristo, M. A., Lawrence, M., Obenchain, V., Ye, K., Schulz, M. H., Long, Q., Apweiler, R., Ning, Z., Layer, R. M., Chiang, C., Quinlan, A. R., Hall, I. M., Faust, G. G., Hall, I. M., Boeva, V., Boeva, V., Li, H., Koren, A., Soifer, I. and Barkai, N. (2017). \"Coupling phenotypic persistence to DNA damage increases genetic diversity in severe stress\". Nature Ecology & Evolution. 1(1), pp. 497\u2013500. doi: 10.1038/s41559-016-0016.
Uhlitz, F., Sieber, A., Wyler, E., Fritsche-Guenther, R., Meisig, J., Landthaler, M., Klinger, B., Bl\u00fcthgen, N. (2017). \"An immediate-late gene expression module decodes ERK signal duration\". Molecular Systems Biology. 13: 928, 2017. doi: 10.15252/msb.20177554.
"},{"location":"misc/publication-list/#theses","title":"Theses","text":""},{"location":"misc/publication-list/#2019_1","title":"2019","text":"Schumann F. (2019). \"Establishing a pipeline for stable mutational signature detection and evaluation of variant filter effects\". Freie Universit\u00e4t Berlin. Bachelor Thesis, Bioinformatics.
"},{"location":"misc/publication-list/#2018_1","title":"2018","text":"Borgsm\u00fcller N. (2018). \"Optimization of data processing in GC-MS metabolomics\", Technische Universit\u00e4t Berlin. Master Thesis, Biotechnology.
Kuchenbecker, S.-L. (2018). \"Analysis of Antigen Receptor Repertoires Captured by High Throughput Sequencing\". Freie Universit\u00e4t Universit\u00e4t Berlin. PhD Thesis, Dr. rer. nat. URN:NBN: urn:nbn:de:kobv:188-refubium-22171-8
Schubach M. (2018). \"Learning the Non-Coding Genome\", Freie Universit\u00e4t Universit\u00e4t Berlin. PhD Thesis, Dr. rer. nat. URN:NBN: urn:nbn:de:kobv:188-refubium-23332-7
"},{"location":"misc/publication-list/#posters","title":"Posters","text":""},{"location":"misc/publication-list/#2018_2","title":"2018","text":"Roskosch, S., Hald\u00f3rsson B., Kehr, B. (2018). \"PopDel: Population-Scale Detection of Genomic Deletions\" ECCB 2018. Poster.
White T., Kehr B. (2018). \"Comprehensive extraction of structural variations from long-read DNA sequences\" WABI 2018. Poster.
"},{"location":"misc/publication-list/#2017_1","title":"2017","text":"Schubach M., Re R., Robinson P.N., Valentini G. (2017). \"Variant relevance prediction in extremely imbalanced training sets\" ISMB/ECCB 2017. Poster.
White T., Kehr B. (2017). \"Improving long-read mapping with simple lossy sequence transforms\" ISMB/ECCB 2017. Poster.
"},{"location":"ondemand/interactive/","title":"OnDemand: Interactive Sessions","text":"Interactive sessions allow you to start and manage selected apps. Depending on the app they run as servers or GUIs. Selecting My Interactive Sessions
in the top menu will direct you to the overview of currently running sessions. The left-hand panel provides a short cut to start a new session of one of the provided apps.
Each running interactive session is listed. Each card corresponds to one session. The title of each card provides the name, allocated resources and the current status. Furthermore, detailed information and links are available:
- Host: Provides the name of the node the session is running on. Click on the host name to open a shell to the given cluster node.
- Time remaining: Time until session till terminate.
- Session ID: Click to open the session directory in the interactive file browser (see below).
- Connect to: This will open the app in your browser (opens a new tab).
- Delete: Terminate the session.
Don't hit reload in your apps
Please note that the portal will use the authentication mechanisms of the apps to ensure that nobody except for you can connect to the session. This means that hitting the browsers \"reload\" button in your app will most likely not work.
Just go back to the interactive session list and click on the \"connect\" button.
"},{"location":"ondemand/interactive/#session-directories","title":"Session Directories","text":"The portal software will create a folder ondemand
in your home directory. Inside, it will create session directories for each started interactive job. For technical reasons, these folders have very long names, for example:
$HOME/ondemand/data/sys/dashboard/batch_connect/sys/ood-bih-rstudio-server/output/e40e03b3-11ca-458a-855b-98e6f148c99a/
This follows the pattern:
$HOME/${application name}/output/${job UUID}
The job identifier used is not the Slurm job ID but an identifier internal to OnDemand. Inside this directory you will find log files and a number of scripts that are used to start your job.
If you need to debug any interactive job, start here. Also, the helpdesk will need the path to this folder to help you with interactive jobs.
You can find the name of the latest output folder with the following command:
$ ls -lhtr $HOME/${application name}/output | tail -n 1\n
For example, for RStudio Server:
$ ls -lhtr $HOME/ondemand/data/sys/dashboard/batch_connect/sys/ood-bih-rstudio-server/output | tail -n 1\n
Prevent Home From Filling Up
You should probably move ~/ondemand
to your work volume with the following:
$ mv ~/ondemand ~/work/ondemand\n$ ln -sr ~/work/ondemand ~/ondemand\n
Make sure to delete potential interactive sessions and to logout from the Ondemand Portal first. Otherwise, the ~/ondemand
folder is constantly recreated and the symlink will be just created within this folder as ~/ondemand/ondemand
and thus not be used as intended.
Also, clear out ~/work/ondemand/*
from time to time but take care that you don't remove the directory of any running job.
"},{"location":"ondemand/interactive/#example-1-default-rstudio-session","title":"Example 1: Default RStudio Session","text":"This description of starting an RStudio session is a showcase for starting other interactive apps as well.
To start the session, please go to Interactive Apps
in the top menu bar and select RStudio Server
or click RStudio Server
in the left-hand panel.
Allocate appropriate resources and click Launch
.
An info card for the RStudio Server will be added to My Interactive Sessions
, and during start, it will change its state from Queued
to Starting
to Running
. Depending on the app, resources allocated and current cluster usage, this will take a couple of seconds.
When in the final state (Running
), one can directly connect to the RStudio Server to get an interactive session by clicking Connect to RStudio Server
:
"},{"location":"ondemand/interactive/#example-2-rstudio-session-with-custom-r-installation-from-conda","title":"Example 2: RStudio Session with custom R-installation from conda","text":"To use the OnDemand portal with a specific R installation including a stable set of custom packages you can use a conda enviroment from the cluster as a R source.
For this you may first need to create this conda environment including your R version of choice and all necessary packages. Specific installations of i.e. python from conda can be used similarly in other interactive apps.
- For reproducibility this environment should clearly define all package versions and include dependencies. This is easiest to achieve by first collecting all packages you need into a primary collection (i.e. a yaml file, potentially including a specific R version for r-base if needed) and creating an environment from there. Exporting this environment will generate a file with all used packages and their version numbers, that can be used to recreate the same environment.
- Example code:
Click to expand * Commands: + `conda env create -n R-example -f R-example.yaml` + `conda activate R-example` + `conda env export -f R-fixed-versions.yaml` + `conda env create -n R-fixed-versions -f R-fixed-versions.yaml` * R-example.yaml channels:\n - conda-forge\n - bioconda\n - defaults\ndependencies:\n - r-base\n - r-essentials\n - r-devtools\n - bioconductor-deseq2\n - r-tidyverse\n - r-rmarkdown\n - r-knitr\n - r-dt\n
- R packages only available from github
Some packages (i.e. several single-cell-RNAseq analysis tools) are only available from github and not on Cran/Bioconductor. There are two ways to install such packages into a conda enviroment.
Click to expand 1) Install from inside R \\[easier option, but not pure conda\\] * First setup the conda env, ideally including all dependencies for the desired package from github (and do include r-devtools) * Then within R run `devtools::install_github('owner/repo', dependencies=F, upgrade=F, lib='/path/to/conda/env-name/lib/R/library')` * if you don't have all dependencies already installed you will have to omit dependencies=F and risk a mix of conda & native R installed packages (or just have to redo the conda env). * github_install involves a build process and still needs a bit of memory, so this might crash on the default `srun --pty bash -i` shell 2) Build packages into a local conda channel \\[takes longer, but pure conda\\]\\ This approach is mostly taken from the answers given [here](https://stackoverflow.com/questions/52061664/install-r-package-from-github-using-conda). These steps must be taken _before_ building the final env used with Rstudio * use `conda skeleton cran https://github.com/owner/repo [--git-tag vX.Y]` to generate build files * conda skeleton only works for repositories with a release/version tag. If the package you want to install does not have that, you either need to create a fork and add a such a tag, or find a fork that already did that. Downloading the code directly from github and building the package from that is also possible, but you will the need to manually set up the `meta.yaml` and `build.sh` files that conda skeleton would create. * If there is more than one release tag, do specify which one you want, it may not automatically take the most recent one. * If any r-packages from bioconductor are dependencies, conda will not find them during the build process. You will need to change the respective entries in the `meta.yaml` file created by conda skeleton. I.e. change `r-deseq2` to `bioconductor-deseq2` * Build the package with `conda build --R= [--use-local] r-` * You need to specifying the same R-version used in the final conda env * If the github package has additional dependencies from github, build those first and then add `--use-local` so the build process can find them. * The build process definitely needs more memory than the default `srun --pty bash -i` shell. It also takes quite a bit of time (much longer than installing through devtools::install_github) * Finally add the packages (+versions) you built to the environment definition (i.e. yaml file) and create the (final) conda environment. Don't forget to tell conda to use locally build packages (either supply `--use-local` or add `- local` to the channel list in the yaml file) Starting the Rstudio session via the OnDemand portal works almost as described above (see Example 1). However, you do have to select `miniforge` as R source and provide the path to your miniforge installation and (separated by a colon) the name of the (newly created) conda enviroment you want to use.
Additional notes:
- Updating the conda env, that an already running rstudio instance is using, does work but does requires a restart of the R session to take effect
- If you are starting a new interactive Rstudio session but with a different conda environment than before, Rstudio will still start from the same project as before. In this case the 'old' project likely still contains the previous
.libPaths()
entries and therefore a link to your previous conda installation. Creating a new project cleans .libPaths()
to only the env specified in setting up the Rstudio session.
"},{"location":"ondemand/overview/","title":"The Open OnDemand Portal","text":"Status / Stability
OnDemand Support is currently in beta phase on the BIH HPC. In case of any issues, please send an email to hpc-helpdesk@bih-charite.de.
To allow for better interactive works, BIH HPC administration has setup an Open OnDemand (OOD) portal web server.
You can find the OnDemand Portal for HPC 4 Research at:
- https://hpc-portal.cubi.bihealth.org
"},{"location":"ondemand/overview/#background","title":"Background","text":"OOD allows you to access cluster resources using a web-based graphical interface in addition to traditional SSH connections. You can then connect to jobs running graphical applications either to virtual desktops (such as Matlab) or to web apps (such as Jupyter and RStudio Server).
The following figure illustrates this.
The primary way to the cluster continues to be SSH which has several advantages. By the nature of the cluster being based on Linux servers, it will offer more features through the \"native\" access and through its lower complexity, it will offer higher stability. However, we all like to have the option of a graphical interface, at least from time to time .
The main features are:
- Easy web-based access to Jupyter and RStudio Server on the cluster.
- Generally lower the entry barrier of using the HPC system.
"},{"location":"ondemand/overview/#logging-into-the-portal","title":"Logging into the Portal","text":"The first prerequisite is to have a cluster account already (see Getting Access). Once you have done your first SSH connection to the cluster successfully you can start using the portal. For this you perform the following steps:
- Go to https://hpc-portal.cubi.bihealth.org - you will be redirected to the login page shown below. If you have an account with Charite (ends in
_c
) then please use the \"Charit\u00e9 - Universit\u00e4tmedizin Berlin\" button, for MDC Accounts please use the \"Max Delbr\u00fcck Center Berlin\" button. - Login with your home organization's SSO system. Please note that depending on whether you are accessing the system via the wired network in your home organization or via VPN the SSO might look differently.
Clicked the Wrong Login Button?
If you clicked the wrong button then please clear your cookies to force a logout of the system.
"},{"location":"ondemand/overview/#prepare-ondemand-folder","title":"Prepare OnDemand Folder","text":"The ondemand
folder is automatically created in your home directory, and the OnDemand service searches for this folder in your home directory, i.e. it has to stay there. But as the quota in the home directory is very limited, you can easily hit the hard quota which might prevent you from working on the cluster.
To prevent this, move the ~/ondemand
folder to the ~/work
folder and create a symlink for the now dislocated ~/ondemand
folder:
hpc-login-1:~$ mv ~/ondemand ~/work/ondemand\nhpc-login-1:~$ ln -sr ~/work/ondemand ~/ondemand\n
Important
Make sure to delete potential interactive sessions and to logout from the Ondemand Portal first. Otherwise, the ~/ondemand
folder is constantly recreated and the symlink will be just created within this folder as ~/ondemand/ondemand
and thus not be used as intended.
"},{"location":"ondemand/overview/#portal-dashboard","title":"Portal Dashboard","text":"Problems with Open OnDemand?
First try to log out and login again. Next, try to clear all cookies for the domain hpc-portal.cubi.bihealth.org
. Finally, try the Help > Restart Web Server
link to restart the per-user nginx (PUN) server.
You will then be redirected to the dashboard screen.
Here you have access to the following actions. We will not go into detail of all of them and expect them to be self-explanatory.
Important
Please note that when using the portal then you are acting as your HPC user. Use standard best practice. Consider carefully what you do as you would from the command line (e.g., don't use the portal to browse the web from the cluster).
- Files
- Home Directory - Access a file browser.
- Quotas - Display quota information (only available on HPC 4 Research).
- Jobs
- Active Jobs - List your jobs.
- Job Composer - Start a new job.
- Clusters
- Shell Access - Shell access in your browser.
- Interactive Apps
- Mate and Xfce Desktops - Start virtual desktops on the HPC.
- Matlab - Run a virtual desktop that has Matlab installed.
- MaxQuant - Run a virtual desktop that has MaxQuant installed.
- Jupyter - Run Jupyter on the HPC and easily connect to it from your browser without setting up any SSH tunnels.
- RStudio Server - Run RStudio Server on the HPC and easily connect to it from your browser without setting up any SSH tunnels.
- My Interactive Sessions - See details of your currently running interactive sessions.
- Help
- Contact Support - Links ot the \"Getting Help\" page in this documentation.
- Online Documentation - Links to this documentation.
- Restart Web Server - Try this if the portal acts weird before contacting the helpdesk. OnDemand runs a web server per user, so this does not affect any other user.
- Log Out - Log out of the system.
"},{"location":"ondemand/quotas/","title":"OnDemand: Quota Inspection","text":"Outdated
This document is only valid for the old, third-generation file system and will be removed soon. Quotas of our new CephFS storage are communicated via the HPC Access web portal.
Accessing the quota report by selecting Files
and then Quotas
in the top menu will provide you with a detailed list of all quotas for directories that you are assigned to.
There are two types of quotas: for (a) size of and (b) number of files in a directory.
Every row in the table corresponds to a directory that you have access to. This implies your home directory (fast/users
) as well as the group directory of your lab (fast/groups
) and possible projects (fast/projects
) (if any). Quotas are not directly implied on these directories but on the home
, scratch
and work
subdirectories that each of subdirectory of the beforementioned directories has (for a detailed explanation see Storage and Volumes).
The following list explains the columns of the table:
- path resembles the path to the directory the quota is displayed for. Please note that this is not actually a path but the fileset name the cluster uses internally to handle the associated directory/path. The \"real\" path can be derived by preceding the name with a slash (
/
) and substituting the underscores with a slash in the (users|groups|projects)_
and _(home|scratch|work)
substring. The corresponding path for name fast/users_stolpeo_c_home
would be /fast/users/stolpeo_c/home
. - block usage gives the current size of the directory/fileset. The unit is variable and directly attached to the number.
- block soft limit gives the soft quota for the directory/fileset. Exceeding the soft quota (and staying below the hard quota) will trigger the grace period. The unit is variable and directly attached to the number.
- block hard limit gives the hard quota for the directory/fileset. Exceeding the hard quota is not possible and will prevent you from writing any data to the directory. That might cause trouble even deleting files as logging in and browsing the file system may create data. The unit is variable and directly attached to the number.
- block grace gives the grace period in days when exceeding the soft quota.
- files usage gives the number of files in the directory tree.
- files soft limit gives the soft quota for the allowed number of files in the directory/fileset. Exceeding the soft quota (and staying below the hard quota) will trigger the grace period.
- files hard limit gives the hard quota for the allowed number of files. Exceeding the hard quota is not possible and will prevent you from writing any data to the directory. That might cause trouble even deleting files as logging in and browsing the file system may create data.
- files grace gives the grace period in days when exceeding the soft quota for files.
"},{"location":"overview/architecture/","title":"Cluster Architecture","text":"BIH HPC IT provides acess to high-performance compute (HPC) cluster systems. A cluster system bundles a high number of nodes and in the case of HPC, the focus is on performance (with contrast to high availability clusters).
"},{"location":"overview/architecture/#hpc-4-research","title":"HPC 4 Research","text":""},{"location":"overview/architecture/#cluster-hardware","title":"Cluster Hardware","text":" - approx. 256 nodes (from three generations),
- 4 high-memory nodes (2 nodes with 512 GB RAM, 2 nodes with 1 TB RAM),
- 7 GPU nodes with 4 Tesla GPUs each, 1 GPU node with 10 A40 GPUs, and
- a high-performance Tier 1 parallel CephFS file system with a larger but slower Tier 2 CephFS file system, and
- a legacy parallel GPFS files system.
"},{"location":"overview/architecture/#network-interconnect","title":"Network Interconnect","text":" - Older nodes are interconnected with 2x10GbE/2x40GbE
- Recent nodes are interconnected with 2x25GbE/2x100GbE
"},{"location":"overview/architecture/#cluster-management","title":"Cluster Management","text":"Users don't connect to nodes directly but rather create interactive or batch jobs to be executed by the cluster job scheduler Slurm.
- Interactive jobs open interactive sessions on compute nodes (e.g., R or iPython sessions). These jobs are run directly in the user's terminal.
- Batch jobs consist a job script with execution instructions (a name, resource requirements etc.) These are submitted to the cluster and then assigned to compute hosts by the job scheduler. Users can configure the scheduler to send them an email upon completion. Users can submit many batch jobs at the same time and the scheduler will execute them once the cluster offers sufficient resources.
- Web-based access can be achieved using the OnDemand Portal
"},{"location":"overview/architecture/#head-vs-compute-nodes","title":"Head vs. Compute Nodes","text":"As common with HPC systems, users cannot directly access the compute nodes but rather connect to so-called head nodes. The BIH HPC system provides the following head nodes:
login-1
and login-2
that accept SSH connections and are meant for low intensity, interactive work such as editing files, running screen/tmux sessions, and logging into the compute nodes. Users should run no computational tasks and no large-scale data transfer on these nodes. transfer-1
and transfer-2
also accept SSH connections. Users should run all large-scale data transfer through these nodes.
"},{"location":"overview/architecture/#common-use-case","title":"Common Use Case","text":"After registration and client configurations, users with typically connect to the HPC system through the login nodes:
local:~$ ssh -l jdoe_c hpc-login-1.cubi.bihealth.org\nhpc-login-1:~$\n
Subsequently, they might submit batch jobs to the cluster for execution through the Slurm scheduling system or open interactive sessions:
hpc-login-1:~$ sbatch job_script.sh\nhpc-login-1:~$ srun --pty bash -i\nmed0104:~$\n
"},{"location":"overview/for-the-impatient/","title":"Overview","text":""},{"location":"overview/for-the-impatient/#bih-hpc-4-research","title":"BIH HPC 4 Research","text":"BIH HPC 4 Research is located in the BIH data center in Buch and connected via the BIH research network. Connections can be made from Charite, MDC, and BIH networks. The cluster is open for users with either Charite or MDC accounts after getting access through the gatekeeper proces. The system has been designed to be suitable for the processing of human genetics data from research contexts (and of course data without data privacy concerns such as public and mouse data).
"},{"location":"overview/for-the-impatient/#cluster-hardware-and-scheduling","title":"Cluster Hardware and Scheduling","text":"The cluster consists of the following major components:
- 2 login nodes for users
hpc-login-1
and hpc-login-2
(for interactive sessions only), - 2 nodes for file transfers
hpc-transfer-1
and hpc-transfer-2
, - a scheduling system using Slurm,
- 228 general purpose compute nodes
hpc-cpu-{1..228}
- a few high memory nodes
hpc-mem-{1..5}
, - 7 nodes with 4 Tesla V100 GPUs each (!)
hpc-gpu-{1..7}
and 1 node with 10x A40 GPUs (!) hpc-gpu-8
, - a legacy parallel GPFS file system with 2.1 PB, by DDN mounted at
/fast
, - a next generation high-performance storage system based on Ceph/CephFS
- a tier 2 (slower) storage system based on Ceph/CephFS
This is shown by the following picture:
"},{"location":"overview/for-the-impatient/#differences-between-workstations-and-clusters","title":"Differences Between Workstations and Clusters","text":"The differences include:
- The directly reachable login nodes are not meant for computation! Use
srun
to go to a compute node. - Every time you type
srun
to go to a compute node you might end up on a different host. - Most directories on the nodes are not shared, including
/tmp
. - The
/fast
directory is shared throughout the cluster which contains your home, group home, and project directories. - You will not get
root
or sudo
permissions on the cluster. - You should prefer batch jobs (
sbatch
) over calling programs interactively.
"},{"location":"overview/for-the-impatient/#what-the-cluster-is-and-is-not","title":"What the Cluster Is and Is NOT","text":"NB: the following might sound a bit harsh but is written with everyone's best intentions in mind (we actually like you, our user!) This addresses a lot of suboptimal (yet not dangerous, of course) points we observed in our users.
IT IS
- It is scientific infrastructure just like a lab workbench or miscroscope. It is there to be used for you and your science. We trust you to behave in a collaboratively. We will monitor usage, though, and call out offenders.
- With its ~200 nodes, ~6400 threads and fast parallel I/O, it is a powerful resource for life science high performance computation, originally optimized at bioinformatics sequence processing.
- A place for data move data at the beginning of your project. By definition, every project has an end. Your project data needs to leave the cluster at the end of the project.
- A collaborative resource with central administration managed by BIH HPC IT and supported via hpc-helpdesk@bih-charite.de
IT IS NOT
- A self-administrated workstation or servers.
- You will not get
sudo
. - We will not install software beyond those in broad use and available in CentOS Core or EPEL repositories.
- You can install software in your user/group/project directories, for example using Conda.
- A place to store primary copies of your data. You only get 1 GB of storage in your home for scripts, configuration, and documents.
- A safe place to store data. Only your 1 GB of home is in snapshots and backup. While data is stored on redundant disks, technical or administrative failure might eventually lead to data loss. We do everything humanly possible to prevent this. Despite this, it is your responsibility to keep important files in the snapshot/backup protected home, ideally even in copy (e.g., a git repository) elsewhere. Also, keeping safe copies of primary data files, your published results, and the steps in between reproducible is your responsibility.
- A place to store data indefinitely. The fast CephFS Tier 1 storage is expensive and \"rare\". CephFS Tier 2 is bigger in volume, but still not unlimited. The general workflow is: (1) copy data to cluster, (2) process it, creating intermediate and final results, (3) copy data elsewhere and remove it from the cluster
- Generally suitable for primary software development. The I/O system might get overloaded and saving scripts might take some time. We know of people who do this and it works for them. Your mileage might vary.
"},{"location":"overview/job-scheduler/","title":"Job Scheduler","text":"Once logged into the cluster through the login nodes, the Slurm scheduler needs to be used to submit computing jobs. In Slurm nomenclature, cluster compute nodes are assigned to one or more partitions. Submitted jobs are assigned to nodes according to the partition's configuration.
"},{"location":"overview/job-scheduler/#partitions","title":"Partitions","text":"The BIH HPC has the partitions described below. The cluster focuses on life science applications and not \"classic HPC\" with numerical computations using MPI. Thus, all partitions except for mpi
only allow to reserve resources on one node. This makes the cluster easier to use as users don't have to explicitely specify this limit when submitting their jobs.
"},{"location":"overview/job-scheduler/#standard","title":"standard
","text":"Jobs are submitted to the standard
partition by default. From the, the scheduler will route the jobs to their actual partition using the routing rule set described below. You can override this routing by explicitely assigning a partition (but this is discouraged).
- Jobs requesting a GPU resources are routed to the
gpu
queue. - Else, jobs requesting more than 200 GB of RAM are routed to the
highmem
queue. - Else, jobs are assigned to the partitions
debug
, short
, medium
, and long
long depending on their configured maximal running time. The partitions are evaluated in the order given above and the first fitting partition will be used.
"},{"location":"overview/job-scheduler/#debug","title":"debug
","text":"This partition is for very short jobs that should be executed quickly, e.g., for tests. The job running time is limited to one hour and at most 128 cores can be used per user but the jobs are submitted with highest priority.
- maximum run time: 1 hour
- maximum cores: 128 cores per user
- partition name:
debug
- argument string: maximum run time:
--time 01:00:00
"},{"location":"overview/job-scheduler/#short","title":"short
","text":"This partition is for jobs running only few hours. The priority of short jobs is high and many cores can be used at once to reward users for splitting their jobs into smaller parts.
- maximum run time: 4 hours
- maximum cores: 2000 cores
- partition name:
short
- argument string: maximum run time:
--time 04:00:00
"},{"location":"overview/job-scheduler/#medium","title":"medium
","text":"This partition is for jobs running for multiple days. Users can only allocate the equivalent of 4 nodes.
- maximum run time: 7 days
- maximum cores: 128 cores/slots (4 nodes)
- partition name:
medium
- argument string: maximum run time:
--time 7-00:00:00
"},{"location":"overview/job-scheduler/#long","title":"long
","text":"This partition is for long-running tasks. Only one node can be reserved for so long to discourage really long-running jobs and encourage users for splitting their jobs into smaller parts.
- maximum run time: 14 days
- maximum cores: 32 cores/slots (1 node)
- partition name:
long
- argument string: maximum run time:
--time 14-00:00:00
"},{"location":"overview/job-scheduler/#gpu","title":"gpu
","text":"Jobs requesting GPU resources are automatically assigned to the gpu
partition.
The GPU nodes are only part of the gpu
partition so they are not blocked by normal compute jobs. Maximum run time is relatively high (14 days) to allow for longer training jobs. Contact hpc-helpdesk@bih-charite.de if you have longer running jobs that you really cannot make run any shorter for assistance.
Info
Fair use rules apply. As GPU nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
- maximum run time: 14 days
- partition name:
gpu
- argument string: select
$count
GPUs: -p gpu --gres=gpu:$card:$count
(card=tesla
or card=a40
), maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#highmem","title":"highmem
","text":"Jobs requesting more than 200 GB of RAM are automatically routed to the highmem
partition.
The high memory nodes are only part of the highmem
partition so they are not blocked by normal compute jobs. Maximum run time is relatively high (14 days) to allow for longer jobs. Contact hpc-helpdesk@bih-charite.de for assistance if you have longer running jobs that you really cannot make run any shorter.
Info
Fair use rules apply. As high-memory nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
- maximum run time: 14 days
- partition name:
highmem
- argument string:
-p highmem
, maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#mpi","title":"mpi
","text":"Jobs are not routed automatically to the mpi
partition but you have to explitely request the partition. This is the only partition in which more than one node can be allocated to a job.
You can submit multi-node jobs into the mpi
partition. Maximum run time is relatively high (14 days) to allow for longer jobs. Don't abuse this. Contact hpc-helpdesk@bih-charite.de for assistance if you have longer running jobs that you really cannot make run any shorter.
- maximum run time: 14 days
- partition name:
highmem
- argument string:
-p mpi
, maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#critical","title":"critical
","text":"Jobs are not routed into critial
automatically and the partition has to be selected manually.
This partition is for time-critical jobs with deadlines. As long as the cluster is not very busy, requests for critical jobs will be granted most of the time. However, do not use this partition without arranging with hpc-helpdesk as killing jobs will be used as the ultima ratio in case of such policy violations.
- maximum run time: 7 days
- maximum cores: 2000 cores/slots (48 nodes)
- partition name:
critical
- argument string: maximum run time:
--time 7-00:00:00
"},{"location":"overview/monitoring/","title":"Monitoring","text":"We currently provide you only with Ganglia for monitoring the cluster status.
"},{"location":"overview/monitoring/#using-ganglia","title":"Using Ganglia","text":"Go to the following address and login with your home organization (Charite or MDC):
- https://hpc-ganglia.cubi.bihealth.org
Ganglia does not know about Slurm
Ganglia will not show you anything about the Slurm job schedulign system. If a job uses a whole node but uses no CPUs then this will be displayed as unused in Ganglia. However, Slurm would not schedule another job on this node.
You will be show a screen as shown below. This allows you to get a good idea of what is going on on the HPC.
By default you will be shown the cluster usage of the last day. You can quickly switch to report for two or four hours as well, etc.
In the first row of pictures you see the number of total CPUs (actually hardware threads), number of hosts seen as up and down by Ganglia, and cluster load/utilization. You will then see the overall cluster load, memory usage, CPU usage, and network utilization across the selected time period.
Linux load is not intuitive
Note that the technical details behind Linux load is not very interactive. It is incorporating much more than just the CPU usage. You can find a quite comprehensive treatement of Linux Load here.
We are using a fast shared storage system and almost no local storage (except in /tmp
). Also, almost no jobs use MPI or other heavy network communication. Thus, the network utilization is a good measure of the I/O on the cluster.
Below, you can drill down into various metrics and visualize them historically. Just try it out and find your way around, you cannot break anything. Sadly, there is no good documentation of Ganglia online.
"},{"location":"overview/monitoring/#aggregate-gpu-utilization-visualization","title":"Aggregate GPU Utilization Visualization","text":"Ganglia allows you to obtain metrics in several interesting and useful ways. If you click on \"Aggregate Graphs\" then you could enter the following values to get an overview of the live GPU utilization.
- Title:
Aggreate GPU Utilization
- Host Regular expression:
hpc-gpu-.*
- Metric Regular Expressions:
gpu._util
- Graph Type:
Stacked
- Legend Options:
Hide legend
Then click Create Graph
.
If a GPU is fully used, it will contribute 100 points on the vertical axis. See above for an example, and here is a direct link:
- Aggregate GPU Utilization
"},{"location":"overview/storage/","title":"Nodes and Storage Volumes","text":"No mounting on the cluster itself.
For various technical and security-related reasons it is not possible to mount anything on the cluster nodes by users. For mounting the cluster storage on your computer, please read Connecting: SSHFS Mounts.
This document gives an overview of the nodes and volumes on the cluster.
"},{"location":"overview/storage/#cluster-layout","title":"Cluster Layout","text":""},{"location":"overview/storage/#cluster-nodes","title":"Cluster Nodes","text":"The following groups of nodes are available to cluster users. There are a number of nodes that are invisible to non-admin staff, hosting the queue master and monitoring tools and providing backup storage for key critical data, but these are not shown here.
hpc-login-{1,2}
- available as
hpc-login-{1,2}.cubi.bihealth.org
- do not perform any computation on these nodes!
- each process may at most use 1 GB of RAM
med0101..0124,0127
- 25 standard nodes
- Intel Xeon E5-2650 v2 @2.60Ghz, 16 cores x2 threading
- 128 GB RAM
med0133..0164
- 32 standard nodes
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
med0201..0264
- 64 nodes with Infiniband interconnect
- Intel Xeon E5-2650 v2 @2.60Ghz, 16 cores x2 threading
- 128 GB RAM
med0301..0304
- 4 nodes with 4 Tesla V100 GPUs each
med0401..0405
special purpose/high-memory machines - Intel Xeon E5-4650 v2 @2.40GHz, 40 cores x2 threading
med0401
and med0402
- 1 TB RAM
med0403
and med0404
- 500 GB RAM
med0405
- 2x \"Tesla K20Xm\" GPU accelleration cards (cluster resource
gpu
) - access limited to explicit GPU users
med0601..0616
- 16 nodes owned by CUBI
- Intel Xeon E5-2640 v3 @2.60Ghz
- 192 GB RAM
med0618..0633
- 16 nodes owned by CUBI
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
med0701..0764
- 64 standard nodes
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
"},{"location":"overview/storage/#cluster-volumes-and-locations","title":"Cluster Volumes and Locations","text":"The cluster has 2.1 PB of legacy fast storage, currently available at /fast
, as well as 1.6 PB of next-generation fast storage, available at /data/cephfs-1
. Additionally 7.4 PB of slower \"Tier 2\" storage is available at /data/cephfs-2
. Storage is provided by a Ceph storage cluster and designed for massively parallel access from an HPC system. In contrast to \"single server\" NFS systems, the system can provide large bandwidth to all cluster nodes in parallel as long as large data means relatively \"few\" files are read and written.
Storage is split into three sections:
home
-- small, persistent, and safe storage, e.g., for documents and configuration files (default quota of 1 GB). work
-- larger and persistent storage, e.g., for your large data files (default quota of 1 TB). scratch
-- large and non-persistent storage, e.g., for temporary files, files are automatically deleted after 2 weeks (default quota of 10 TB; deletion not implemented yet).)
Each user, group, and project has one or more of these sections each, e. g. for users:
/data/cephfs-1/home/users/$NAME
/data/cephfs-1/home/users/$NAME/work
/data/cephfs-1/home/users/$USER/scratch
See Storage and Volumes: Locations for more informatin.
"},{"location":"slurm/background/","title":"Introduction to Scheduling","text":"As explained elsewhere in more detail, an HPC cluster consists of multiple computers connected via a network and working together. Multiple users can use the system simultaneously to do their work. This means that the system needs to join multiple computers (nodes) to provide a coherent view of them and the same time partition the system to allow multiple users to work concurrently.
user 1 user 2 ...\n\n .---. .---. .---. .---.\n | J | | J | | J | | J |\n | o | | o | | o | | o | ...\n | b | | b | | b | | b |\n | 1 | | 2 | | 3 | | 4 |\n '---' '---' '---' '---'\n\n.------------------------------------------.\n| Cluster Scheduler |\n'------------------------------------------'\n\n.----------. .------------. .------------.\n| multiple | | separate | | computers |\n'----------' '------------' '------------'\n
"},{"location":"slurm/background/#interlude-partitioning-single-computers","title":"Interlude: Partitioning Single Computers","text":"Overall, this partitioning is not so different from how your workstation or laptop works. Most likely, your computer (or even your smartphone) has multiple processors (or cores). You can run multiple programs on the same computer and the fact that (a) there is more than one core and (b) there is more than one program running is not known to the running programs (unless they explicitly communicate with each other). Different programs can explicitly take advantage of the multiple processor cores. The main difference is that you normally use your computer in an interactive fashion (you perform an action and expect an immediate reaction).
Even with a single processor (and core), your computer manages to run more than one program at the same time. This is done with the so-called time-slicing approach where the operating system lets each programs run in turn for a short time (a few milliseconds). A program with a higher priority will get more time slices than one with a lower (e.g., your audio player has real-time requirements and you will hear artifacts if it is starved for compute resources). Your operating system protects programs from each other by creating an address space for each. When two programs are running, the value of the memory at any given position in one program is independent from the value in the other program. Your operating system offers explicit functionality for sharing certain memory areas that two programs can use to exchange data efficiently.
Similarly, file permissions with Unix users/groups or Unix/Windows ACLs (access control lists) are used to isolate users from each other. Programs can share data by accessing the same file if they can both access it. There are special files called sockets that allow for network-like inter-process communication but of course two programs on the same computer can also connect (virtually) via the computer network (no data will actually go through a cable).
"},{"location":"slurm/background/#interlude-resource-types","title":"Interlude: Resource Types","text":"As another diversion, let us consider how Unix manages its resources. This is important to understand when requesting resources from the scheduler later on.
First of all, a computer might offer a certain feature such as a specific hardware platform or special network connection. Examples for this on the BIH HPC are specific Intel processor generations such as haswell
or the availability of Infiniband networking. You can request these with so-called constraints; they are not allocated to specific jobs.
Second, there are resources that are allocated to specific jobs. The most important resources here are:
- computing resources (processors/CPUs (central progressing units) and cores, details are explained below),
- main memory / RAM,
- special hardware such as GPUs, and
- (wall-clock) time that a job wants to run as an upper bound.
Generally, once a resource has been allocated to one job, it is not available to another. This means if you allocating more resources to your job that you actually need (overallocation) then those resources are not available to other jobs (whether they are your jobs or those of other users). This will be explained further below.
Another example of resource allocation are licenses. The BIH HPC has a few Matlab 2016b licenses that users can request. As long as a license is allocated to one job, it is unavailable to another.
"},{"location":"slurm/background/#nodes-sockets-processors-cores-threads","title":"Nodes, Sockets, Processors, Cores, Threads","text":"Regarding compute resources, Slurm differentiates between:
- nodes: a compute server,
- sockets: a socket in the compute server that hosts one physical processor,
- processor: a CPU or a CPU core in a multi-core computer (all CPUs in the BIH HPC are multi-core), and
- (hardware) threads: most Intel CPUs feature hardware threads (also known as \"hyperthreading\") where each core appears to be two cores.
In most cases, you will use one compute node only. When using more than one node, you will need to use some form of message passing, e.g., MPI, so processes on different nodes can communicate. On a single node you would mostly use single- or multi-threaded processes, or multiple processes.
Above: Slurm's nomenclature for sockets, processors, cores, and threads (from Slurm Documentation).
Co-locating processes/threads on the same socket has certain implications that are mostly useful for numerical applications. We will not further go into detail here. Slurm provides many different features of ways to specify allocation of \"pinning\" to specific process locations. If you need this feature, we trust that you find sufficient explanation in the Slurm documentation.
Usually, you would allocate multiple cores (a term Slurm uses synonymously with processors) on a single node (allocation on a single node is the default).
"},{"location":"slurm/background/#how-scheduling-works","title":"How Scheduling Works","text":"Slurm is an acronym for \"Simple Linux Unix Resource Manager\" (note that the word \"scheduler\" does not occur here). Actually, one classically differentiates between the managing of resources and the scheduling of jobs that use them. The resource manager allocates resources according to a user's request for a job and ensures that there are no conflicts. If the required resources are not available, the scheduler puts the user's job into a queue. Later, when then requested resources become available the scheduler assigns them to the job and runs it. In the following, both resource allocation and the running of the job are described as being done by the scheduler.
The interesting case occurs when there are not enough resources available for at least two jobs submitted to the scheduler. The scheduler has to decide how to proceed. Consider the simplified case of only scheduling cores. Each job will request a number of cores. The scheduler will then generate a scheduling plan that might look as follows.
core\n ^\n4 | |---job2---|\n3 | |---job2---|\n2 | |---job2---|\n1 | |--job1--|\n +--------------------------> t time\n 5 1 1 2\n 0 5 0\n
job1
has been allocated one core and job2
has been allocated two cores. When job3
, requesting one core is submitted at t = 5, it has to wait at least as long until job1
is finished. If job3
requested two or more cores, it would have to wait at least until job2
also finished.
We can now ask several questions, including the following:
- What if a job runs for less than the allocated time? -- In this case, resources become free and the scheduler will attempt to select the next job(s) to run.
- What if a job runs longer than the allocated time? -- In this case, the scheduler will send an informative Unix signal to the process first. The job will be given a bit more time and if it does not exit it will be forcibly terminated. You will find a note about this at the end of your job log file.
- What if multiple jobs compete for resources? -- The scheduler will prefer certain jobs over others using the Slurm Multifactor Priority Plugin. In practice, small jobs will be preferred over large, users with few used resources in the last month will be favored over heavy consumers, long-waiting jobs will be favored over recently submitted jobs, and many other factors. You can use the sprio utility to inspect these factors in real-time.
- How does the scheduler handle new requests? -- Generally, the scheduler will add new jobs to the waiting queue. The scheduler regularly adjusts its planning by recalculating job priorities. Slurm is configured to perform computationally simple schedule recalculations quite often and larger recalculations more infrequently.
Also see the Slurm Frequently Asked Questions.
Please note that even if all jobs were known at the start of time, scheduling is still a so-called NP-complete problem. Entire computer science journals and books are dedicated only to scheduling. Things get more complex in the case of online scheduling, in which new jobs can appear at any time. In practice, Slurm does a fantastic job with its heuristics but it heavily relies on parameter tuning. HPC administration is constantly working on optimizing the scheduler settings. Note that you can use the --format
option to the squeue
command to request that it shows you information about job scheduling (in particular, see the %S
field, which will show you the expected start time for a job, assuming Slurm has calculated it). See man squeue
for details. If you observe inexplicable behavior, please notify us at hpc-helpdesk@bih-charite.de
.
"},{"location":"slurm/background/#slurm-partitions","title":"Slurm Partitions","text":"In Slurm, the nodes of a cluster are split into partitions. Nodes are assigned to one or more partition (see the Job Scheduler section for details). Jobs can also be assigned to one or more partitions and are executed on nodes of the given partition.
In the BIH HPC, partitions are used to stratify jobs of certain running times and to provide different quality of service (e.g., maximal number of CPU cores available to a user for jobs of a certain running time and size). The partitions gpu
and highmem
provide special hardware (the nodes are not assigned to other partitions) and the mpi
partition allows MPI-parallelism and the allocation of jobs to more than one node. The Job Scheduler provides further details.
"},{"location":"slurm/cheat-sheet/","title":"Slurm Cheat Sheet","text":"This page contains assorted Slurm commands and Bash snippets that should be helpful.
man
pages!
$ man sinfo\n$ man scontrol\n$ man squeue\n# etc...\n
interactive sessions
hpc-login-1:~$ srun --pty bash\nmed0740:~$ echo \"Hello World\"\nmed0740:~$ exit\n
batch submission
hpc-login-1:~$ sbatch script.sh\nSubmitted batch job 2\nhpc-login-1:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 27 debug script.s holtgrem R 0:06 1 med0703\n
listing nodes
$ sinfo -N\nNODELIST NODES PARTITION STATE\nmed0740 1 debug* idle\nmed0741 1 debug* down*\nmed0742 1 debug* down*\n\n$ scontrol show nodes\nNodeName=med0740 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.06\n AvailableFeatures=(null)\n[...]\n\n$ scontrol show nodes med0740\nNodeName=med0740 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.06\n AvailableFeatures=(null)\n ActiveFeatures=(null)\n Gres=(null)\n NodeAddr=med0740 NodeHostName=med0740 Version=20.02.0\n OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020\n RealMemory=1 AllocMem=0 FreeMem=174388 Sockets=2 Boards=1\n State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A\n Partitions=debug\n BootTime=2020-03-05T00:54:15 SlurmdStartTime=2020-03-05T16:23:25\n CfgTRES=cpu=32,mem=1M,billing=32\n AllocTRES=\n CapWatts=n/a\n CurrentWatts=0 AveWatts=0\n ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s\n
queue states
$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n
node resources
$ sinfo -o \"%20N %10c %10m %25f %10G \"\n
additional resources such as GPUs
$ sinfo -o \"%N %G\"\n
listing job details
$ scontrol show job 225\nJobId=225 JobName=bash\n UserId=XXX(135001) GroupId=XXX(30069) MCS_label=N/A\n Priority=4294901580 Nice=0 Account=(null) QOS=normal\n JobState=FAILED Reason=NonZeroExitCode Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=130:0\n RunTime=00:16:27 TimeLimit=14-00:00:00 TimeMin=N/A\n SubmitTime=2020-03-23T11:34:26 EligibleTime=2020-03-23T11:34:26\n AccrueTime=Unknown\n StartTime=2020-03-23T11:34:26 EndTime=2020-03-23T11:50:53 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-03-23T11:34:26\n Partition=gpu AllocNode:Sid=hpc-login-1:1918\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=med0301\n BatchHost=med0301\n NumNodes=1 NumCPUs=2 NumTasks=0 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=2,node=1,billing=2\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Command=bash\n WorkDir=XXX\n Power=\n TresPerNode=gpu:tesla:4\n MailUser=(null) MailType=NONE\n
host:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n 1177 medium bash jweiner_ R 4-21:52:24 1 med0127 \n 1192 medium bash jweiner_ R 4-07:08:40 1 med0127 \n 1209 highmem bash mkuhrin_ R 2-01:07:17 1 med0402 \n 1210 gpu bash hilberta R 1-10:30:34 1 med0304 \n 1213 long bash schubacm R 1-09:42:27 1 med0127 \n 2401 gpu bash ramkem_c R 1-05:14:53 1 med0303 \n 2431 medium ngs_mapp holtgrem R 1-05:01:41 1 med0127 \n 2437 critical snakejob holtgrem R 1-05:01:34 1 med0135 \n 2733 debug bash schubacm R 7:36:42 1 med0127 \n 3029 critical ngs_mapp holtgrem R 5:59:07 1 med0127 \n 3030 critical snakejob holtgrem R 5:56:23 1 med0134 \n 3031 critical snakejob holtgrem R 5:56:23 1 med0137 \n 3032 critical snakejob holtgrem R 5:56:23 1 med0137 \n 3033 critical snakejob holtgrem R 5:56:23 1 med0138 \n 3034 critical snakejob holtgrem R 5:56:23 1 med0138 \n 3035 critical snakejob holtgrem R 5:56:20 1 med0139 \n 3036 critical snakejob holtgrem R 5:56:20 1 med0139 \n 3037 critical snakejob holtgrem R 5:56:20 1 med0140 \n 3038 critical snakejob holtgrem R 5:56:20 1 med0140 \n 3039 critical snakejob holtgrem R 5:56:20 1 med0141 \n 3040 critical snakejob holtgrem R 5:56:20 1 med0141 \n 3041 critical snakejob holtgrem R 5:56:20 1 med0142 \n 3042 critical snakejob holtgrem R 5:56:20 1 med0142 \n 3043 critical snakejob holtgrem R 5:56:20 1 med0143 \n 3044 critical snakejob holtgrem R 5:56:20 1 med0143 \n 3063 long bash schubacm R 4:12:37 1 med0127 \n 3066 long bash schubacm R 4:11:47 1 med0127 \n 3113 medium ngs_mapp holtgrem R 1:52:33 1 med0708 \n 3118 medium snakejob holtgrem R 1:50:38 1 med0133 \n 3119 medium snakejob holtgrem R 1:50:38 1 med0703 \n 3126 medium snakejob holtgrem R 1:50:38 1 med0706 \n 3127 medium snakejob holtgrem R 1:50:38 1 med0144 \n 3128 medium snakejob holtgrem R 1:50:38 1 med0144 \n 3133 medium snakejob holtgrem R 1:50:35 1 med0147 \n 3134 medium snakejob holtgrem R 1:50:35 1 med0147 \n 3135 medium snakejob holtgrem R 1:50:35 1 med0148 \n 3136 medium snakejob holtgrem R 1:50:35 1 med0148 \n 3138 medium snakejob holtgrem R 1:50:35 1 med0104 \n
host:~$ squeue -o \"%.10i %9P %20j %10u %.2t %.10M %.6D %10R %b\"\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(R TRES_PER_NODE\n 1177 medium bash jweiner_m R 4-21:52:22 1 med0127 N/A\n 1192 medium bash jweiner_m R 4-07:08:38 1 med0127 N/A\n 1209 highmem bash mkuhrin_m R 2-01:07:15 1 med0402 N/A\n 1210 gpu bash hilberta_c R 1-10:30:32 1 med0304 gpu:tesla:4\n 1213 long bash schubacm_c R 1-09:42:25 1 med0127 N/A\n 2401 gpu bash ramkem_c R 1-05:14:51 1 med0303 gpu:tesla:1\n 2431 medium ngs_mapping holtgrem_c R 1-05:01:39 1 med0127 N/A\n 2437 critical snakejob.ngs_mapping holtgrem_c R 1-05:01:32 1 med0135 N/A\n 2733 debug bash schubacm_c R 7:36:40 1 med0127 N/A\n 3029 critical ngs_mapping holtgrem_c R 5:59:05 1 med0127 N/A\n 3030 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0134 N/A\n 3031 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0137 N/A\n 3032 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0137 N/A\n 3033 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0138 N/A\n 3034 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0138 N/A\n 3035 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0139 N/A\n 3036 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0139 N/A\n 3037 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0140 N/A\n 3038 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0140 N/A\n 3039 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0141 N/A\n 3040 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0141 N/A\n 3041 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0142 N/A\n 3042 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0142 N/A\n 3043 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0143 N/A\n 3044 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0143 N/A\n 3063 long bash schubacm_c R 4:12:35 1 med0127 N/A\n 3066 long bash schubacm_c R 4:11:45 1 med0127 N/A\n 3113 medium ngs_mapping holtgrem_c R 1:52:31 1 med0708 N/A\n 3118 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0133 N/A\n 3119 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0703 N/A\n 3126 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0706 N/A\n 3127 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0144 N/A\n 3128 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0144 N/A\n 3133 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0147 N/A\n 3134 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0147 N/A\n 3135 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0148 N/A\n 3136 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0148 N/A\n 3138 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0104 N/A\n
host:~$ sinfo\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST \ndebug* up 8:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \ndebug* up 8:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \ndebug* up 8:00:00 10 alloc med[0137-0144,0147-0148] \ndebug* up 8:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nmedium up 7-00:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \nmedium up 7-00:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \nmedium up 7-00:00:00 10 alloc med[0137-0144,0147-0148] \nmedium up 7-00:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nlong up 28-00:00:0 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \nlong up 28-00:00:0 8 mix med[0104,0127,0133-0135,0703,0706,0708] \nlong up 28-00:00:0 10 alloc med[0137-0144,0147-0148] \nlong up 28-00:00:0 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \ncritical up 7-00:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \ncritical up 7-00:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \ncritical up 7-00:00:00 10 alloc med[0137-0144,0147-0148] \ncritical up 7-00:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nhighmem up 14-00:00:0 1 mix med0402 \nhighmem up 14-00:00:0 3 idle med[0401,0403-0404] \ngpu up 14-00:00:0 2 mix med[0303-0304] \ngpu up 14-00:00:0 2 idle med[0301-0302] \n
"},{"location":"slurm/commands-sacct/","title":"Slurm Command: sacct
","text":"Perform queries to the Slurm accounting information.
Representative Example
hpc-login-1:~$ sacct -j 1607103\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n1607103 wgs_sv_an+ medium 1 PENDING 0:0\n
The sacct
command displays information from the Slurm accounting service. The Slurm scheduler only knows about active or completing (very recently active) jobs. The accouting system also knows about currently running jobs so it is the more robust way to query information about jobs. However, not all information is available to the accouting system, so scontrol show job
and squeue
provide more information about current and pending jbos.
Slurm Documentation: sacct
Please also see the official Slurm documentation on sacct.
"},{"location":"slurm/commands-sacct/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sbatch
command.
--jobs
-- The job(s) to query for. --format
-- Define attributes to retrieve. --long
-- Get a lot of information from the database, consider to pipe into | less -S
.
"},{"location":"slurm/commands-sacct/#notes","title":"Notes","text":" - If you need to get information about a job regardless of it being in the past, present, or future execution, use
sacct
over scontrol
and squeue
.
"},{"location":"slurm/commands-sattach/","title":"Slurm Command: sattach
","text":"The sattach
command allows you to connect the standard input, output, and error streams to your current terminals ession.
Representative Example
hpc-login-1:~$ sattach 12345.0\n[...output of your job...]\nmed0211:~$ [Ctrl-C]\nhpc-login-1:~$\n
Press Ctrl-C
to detach from the current session. Please note that you will have to give the job ID as well as step step ID. For most cases, simply append \".0\"
to your job ID.
Slurm Documentation: sattach
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-sattach/#important-arguments","title":"Important Arguments","text":" --pty
-- Execute task zero in pseudo terminal. --verbose
-- Increase verbosity of sattach
.
"},{"location":"slurm/commands-sbatch/","title":"Slurm Command: sbatch
","text":"The sbatch
command allows you to put a job into the scheduler's queue to be executed at a later time.
Representative Example
# Execute job.sh in partition medium with 4 threads and 4GB of RAM total for a\n# running time of up to one day.\nhpc-login-1:~$ sbatch --partition=medium --mem=4G --ntasks 4 --time=1-00 job.sh\nSubmitted batch job JOB_ID\n
The command will create a batch job and add it to the queue to be executed at a later point in time.
Slurm Documentation: sbatch
Please also see the official Slurm documentation on sbatch.
"},{"location":"slurm/commands-sbatch/#important-arguments","title":"Important Arguments","text":" --array
-- Submit jobs as array jobs. Also see the section [#array-jobs] below. --nodes
-- The number of nodes to allocate. This is only given here as an important argument as the maximum number of nodes allocatable to any partition but mpi
is set to one (1). This is done as there are few users on the BIH HPC that actually use multi-node paralleilsm. Rather, most users will use multi-core parallelism and might forget to limit the number of nodes which causes inefficient allocation of resources. --cpus-per-task
-- This corresponds to the number of CPU cores allocated to each task. --mem
-- The memory to allocate for the job. As you can define minimal and maximal number of tasks/CPUs/cores, you could also specify --mem-per-cpu
and get more flexible scheduling of your job. --gres
-- Generic resource allocation. On the BIH HPC, this is only used for allocating GPUS, e.g., with --gres=gpu:tesla:2
, a user could allocate two NVIDIA Tesla GPUs on the same host (use a40
instead of tesla
for the A40 GPUs). --licenses
-- On the BIH HPC, this is used for the allocation of MATLAB 2016b licenses only. --partition
-- The partition to run in. Also see the Job Scheduler section. --time
-- Specify the running time, see man sbatch
or the official Slurm documentation on srun for supported formats. **Please note that the DRMA API only accepts the hours:minutes
format. --dependency
-- Specify dependencies on other jobs, e.g., using --dependency afterok:JOBID
to only execute if the job with ID JOBID
finished successfully or --dependency after:JOBID
to wait for a job to finish regardless of its termination status. --constraint
-- Require one or more features from your node. On the BIH HPC, the processor generation is defined as a feature on the nodes, e.g., haswell
, or special networking such as infiniband
. You can have a look at /etc/slurm/slurm.conf
on all configured features. --output
-- The path to the output log file (by default joining stdout and stderr, see the man page on --error
on how to redirect stderr separately). A various number of placeholders is available, see the \"filename pattern\" section of man sbatch
or the official Slurm documentation on srun. --mail-type=<type>
-- Send out notifications by email when an event occurs. Use FAIL
to get emails when your job fails. Also see the documentation of sbatch in the Slurm manual. --mail-user=<email>
-- The email address to send to. Must end in @charite.de
, @mdc-berlin.de
, or @bih-charite.de
.
Ensure your --output
directory exists!
In the case that the path to the log/output file does not exist, the job will just fail. scontrol show job ID
will report JobState=FAILED Reason=NonZeroExitCode
. Regrettably, no further information is displayed to you as the user. Always check that the path to the directories in StdErr
and StdOut
exists when checking scontrol show job ID
.
"},{"location":"slurm/commands-sbatch/#other-arguments","title":"Other Arguments","text":" --job-name
"},{"location":"slurm/commands-sbatch/#job-scripts","title":"Job Scripts","text":"Also see the section Slurm Job Scripts on how to embed the sbatch
parameters in #SBATCH
lines.
"},{"location":"slurm/commands-sbatch/#array-jobs","title":"Array Jobs","text":"If you have many (say, more than 10) similar jobs (e.g., when performing a grid search), you can also use array jobs. However, you should also consider whether it would make sense to increase the time of your jobs, e.g, to be at least ~10min.
You can submit array jobs by specifying -a EXPR
or --array EXPR
where EXPR
is a range or a list (of course, you can also add this as an #SBATCH
header in your job script). For example:
hpc-login-1 ~# sbatch -a 1-3 grid_search.sh\nhpc-login-1 ~# sbatch -a 1,2,5-10 grid_search.sh\n
This will submit grid_search.sh
with certain variables set:
SLURM_ARRAY_JOB_ID
-- the ID of the first job SLURM_ARRAY_TASK_ID
-- the index of the job in the array SLURM_ARRAY_TASK_COUNT
-- number of submitted jobs in array SLURM_ARRAY_TASK_MAX
-- higehst job array index value SLURM_ARRAY_TASK_MIN
-- lowest job array index value
Using array jobs has several advantages:
- It greatly reduces the load on the Slurm scheduler.
- You do not need to submit in a loop, but rather
- You can use a single command line.
Also see Slurm documentation on job arrays.
For example, if you submit sbatch --array=1-3 grid_search.sh
and slurm responsds with Submitted batch job 36
then the script will be run three times with the following prameters set:
SLURM_JOB_ID=36\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=1\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n\nSLURM_JOB_ID=37\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=2\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n\nSLURM_JOB_ID=38\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=3\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n
"},{"location":"slurm/commands-sbatch/#notes","title":"Notes","text":" - This is the primary entry point for creating batch jobs to be executed at a later point in time.
- As with all jobs allocated by Slurm, interactive sessions executed with
sbatch
are governed by resource allocations, in particular: sbatch
jobs have a maximal running time set, sbatch
jobs have a maximal memory and number of cores set, and - also see
scontrol show job JOBID
.
"},{"location":"slurm/commands-scancel/","title":"Slurm Command: scancel
","text":"Terminate a running Slurm job.
Representative Example
hpc-login-1:~$ scancel 1703828\nhpc-login-1:~$\n
This command allows to terminate one or more running jobs (of course, non-superusers can only terminate their own jobs).
Slurm Documentation: scancel
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-scontrol/","title":"Slurm Command: scontrol
","text":"The scontrol
allows to query detailed information from the scheduler and perform manipulation. Object manipulation is less important for normal users.
Representative Example
hpc-login-1:~$ scontrol show job 1607103\nJobId=1607103 JobName=wgs_sv_annotation\n UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(5272) MCS_label=N/A\n Priority=748 Nice=0 Account=(null) QOS=normal\n [...]\nhpc-login-1:~$ scontrol show node med02[01-32]\nNodeName=med0201 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.01\n AvailableFeatures=ivybridge,infiniband\n ActiveFeatures=ivybridge,infiniband\n [...]\nhpc-login-1:~$ scontrol show partition medium\nPartitionName=medium\n AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL\n AllocNodes=ALL Default=NO QoS=medium\n DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO\n [...]\n
This command allows to query all information for an object from Slurm, e.g., jobs, nodes, or partitions. The command also accepts ranges of jobs and hosts. It is most useful to get the information of one or a few objects from the scheduler.
Slurm Documentation: scontrol
Please also see the official Slurm documentation on scontrol.
"},{"location":"slurm/commands-scontrol/#important-sub-commands","title":"Important Sub commands","text":" scontrol show job
-- Show details on jobs. scontrol show partition
-- Show details on partitions. scontrol show node
-- Show details on nodes. scontrol help
-- Show help. scontrol
-- Start an interactive scontrol shell / REPL (read-eval-print loop).
"},{"location":"slurm/commands-scontrol/#notes","title":"Notes","text":" scontrol
can only work on jobs that are pending (in the queue), running, or in \"completing' state. - For jobs that have finished, you have to use Slurm's accounting features, e.g., with the
sacct
command.
"},{"location":"slurm/commands-sinfo/","title":"Slurm Command: sinfo
","text":"The sinfo
command allows you to query the current cluster status.
Representative Example
hpc-login-1:~$ sinfo\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\n[...]\nmedium up 7-00:00:00 10 drain* med[0101-0103,0125-0126,0128-0132]\nmedium up 7-00:00:00 1 down* med0243\nmedium up 7-00:00:00 31 mix med[0104,0106-0122,0124,0133,0232-0233,0237-0238,0241-0242,0244,0263-0264,0503,0506]\nmedium up 7-00:00:00 5 alloc med[0105,0123,0127,0239-0240]\nmedium up 7-00:00:00 193 idle med[0134-0164,0201-0231,0234-0236,0245-0262,0501-0502,0504-0505,0507-0516,0601-0632,0701-0764]\n[...]\nhpc-login-1:$ sinfo --summarize\nPARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST\ndebug* up 8:00:00 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\nmedium up 7-00:00:00 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\nlong up 28-00:00:0 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\ncritical up 7-00:00:00 25/141/10/176 med[0101-0164,0501-0516,0601-0632,0701-0764]\nhighmem up 14-00:00:0 1/2/1/4 med[0401-0404]\ngpu up 14-00:00:0 3/0/1/4 med[0301-0304]\nmpi up 14-00:00:0 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n
This command will summaries the state of nodes by different criteria (e.g., by partition or globally).
Slurm Documentation: sinfo
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-sinfo/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sinfo
command.
--summarize
-- Summarize the node state by partition. --nodes
-- Select the nodes to show the status for, e.g., display the status of all GPU nodes with sinfo -n med030[1-4]
.
"},{"location":"slurm/commands-sinfo/#node-states","title":"Node States","text":"The most important node states are:
down
-- node is marked as offline draining
-- node will not accept any more jobs but has jobs running on it drained
-- node will not accept any more jobs and has no jobs running on it, but is not offline yet idle
-- node is ready to run jobs allocated
-- node is fully allocated (e.g., CPU, RAM, or GPU limit has been reached) mixed
-- node is running jobs but there is space for more
"},{"location":"slurm/commands-sinfo/#notes","title":"Notes","text":" - Also see the Slurm Format Strings section.
"},{"location":"slurm/commands-squeue/","title":"Slurm Command: squeue
","text":"The squeue
command allows you to view currently running and pending jobs.
Representative Example
hpc-login-1:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 1583165 highmem 20200702 usr PD 0:00 1 (DependencyNeverSatisfied)\n 1605901 critical variant_ holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n 1605902 critical variant_ holtgrem PD 0:00 1 (Dependency)\n 1605905 critical variant_ holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n 1605916 critical wgs_sv_c holtgrem PD 0:00 1 (Dependency)\n 1607103 medium wgs_sv_a holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n[...]\n
Slurm Documentation: squeue
Please also see the official Slurm documentation on squeue.
"},{"location":"slurm/commands-squeue/#important-arguments","title":"Important Arguments","text":" --nodelist
-- Only display jobs running on certain nodes (e.g., GPU nodes). --format
-- Define the format to print, see man squeue
for details. See below for a format string that includes the jobid, partition, job name, user name, job status, running time, number of nodes, number of CPU cores, and allocated GPUs.
"},{"location":"slurm/commands-squeue/#notes","title":"Notes","text":"The following aliases in ~/.bashrc
will allow you to print a long and informative squeue
output with sq
, pipe it into less with sql
, get only your jobs (adjust the alias
to your account) using sqme
and pipe that into less with sqmel
.
alias sq='squeue -o \"%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %10R %b\" \"$@\"'\nalias sql='sq \"$@\" | less -S'\nalias sqme='sq -u YOURUSER_c_or_m \"$@\"'\nalias sqmel='sqme \"$@\" | less -S'\n
"},{"location":"slurm/commands-srun/","title":"Slurm Command: srun
","text":"The srun
command allows you to run a command now.
Representative Example
hpc-login-1:~$ srun --pty bash -i\nmed0201:~$\n
The command will perform a resource allocation with the scheduler (and wait until it has allocated the requested resources) first. Most importantly, you can specify the --pty
argument which will connect the current terminal's standard output, error, and input to your current one. This allows you to run interactive jobs such as shells with srun --pty bash -i
.
Slurm Documentation: srun
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-srun/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sbatch
command.
--pty
-- Connect current terminal to the job's stdoud/stderr/stdin. --x11
-- Setup X11 forwarding. --immediate
-- Immediately terminate if the resources to run the job are not available, do not wait. --test-only
-- Don't run anything, but only estimate when the job would be scheduled.
"},{"location":"slurm/commands-srun/#notes","title":"Notes","text":" - This is the primary entry point for creating interactive shell sessions on the cluster.
- As with all jobs allocated by Slurm, interactive sessions executed with
srun
are governed by resource allocations, in particular: srun
jobs have a maximal running time set, srun
jobs have a maximal memory and number of cores set, and - also see
scontrol show job JOBID
.
"},{"location":"slurm/format-strings/","title":"Slurm Command Format Strings","text":"In the sections Slurm Quickstart and Slurm Cheat Sheet, we have seen that sinfo
and squeue
allow for the compact display partitions/nodes and node information. In contrast, scontrol show job <id>
and scontrol show partition <id>
and scontrol show node <id>
show comprehensive information that quickly gets hard to comprehend for multiple entries.
Now you might ask: is there anything in between? And: yes, there is.
You can tune the output of sinfo
and squeue
using parameters, in particular by providing format strings. All of this is described in the man pages of the commands that you can display with man sinfo
and man squeue
on the cluster.
"},{"location":"slurm/format-strings/#tuning-sinfo-output","title":"Tuning sinfo
Output","text":"Notable arguments of sinfo
are:
-N, --Node
-- uncompress the usual lines and display one line per node and partition. -s, --summarize
-- compress the node state, more compact display. -R, --list-reasons
-- for nodes that are not up, display reason string provided by admin. -o <fmt>, --format=<fmt>
-- use format string for display.
The most interesting argument is -o/--format
. The man page lists the following values that are used when using other arguments. In other words, many of the display modifications could also be applied with -o/--format
.
default \"%#P %.5a %.10l %.6D %.6t %N\"\n--summarize \"%#P %.5a %.10l %.16F %N\"\n--long \"%#P %.5a %.10l %.10s %.4r %.8h %.10g %.6D %.11T %N\"\n--Node \"%#N %.6D %#P %6t\"\n--long --Node \"%#N %.6D %#P %.11T %.4c %.8z %.6m %.8d %.6w %.8f %20E\"\n--list-reasons \"%20E %9u %19H %N\"\n--long --list-reasons\n \"%20E %12U %19H %6t %N\"\n
The best way to learn more about this is to play around with sinfo -o
, starting out with one of the format strings above. Details about the format strings are described in man sinfo
. Some remarks here:
%<num><char>
displays the value represented by <char>
padded with spaces to the right such that a width of <num>
is reached, %.<num><char>
displays the value represented by <char>
padded with spaces to the left such that a width of <num>
is reached, and %#<char>
displays the value represented by <char>
padded with spaces to the max length of the value represented by <char>
(this is a \"virtual\" value, used internally only, you cannot use this and you will have to place an integer here).
For example, to create a grouped display with reasons for being down use:
hpc-login-1:~$ sinfo -o \"%10P %.5a %.10l %.16F %40N %E\"\nPARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST REASON\ndebug* up 8:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\ndebug* up 8:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nmedium up 7-00:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\nmedium up 7-00:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nlong up 28-00:00:0 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\nlong up 28-00:00:0 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\ncritical up 7-00:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\ncritical up 7-00:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nhighmem up 14-00:00:0 0/4/0/4 med[0401-0404] none\ngpu up 14-00:00:0 3/1/0/4 med[0301-0304] none\n
"},{"location":"slurm/format-strings/#tuning-squeue-output","title":"Tuning squeue
Output","text":"The standard squeue output might yield the following
hpc-login-1:~$ squeue | head\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 3149 medium variant_ holtgrem PD 0:00 1 (Dependency)\n 1177 medium bash jweiner_ R 6-03:32:41 1 med0127\n 1192 medium bash jweiner_ R 5-12:48:57 1 med0127\n 1210 gpu bash hilberta R 2-16:10:51 1 med0304\n 1213 long bash schubacm R 2-15:22:44 1 med0127\n 2401 gpu bash ramkem_c R 2-10:55:10 1 med0303\n 3063 long bash schubacm R 1-09:52:54 1 med0127\n 3066 long bash schubacm R 1-09:52:04 1 med0127\n 3147 medium ngs_mapp holtgrem R 1-03:13:42 1 med0148\n
Looking at man squeue
, we learn that the default format strings are:
default \"%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R\"\n-l, --long \"%.18i %.9P %.8j %.8u %.8T %.10M %.9l %.6D %R\"\n-s, --steps \"%.15i %.8j %.9P %.8u %.9M %N\"\n
This looks a bit wasteful. Let's cut down on the padding of the job ID and expand on the job name and remove some right paddings.
hpc-login-1:~$ squeue -o \"%.6i %9P %30j %.10u %.2t %.10M %.6D %R %b\" | head\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 3149 medium variant_calling holtgrem_c PD 0:00 1 (Dependency)\n 1177 medium bash jweiner_m R 6-03:35:55 1 med0127\n 1192 medium bash jweiner_m R 5-12:52:11 1 med0127\n 1210 gpu bash hilberta_c R 2-16:14:05 1 med0304\n 1213 long bash schubacm_c R 2-15:25:58 1 med0127\n 2401 gpu bash ramkem_c R 2-10:58:24 1 med0303\n 3063 long bash schubacm_c R 1-09:56:08 1 med0127\n 3066 long bash schubacm_c R 1-09:55:18 1 med0127\n 3147 medium ngs_mapping holtgrem_c R 1-03:16:56 1 med0148\n
"},{"location":"slurm/format-strings/#displaying-resources","title":"Displaying Resources","text":"Now display how many of our internal projects still exist.
hpc-login-1:~$ squeue -o \"%.6i %9P %30j %.10u %.2t %.10M %.6D %10R %s\" | head\n
The next steps are (TODO):
- setup of certificate for containers
- opening firewall apropriately
- integrate with openmpi documentation
"},{"location":"slurm/job-scripts/","title":"Slurm Job Scripts","text":"This page describes how to create SLURM job scripts.
SLURM job scripts look as follows. On the top you have lines starting with #SBATCH
. These appear as comments to bash scripts. These lines are interpreted by sbatch
in the same way as command line arguments. That is, when later submitting the script with sbatch my-job.sh
you can either have the parameter to the sbatch
call or in the file.
Multi-Node Allocation in Slurm
Classically, jobs on HPC systems are written in a way that they can run on multiple nodes at once, using the network to communicate. Slurm comes from this world and when allocating more than one CPU/core, it might allocate them on different nodes. Please use --nodes=1
to force Slurm to allocate them on a single node.
Creating the Script
host:example$ cat >my-job.sh <<\"EOF\"\n#!/bin/bash\n#\n#SBATCH --job-name=this-is-my-job\n#SBATCH --output=output.txt\n#\n#SBATCH --ntasks=1\n#SBATCH --nodes=1\n#SBATCH --time=10:00\n#SBATCH --mem-per-cpu=100M\n\ndate\n\nhostname\n>&2 echo \"Hello World\"\n\nsleep 1m\n\ndate\nEOF\n
Also see the SLURM Rosetta Stone for more options.
Submit, Look at Queue & Result
host:example$ sbatch script.sh \nSubmitted batch job 315\nhost:example$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n 315 debug this-is- holtgrem R 0:40 1 med0127 \nhost:example$ sleep 2m\nhost:example$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \nhost:example$ cat output.txt \nWed Mar 25 13:30:56 CET 2020\nmed0127\nHello World\nWed Mar 25 13:31:56 CET 2020\n
"},{"location":"slurm/memory-allocation/","title":"Memory Allocation","text":"Memory allocation is one of the topics that users find confusing most often. This section first gives some technical background and then explains how to implement this properly with Slurm on the BIH HPC.
"},{"location":"slurm/memory-allocation/#technical-background","title":"Technical Background","text":"Technical Background Summary
- virtual memory is what your programs tells the operating system it wants to use
- resident set size is the amount of memory that your program actually uses
- most memory will be allocated on the heap
Main memory used to be one of the most important topics when programming, as computers had so little. There is the infamous quote \"640KB ought ot be enough for anybody\" wrongly attribute to Bill Gates which refers to the fact that early computers could only address that amount of memory. In MS DOS, one had to use special libraries for a program to use more memory. Today, computers are very fast and memory is plentiful and people can (rightfully) forget about memory allocation ... as long as they don't use \"much\" memory by today's standards.
The Linux operating system differentiates between the following types of memory:
- virtual memory size (vsize), the amount of memory that a process (virtually) allocates,
- resident set size (rss), the amount of memory actually used and currently in the computer's main memory,
- the swap memory usage, the amount of active memory that is not present in main memory but on the computer's disk,
- sometimes, the shared memory is also interesting, and
- it might be interesting to know about heap and stack size.
Note that above we are talking about processes, not Slurm jobs yet. Let us look at this in detail:
Each program uses some kind of memory management. For example, in C the malloc
and free
functions manually allocate and free memory while in Java, R, and Python, memory allocation and release is done automatically using a concept called garbage collection. Each program starts with a certain virtual memory size, that is the amount of memory it can address, say 128MB. When the program allocates memory, the memory allocation mechanism will check whether it has sufficient space left. If not, it will request an increase in virtual memory from the operating system, e.g., to 256MB. If this fails then the program can try to handle the error, e.g., terminate gracefully, but many programs will just panic and stop. Otherwise, the program will get access to more memory and happily continue to run.
However, programs can allocate humonguous amounts of virtual memory and only use a little. Memory is organized in \"pages\" (classically these are 4096 bytes each, but can be larger using so-called \"huge page\" features). The operating system tracks which memory pages are actually used by a process. The total size of these pages is called the resident set size: the amount of memory that is actually currently used by a program. Programs can also mark pages as unused again, thus freeing resident memory and can also decrease their virtual memory.
In some cases it is still interesting to use swap memory. Here, the contents of resident memory are copied to disk by the operating system. This process is completely transparent to the program; the data remains available at the original positions in the virtual memory! However, accessing it will take some time as it must be read back into main memory from the disk. In this way, it was possible for a computer with 4MB of RAM and a disk of 100MB to run programs that used 8MB. Of course, this was only really useable for programs that ran in the background. One could really feel the latency if a graphical program was using swapped memory (you could actually hear the hard drive working). Today, swap storage is normally only relevant when put your computer into hibernation. Given the large main memory on the cluster nodes, their small local hard drives (just used for loading the operating system), and the extreme slowness involved in using swapped memory, the BIH HPC nodes have no swap memory allocated.
Most HPC users will also use shared memory, at least implicitly. Whenever a program uses fork
to create a subprocess (BTW, this is not a thread), the program can chose to \"copy\" its current address space. The second process then has access to the same memory than the parent process in a copy-on-write fashion. This allows, for example, pre-loading a database, and also allows the use of already loaded library code by the child process as well. If the child process writes to the copy-on-write memory of the parent, the relevant memory page will be copied and attributed to the child.
Two or more processes can share the same memory explicitly. This is usually used for inter-process communication but the Bowtie program uses it for sharing the memory of indices. For example, the Python multiprocessing
module will use this, including if you have two MPI processes running on the same host.
Memory is also separated into segments, the most interesting ones are heap and stack memory. For compiled languages, memory can be allocated on either. For C, an int
variable will be allocated on the stack. Every time you call a function, a stack frame is created in memory to hold the local variables and other information for the duration of the function execution. The stack thus grows through function calls made by your program and shrinks when the functions return. The stack size for a process is limited (by ulimit -s
) and a program that goes too deep (e.g., via infinite recursion) will be terminated by the operating system if it exceeds this limit. Again in C, int * ptr = (int *)malloc(10 * sizeof(int));
will allocate memory for one variable (an integer pointer) on the stack and memory for 10 integers on the heap. When the function returns, the ptr
variable on the stack will be freed but to free the array of integers, you'd have to call free(ptr)
. If the memory is not freed then this constitutes a memory leak, but that is another topic.
Other relevant segments are code, where the compiled code lives, and data, where static data such as strings displayed to the user are stored. As a side node, in interpreted languages such as R or Python, the code and data segments will refer to the code and data of Python while the actual program text will be on the heap.
"},{"location":"slurm/memory-allocation/#interlude-memory-in-java","title":"Interlude: Memory in Java","text":"Memory in Java Summary
- set
-XX:MaxHeapSize=<size>
(e.g., <size>=2G
) for your program and only tune the other parameters if needed - also consider the amount of memory that Java needs for heap management in your Slurm allocations
Java's memory management provides for some interesting artifacts. When running simple Java programs, you will never run into this but if you need to use gigabytes of memory in Java then you will have to learn a bit about Java memory management. This is the case when running GATK programs, for example.
As different operating systems handle memory management differently, the Java virtual machine does its own memory management to provide a consistent interface. The following three settings are important in governing memory usage of Java:
-Xmx<size>
/-XX:MaxHeapSize=<size>
-- the maximal Java heap size -Xms<size>
/-XX:InitialHeapSize=<size>
-- the initial Java heap size -Xss<size>
/-XX:ThreadStackSize=<size>
-- maximal stack size available to a Java thread (e.g., the main thread)
Above, <size>
is a memory specification, either in bytes or with a suffix, e.g., 80M
, or 1G
.
On startup, Java does roughly the following:
- Setup the core virtual machine, load libraries, etc. and allocate (vsize) consume (rss) memory on the OS (operating system) heap.
- Setup the Java heap allocate (vsize) and consume (rss) memory on the OS heap. In particular, Java will need to setup data structures for the memory management of each individual object.
- Run the program where Java data and Java threads will lead to memory allocation (vsize) and consumption (rss) of memory.
Memory freed by the Java garbage collector can be re-used by other Java objects (rss remains the same) or be freed in the operating system (rss decreases). The Java VM program itself will also consume memory on the OS stack but that is negligible.
Overall, the Java VM needs to store in main memory:
- The Java VM, program code, Java thread stacks etc. (very little memory).
- The Java heap (potentially a lot of memory).
- The Java heap management data structures (so-called \"off-heap\", but of course on the OS heap) (potentially also considerable memory).
In the BIH HPC context, the following is recommended to:
- Set the Java heap to an appropriate size (use trial-and-error to determine the correct size or look through internet forums).
- Only tune initial heap size in the case of performance issues (unlikely in batch processing).
- Only bump the stack size when problems occur.
- Consider \"off-heap\" memory when performing Slurm allocations.
"},{"location":"slurm/memory-allocation/#memory-allocation-in-slurm","title":"Memory Allocation in Slurm","text":"Memory Allocation in Slurm Summary
- most user will simply use
--mem=<size>
(e.g., <size>=3G
) to allocate memory per node - both interactive
srun
and batch sbatch
jobs are governed by Slurm memory allocation - the sum of all memory of all processes started by your job may not exceed the job reservation.
- please don't over-allocate memory, see \"Memory Accounting in Slurm\" below for details
Our Slurm configuration uses Linux cgroups to enforce a maximum amount of resident memory. You simply specify it using --mem=<size>
in your srun
and sbatch
command.
In the (rare) case that you provide more flexible number of threads (Slurm tasks) or GPUs, you could also look into --mem-per-cpu
and --mem-per-gpu
. The official Slurm sbatch manual is quite helpful, as is man sbatch
on the cluster command line.
Slurm (or rather Linux via cgroups) will track all memory started by all jobs by your process. If each process works independently (e.g., you put the output through a pipe prog1 | prog2
) then the amount of memory consumed will at any given time be the sum of the RSS of both processes at that time. If your program uses fork
, which uses memory in a copy-on-write fashion, the shared memory is of course only counted once. Note that Python's multiprocessing does not use copy on write: its data will be explicitly copied and consume additional memory. Refer to the Scipy/Numpy/Pandas etc. documentation on how to achieve parallelism without copying too much data.
The amount of virtual memory that your program can reserve is only \"virtually\" unlimited (pun not intended). However, in practice, the operating system will not like you allocating more than physically available. If your program attempts to allocate more memory than requested via Slurm, your program will be killed.
This is reported to you in the Slurm job output log as something like:
slurmstepd: error: Detected 1 oom-kill event(s) in step <JOB ID>.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.\n
You can inspect the amount of memory available on each node in total with sinfo --format \"%.10P %.10l %.6D %.6m %N\"
, as shown below.
$ sinfo --format \"%.10P %.10l %.6D %.6m %N\"\n PARTITION TIMELIMIT NODES MEMORY NODELIST\n debug* 8:00:00 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n medium 7-00:00:00 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n long 28-00:00:0 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n critical 7-00:00:00 176 128722 med[0101-0164,0501-0516,0601-0632,0701-0764]\n highmem 14-00:00:0 4 515762 med[0401-0404]\n gpu 14-00:00:0 4 385215 med[0301-0304]\n mpi 14-00:00:0 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n
"},{"location":"slurm/memory-allocation/#memorycpu-accounting-in-slurm","title":"Memory/CPU Accounting in Slurm","text":"Memory Accounting in Slurm Summary
- you can use Slurm accounting to see memory and CPU usage of your program
- use
sacct -j JOBID --format=JobID,MaxRSS
to display the RSS usage of your program - use
sacct -j JOBID --format=Elapsed,AllocCPUs,TotalCPU
to display information about CPU usage - consider using the helpful script below to compute overallocated memory
While Slurm runs your job, it collects information about the job such as the running time, exit status, and memory usage. This information is available through the scheduling system via the squeue
and scontrol
commands, but only while the job is pending execution, executing, or currently completing. After job completion, the information is only available through the Slurm accounting system.
You can query information about jobs, e.g., using sacct
:
$ sacct -j 1607166\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n1607166 snakejob.+ critical 16 COMPLETED 0:0\n1607166.bat+ batch 16 COMPLETED 0:0\n1607166.ext+ extern 16 COMPLETED 0:0\n
This shows that the job with ID 1607166
with a job ID starting with snakejob.
has been run in the critical
partition, been allocated 16 cores and had an exit code of 0:0
. For technical reasons, there is a batch
and an extern
sub step. Actually, Slurm makes it possible to run various steps in one batch as documented in the Slurm documentation.
The sacct
command has various command-line options that you can read about via man sacct
or in the Slurm documentation. We can use --brief
/-b
to show only a brief summary.
$ sacct -j 1607166 --brief\n JobID State ExitCode\n------------ ---------- --------\n1607166 COMPLETED 0:0\n1607166.bat+ COMPLETED 0:0\n1607166.ext+ COMPLETED 0:0\n
Similarly, you can use --long
to display extended information (see the manual for the displayed columns). Very long report lines can be piped into less -S
for easier display. You can fine-tune the information to display with a format string to --format
:
$ sacct -j 1607166 --format=JobID,ReqMem,MaxRSS,Elapsed,TotalCPU,AllocCPUS\n JobID ReqMem MaxRSS Elapsed TotalCPU AllocCPUS\n------------ --------- ---------- ---------- ---------- ----------\n1607166 60Gn 13:07:31 7-16:21:29 16\n1607166.bat+ 60Gn 4314560K 13:07:31 7-16:21:29 16\n1607166.ext+ 60Gn 0 13:07:31 00:00.001 16\n
From this command, we can read that we allocate 60GB memory of memory per node (suffix n
, here Gn
for gigabytes per node) and the maximum RSS is reported as 4.3GB. You can use this information to fine-tune your memory allocations. As a side-remark, a suffic c
indicates the memory per core (e.g., that could be60Gc
)
Further, the program ran for 13 hours and 7 minutes with allocated 16 CPU cores and consumed a total of 7 days, 16 hours, and 21 minutes of CPU time. Thus, a total of 10,061 CPU minutes were spent in 787 minutes wall-clock time. This yields an overall empirical degree of parallelism of about 10061 / 787 = 14, and a parallel efficiency of 14 / 16 = 88%. The discussion of parallel efficiency is a topic not covered here.
However, you can use the awk
script below to compute the empirical parallelism (EmpPar
) and the parallel efficiency (ParEff
). The script also displays the difference I requested, and used RSS (DiffRSS
). The script can be found here.
$ sacct -j 1607166 --format=JobID,ReqMem,MaxRSS,Elapsed,TotalCPU,AllocCPUS \\\n | awk -f quick-sacct.awk\n JobID ReqMem MaxRSS Elapsed TotalCPU AllocCPUS EmpPar ParEff DiffMEM\n------------ ---------- ---------- ---------- ---------- ---------- --------- -------- --------\n1607166 60Gn 13:07:31 7-16:21:29 16 0.00 0.00 -\n1607166.bat+ 60Gn 4314560K 13:07:31 7-16:21:29 16 14.05 0.88 55.89\n1607166.ext+ 60Gn 0 13:07:31 00:00.001 16 0.00 0.00 -\n
"},{"location":"slurm/overview/","title":"Scheduling Overview","text":"The BIH HPC uses the Slurm scheduling system for resource allocation. This section of the manual attempts to give an overview of what scheduling is and how to use the Slurm scheduler. For more detailed information, you will have to refer to the Slurm website and the Slurm man pages (e.g., by entering man sbatch
or man srun
on the HPC terminal's command line).
For a quick introduction and hands-on examples, please see the manual sections
- Overview, starting with Slurm Quickstart, and
- HPC Tutorial, starting with Episode 0.
Also, make sure that you are aware of our How-To: Debug Software and How-To: Debug Software on HPC Systems guides in the case that something goes wrong.
"},{"location":"slurm/overview/#annotated-contents","title":"Annotated Contents","text":" - Background on Scheduling -- some background on scheduling and the terminology used
- Quickstart -- explains the most important Slurm commands, with examples
- Cheat Sheet -- for quick reference
- Job Scripts -- how to setup job scripts with Slurm
- Memory Allocation -- memory allocation ( one of the most important concepts that is most often found confusing)
- Introduction to Slurm Commands
srun
-- running parallel jobs now sbatch
-- submission of batch jobs scancel
-- stop/kill jobs sinfo
-- display information about the Slurm cluster squeue
-- information about pending and running jbos scontrol
-- detailed information (and control) sacct
-- access Slurm accounting information (pending, running, and past jobs) - Format Strings in Slurm -- format strings allow to display extended information about Slurm scheduler objects
- Slurm and Snakemake -- how to use Snakemake with Slurm
- X11 Forwarding -- X11 forwarding in Slurm (simple; short)
- Rosetta Stone -- lookup table for SGE <-> Slurm
"},{"location":"slurm/overview/#a-word-on-elsewhere","title":"A Word on \"Elsewhere\"","text":"Many other facilities run Slurm clusters and make their documentation available on the internet. We list some that we found useful below. However, be aware that Slurm is a highly configurable and extensible system. Other sites may have different configurations and plugins enabled than we have (or might even have written custom plugins that are not available at BIH). In any case, it's always useful to look \"\u00fcber den Tellerrand\".
- Quick Start User Guide - the official guide from the Slurm creators.
- Slurm
man
Pages - web versions of Unix manual (man
) pages. - TU Dresden Slurm Compendium - nice documentation from the installation in Dresden. Note that their installation is highly customized, in particular, their partition selection is automated (but is not for us).
- Slurm at CECI - CECI is a HPC consortium from Belgium.
- Slurm at the Arctic University of Norway
- Slurm at Technical University of Denmark - if you want to get an insight in how this looks to administrator.
"},{"location":"slurm/quickstart/","title":"Slurm Quickstart","text":"Create an interactive bash session (srun
will run bash in real-time, --pty
connects its stdout
and stderr
to your current session).
hpc-login-1:~$ srun --pty bash -i\nmed0740:~$ echo \"Hello World\"\nHello World\nmed0740:~$ exit\nhpc-login-1:~$\n
Note you probably want to longer running time for your interactive jobs. This way, your jobs can run for up to 28 days. This will make your job be routed automatically into the long
partition as it is the only one that can fit your job.
hpc-login-1:~$ srun --pty --time 28-00 bash -i\nmed0740:~$\n
Pro-Tip: Using Bash aliases for quick access.
hpc-login-1:~$ alias slogin=\"srun --pty bash -i\"\nhpc-login-1:~$ slogin\nmed0740:~$ exit\nhpc-login-1:~$ cat >>~/.bashrc <<\"EOF\"\n# Useful aliases for logging in via Slurm\nalias slogin=\"srun --pty bash -i\"\nalias slogin-x11=\"srun --pty --x11 bash -i\"\nEOF\n
Create an interactive R session on the cluster (assuming conda is active and the environment my-r
is created, e.g., with conda create -n my-r r
).
hpc-login-1:~$ conda activate my-r\nhpc-login-1:~$ srun --pty R\nR version 3.6.2 (2019-12-12) -- \"Dark and Stormy Night\"\nCopyright (C) 2019 The R Foundation for Statistical Computing\n[...]\nType 'demo()' for some demos, 'help()' for on-line help, or\n'help.start()' for an HTML browser interface to help.\nType 'q()' to quit R.\n\n\n> Sys.info()[\"nodename\"]\n nodename\n\"med0740\"\n> q()\nSave workspace image? [y/n/c]:\nhpc-login-1:~$\n
Create an interactive iPython session on the cluster (assuming conda is active and the environment my-python
is created, e.g., with conda create -n my-python python=3 ipython
).
hpc-login-1:~$ conda activate my-python\nhpc-login-1:~$ srun --pty ipython\nPython 3.8.2 | packaged by conda-forge | (default, Mar 5 2020, 17:11:00)\nType 'copyright', 'credits' or 'license' for more information\nIPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.\n\nIn [1]: import socket; socket.gethostname()\nOut[1]: 'med0740'\n\nIn [2]: exit\nhpc-login-1:~$\n
Allocate 4 cores (default is 1 core), and a total of 4GB of RAM on one node (alternatively use --mem-per-cpu
to set RAM per CPU); sbatch
accepts the same argument.
hpc-login-1:~$ srun --cpus-per-task=4 --nodes=1 --mem=4G --pty bash\nmed0740:~$ export | grep SLURM_CPUS_ON_NODE\n4\nmed0740:~$ your-parallel-script --threads 4\n
Submit an R script to the cluster in batch mode (sbatch
schedules the job for later execution).
hpc-login-1:~$ cat >job-script.sh <<\"EOF\"\n#!/bin/bash\necho \"Hello, I'm running on $(hostname) and it's $(date)\"\nEOF\nhpc-login-1:~$ sbatch job-script.sh\nSubmitted batch job 7\n\n# Some time later:\nhpc-login-1:~$ cat slurm-7.out\nHello, I'm running on med0740 and it's Fri Mar 6 07:36:42 CET 2020\nhpc-login-1:~$\n
"},{"location":"slurm/reservations/","title":"Reservations / Maintenances","text":"Hint
Read this in particular if you want to know why your job does not get scheduled and you see Reason=ReqNodeNotAvail,_Reserved_for_maintenance
in scontrol show job
.
Administration registers maintenances with the Slurm scheduler as so-called reservations. You can see the current reservations with scontrol show reservation
. The following is a scheduled reservation affecting ALL nodes of the cluster.
# scontrol show reservation\nReservationName=root_13 StartTime=2021-09-07T00:00:00 EndTime=2021-09-09T00:00:00 Duration=2-00:00:00\n Nodes=hpc-cpu-[1-36],med[0101-0116,0201-0264,0301-0304,0401-0404,0501-0516,0601-0632,0701-0764]\n NodeCnt=236 CoreCnt=5344 Features=(null) PartitionName=(null)\n Flags=MAINT,IGNORE_JOBS,SPEC_NODES,ALL_NODES TRES=cpu=10176\n Users=root Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a\n MaxStartDelay=(null)\n
You will also be notified when logging into the login nodes, e.g.,
--\n ***NOTE: 1 scheduled maintenance(s)***\n\n 1: 2021-09-07 00:00:00 to 2021-09-09 00:00:00 ALL nodes\n\nYou jobs do not start because of \"Reserved_for_maintenance\"?\nSlurm jobs will only start if they do not overlap with scheduled reservations.\nMore information:\n\n - https://bihealth.github.io/bih-cluster/slurm/reservations/\n - https://bihealth.github.io/bih-cluster/admin/maintenance/\n--\n
"},{"location":"slurm/reservations/#what-is-the-effect-of-a-reservation","title":"What is the Effect of a Reservation?","text":"Maintenance reservations will block the affected nodes (or even the whole cluster) for jobs. If there is a maintenance in one week then your job must have an end time before the reservation starts. By this, the job gives a guarantee to the scheduler that it will not interfer with the maintenance reservation.
For example, scontrol show job JOBID
might report the following
JobId=4011580 JobName=snakejob\n UserId=USER(UID) GroupId=GROUP(GID) MCS_label=N/A\n Priority=1722 Nice=0 Account=GROUP QOS=normal\n JobState=PENDING Reason=ReqNodeNotAvail,_Reserved_for_maintenance Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=28-00:00:00 TimeMin=N/A\n SubmitTime=2021-08-30T09:01:01 EligibleTime=2021-08-30T09:01:01\n AccrueTime=2021-08-30T09:01:01\n StartTime=2021-09-09T00:00:00 EndTime=2021-10-07T00:00:00 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-08-30T10:20:40\n Partition=long AllocNode:Sid=172.16.35.153:5453\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1-1 NumCPUs=8 NumTasks=8 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=8,mem=4G,node=1,billing=8\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=4G MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Power=\n NtasksPerTRES:0\n
Look out for the Reason
line:
Reason=ReqNodeNotAvail,_Reserved_for_maintenance\n
This job is scheduled to run up to 4 weeks and has been submitted on 2021-08-30.
Right now the following reservation is active
# scontrol show reservation\nReservationName=root_13 StartTime=2021-09-07T00:00:00 EndTime=2021-09-09T00:00:00 Duration=2-00:00:00\n Nodes=hpc-cpu-[1-36],med[0101-0116,0201-0264,0301-0304,0401-0404,0501-0516,0601-0632,0701-0764]\n NodeCnt=236 CoreCnt=5344 Features=(null) PartitionName=(null)\n Flags=MAINT,IGNORE_JOBS,SPEC_NODES,ALL_NODES TRES=cpu=10176\n Users=root Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a\n MaxStartDelay=(null)\n
Thus, the scheduler decided to set a StartTime
of the job to 2021-09-09T00:00:00
, which is the end time of the reservation. Effectively, the job is forced to run outside the maintenance reservation.
You can resolve this by using a --time=
parameter to srun
or sbatch
such that the job ends before the maintenance reservation starts.
"},{"location":"slurm/rosetta-stone/","title":"Slurm Rosetta Stone","text":"Rosetta Stone?
The Rosetta Stone is a stone slab that carries the same text in Egyptian hieroglyphs and ancient Greek. This was key for decyphering Egyptian hieroglyphs in the 18th century. Nowadays, the term is often used to label translation tables such as the one below.
The table below shows some SGE commands and their Slurm equivalents.
User Command SGE Slurm remote login qrsh/qlogin
srun --pty bash
run interactively N/A srun --pty program
submit job qsub script.sh
sbatch script.sh
delete job qdel job-id
scancel job-id
job status by job id N/A squeue --job job-id
detailed job status qstat -u '*' -j job-id
sstat job-id
job status of your jobs qstat
squeue --me
job status by user qstat -u user
squeue -u user
hold job qhold job-id
scontrol hold job-id
release job qrls job-id
scontrol release job-id
queue list qconf -sql
scontrol show partitions
node list qhost
sinfo -N
OR scontrol show nodes
cluster status qhost -q
sinfo
show node resources N/A sinfo \"%n %G\"
Job Specification SGE Slurm script directive marker #$
#SBATCH
(run in queue) -q queue
-p queue
allocated nodes N/A -N min[-max]
allocate cores -pe smp count
-n count
limit running time -l h_rt=time
-t days-hh:mm:s
redirectd stdout -o file
-o file
redirect stderr -e file
-e file
combine stdout/stderr -j yes
-o without -e
copy environment -V
--export=ALL\\|NONE\\|variables
email notification -m abe
--mail-type=events
send email to -M email
--mail-user=email
job name -N name
--job-name=name
restart job -r yes|no
--requeue|--no-requeue
working directory -wd path
--workdir
run exclusively -l exclusive
--exclusive
OR --shared
allocate memory -l h_vmem=size
--mem=mem
OR --mem-per-cpu=mem
wait for job -hold_jid jid
--depend state:job
select target host -l hostname=host1\\|host1
--nodelist=nodes
AND/OR --exclude
allocate GPU -l gpu=1
--gres=gpu:tesla:count
or --gres=gpu:a40:count
"},{"location":"slurm/snakemake/","title":"Snakemake with Slurm","text":"This page describes how to use Snakemake with Slurm.
"},{"location":"slurm/snakemake/#prerequisites","title":"Prerequisites","text":" - This assumes that you have Miniforge properly setup with Bioconda.
- Also it assumes that you have already activated the Miniforge base environment with
source miniforge/bin/activate
.
"},{"location":"slurm/snakemake/#environment-setup","title":"Environment Setup","text":"We first create a new environment snakemake-slurm
and activate it. We need the snakemake
package for this.
host:~$ conda create -y -n snakemake-slurm snakemake\n[...]\n#\n# To activate this environment, use\n#\n# $ conda activate snakemake-slurm\n#\n# To deactivate an active environment, use\n#\n# $ conda deactivate\nhost:~$ conda activate snakemake-slurm\n(snakemake-slurm) host:~$\n
"},{"location":"slurm/snakemake/#snakemake-workflow-setup","title":"Snakemake Workflow Setup","text":"We create a workflow and ensure that it works properly with multi-threaded Snakemake (no cluster submission here!)
host:~$ mkdir -p snake-slurm\nhost:~$ cd snake-slurm\nhost:snake-slurm$ cat >Snakefile <<\"EOF\"\nrule default:\n input: \"the-result.txt\"\n\nrule mkresult:\n output: \"the-result.txt\"\n shell: r\"sleep 1m; touch the-result.txt\"\nEOF\nhost:snake-slurm$ snakemake --cores=1\n[...]\nhost:snake-slurm$ ls\nSnakefile the-result.txt\nhost:snake-slurm$ rm the-result.txt\n
"},{"location":"slurm/snakemake/#snakemake-and-slurm","title":"Snakemake and Slurm","text":"You have two options:
- Simply use
snakemake --profile=cubi-v1
and the Snakemake resource configuration as shown below. STRONGLY PREFERRED - Use the
snakemake --cluster='sbatch ...'
command.
Note that we sneaked in a sleep 1m
? In a second terminal session, we can see that the job has been submitted to SLURM indeed.
host:~$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 325 debug snakejob holtgrem R 0:47 1 med0127\n
"},{"location":"slurm/snakemake/#threads-resources","title":"Threads & Resources","text":"The cubi-v1
profile (stored in /etc/xdg/snakemake/cubi-v1
on all cluster nodes) supports the following specification in your Snakemake rule:
threads
: the number of threads to execute the job on - memory in a syntax understood by Slurm, EITHER
resources.mem
/resources.mem_mb
: the memory to allocate for the whole job, OR resources.mem_per_thread
: the memory to allocate for each thread.
resources.time
: the running time of the rule, in a syntax supported by Slurm, e.g. HH:MM:SS
or D-HH:MM:SS
resources.partition
: the partition to submit your job into (Slurm will pick a fitting partition for you by default) resources.nodes
: the number of nodes to schedule your job on (defaults to 1
and you will want to keep that value unless you want to use MPI)
You will need Snakemake >=7.0.2 for this.
Here is how to call Snakemake:
# snakemake --profile=cubi-v1 -j1\n
To set rule-specific resources:
rule myrule:\n threads: 1\n resources:\n mem='8G',\n time='04:00:00',\n input: # ...\n output: # ...\n shell: # ...\n
You can combine this with Snakemake resource callables, of course:
def myrule_mem(wildcards, attempt):\n mem = 2 * attempt\n return '%dG' % mem\n\nrule snps:\n threads: 1\n resources:\n mem=myrule_mem,\n time='04:00:00',\n input: # ...\n output: # ...\n shell: # ...\n
"},{"location":"slurm/snakemake/#custom-logging-directory","title":"Custom logging directory","text":"By default, slurm will write log files into the working directory of snakemake, which will look like slurm-$jobid.out
.
To change this behaviour, the environment variable SBATCH_DEFAULTS
can be set to re-route the --output
parameter. If you want to write your files into slurm_logs
with a filename pattern of $name-$jobid
for instance, consider the following snippet for your submission script:
#!/bin/bash\n#\n#SBATCH --job-name=snakemake_main_job\n#SBATCH --ntasks=1\n#SBATCH --nodes=1\n#SBATCH --time=48:10:00\n#SBATCH --mem-per-cpu=300M\n#SBATCH --output=slurm_logs/%x-%j.log\n\nmkdir -p slurm_logs\nexport SBATCH_DEFAULTS=\" --output=slurm_logs/%x-%j.log\"\n\ndate\nsrun snakemake --use-conda -j1 --profile=cubi-v1\ndate\n
The name of the snakemake slurm job will be snakemake_main_job
, the name of the jobs spawned from it will be called after the rule name in the Snakefile.
"},{"location":"slurm/temporary-files/","title":"Slurm and Temporary Files","text":"This section describes how Slurm handles temporary files on the local disk.
Temporary Files Best Practices
See Best Practices: Temporary Files for information how to use temporary files effectively.
"},{"location":"slurm/temporary-files/#slurm-behaviour","title":"Slurm Behaviour","text":"Our Slurm configuration has the following behaviour.
"},{"location":"slurm/temporary-files/#environment-variable-tmpdir","title":"Environment Variable TMPDIR","text":"Slurm itself will by default not change the TMPDIR
environment variable but retain the variable's value from the srun
or sbatch
call.
"},{"location":"slurm/temporary-files/#private-local-tmp-directories","title":"Private Local /tmp
Directories","text":"The only place where users can write data to on local storage of the compute nodes is /tmp
.
Storage is a consumable shared resource as the storage used by one job cannot use another job. It is thus critical that Slurm cleans up after each job such that all space on the local node is available to the next job. This is done using the job_container/tmpfs Slurm plugin.
This plugin creates a so-called Linux namespace for each job and creates a bind mount of /tmp
to a location on the local storage. This mount is only visible to the currently running job and each job, even of the same user, get their own /tmp
. After a job terminates, Slurm will remove the directory and all of its content.
There is a notable exception. If you use ssh
to connect to a node rather than using srun
or sbatch
, you will see the system /tmp
directory and can also write to it. This usage of storage is not tracked and consequently you can circumvent the Slurm quota management. Using /tmp
in this fashion (i.e., outside of Slurm-controlled jobs) is prohibited. If it cannot be helped (e.g., if you need to run some debugging application that needs to create FIFO or socket files) then keep usage of /tmp
outside of Slurm job below 100MB.
"},{"location":"slurm/temporary-files/#tracking-local-storage-localtmp","title":"Tracking Local Storage localtmp
","text":"Enforcing localtmp
Gres
From January 31, we will enforce the allocated storage in /tmp
on the local disk with quotas. Jobs writing to /tmp
beyond the quota in the job allocation will not function properly and probably crash with \"out of disk quota\" messages.
Slurm tracks the available local storage above 100MB on nodes in the localtmp
generic resource (aka Gres). The resource is counted in steps of 1MB, such that a node with 350GB of local storage would look as follows in scontrol show node
:
hpc-login-1 # scontrol show node hpc-cpu-1\nNodeName=hpc-cpu-1 Arch=x86_64 CoresPerSocket=24\n [...]\n Gres=localtmp:350K\n [...]\n CfgTRES=cpu=96,mem=360000M,billing=96,gres/localtmp=358400\n [...]\n
Each job is automaticaly granted 100MB of storage on the local disk which is sufficient for most standard programs. If your job needs more temporary storage then you should either
- use the
$HOME/scratch
volume (see Best Practices: Temporary Files) - specify a
localtmp
generic resource (described here)
You can allocate the resource with --gres=localtmp:SIZE
where SIZE
is given in MB.
hpc-login-1 # srun --gres=localtmp:100k --pty bash -i\nhpc-cpu-1 # scontrol show node hpc-cpu-1\nNodeName=hpc-cpu-1 Arch=x86_64 CoresPerSocket=24\n [...]\n Gres=localtmp:250K\n [...]\n CfgTRES=cpu=96,mem=360000M,billing=96,gres/localtmp=358400\n [...]\n AllocTRES=cpu=92,mem=351G,gres/localtmp=102400\n [...]\n
The first output tells us about the resource configured to be available to user jobs and the last line show us that 100k=102400
MB of local storage are allocated.
You can also see the used resources in the details of your job:
scontrol show job 14848\nJobId=14848 JobName=example.sh\n [...]\n TresPerNode=gres:localtmp:100k\n
"},{"location":"slurm/x11/","title":"Slurm and X11","text":"Make sure to connect to the login node with X11 forwarding.
host:~$ ssh -X -l user_c hpc-login-1.cubi.bihealth.org\n
Once connected to the login node, pass the --x11
flag.
hpc-login-1:~$ srun --pty --x11 xterm\n
"},{"location":"storage/home-quota/","title":"Keeping your home folder clean","text":"We set quite restrictive quotas for user homes, but in exchange you get file system snapshots and mirroring. Your home folder should therefore only be used for scripts, your user config, and other small files. Everything else should be stored in the work
or scratch
subdirectories, which effectively link to your group's shared storage space. This document describes some common pitfalls and how to circumvent them.
Hint
The tilde character (~
) is shorthand for your home directory.
"},{"location":"storage/home-quota/#code-libraries-and-other-big-folders","title":"Code libraries and other big folders","text":"Various programs are used to depositing large folders in a user's home and can quickly use up your allotted storage quota. These include:
- Python:
~/.local/lib/python*
- *conda: Location chosen by the user.
- R:
~/R/x86_64-pc-linux-gnu-library
- HPC portal:
~/ondemand
Please note that directories whose name is starting with a dot are not shown by the normal ls
command, but require the ls -a
flag. You can search your home folder for large directories like so:
$ du -shc ~/.* ~/* --exclude=.. --exclude=.\n
You should move these locations to your work
folder and create symbolic links in their place. Conda installations should be installed in work
from the very beginning as they do not react well to being moved around.
Here is an example for the .local
folder.
$ mv ~/.local ~/work/.local\n$ ln -s ~/work/.local ~/.local\n
"},{"location":"storage/home-quota/#temporary-files","title":"Temporary Files","text":"Another usual culprit is the hidden .cache
directory which contains temporary files. This folder can be moved to the scratch
volume in a similar manner as described above.
$ mv ~/.cache ~/scratch/.cache\n$ ln -s ~/scratch/.cache ~/.cache\n
Important
Files placed in your scratch
directory will be automatically removed after 2 weeks. Do not place any valuable files in there.
"},{"location":"storage/migration-faq/","title":"Data Migration Tips and tricks","text":"Please use hpc-transfer-1
and hpc-transfer-2
for moving large amounts of files. This not only leaves the compute notes available for actual computation, but also has no risk of your jobs being killed by Slurm. You should also use tmux
to not risk connection loss during long running transfers.
"},{"location":"storage/migration-faq/#moving-a-project-folder","title":"Moving a project folder","text":" -
Define source and target location and copy contents. Please replace the parts in curly brackets with your actual folder names. It is important to end paths with a trailing slash (/
) as this is interpreted by sync
as \u201call files in this folder\u201d.
$ SOURCE=/data/gpfs-1/work/projects/{my_project}/\n$ TARGET=/data/cephfs-2/unmirrored/projects/{my-project}/\n$ rsync -ahP --stats --dry-run $SOURCE $TARGET\n
-
Remove the --dry-run
flag to start the actual copying process.
Important
File ownership information will be lost during this process. This is due to non-root users not being allowed to change ownership of arbitrary files. If this is a problem for you, please contact our admins again after completing this step.
-
Perform a second rsync
to check if all files were successfully transferred. Paranoid users might want to add the --checksum
flag to rsync
or use hashdeep
. Please note the flag --remove-source-files
which will do exactly as the name suggests, but leaves empty directories behind.
$ rsync -ahX --stats --remove-source-files --dry-run $SOURCE $TARGET\n
- Again, remove the
--dry-run
flag to start the actual deletion. - Check if all files are gone from the SOURCE folder and remove the empty directories:
$ find $SOURCE -type f | wc -l\n0\n$ rm -r $SOURCE\n
Warning
When defining your SOURCE location, do not use the *
wildcard character. It will not match hidden (dot) files and leave them behind. Its better to use a trailing slash which matches \u201cAll files in this folder\u201d.
"},{"location":"storage/migration-faq/#moving-user-work-folders","title":"Moving user work folders","text":""},{"location":"storage/migration-faq/#work-data","title":"Work data","text":" -
All files within your own work directory can be transferred as follows. Please replace parts in curly braces with your cluster user name.
$ SOURCE=/data/gpfs-1/work/users/{username}/\n$ TARGET=/data/cephfs-1/home/users/{username}/work/\n$ rsync -ahP --stats --dry-run $SOURCE $TARGET\n
Note
The --dry-run
flag lets you check that rsync is working as expected without copying any files. Remove it to start the actual transfer.
-
Perform a second rsync
to check if all files were successfully transferred. Paranoid users might want to add the --checksums
flag or use hashdeep
. Please note the flag --remove-source-files
which will do exactly as the name suggests, but leaves empty directories behind.
$ rsync -ahP --stats --remove-source-files --dry-run $SOURCE $TARGET\n
- Check if all files are gone from the SOURCE folder:
$ find $SOURCE -type f | wc -l\n0\n
"},{"location":"storage/migration-faq/#conda-environments","title":"Conda environments","text":"Conda installations tend not to react well to moving their main folder from its original location. There are numerous ways around this problem which are described here.
A simple solution we can recommend is this:
-
Install a fresh version of conda or mamba in your new work folder. Don't forget to first remove the conda init block in ~/.bashrc
.
$ nano ~/.bashrc\n$ conda init\n$ conda config --set auto_activate_base false\n
-
You can then use your new conda to export your old environments by specifying a full path like so:
$ conda env export -p /fast/work/user/$USER/miniconda/envs/<env_name> -f <env_name>.yaml\n
If you run into errors it might be better to also use the --no-builds
flag. -
Finally re-create your old environments from the yaml files:
$ conda env create -f {environment.yml}\n
"},{"location":"storage/querying-storage/","title":"Querying Storage Quotas","text":"Outdated
This document is only valid for the old, third-generation file system and will be removed soon. Quotas of our new CephFS storage are communicated via the HPC Access web portal.
As described elsewhere, all data in your user, group, and project volumes is subject to quotas. This page quickly shows how to query for the current usage of data volume and file counts for your user, group, and projects.
"},{"location":"storage/querying-storage/#query-for-user-data-and-file-usage","title":"Query for User Data and File Usage","text":"The file /etc/bashrc.gpfs-quota
contains some Bash functions that you can use for querying the quota usage. This file is automatically sourced in all of your Bash sessions.
For querying your user's data and file usage, enter the following command:
# bih-gpfs-quota-user holtgrem_c\n
You will get a report as follows. As soon as usage reaches 90%, the data/file usage will be highlighted in yellow. If you pass 99%, the data/file usage will be highlighted in red.
=================================\nQuota Report for: user holtgrem_c\n=================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\nusers holtgrem_c home 103M 10% 1.0G 1.5G - 2.5k 25% 10k 12k -\nusers holtgrem_c work 639G 62% 1.0T 1.1T - 1.0M 52% 2.0M 2.2M -\nusers holtgrem_c scratch 42G 0% 200T 220T - 207k 0.1% 200M 220M -\n[...]\n
"},{"location":"storage/querying-storage/#query-for-group-data-and-file-usage","title":"Query for Group Data and File Usage","text":"# bih-gpfs-report-quota group ag_someag\n=================================\nQuota Report for: group ag_someag\n=================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\ngroups ag_someag home 0 0% 1.0G 1.5G - 4 0% 10k 12k -\ngroups ag_someag work 349G 34% 1.0T 1.5T - 302 0% 2.0M 2.2M -\ngroups ag_someag scratch 0 0% 200T 220T - 1 0% 200M 220M -\n\n[...]\n
"},{"location":"storage/querying-storage/#query-for-project-data-and-file-usage","title":"Query for Project Data and File Usage","text":"# bih-gpfs-report-quota project someproj\n==================================\nQuota Report for: project someproj\n==================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\ngroups someproj home 0 0% 1.0G 1.5G - 4 0% 10k 12k -\ngroups someproj work 349G 34% 1.0T 1.5T - 302 0% 2.0M 2.2M -\ngroups someproj scratch 0 0% 200T 220T - 1 0% 200M 220M -\n\n[...]\n
"},{"location":"storage/scratch-cleanup/","title":"Automated Cleanup of Scratch","text":"The scratch
space is automatically cleaned up nightly with the following mechanism.
- Daily snapshots of the
scratch
folder are created and retained for 3 days. - Files which were not modified for the last 14 days are removed.
- Erroneously deleted files can be manually retrieved from the snapshots.
Warning
We specifically use the mtime
attribute to determine if files in scratch should be cleaned up. Copying or downloading files to scratch while preserving the original mtime
might lead to unexpected results.
"},{"location":"storage/storage-locations/","title":"Storage and Volumes: Locations","text":"This document describes the forth iteration of the file system structure on the BIH HPC cluster. It was made necessary because the previous file system was no longer supported by the manufacturer and we since switched to distributed Ceph storage.
Important
For now, the old, third-generation file system is still mounted at /fast
. It will be decommissioned soon, please consult this document describing the migration process!
"},{"location":"storage/storage-locations/#organizational-entities","title":"Organizational Entities","text":"There are the following three entities on the cluster:
- Users (real people)
- Groups (Arbeitsgruppen) with one leader and an optional delegate
- Projects with one owner and an optional delegate
Each user, group, and project can have storage folders in different locations.
"},{"location":"storage/storage-locations/#data-types-and-storage-tiers","title":"Data Types and Storage Tiers","text":"Files stored on the HPC fall into one of three categories:
-
Home folders store programs, scripts, and user config i.\u00a0e. long-lived and very important files. Loss of this data requires to redo manual work (like programming).
-
Work folders store data of potentially large size which has a medium life time and is important. Examples are raw sequencing data and intermediate results that are to be kept (e.\u00a0g. sorted and indexed BAM files). Work data requires time-consuming actions to be restored, such as downloading large amounts of data or long-running computation.
-
Scratch folder store temporary files with a short life-time. Examples are temporary files (e.\u00a0g. unsorted BAM files). Scratch data is created to be removed eventually.
Ceph storage comes in two types which differ in their I/O speed, total capacity, and cost. They are called Tier 1 and Tier 2 and sometimes hot storage and warm storage. In the HPC filesystem they are mounted in /data/cephfs-1
and /data/cephfs-2
.
- Tier 1 storage is fast, relatively small, expensive, and optimized for performance.
- Tier 2 storage is slow, big, cheap, and built for keeping large files for longer times.
Storage quotas are imposed in these locations to restrict the maximum size of folders. Amount and utilization of quotas is communicated via the HPC Access web portal.
"},{"location":"storage/storage-locations/#home-directories","title":"Home Directories","text":"Location: /data/cephfs-1/home/
Only users have home directories on Tier 1 storage. This is the starting point when starting a new shell or SSH session. Important config files are stored here as well as analysis scripts and small user files. Home folders have a strict storage quota of 1\u00a0GB.
"},{"location":"storage/storage-locations/#work-directories","title":"Work Directories","text":"Location: /data/cephfs-1/work/
Groups and projects have work directories on Tier 1 storage. User home folders contain a symlink to their respective group's work folder. Files shared within a group/project are stored here as long as they are in active use. Work folders are generally limited to 1\u00a0TB per group. Project work folders are allocated on an individual basis.
"},{"location":"storage/storage-locations/#scratch-space","title":"Scratch Space","text":"Location: /data/cephfs-1/scratch/
Groups and projects have scratch space on Tier 1 storage. User home folders contain a symlink to their respective group's scratch space. Meant for temporary, potentially large data e.\u00a0g. intermediate unsorted or unmasked BAM files, data downloaded from the internet etc. Scratch space is generally limited to 10\u00a0TB per group. Projects are allocated scratch on an individual basis. Files in scratch will be automatically removed 2 weeks after their creation.
"},{"location":"storage/storage-locations/#tier-2-storage","title":"Tier 2 Storage","text":"Location: /data/cephfs-2/
This is where big files go when they are not in active use. Groups are allocated 10 TB of Tier 2 storage by default. File quotas here can be significantly larger as space is much cheaper and more abundant than on Tier 1.
Note
Tier 2 storage is currently not accessible from HPC login nodes.
"},{"location":"storage/storage-locations/#overview","title":"Overview","text":"Tier Function Path Default Quota 1 User home /data/cephfs-1/home/users/<user>
1 GB 1 Group work /data/cephfs-1/work/groups/<group>
1 TB 1 Group scratch /data/cephfs-1/scratch/groups/<group>
10 TB 1 Project work /data/cephfs-1/work/projects/<project>
On request 1 Project scratch /data/cephfs-1/scratch/projects/<project>
On request 2 Group /data/cephfs-2/unmirrored/groups/<group>
10 TB 2 Project /data/cephfs-2/unmirrored/projects/<project>
On request 2 Group /data/cephfs-2/mirrored/groups/<group>
On request 2 Project /data/cephfs-2/mirrored/projects/<project>
On request"},{"location":"storage/storage-locations/#snapshots-and-mirroring","title":"Snapshots and Mirroring","text":"Snapshots are incremental copies of the state of the data at a particular point in time. They provide safety against various \"Ops, did I just delete that?\" scenarios, meaning they can be used to recover lost or damaged files. Depending on the location and Tier, CephFS creates snapshots in different frequencies and retention plans.
Location Path Retention policy Mirrored User homes /data/cephfs-1/home/users/
Hourly for 48 h, daily for 14 d yes Group/project work /data/cephfs-1/work/
Four times a day, daily for 5 d no Group/project scratch /data/cephfs-1/scratch/
Daily for 3 d no Group/project mirrored /data/cephfs-2/mirrored/
Daily for 30 d, weekly for 16 w yes Group/project unmirrored /data/cephfs-2/unmirrored/
Daily for 30 d, weekly for 16 w no Some parts of Tier 1 and Tier 2 snapshots are also mirrored into a separate fire compartment within the data center. This provides an additional layer of security i.\u00a0e. physical damage to the servers.
"},{"location":"storage/storage-locations/#accessing-snapshots","title":"Accessing Snapshots","text":"To access snapshots simply navigate to the .snap/
sub-folder of the respective location. This special folder exists on all levels of the CephFS file hierarchy, so even in your user home directory. Inside you will find one folder per snapshot created and in those a complete replica of the respective folder at the time of snapshot creation.
For example:
/data/cephfs-1/home/.snap/<some_snapshot>/users/<your_user>/
same as: /data/cephfs-1/home/users/<your_user>/.snap/<some_snapshot>
/data/cephfs-1/work/.snap/<some_snapshot>/groups/<your_group>/
/data/cephfs-2/unmirrored/.snap/<some_snapshot>/projects/<your_project>/
Here is a simple example of how to restore a file:
$ cd /data/cephfs-2/unmirrored/groups/cubi/.snap/scheduled-2024-03-11-00_00_00_UTC/\n$ ls -l\nimportant_file.txt\n$ cp important_file.txt /data/cephfs-2/unmirrored/groups/cubi/\n
"},{"location":"storage/storage-locations/#technical-implementation","title":"Technical Implementation","text":""},{"location":"storage/storage-locations/#tier-1","title":"Tier 1","text":" - Fast & expensive
- mounted on
/data/cephfs-1
- Currently 12 Nodes with 10 \u00d7 14 TB NVME SSD each
- 1.68 PB raw storage
- 1.45 PB erasure coded (EC 8:2)
- 1.23 PB usable (85 %, Ceph performance limit)
- For typical CUBI use case 3 to 5 times faster I/O then the old DDN
- Two more nodes in purchasing process
- Hardware costs:
- One node/chunk: 45.000 \u20ac (150 TB)
- ca. 300 \u20ac/TB
"},{"location":"storage/storage-locations/#tier-2","title":"Tier 2","text":" - Slower but more affordable
- mounted on
/data/cephfs-2
- Currently 10 nodes with 52 HDDs slots and SSD cache (~40 HDDs per node with 16\u201318 TB capacity)
- 6.6 PB raw storage
- 5.3 PB erasure coded (EC 8:2)
- 4.5 PB usable (85 %; Ceph performance limit)
- More nodes in purchasing process
- Hardware costs:
- ca. 50 \u20ac per TB, 100 \u20ac mirrored
- small chunk extension possible
"},{"location":"storage/storage-locations/#tier-2-mirror","title":"Tier 2 mirror","text":" - Similar in hardware and size (10 nodes, 6+ PB)
- Stored in separate fire compartment.
"},{"location":"storage/storage-migration/","title":"Migration from old GPFS to new CephFS","text":"Important
We will remove access to /fast
on most cluster nodes following September 30th.
"},{"location":"storage/storage-migration/#what-is-going-to-happen","title":"What is going to happen?","text":"Files on the cluster's main storage /data/gpfs-1
aka. /fast
will move to a new file system. That includes users' home directories, work directories, and work-group directories. Once files have been moved to their new locations, /fast
will be retired.
Simultaneously we will move towards a more unified naming scheme for project and group folder names. From now on, all such folders names shall be in kebab-case. This is Berlin after all. Group folders will also be renamed, removing the \"ag_\" prefix.
Detailed communication about the move will be communicated via the cluster mailinglist and the user forum. For technical help, please consult the Data Migration Tips and tricks.
"},{"location":"storage/storage-migration/#why-is-this-happening","title":"Why is this happening?","text":"/fast
is based on a high performance proprietary hardware (DDN) & file system (GPFS). The company selling it has terminated support which also means buying replacement parts will become increasingly difficult.
"},{"location":"storage/storage-migration/#the-new-storage","title":"The new storage","text":"There are two file systems set up to replace /fast
, named Tier 1 and Tier 2 after their difference in I/O speed:
- Tier 1 is faster than
/fast
ever was, but it only has about 75\u00a0% of its usable capacity. - Tier 2 is not as fast, but much larger, almost 3 times the current usable capacity.
The Hot storage Tier 1 is reserved for files requiring frequent random access, user homes, and scratch. Tier 2 (Warm storage) should be used for everything else. Both file systems are based on the open-source, software-defined Ceph storage platform and differ in the type of drives used. Tier 1 or Cephfs-1 uses NVME SSDs and is optimized for performance, Tier 2 or Cephfs-2 used traditional hard drives and is optimized for cost.
So these are the three terminologies in use right now:
- Cephfs-1 = Tier 1 = Hot storage =
/data/cephfs-1
- Cephfs-2 = Tier 2 = Warm storage =
/data/cephfs-2
More information about CephFS can be found here.
"},{"location":"storage/storage-migration/#new-file-locations","title":"New file locations","text":"Naturally, paths are going to change after files move to their new location. Due to the increase in storage quality options, there will be some more folders to consider.
"},{"location":"storage/storage-migration/#users","title":"Users","text":" - Home on Tier 1:
/data/cephfs-1/home/users/<user>
- Work on Tier 1:
/data/cephfs-1/work/groups/<doe>/users/<user>
- Scratch on Tier 1:
/data/cephfs-1/scratch/groups/<doe>/users/<user>
Important
User work
& scratch
spaces are now part of the user's group folder. This means, groups need to coordinate internally to distribute their allotted quota according to each user's needs.
The implementation is done via symlinks created by default when the user account is moved to its new destination:
~/work -> /data/cephfs-1/work/groups/<group>/users/<user>
~/scratch -> /data/cephfs-1/scratch/groups/<group>/users/<user>
"},{"location":"storage/storage-migration/#groups","title":"Groups","text":" - Work on Tier 1:
/data/cephfs-1/work/groups/<group>
- Scratch on Tier 1:
/data/cephfs-1/scratch/groups/<group>
- Tier 2 storage:
/data/cephfs-2/unmirrored/groups/<group>
- Mirrored space on Tier 2 is available on request.
"},{"location":"storage/storage-migration/#projects","title":"Projects","text":" - Work on Tier 1:
/data/cephfs-1/work/projects/<project>
- Scratch on Tier 1:
/data/cephfs-1/scratch/projects/<project>
- Tier 2 storage is available on request.
"},{"location":"storage/storage-migration/#recommended-practices","title":"Recommended practices","text":""},{"location":"storage/storage-migration/#data-locations","title":"Data locations","text":""},{"location":"storage/storage-migration/#tiers","title":"Tiers","text":" - Tier 1: Designed for many I/O operations. Store files here which are actively used by your compute jobs.
- Tier 2: Big, cheap storage. Fill with files not in active use.
- Tier 2 mirrored: Extra layer of security. Longer term storage of invaluable data.
"},{"location":"storage/storage-migration/#folders","title":"Folders","text":" - Home: Configuration files, templates, generic scripts, & small documents.
- Work: Conda environments, R packages, data actively processed or analyzed.
- Scratch: Non-persistent storage for temporary or intermediate files, caches, etc.
"},{"location":"storage/storage-migration/#project-life-cycle","title":"Project life cycle","text":" - Import raw data on Tier 2 for validation (checksums, \u2026)
- Stage raw data on Tier 1 for QC & processing.
- Save processing results to Tier 2.
- Continue analysis on Tier 1.
- Save analysis results on Tier 2.
- Reports & publications can remain on Tier 2.
- After publication (or the end of the project), files on Tier 1 should be deleted.
"},{"location":"storage/storage-migration/#example-use-cases","title":"Example use cases","text":"Space on Tier 1 is limited. Your colleagues, other cluster users, and admins will be very grateful if you use it only for files you actively need to perform read/write operations on. This means main project storage should probably always be on Tier 2 with workflows to stage subsets of data onto Tier 1 for analysis.
These examples are based on our experience of processing diverse NGS datasets. Your mileage may vary but there is a basic principle that remains true for all projects.
"},{"location":"storage/storage-migration/#dna-sequencing-wes-wgs","title":"DNA sequencing (WES, WGS)","text":"Typical Whole Genome Sequencing data of a human sample at 100x coverage requires about 150 GB of storage, Whole Exome Sequencing files occupy between 6 and 30 GB. These large files require considerable I/O resources for processing, in particular for the mapping step. A prudent workflow for these kind of analysis would therefore be the following:
- For one sample in the cohort, subsample its raw data files (
fastqs
) from the Tier 2 location to Tier 1. seqtk
is your friend! - Test, improve & check your processing scripts on those smaller files.
- Once you are happy with the scripts, copy the complete
fastq
files from Tier 2 to Tier 1. Run the your scripts on the whole dataset, and copy the results (bam
or cram
files) back to Tier 2. - Remove raw data & bam/cram files from Tier 1, unless the downstream processing of mapped files (variant calling, structural variants, ...) can be done immediatly.
Tip
Don't forget to use your scratch
area for transient operations, for example to sort your bam
file after mapping. More information on how to efficiently set up your temporary directory here.
"},{"location":"storage/storage-migration/#bulk-rna-seq","title":"bulk RNA-seq","text":"Analysis of RNA expression datasets are typically a long and iterative process, where the data must remain accessible for a significant period. However, there is usually no need to keep raw data files and mapping results available once the gene & transcripts counts have been generated. The count files are much smaller than the raw data or the mapped data, so they can live longer on Tier 1.
A typical workflow would be:
- Copy your
fastq
files from Tier 2 to Tier 1. - Perform raw data quality control, and store the outcome on Tier 2.
- Get expression levels, for example using
salmon
or STAR
, and store the results on Tier 2. - Import the expression levels into
R
, using tximport
and DESeq2
or featureCounts
& edgeR
, for example. - Save expression levels (
R
objects) and the output of salmon
, STAR
, or any mapper/aligner of your choice to Tier 2. - Remove raw data, bam & count files from Tier 1.
Tip
If using STAR
, don't forget to use your scratch
area for transient operations. More information on how to efficiently set up your temporary directory here
"},{"location":"storage/storage-migration/#scrna-seq","title":"scRNA-seq","text":"The analysis workflow of bulk RNA & single cell dataset is conceptually similar: Large raw files need to be processed once and only the outcome of the processing (gene counts matrices) are required for downstream analysis. Therefore, a typical workflow would be:
- Copy your
fastq
files from Tier 2 to Tier 1. - Perform raw data QC, and store the results on Tier 2.
- Get the count matrix, e.\u00a0g. using
Cell Ranger
or alevin-fry
, perform count matrix QC and store the results on Tier 2. - Remove raw data, bam & count files from Tier 1.
- Downstream analysis with
seurat
, scanpy
, or Loupe Browser
.
"},{"location":"storage/storage-migration/#machine-learning","title":"Machine learning","text":"There is no obvious workflow that covers most used cases for machine learning. However,
- Training might be done on scratch where data access is quick and data size not as constrained as on work space. But files will disappear after 14 days.
- Some models can be updated with new data, without needing to keep the whole dataset on Tier 1.
"},{"location":"storage/storage-migration/#data-migration-process-from-old-fast-to-cephfs","title":"Data migration process from old /fast
to CephFS","text":" - After being contacted by HPC admins, delegates move project folders to Tier 2. Additional Tier 1 storage is granted on request.
- User homes and group folders are moved by HPC admins to Tier 1 and 2 as appropriate. This is done on a group-by-group basis.
- Users move contents of their work directories into the new shared group work space.
Best practices and tools will be provided.
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"Welcome to the user documentation of the BIH high-performance computing (HPC) cluster, also called HPC 4 Research. The BIH HPC cluster is managed by CUBI (Core Unit Bioinformatics). This documentation is maintained by BIH CUBI and the user community. It is a living document that you can update and add to. See How-To: Contribute to this Document for details.
The global table of contents is on the left, the one of the current page is on the right.
Additional resources
- User discussion forum
- Performance and workload monitoring
"},{"location":"#getting-started","title":"Getting Started","text":"Read the following set of pages (in order) to learn how to get access and connect to the cluster.
- Getting Access
- Connecting
- Storage
- Slurm
- Getting Help (Writing Good Tickets; if no answer found, contact the HPC Helpdesk).
- HPC Tutorial
Acknowledging BIH HPC Usage
Acknowledge usage of the cluster in your manuscript as \"Computation has been performed on the HPC for Research/Clinic cluster of the Berlin Institute of Health\". Please add your publications using the cluster to this list.
"},{"location":"#news-maintenance-announcements","title":"News & Maintenance Announcements","text":" - July 16th: New high-memory node
hpc-mem-5
with 4 TB of RAM. - Until autumn 2024: Operation Exodus \u2013 Migration of all data from GPFS to CephFS storage.
- September 30th 2024: Unmounting of
/fast
on all non-transfer nodes. - October 31st 2024: Retirement of GPFS/DDN storage.
See Maintenance for a detailed list of current, planned, and previous maintenance and update work.
"},{"location":"#technical-details","title":"Technical Details","text":"If you are interested in how this HPC cluster is set up on a technical level, we got you covered. There is an entire section on this.
"},{"location":"#documentation-structure","title":"Documentation Structure","text":"The documentation is structured as follows:
- Administrative information about administrative processes such as how to get access, register users, work groups, and projects.
- Connecting technical help for connecting to the cluster.
- Storage describes how and where files are stored.
- HPC tutorial a first demo project for getting you started quickly.
- Cluster Scheduler technical help for using the Slurm scheduler.
- OnDemand Portal introduces web HPC access.
- Best Practice guidelines on recommended usage of certain aspects of the system.
- Static Data (Cubit) documentation about the static data (files) collection on the cluster.
- How-To short(ish) solutions for specific technical problems.
- Getting Help explains how you can obtain help in using the BIH HPC.
- Miscellaneous contains a growing list of pages that don't fit anywhere else.
"},{"location":"admin/getting-access/","title":"Getting Access","text":"Access to the BIH HPC cluster is conceptually based on user groups (also known as labs or units) and projects. Users have a relatively limited storage quota within their private home folder and store big data primarily within their group's work space or in project folders. Projects are collaborative efforts involving multiple PIs/groups and are allocated separate storage space on the cluster.
Independent group leaders at BIH/Charit\u00e9/MDC can request a group on the cluster and name group members. The work group leader (the group PI) bears the responsibility for the group's members and ensures that cluster policies and etiquette are followed. In brief: Fair usage rules apply and the cluster ist not to be abused for unethical or illegal purposes. Major and/or continued violations may lead to exclusion of the entire group.
The group leader may also name one delegate (typically an IT-savvy Post-Doc) who is thereby allowed to take decision about cluster usage and work group management on behalf of the group leader. The above mentioned responsibilities stay with the group leader.
Note
- A Charit\u00e9 or MDC user account is required for accessing HPC 4 Research.
- Please only use email addresses from the institutions Charite, BIH, or MDC in the forms below.
"},{"location":"admin/getting-access/#work-groups-and-users","title":"Work Groups and Users","text":"All cluster users are member of exactly one primary work group. This affiliation is usually defined by real life organisational structures within Charit\u00e9/BIH/MDC. Leaders of independent research groups (PIs) can apply for a new cluster work group as follows:
- The group leader sends an email to hpc-helpdesk@bih-charite.de and includes the filled-out form below. Please read the notes box before sending.
- The HPC helpdesk decides on the request and creates corresponding objects on the cluster (users, groups, directories).
- New users are notified and sent further instructions via email.
Important
Changes to an existing group (adding new users, changes in resources, etc.) can only be requested by group leaders and delegates.
"},{"location":"admin/getting-access/#form-new-group","title":"Form: New Group","text":"Example values are given in curly braces.
# Group \"ag-{doe}\"\nGroup leader/PI: {John Doe}\nDelegate [optional]: {Max Mustermann}\nPurpose of cluster usage [short]: {RNA-seq analysis in colorectal cancer}\n\nRequired resources:\n- Tier 1 storage: {1 TB}\n- Tier 1 scratch: {10 TB}\n- Tier 2 storage: {10 TB}\n\n# Users\n## User 1\n- first name: {John}\n- last name: {Doe}\n- affiliation: {Charit\u00e9, Department of Oncology}\n- institute email: {john.doe@charite.de}\n- user has account with\n - [ ] BIH\n - [x] Charite\n - [ ] MDC\n- BIH/Charit\u00e9/MDC user name: {doej}\n\n## User 2\n[etc.]\n
"},{"location":"admin/getting-access/#form-add-user-to-group","title":"Form: Add User to Group","text":"Example values are given in curly braces.
# New user of AG {Doe}\n- first name: {Mia}\n- last name: {Smith}\n- affiliation: {Charit\u00e9, Department of Oncology}\n- institute email: {mia.smith@charite.de}\n- user has account with\n - [ ] BIH\n - [x] Charite\n - [ ] MDC\n- BIH/Charit\u00e9/MDC user name: {smithm}\n
Notes
- All cluster groups must have an owner and may have one delegate.
- Group ownership implies control but also accountability for their group files and members.
- Users can only be members of one primary work group.
- We strongly dis-encourage on-boarding non-lab members into your group. This cause biases in usage accounting, may raise concerns in IT security and data privacy audits and also puts unfair responsibilities on the group leader.
"},{"location":"admin/getting-access/#projects","title":"Projects","text":"Projects are secondary user groups to enable:
- collaboration and data sharing across different work groups,
- fine-grained allocation of additional storage resources,
- organising data in a fine-grained manner for better data lifecycle management.
Project creation can be initiated by group leaders and group delegates as follows:
- Send an email to hpc-helpdesk@bih-charite.de and includes the filled-out form below. Please read the notes box before sending.
- The HPC helpdesk decides on the request and creates corresponding objects on the cluster (groups, directories).
Important
Changes to an existing project (adding new users, changes in resources, etc.) can only be requested by project owners and delegates. Please send us cluster user names for adding new project members.
"},{"location":"admin/getting-access/#form","title":"Form","text":"Example values are given in curly braces.
# Project \"{doe-dbgap-rna}\"\nProject owner: {John Doe}, {doej_c}\nDelegate [optional]: {Max Mustermann}, {musterm_c}\nPurpose of cluster usage [short]: {RNA-seq data from dbGAP}\n\nRequired resources:\n- Tier 1 work: {0 TB}\n- Tier 1 scratch: {0 TB}\n- Tier 2 storage: {1 TB}\n\nAdditional members (cluster user names):\n- {sorgls_c}\n- ...\n
Notes
- All projects must have one owner and may have one delegate.
- Please note that we will enforce kebab case for all project names and folders.
- Tier 1 project storage will be supplemented with 10 TB of T1 scratch by default.
- Users can be associated with multiple projects.
- Project membership does not grant cluster access. A primary group affiliation is still required.
"},{"location":"admin/maintenance/","title":"Next Maintenance Window","text":"This page documents the current and known upcoming maintenance windows.
"},{"location":"admin/maintenance/#login-compute-and-storage-maintenance-december-13-14-2022","title":"Login, Compute and Storage Maintenance, December 13-14, 2022","text":"All informationand updates regarding maintenance will be circulated on our forum https://hpc-talk.cubi.bihealth.org/c/announcements/5.
"},{"location":"admin/maintenance/#login-compute-and-storage-maintenance-march-22-23-2022","title":"Login, Compute and Storage Maintenance, March 22-23, 2022","text":"All COMPUTE nodes and STORAGE resources won't be reachable!
All nodes will be running in RESERVATION mode. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday and Wednesday, March 22 and 23, all-day). For example, if you submit a job that can run up to 7 days after March 15 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\".
Issues of today's maintenance:
- Mounting of storage to
/tmp
on login nodes - Changing mount options of the root partition on the compute nodes
- Upgrading all nodes kernels and further packages
- This implies an upgrade of CUDA, Singularity, and further packages
- Cold reboot (\"power off, power on\") of storage system
- Exchanging
cephfs-2
switches (Tier 2 storage, not relevant for most users)
IMPORTANT
- All nodes will reboot
- All running jobs will die
- All sessions on login nodes will die
Progress Thread on hpc-talk
"},{"location":"admin/maintenance/#drmaa-deprecation-march-2-2022","title":"DRMAA Deprecation, March 2, 2022","text":" - The usage of DRMAA on the HPC is deprecated.
- In Snakemake, it has been deprecated in favor of using Snakemake Profiles as documented.
- We will support DRMAA at least until June 30, 2022 but ask all users to migrate away from it as soon as possible.
- Background:
- With DRMAA, the status of each job is queried for using
scontrol show job JOBID
and sacct -j JOBID
. - This leads to regular remote procedure calls (RPC) to the slurm control daemon.
- It leads to a lot of such calls.
- It leads to so many calls that it prevents the scheduler from working correctly and leads to service degradation for all users.
- Using Snakemake profiles is easy.
- Call Snakemake with
snakemake --profile=cubi-v1
instead of snakemake --drmaa \"...\"
. - In your rules, specify threads, running time and memory as:
rule myrule:\n # ...\n threads: 8\n resources:\n time=\"12:00:00\",\n memory=\"8G\",\n # ...\n
"},{"location":"admin/maintenance/#cluster-setting-tuning-march-1-2022","title":"Cluster Setting Tuning, March 1, 2022","text":" - We have adjusted the scheduler settings to address high number of jobs by users:
SchedulerParameters+=bf_max_job_user=50
: backfill scheduler only considers 50 jobs of each user. This mitigates an issue with some users having too many jobs and thus other users' jobs don't get ahead in the queue EnforcePartLimits=ALL
: jobs that don't fit into their partition are rejected DependencyParameters=kill_invalid_depend
: jobs that have dependencies set that cannot be fulfilled will be killed
"},{"location":"admin/maintenance/#limiting-global-memory-usage-february-14-2022","title":"Limiting Global Memory Usage, February 14, 2022","text":" - A global memory allocation limit per user is set per partition.
- The value is set to \"max CPU count per user * 7GB\".
- Users can allocate up to \"max cpu count\" CPUs or \"max cpu count * 7GB\" RAM.
- This is enforced globally (users could allocate \u2154 of their global CPU limit with 3.5 GB RAM and \u2153 with 7GB of RAM, for example).
"},{"location":"admin/maintenance/#ganglia-fixes-docs-february-3-2022","title":"Ganglia Fixes & Docs, February 3, 2022","text":" - Reparing GPFS and NVIDIA GPU monitoring in Ganglia
- Root cause was that the Python modules in Ganglia were removed from EPEL. We now have a local package build of Ganglia, if you are interested, here is the patch and Docker based build instructions.
- You can find some documentation about our Ganglia here.
"},{"location":"admin/maintenance/#misc-changes-january-29-2022","title":"Misc Changes, January 29, 2022","text":" - We have reduced oversubscription to 2x from 4x.
- We have setup the user quota on /tmp on the login nodes to 20MB to improve stability of the nodes.
"},{"location":"admin/maintenance/#enabling-oversubscription-january-6-2022","title":"Enabling Oversubscription, January 6, 2022","text":" - Many resources remain unused as users allocate too many cores to their jobs. Slurm will now oversubscribe jobs in terms of CPUs, i.e., schedule more than one allocated core per physical core/thread.
"},{"location":"admin/maintenance/#enforcing-usage-of-localtmp-resource-january-31-2022","title":"Enforcing Usage of localtmp
Resource, January 31, 2022","text":" - We will enforce using
localtmp
resource for local storage above 100MB. - See Slurm: Temporary Files for details.
"},{"location":"admin/maintenance/#temporary-file-handling-changes-december-27-2021","title":"Temporary File Handling Changes, December 27, 2021","text":" - Each job gets its private
/tmp
using Linux namespaces/cgroups. This greatly improves the reliability of cleaning up after jobs. (Technically, this is implemented using the Slurm job_container/tmpfs) plugin. - We are starting to track available local temporary space with Slurm in the general resource (
Gres
) \"localtmp\". In the future this will become a requirement. Also see Slurm: Temporary Files.
"},{"location":"admin/maintenance/#cluster-node-upgrades-december-22-23-2021","title":"Cluster Node Upgrades, December 22-23, 2021","text":" - Renaming of cluster head nodes to:
hpc-login-1.cubi.bihealth.org
hpc-login-2.cubi.bihealth.org
hpc-portal.cubi.bihealth.org
hpc-transfer-1.cubi.bihealth.org
hpc-transfer-2.cubi.bihealth.org
- Upgraded cluster operating system from CentOS 7.9 to Rocky Linux 8.5.
- Added three more GPU nodes with Tesla V100 GPUS:
hpc-gpu-{5..7}
. - Slurm has been upgraded to
28.08.5
. - Ganglia monitoring generally available at https://hpc-ganglia.cubi.bihealth.org, from internal networks.
- We have applied a number of changes to maximal running times in Slurm configuration.
"},{"location":"admin/maintenance/#gpfs-upgrade-december-20-21-2021","title":"GPFS Upgrade, December 20-21, 2021","text":"The GPFS storage system has been upgraded to the latest version to make compatible with Enterprise Linux version 8.
"},{"location":"admin/maintenance/#slurm-upgrade-to-21080-september-8-2021","title":"Slurm upgrade to 21.08.0
, September 8, 2021","text":"Slurm has been upgraded to version 21.08.0
.
"},{"location":"admin/maintenance/#network-re-cabling-september-7-8-2021","title":"Network re-cabling, September 7-8, 2021","text":"All servers/nodes won't be reachable!
All nodes will be running in reservation mode. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday and Wednesday, September 7 and 8, all-day). For example, if you submit a job that can run up to 7 days after August 30 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\".
If you already have a job running on any nodes that goes beyond September 7, 12:00 am (00:00 Uhr), this job will die.
"},{"location":"admin/maintenance/#renaming-of-gpu-high-memory-machines-scheduler-changes-september-7-2021","title":"Renaming of GPU & High Memory Machines & Scheduler Changes, September 7, 2021","text":"The GPU machines med030[1-4]
have been renamed to hpc-gpu-[1-4]
. The high memory machines med040[1-4]
have been renamed to hpc-mem-[1-4]
. It will probably take us some time to update all places in the documentation.
Further, the long
partition has been changed to allow jobs with a maximum running time of 14 days.
"},{"location":"admin/maintenance/#new-nodes-in-the-staging-partition-august-31-2021","title":"New Nodes in the staging
partition, August 31, 2021","text":"We have installed 36 new nodes (in BETA mode) in the cluster called hpc-node-[1-36]
. They have 48 cores (thus 96 hardware threads) each and have 360GiB of main memory available (for the hardware nerds, it's Intel(R) Xeon(R) Gold 6240R CPUs at 2.40GHz, featuring the cascadelake
architecture).
Right now, they are only available in the staging
partition. After some testing we will move them to the other partitions. We'd like to ask you to test them as well and report any issues to hpc-helpdesk@bih-charite.de. The nodes have been setup identically to the existing med0xxx
nodes. We do not expect big changes but the nodes might not be as stable as other oness.
Here is how you can reach them.
hpc-login-1 # srun --immediate=5 --pty --time=24:00:00 --partition=staging bash -i\n[...]\nhpc-cpu-1 #\n
Note that I'm specifying a maximal running time of 24h so the scheduler will end the job after 24 hours which is before the upcoming maintenance reservation begins. By default, the scheduler allocates 28 days to the job which means that the job cannot end before the reservation and will be scheduled to start after it. See Reservations / Maintenances for more information about maintenance reservations.
"},{"location":"admin/maintenance/#reservation-maintenance-display-on-login-august-30-2021","title":"Reservation / Maintenance Display on Login, August 30, 2021","text":"User will now be notified on login about maintenance, for example:
NOTE: scheduled maintenance(s)\n\n 1: 2021-09-07 00:00:00 to 2021-09-09 00:00:00 ALL nodes\n\nSlurm jobs will only start if they do not overlap with scheduled reservations.\nMore information:\n\n - https://bihealth.github.io/bih-cluster/slurm/reservations/\n - https://bihealth.github.io/bih-cluster/admin/maintenance/\n
"},{"location":"admin/maintenance/#update-to-job-sumission-script-august-23-2021","title":"Update to Job Sumission Script, August 23, 2021","text":"The srun
command will now behave as if --immediate=60
has been specified by default. It explains how to override this behaviour and possible reasons for job scheduling to fail within 60 seconds (reservations and full cluster).
"},{"location":"admin/maintenance/#slurm-upgrade-august-6-2021","title":"Slurm upgrade, August 6, 2021","text":"We upgrade from 20.11.2
to 20.11.8
which contains some fixes for bugs that our users actually stumbled over. The change should be non-intrusive as it's only a patch-level update.
"},{"location":"admin/maintenance/#networking-hardware-exchange-august-3-2021","title":"Networking hardware exchange, August 3, 2021","text":"Following servers won't be reachable:
- GPU nodes (med03xx)
- computing nodes (med0233-0248)
These nodes are running in reservation mode now. This means you are still able to schedule new jobs on these nodes if their potential/allowed runtime does not extend into the maintenance window (Tuesday, August 3, all-day). For example, if you submit a job that can run up to 7 days after July 26 then the job will remain in \"pending/PD\" state giving the explanation of \"all nodes being reserved or unavailable\". If you have a job running on any of the before mentioned nodes that goes beyond August 3, 12:00 am (00:00 Uhr), this job will die. We do not expect the remaining nodes to be affected. However, there remains a minor risk of unexpected downtime of other nodes.
"},{"location":"admin/maintenance/#server-reorganization-july-13-2021","title":"Server reorganization, July 13, 2021","text":"Affected servers are:
- med02xx
- med07xx
"},{"location":"admin/maintenance/#server-reorganization-june-22-23-2021","title":"Server reorganization, June 22 + 23, 2021","text":"If you have a job running on any of the before mentioned nodes that goes beyond June 22, 6am, this job will die. We put a so-called Slurm reservation for the maintenance period. Any job that is scheduled before the maintenance and whose end time (start time + max running time) is not before the start of the maintenance will not be scheduled with the message ReqNodeNotAvail, Reserved for maintenance.
Affected servers are:
- med01xx
- med05xx
- med06xx
- med03xx
- med0405
"},{"location":"admin/maintenance/#memory-and-psu-exchange-may-31-2021","title":"Memory and PSU exchange, May 31, 2021","text":" - Memory exchange
- transfer-1.research (OK)
- med0143 (OK)
- med0147 (OK)
- med0206 (FAIL - exchange part broken)
- med0233 (FAIL - exchange part broken)
- med0254 (FAIL - exchange part broken)
- PSU exchange
- med-host024 (OK)
"},{"location":"admin/maintenance/#moving-servers-may-20-may-25-2021","title":"Moving servers, May 20 + May 25, 2021","text":" - Physically moving proxmox-{2,4} and transfer-2.research (May 20)
- Physically moving proxmox-{1,3} and transfer-1.research (May 25)
"},{"location":"admin/maintenance/#miscellaneous-maintenances-december-23-25-2020","title":"Miscellaneous Maintenances, December 23-25, 2020","text":"HPC 4 Research
- Separate HPC 4 Research group GID space from other organization's.
- Fully Unavailable
- Reboot login nodes to increase RAM on hpc-login-2.research
- Update firmwares of transfer-{1,2}.research
"},{"location":"admin/maintenance/#centos-8-migration-in-planning","title":"CentOS 8 Migration (in planning)","text":"Note
This task is currently being planned. No schedule has been fixed yet.
- All nodes will be upgraded to CentOS 8.
- This will be done in a rolling fashion over the course of 1 month.
- The login nodes must be rebooted which we will do with a break of 2 days (one node will remain running).
"},{"location":"admin/maintenance/#finalize-unification-of-mass-data-mounts","title":"Finalize unification of Mass Data Mounts","text":"Note
This task is currently being planned. No schedule has been fixed yet.
- We will remove the bind mount
/fast
that currently points to /data/gpfs-1
on HPC 4 Research. - Users should use
/data
instead of /fast
everywhere, e.g., /data/users/$NAME
etc.
"},{"location":"admin/maintenance/#previous-maintenance-windows","title":"Previous Maintenance Windows","text":""},{"location":"admin/maintenance/#hpc-4-research-miscellaneous-maintenances-december-1-2020","title":"HPC 4 Research: Miscellaneous Maintenances, December 1, 2020","text":"Time: 6am-12am
- Exchange GPFS Controller
- We need to exchange a central piece of hardware in the storage system.
- We do not expect a downtime, only a degradation of service.
- Access to the GPFS will be degraded
- Slurm Scheduler
- Upgrade to the latest and greatest version.
- Restructure scheduler installation to ease rolling upgrades without future downtimes.
- Archival of old accounting information to improve schedule performance.
- Slurm will be unavailable.
- Re-Mounting of GPFS
- The
/fast
file system will be re-mounted to /data/gpfs-1
. /fast
becomes a symbolic link to /data
on all of the cluster. - GPFS access will disappear for some time.
- Login & Transfer Node Migration
- The login nodes will be moved from physical machines to virtual machines in high-availability mode.
- Further, they will be available as
hpc-login-1.cubi.bihealth.org
and login-2...
instead of hpc-login-{1,2}
. - The same is true for,
hpc-transfer-{1,2}
which will be replaced by transfer-1.research.hpc.bihealth.org
and transfer-2...
. - The aim is to improve stability and make everything easier to manage by administration.
"},{"location":"admin/maintenance/#current-status-result","title":"Current Status / Result","text":" - We had to clear the accounting information database to make the update work within an acceptable time (we have 4M+ jobs in there). From now on we will only keep the last 31 days in the database (updated nightly).
- The old login and transfer nodes have been made available as nodes
med010[1-3]
and med012[5-6]
. - All nodes are available again.
- The maintenance is complete.
"},{"location":"admin/maintenance/#slurm-scheduler-updates-september-8-2020","title":"Slurm Scheduler Updates: September 8, 2020","text":" - To improve the scheduling behaviour we will need to restart the Slurm scheduler at ~8am.
- If everything runs well, this will finish after 30minutes (8:30 am).
- Planned Scheduler Changes:
- Introduce automatic routing of jobs to partitions.
- Make Slurm scheduler and accounting run more robustly.
"},{"location":"admin/maintenance/#network-maintenance-june-3-2020","title":"Network Maintenance: June 3, 2020","text":"On June 3, we need to perform a network maintenance at 8 am.
If everything goes well, there might be a short delay in network packages and connections will survive. In this case, the maintenance will end 8:30 am.
Otherwise, the maintenance will finish by noon.
"},{"location":"admin/maintenance/#cluster-maintenance-with-downtime-june-16","title":"Cluster Maintenance with Downtime: June 16","text":"We need to schedule a full cluster downtime on June 16.
"},{"location":"admin/maintenance/#slurm-migration","title":"Slurm Migration","text":"We will switch to the Slurm workload scheduler (from the legacy SGE). The main reason is that Slurm allows for better scheduling of GPUs (and has loads of improvements over SGE), but the syntax is a bit different. Currently, our documentation is in an transient state. We are currently extending our Slurm-specific documentation.
- March 7, 2020 (test stage): Slurm will provide 16 CPU and 3 GPU nodes (with 4 Tesla V100 each), and two high memory nodes, the remaining nodes are available in SGE. We ask users to look into scheduling with Slurm.
- March 31, 2020 (intermediate stage): Half of the nodes will be migrated to the Slurm cluster (~100), all high memory and GPU nodes will be moved to Slurm. New users are advised to use not learn SGE any more but directly use Slurm. Support for SGE is limited to bug fixing only (documentation and tips are phased out).
- May 31, 2020 (sunsetting SGE): All but 16 nodes will remain in the SGE cluster.
- June 31, 2020 (the end): SGE has reached its end of life on hpc4research.
"},{"location":"admin/maintenance/#ssh-key-management","title":"SSH Key Management","text":"SSH Key Management has switched to using Charite and MDC ActiveDirectory servers. You need to upload all keys by the end of April 2020.
- MDC Key Upload
- Charite Key Upload
Schedule
Feb 4, 2020:
Keys are now also taken from central MDC/Charite servers. You do not need to contact us any more to update your keys (we cannot accelerate the process at MDC). May 1, 2020:
Keys are now only taken from central MDC/Charite servers. You must upload your keys to central servers by then.
"},{"location":"admin/maintenance/#switch-update-location-flip-of-hpc-login-2-and-hpc-transfer-1","title":"Switch update, Location Flip of hpc-login-2 and hpc-transfer-1","text":" - Monday, February 23, 9am-15am.
Affected systems:
hpc-transfer-1
hpc-transfer-2
hpc-login-2
- a few compute nodes
The compute nodes are non-critical as we are taking them out of the queues now.
"},{"location":"admin/maintenance/#centos-76-upgrade-january-29-february-5","title":"CentOS 7.6 Upgrade, January 29, February 5","text":" - Wednesday, January 29, 2018: Reboot hpc-login-1, hpc-transfer-1
- Wednesday, February 5, 2018: Reboot hpc-login-2, hpc-transfer-2
"},{"location":"admin/maintenance/#september-03-30-2018","title":"September 03-30, 2018","text":"Starting monday 03.09.2018 we will be performing rolling update of the cluster from CentOS 7.4 to CentOS 7.5. Since update will be performed in small bunches of nodes, the only impact you should notice is smaller number of nodes available for computation.
Also, for around two weeks, you can expect that your jobs can hit both CentOS 7.4 & CentOS 7.5 nodes. This should not impact you in any way, but if you encounter any unexpected behavior of the cluster during this time, please let us know.
At some point we will have to update the transfer, and login nodes. We will do this also in parts, so the you can switch to the other machine.
Key dates are:
18.09.2018 - hpc-login-1 & hpc-transfer-1 will not be available, and you should switch to hpc-login-2 & hpc-transfer-2 respectively.
25.09.2018 - hpc-login-2 & hpc-transfer-2 will not be available, and you should switch to hpc-login-1 & hpc-transfer-1 respectively.
Please also be informed that non-invasive maintenance this weekend which we announced has been canceled, so cluster will operate normally.
In case of any concerns, issues, do not hesitate to contact us via hpc-admin@bih-charite.de, or hpc-helpdesk@bih-charite.de.
"},{"location":"admin/maintenance/#june-18-2018-0600-1500","title":"June 18, 2018, 0600-1500","text":"Due to tasks we need to perform on BIH cluster, we have planned maintenance:
- Maintenance start: 18.06.2018 06:00 AM
- Maintenance end: 18.06.2018 3:00 PM
During maintenance we will perform several actions:
- GPFS drives re-balancing to improve performance
- OS update on cluster, transfer, and login nodes
During maintenance whole cluster will not be usable, this includes:
- you will not be able to run jobs on cluster (SGE queuing system will be shutdown)
- hpc-login-{1,2} nodes will not work reliably during this time
- hpc-transfer-{1-2} nodes, and resources shared by them will be not available
Maintenance window is quite long, since we are dependent on external vendor. However, we will recover services as soon as possible.
We will keep you posted during maintenance with services status.
"},{"location":"admin/maintenance/#march-16-18-2018-mdc-it","title":"March 16-18, 2018 (MDC IT)","text":"MDC IT has a network maintenance from Friday, March 16 18:00 hours until Sunday March 18 18:00 hours.
This will affect connections to the cluster but no connections within the cluster.
"},{"location":"admin/maintenance/#january-17-2018-complete","title":"January 17, 2018 (Complete)","text":"STATUS: complete
The first aim of this window is to upgrade the cluster to CentOS 7.4 to patch against the Meltdown/Spectre vulnerabilities. For this, the login and transfer nodes have to be rebooted.
The second aim of this window is to reboot the file server to mitigate some NFS errors. For this, the SGE master has to be stopped for some time.
"},{"location":"admin/maintenance/#planprogress","title":"Plan/Progress","text":" - reboot med-file1
- update to CentOS 7.4
- front nodes
- hpc-login-1
- hpc-login-2
- hpc-login-3 (admin use only)
- hpc-transfer-1
- hpc-transfer-2
- infrastructure nodes
- qmaster*
- install-srv
- compute nodes
- med0100 to med0246
- med0247 to med0764
- special purpose compute nodes
- med0401 (high-memory)
- med0402 (high-memory)
- med0403 (high-memory)
- med0404 (high-memory)
- med0405 (GPU)
"},{"location":"admin/maintenance/#previous-maintenance","title":"Previous Maintenance","text":"(since January 2010)
- none
"},{"location":"admin/policies/","title":"Policies","text":"This page describes strictly enforced policies valid on the BIH HPC clusters.
The aim of the HPC systems is to support the users in their scientific work and relies on their cooperation. First and foremost, the administration team enforces state of the art IT security and reliability practices through their organizational and operational processes and actions. We kindly ask user to follow the Cluster Etiquette describe below to allow for fair use and flexible access to the shared resources. Beyond this, policies are introduced or enforced only when required to ensure non-restrictive access to the resources themselves. Major or recurrent breaches of policies may lead to exclusion from service.
We will update this list of policies over time. Larger changes will be announced through the mailing list.
"},{"location":"admin/policies/#cluster-etiquette","title":"Cluster Etiquette","text":" - The clusters are soft-partitioned shared resources that are made available under a \"fair use\" policy as far as possible.
- The general assumption that if a user interferes with the work of others (e.g., by blocking compute slots) then this happens accidentally.
- Please do not do this.
- If you see this happening try to contact the user yourself (use
getent passswd $USER
to find out the user's office contact details). - Send an email to hpc-helpdesk@bih-charite.de if you need administrative intervention.
- All users must be subscribed to the cluster mailing list (they are subscribed automatically when the account is created).
- When leaving please send an email to hpc-helpdesk@bih-charite.de such that we can shutdown your account in an organized fashion. We also need to arrange for cleaning up your data.
- The cluster mailing list bih-cluster@charite.de is the primary contact channel for announcements by administration to users. Users must be subscribed to the mailing list. Users must follow the announcements, failure to do so can lead to missing important policy changes and thus losing access to the cluster or data.
- Do not perform any computation on the login nodes. This includes: running
conda
, archive management tools such as tar
, (un)zip
, or gzip
. You should probably only run screen
/tmux
and maybe a text editor there. - Do not perform file transfers through the login nodes. Rather use the transfer nodes
hpc-transfer-1
and hpc-transfer-2
.
"},{"location":"admin/policies/#cluster-policies","title":"Cluster Policies","text":""},{"location":"admin/policies/#file-system-policies","title":"File System Policies","text":"In the case of violations marked with a shield () administration reserves the right to remove write and possibly read permission to the given locations. Policies marked with a robot () are automatically enforced.
- Storage on the GPFS file system is a sparse resource try to use both data volume and file sparingly. Note well that small files above ~4KB take up at least 8MB of space.
- Default quotas are as follows (each user, group, project has a
home
, work
, and scratch
volume). You can request an increase by an email to hpc-helpdesk@bih-charite.de for groups and projects. home
10k files, 1GB space work
2M files, 1TB space scratch
20M files, 200TB space
- The overall throughput limit is 10GB/sec. Try not to overload the cluster I/O wise.
- User home/work/group file sets have to be owned by the user, group is
hpc-users
and mode is u=rwx,go=
; POSIX ACLs are prohibited. This policy is automatically enforced every 5 minutes. - Group and project home/work/group file sets have to be owned by the owner, group set to the corresponding unix group and mode is
u=rwx,g=rwxs,o=
; POSIX ACLs are prohibited. This policy is automatically enforced every 5 minutes. - All files in scratch will be moved into a read-only \"trash can\" inside
scratch/BIH_TRASH
after 14 days (by mtime
) over night. Trash directories will be removed after 14 further days. - Users can arrange with hpc-helpdesk@bih-charite.de to keep files longer by using
touch
on files in scratch
and subsequently bumping the mtime
. - In the case of abuse of this mechanism / failure to communicate with hpc-helpdesk, administration reserves the right to drastically reduce scratch quota of affected users and employ other measures to ensure stability of operations.
- You can learn more in the Automated Scratch Cleanup section.
- Administration will not delete any files (outside of
/tmp
). In the case that users need to delete files that they can access but not update/delete, administration will either give write permissions to the Unix group of the work group or project or change the owner to the owner/delegate of this group. This can occur in a group/project directory of a user who has left the organization. In the case that a user leaves the organization, the owner/delegate of the hosting group can request getting access to the user's files with the express agreement of this user. - Only use
/tmp
in Slurm-controlled jobs. This will enforce that Slurm can clean up after you.
"},{"location":"admin/policies/#connections","title":"Connections","text":"Network connections are a topic important in security. In the case of violations marked with a shield () administration reserves the right to terminate connections without notice and perform other actions.
- Data transfers should happen through the transfer nodes (HPC 4 Research) and/or the compute nodes themselves.
- The cluster is not meant as a \"hop node\". Do not use it to connect to the login node first and then jump to another host outside of the cluster network. Doing so is a breach of cluster policies and quite possibly your organization's IT security policies
- As a corollary, SSH reverse tunnels are strictly prohibited.
- Outgoing connections are meant for data transfers only (in other words: using SSH/SCP to download file is fine).
- Do not leave outgoing connections open longer than necessary.
- Sessions of
screen
and tmux
are only allowed to run on the head nodes. They will be terminated automatically on the compute nodes.
"},{"location":"admin/policies/#interactive-use","title":"Interactive Use","text":" - Interactive sessions block resources to the scheduler. Reduce interactive use to the minimal time and resources possible.
- The cluster is optimized for batch processing. Interactive use is a secondary aim. Administration attempts to strike a good balance here but batch usage is most important. Consider using our Open on Demand service for interactive use.
- Interactive use should happen through the Slurm scheduler (
srun
). - SSH connections to the nodes are allowed for monitoring purposes but not meant for computation. Administration enforces this by restricting all jobs outside of Slurm to use at most 1 core and 128 MB of RAM. This limit is enforced per node per user with Linux cgroups.
- Interactive Slurm sessions on scarce resources (GPU/Highmem partitions) are limited to 24 h.
"},{"location":"admin/policies/#gpu-use","title":"GPU Use","text":" - Interactive sessions block resources to the scheduler. Interactive GPU use is discouraged.
- Accessing GPU cores outside of the Slurm scheduler has been disabled by administration.
"},{"location":"admin/policies/#account-policies","title":"Account Policies","text":" - Sharing accounts and/or credentials is strictly prohibited. Doing so is a breach of cluster policies and certainly also of your organization's IT security policies.
- Hosting shared services on the cluster is also strictly prohibited. - This includes Jupyter servers that shall only be used by the user starting them, this also includes work schedulers such as Dask. - You can assume that the cluster internal network is secure and you do not have to encrypt connections between nodes. - Connections towards outside of the cluster must be encrypted (e.g., via SSH tunnels; incoming ones as reverse tunneling is prohibited, see above). - Access to any service must be protected by appropriate means, e.g., passwords, tokens or client certificates.
"},{"location":"admin/policies/#maintenance","title":"Maintenance","text":" - Maintenance that are expected to cause major service interruptions (the whole system becomes unusable and/or jobs might be prevented to run etc.) are announced 14 days in advance.
- Maintenance of login nodes (e.g., reboot one node while the other is still available) are announced 7 days in advance.
- Maintenance of transfer nodes are announced 1 day in advance. Rationale: transfer nodes expected to not have any interactive sessions running.
"},{"location":"admin/policies/#credentials-policies","title":"Credentials Policies","text":" - Login is currently based on SSH keys only.
- SSH keys must be deposited with the host organizations (Charite/MDC) as documented.
- For technical reasons, the compute nodes also use the
~/.ssh/authorized_keys
file but their usage is discouraged.
"},{"location":"best-practice/bashrc-guide/","title":"~/.bashrc
Guide","text":"You can find the current default content of newly created user homes in /etc/skel.bih
:
hpc-login-1:~$ head /etc/skel.bih/.bash*\n==> /etc/skel.bih/.bash_logout <==\n# ~/.bash_logout\n\n==> /etc/skel.bih/.bash_profile <==\n# .bash_profile\n\n# Get the aliases and functions\nif [ -f ~/.bashrc ]; then\n . ~/.bashrc\nfi\n\n# User specific environment and startup programs\n\nPATH=$PATH:$HOME/.local/bin:$HOME/bin\n\n==> /etc/skel.bih/.bashrc <==\n# .bashrc\n\n# Source global definitions\nif [ -f /etc/bashrc ]; then\n . /etc/bashrc\nfi\n\n# Uncomment the following line if you don't like systemctl's auto-paging feature:\n# export SYSTEMD_PAGER=\n
"},{"location":"best-practice/env-modules/","title":"Custom Environment Modules","text":"This document contains a few tips for helping you using environment modules more effectively. As the general online documentation is lacking a bit, we also give the most popular commands here.
"},{"location":"best-practice/env-modules/#how-does-it-work","title":"How does it Work?","text":"Environment modules are descriptions of software packages. The module
command is provided which allows the manipulation of environment variables such as PATH
, MANPATH
, etc., such that programs are available without passing the full path. Environment modules also allow specifying dependencies between packages and conflicting packages (e.g., when the same binary is available in two packages). Further, environment variables allow the parallel installation of different software versions in parallel and then using software \"a la carte\" in your projects.
"},{"location":"best-practice/env-modules/#popular-commands","title":"Popular Commands","text":""},{"location":"best-practice/env-modules/#querying","title":"Querying","text":"List currently loaded modules:
$ module list\n
Show all available modules
$ module avail\n
"},{"location":"best-practice/env-modules/#loadingunloading-modules","title":"Loading/Unloading Modules","text":"Load one module, make sure to use a specific version to avoid ambiguities.
$ module load Jannovar/0.16-Java-1.7.0_80\n
Unload one module
$ module unload Jannovar\n
Unload all modules
$ module purge\n
"},{"location":"best-practice/env-modules/#getting-help","title":"Getting Help","text":"Get help for environment modules
$ module help\n
Get help for a particular environment module
$ module help Jannovar/0.16-Java-1.7.0_80\n
"},{"location":"best-practice/env-modules/#using-your-own-module-files","title":"Using your own Module Files","text":"You can also create your own environment modules. Simply create a directory with module files and then use module use
for using the modules from the directory tree.
$ module use path/to/modules\n
"},{"location":"best-practice/env-modules/#faq-why-bash-module-command-not-found","title":"FAQ: Why -bash: module: command not found
?","text":"On the login nodes, the module
command is not installed. You should not run any computations there, so why would you need environment modules there? ;)
meg-login2$ module\n-bash: module: command not found\n
Use srun --pty bash -i
to get to one of the compute nodes.
"},{"location":"best-practice/env-modules/#auto-loading-a-set-of-modules","title":"Auto-loading a set of Modules","text":"You will certainly finding yourself using a set of programs regularly without it being part of the core cluster installation, e.g., SAMtools, or Python 3. Just putting the appropriate module load
lines in your ~/.bashrc
will generate warnings when logging into the login node. It is thus recommended to use the following snippet for loading modules automatically on logging into a compute node:
case \"${HOSTNAME}\" in\n login-*)\n ;;\n *)\n # load Python3 environment module\n module load Python/3.4.3-foss-2015a\n\n # Define path for temporary directories, don't forget to cleanup!\n # Also, this will only work after /fast is available.\n export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp\n ;;\nesac\n
"},{"location":"best-practice/project-structure/","title":"Project File System Structure","text":"Under Construction
This guide was written for the old GPFS file system and is in the process of being updated.
"},{"location":"best-practice/project-structure/#general-aims","title":"General Aims","text":"Mostly, you can separate the files in your projects/pipelines into one of the following categories:
- scripts (and their documentation)
- configuration
- data
Ideally, scripts and documentation are independent of a given project and can be separated from the rest. Configuration is project-dependent and small and mostly does not contain any sensitive information (such as genotypes that allows for reidentification of donors). In most cases, data might be large and is either also stored elsewhere or together with scripts and configuration can be regenerated easily.
There is no backup of work
and scratch
The cluster GPFS file system /fast
is not appropriate for keeping around single \"master\" copies of data. You should have a backup and archival strategy for your valuable \"master\" copy data.
"},{"location":"best-practice/project-structure/#best-practices","title":"Best Practices","text":""},{"location":"best-practice/project-structure/#scripts","title":"Scripts","text":" - Your scripts should go into version control, e.g., a Git repository.
- Your scripts should be driven by command line parameters and/or configuration such that no paths etc. are hard-coded. If for a second data set, you need to make a copy of your scripts and adjust some variables, e.g., at the top, you're doing something in a suboptimal fashion. Rather, get these values from the command line or a configuration file and only store (sensible) defaults in your script where appropriate.
- Thus, ideally your scripts are not project-specific.
"},{"location":"best-practice/project-structure/#configuration","title":"Configuration","text":" - Your configuration usually is project-specific.
- Your configuration should also go into version contro, e.g., a Git repository.
In addition, you might need project-specific \"wrapper\" scripts that just call your project-independent script with the correct paths for your project. These scripts rather fall into the \"configuration\" category and should then live together with your configuration.
"},{"location":"best-practice/project-structure/#data","title":"Data","text":" - Your data should go into a location separate from your scripts and configuration.
- Ideally, the raw input data is separated from the work and output files such that you can make these files and directories read-only and don't accidentally damage these files.
Temporary files
You really should keep temporary files in a temporary directory, set the environment variable TMPDIR
appropriately and automatically clean them up (see Useful Tips: Temporary Files)
"},{"location":"best-practice/project-structure/#best-practices-in-practice","title":"Best Practices in Practice","text":"But how can we put this into practice? Below, we give some examples of how to do this. Note that for simplicity's sake we put all scripts and configuration into one directory/repository contrary to the best practices above. This is for educational purposes only and you should strive for reuseable scripts where it makes sense and separate scripts and configuration.
We will limit this to simple Bash scripts for education's purposes. You should be able to easily adapt this to your use cases.
Thus, the aim is to separate the data from the non-data part of the project such that we can put the non-data part of the project into a separate location and under version control. We call the location for non-data part of the project the home location of your project and the location for the data part of the project the work location of your project.
Overall, we have three options:
- Your processes are run in the home location and the sub directories used for execution are links into the work location using symlinks.
- Your processes are run in the work location and
- the scripts are linked into the work location using symlinks, OR
- the scripts are called from the home location, maybe through project-specific wrapper scripts.
"},{"location":"best-practice/project-structure/#example-link-configscripts-into-work-location-option-1","title":"Example: Link config/scripts into work location (Option 1)","text":"Creating the work directory and copy the input files into work/input
.
$ mkdir -p project/work/input\n$ cp /data/cephfs-1/work/projects/cubit/tutorial/input/* project/work/input\n
Creating the home space. We initialize a Git repository, properly configure the .gitignore
file and add a README.md
file.
$ mkdir -p project/home\n$ cd project/home\n$ cat <<EOF >.gitignore\n*~\n.*.sw?\nEOF\n$ cat <<EOF >README.md\n# Example Project\n\nThis is an example project with config/scripts linked into work location.\nEOF\n$ git init\n$ git add .gitignore README.md\n$ git commit -m 'Initial project#\n
We then create the a simple script for executing the mapping step and a configuration file that gives the path to the index and list of samples to process.
$ mkdir scripts\n$ cat <<\"EOF\" >scripts/run-mapping.sh\n#!/bin/bash\n\n# Unofficial Bash script mode, see:\n# http://redsymbol.net/articles/unofficial-bash-strict-mode/\nset -euo pipefail\n\n# Get directory to bash file, see\n# https://stackoverflow.com/a/4774063/84349\nSCRIPTPATH=\"$( cd \"$(dirname \"$0\")\" ; pwd -P )\"\n\n# Helper function to print help to stderr.\nhelp()\n{\n >&2 echo \"Run Mapping Step\"\n >&2 echo \"\"\n >&2 echo \"run-mapping.sh [-c config.sh] [-h]\"\n}\n\n# Parse command line arguments into bash variables.\nCONFIG=\nwhile getopts \"hs:\" arg; do\n case $arg in\n h)\n help()\n exit\n ;;\n s)\n CONFIG=$OPTARG\n ;;\n esac\ndone\n\n# Print the executed commands.\nset -x\n\n# Load default configuration, then load configuration file if any was given.\nsource $SCRIPTPATH/../config/default-config.sh\nif [[ -z \"$CONFIG\" ]]; then\n source $CONFIG\nfi\n\n# Create output directory.\nmkdir -p output\n\n# Actually perform the mapping. This assumes that you have\n# made the bwa and samtools commands available, e.g., using conda.\nfor sample in $SAMPLES; do\n bwa mem \\\n $BWA_INDEX \\\n input/${sample}_R1.fq.gz \\\n input/${sample}_R2.fq.gz \\\n | samtools sort \\\n -o output/${sample}.bam \\\n /dev/stdin\ndone\n\nEOF\n$ chmod +x scripts/run-mapping.sh\n$ mkdir -p config\n$ cat <<\"EOF\" >config/default-config.sh\nBWA_INDEX=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/hs37d5/hs37d5.fa\nSAMPLES=\nEOF\n$ cat <<\"EOF\" >config/project-config.sh\n$ BWA_INDEX comes from default configuration already\nSAMPLES=test\nEOF\n
This concludes the basic project setup. Now, to the symlinks:
$ cd ../work\n$ ln -s ../home/scripts ../home/config .\n
And, to the execution...
$ ./scripts/run-mapping -c config/project-config.sh\n[...]\n
"},{"location":"best-practice/project-structure/#example-link-data-into-home-option-21","title":"Example: Link Data Into Home (Option 2.1).","text":"We can reuse the project up to the statement \"This concludes the basic project setup\" in the example for option 1.
Then, we can do the following:
$ cd ../work\n$ mkdir -p output\n\n$ cd ../home\n$ cat <<\"EOF\" >>.gitignore\n\n# Ignore all data\ninput/\nwork/\noutput/\nEOF\n$ git add .gitignore\n$ git commit -m 'Ignoring data file in .gitignore'\n$ ln -s ../work ../output .\n
And we can execute everything in the home directory.
$ ./scripts/run-mapping -c config/project-config.sh\n[...]\n
"},{"location":"best-practice/project-structure/#example-wrapper-scripts-in-home-option-22","title":"Example: Wrapper Scripts in Home (Option 2.2)","text":"Again, we can reuse the project up to the statement \"This concludes the basic project setup\" in the example for option 1.
Then, we do the following:
$ cd ../work\n$ cat <<\"EOF\" >do-run-mapping.sh\n#!/bin/bash\n\n../home/scripts/run-mapping.sh \\\n -c ../home/config/project-config.sh\nEOF\n$ chmod +x do-run-mapping.sh\n
Note that the the do-run.sh
script could also go into the project-specific Git repository and be linked into the work directory.
Finally, we can run our pipeline:
$ cd ../work\n$ ./do-run-mapping.sh\n[...]\n
"},{"location":"best-practice/screen-tmux/","title":"Screen and Tmux Best Pratice","text":"The program screen
allows you to detach your session from your current login session. So in case you get disconnected your screen session will stay alive.
Hint
You have to reconnect to screen on the machine that you started it. We thus recommend starting it only on the login nodes and not on a compute node.
"},{"location":"best-practice/screen-tmux/#start-and-terminat-a-screen-session","title":"Start and terminat a screen session","text":"You start a new screen
session by
$ screen\n
When you are in a screen session you can terminate it with $ exit\n
so its gone then."},{"location":"best-practice/screen-tmux/#detach-a-screen-session","title":"Detach a screen session","text":"If you want to detach your screen session press Ctrl+a d
"},{"location":"best-practice/screen-tmux/#list-screen-sessions","title":"List screen sessions","text":"To list all your screen sessions run
$ screen -ls\n\nThere is a screen on:\n 2441.pts-1.med0236 (Detached)\n1 Socket in /var/run/screen/S-kbentel.\n
"},{"location":"best-practice/screen-tmux/#reattach-screen-session","title":"Reattach screen session","text":"To reattach a screen session run
$ screen -r screen_session_id\n
If you do not know the screen_session_id
you can get it with screen -ls
, e.g. 2441.pts-1.med0236
in the example above. You do not have to type the whole screen_session_id
only as much as is necessary to identify it uniquely. In case there is only one screen session detached it is enough to run screen -r
"},{"location":"best-practice/screen-tmux/#kill-a-detached-screen-session","title":"Kill a detached screen session","text":"Sometimes it is necessary to kill a detached screen session. This is done with the command
$ screen -X -S screen_session_id quit\n
"},{"location":"best-practice/screen-tmux/#multiple-windows-in-a-screen-session","title":"Multiple windows in a screen session","text":"It is possible to have multiple windows in a screen session. So suppose you are logged into a screen session, these are the relevant shortcuts
new win: Ctrl+a c\nnext/previous win: Ctrl+a n/p\n
To terminate a window just enter
$ exit\n
"},{"location":"best-practice/screen-tmux/#configuration-file","title":"Configuration file","text":"Here is a sensible screen configuration. Save it as ~/.screenrc
.
screenrc
"},{"location":"best-practice/screen-tmux/#fix-a-broken-screen-session","title":"Fix a broken screen session","text":"In case your screen session doesn't write to the terminal correctly, i.e. the formatting of the output is broken, you can fix it by typing to the terminal:
$ tput smam\n
"},{"location":"best-practice/software-craftmanship/","title":"General Software Craftmanship","text":"Computer software, or simply software, is a generic term that refers to a collection of data or computer instructions that tell the computer how to work, in contrast to the physical hardware from which the system is built, that actually performs the work. -- Wikipedia: Software
As you will most probably never have contact with the HPC system hardware, everything you interact with on the HPC is software. All of your scripts, your configuration files, programs installed by you or administration, and all of your data.
This should also answer the question why you should care about software and why you should try to create and use software of a minimal quality.
Software craftsmanship is an approach to software development that emphasizes the coding skills of the software developers themselves. -- Wikipedia: Software Craftmanship
This Wiki page is not mean to give you an introduction of creating good software but rather collect a (growing) list of easy-to-use and high-impact points to improve software quality. Also, it provides pointers to resources elsewhere on the internet.
"},{"location":"best-practice/software-craftmanship/#use-version-control","title":"Use Version Control","text":"Use a version control system for your configuration and your code. Full stop. Modern version control systems are Git and Subversion.
- Official Git Documentation
- Github Help
- Fix Common Git Problems
"},{"location":"best-practice/software-craftmanship/#do-not-share-gitsvn-checkouts-for-multiple-users","title":"Do not Share Git/SVN Checkouts for Multiple Users","text":"Every user should have their own Git/Subversion checkout. Otherwise you are inviting a large number of problems.
"},{"location":"best-practice/software-craftmanship/#document-your-code","title":"Document Your Code","text":"This includes
- programmer-level documentation in your source code, both inline and per code unit (e.g., function/class)
- top-level documentation, e.g., in README files.
"},{"location":"best-practice/software-craftmanship/#document-your-data","title":"Document Your Data","text":"Document where you got things from, how to re-download, etc. E.g., put a README file into each of your data top level directories.
"},{"location":"best-practice/software-craftmanship/#use-checksums","title":"Use Checksums","text":"Use MD5 or other checksums for your data. For example, md5sum
and hashdeep
are useful utilities for computing and checking them:
md5sum
How-To (tools such as sha256sum
work the same...) hashdeep
How-To
"},{"location":"best-practice/software-craftmanship/#use-a-workflow-management-system","title":"Use a Workflow Management System","text":"Use some system for managing your workflows. These systems support you by
- Detect failures and don't continue working with broken data,
- continue where you left off when someting breaks,
- make things more reproducible,
- allow distribution of jobs on the cluster.
Snakemake is a popular workflow management system widely used in Bioinformatics. A minimal approach is using Makefiles.
"},{"location":"best-practice/software-craftmanship/#understand-bash-and-shell-exit-codes","title":"Understand Bash and Shell Exit Codes","text":"If you don't want to use a workflow management system, e.g., for one-step jobs, you should at least understand Bash job management and exit codes. For example, you can use if/then/fi
in Bash together with exit codes to:
- Only call a command if the previous command succeded.
- Remove incomplete output files in case of errors.
if [[ ! -e file.md5 ]]; then\n md5sum file >file.md5 \\\n || rm -f file.md5\nfi\n
Also, learn about the inofficial Bash strict mode.
"},{"location":"best-practice/software-installation-with-conda/","title":"Software Installation with Conda","text":""},{"location":"best-practice/software-installation-with-conda/#conda","title":"Conda","text":"Users do not have the rights to install system packages on the BIH HPC cluster. For the management of bioinformatics software we therefore recommend using the conda package manager. Conda provides software in different \u201cchannels\u201d and one of those channels contains a huge selection of bioinformatics software (bioconda). Generally packages are pre-compiled and conda just downloads the binaries from the conda servers.
You are in charge of managing your own software stack, but conda makes it easy to do so. We will provide you with a description on how to install conda and how to use it. Of course there are many online resources that you can also use. Please find a list at the end of the document.
Warning
Following a change in their terms of service Anaconda Inc. has started to demand payment from research institutions for using both Anaconda, Miniconda, and the defaults channel. As a consequence, usage of this software is prohibited and we're recommending the alternative free \"miniforge\" distribution instead.
"},{"location":"best-practice/software-installation-with-conda/#premise","title":"Premise","text":"When you logged into the cluster, please make sure that you also executed srun
to log into a computation node and perform the software installation there.
"},{"location":"best-practice/software-installation-with-conda/#installing-conda","title":"Installing conda","text":"hpc-login-1:~$ srun --mem=5G --pty bash -i\nhpc-cpu-123:~$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh\nhpc-cpu-123:~$ bash Miniforge3-Linux-x86_64.sh -b -f -p $HOME/work/miniforge\nhpc-cpu-123:~$ eval \"$(/$HOME/work/miniforge/bin/conda shell.bash hook)\"\nhpc-cpu-123:~$ conda init\nhpc-cpu-123:~$ conda config --set auto_activate_base false\n
This will install conda to $HOME/work/miniforge
. You can change the path to your liking, but please note that your $HOME
folder has limited space. The work
subfolder however has a bigger quota. More about this here.
To make bioinformatics software available, we have to add the bioconda
channel to the conda configuration:
hpc-cpu-123:~$ conda config --add channels bioconda\n
"},{"location":"best-practice/software-installation-with-conda/#installing-software-with-conda","title":"Installing software with conda","text":"Installing packages with conda is straight forward:
hpc-cpu-123:~$ conda install <package>\n
This will install a package into the conda base environment. We will explain environments in detail in the next section. To search for a package, e.g. to find the correct name in conda or if it exists at all, issue the command:
hpc-cpu-123:~$ conda search <string>\n
To choose a specific version (conda will install the latest version that is compatible with the current installed Python version), you can provide the version as follows:
hpc-cpu-123:~$ conda install <package>=<version>\n
Please note that new conda installs may ship with a recently update Python version and not all packages might have been adapted. E.g., if you find out that some packages don't work after starting out/upgrading to Python 3.8, simply try to downgrade Python to 3.7 with conda install python=3.7
.
Hint
As resolving the dependency tree of an installation candidate can take a lot of time in Conda, especially when you are installing software from an environment.yaml
file, an alternative resolver has been presented that you can use to install software into your Conda environment. The time savings are immense and an installation that took more than an hour can be resolved in seconds.
Simply run
hpc-cpu-123:~$ conda install mamba\n
With that, you can install software into your environment using the same syntax as for Conda:
hpc-cpu-123:~$ mamba install <package>\n
"},{"location":"best-practice/software-installation-with-conda/#creating-an-environment","title":"Creating an environment","text":"Conda lets you create environments, such that you can test things in a different environment or group your software. Another common use case is to have different environments for the different Python versions. Since conda is Python-based, conflicting packages will mostly struggle with the Python version.
By default, conda will install packages into its root environment. Please note that software that does not depend on Python and is installed in the root environment, is is available in all other environments.
To create a Python 2.7 environment and activate it, issue the following commands:
hpc-cpu-123:~$ conda create -n py27 python=2.7\nhpc-cpu-123:~$ source activate py27\n(py27) hpc-cpu-123:~$\n
From now on, conda will install packages into the py27
environment when you issue the install
command. To switch back to the root environment, simply deactivate the py27
environment:
(py27) hpc-cpu-123:~$ source deactivate py27\nhpc-cpu-123:~$\n
But of course, as Python 2.7 is not supported any more by the Python Software Foundation, you should switch over to Python 3 already!
"},{"location":"best-practice/temp-files/","title":"Temporary Files","text":"Temporary Files and Slurm
See Slurm: Temporary Files for information how Slurm controls access to local temporary storage.
Often, it is necessary to use temporary files, i.e., write something out in the middle of your program, read it in again later, and then discard these files. For example, samtools sort
has to write out chunks of sorted read alignments for allowing to sort files larger than main memory.
"},{"location":"best-practice/temp-files/#environment-variable-tmpdir","title":"Environment Variable TMPDIR
","text":"Traditionally, in Unix, the environment variables TMPDIR
is used for storing the location of the temporary directory. When undefined, usually /tmp
is used.
"},{"location":"best-practice/temp-files/#temporary-directories-on-the-bih-cluster","title":"Temporary Directories on the BIH Cluster","text":"Generally, there are two locations where you could put temporary files:
/data/cephfs-1/home/users/$USER/scratch/tmp
-- inside your scratch folder on the CephFS file system; this location is available from all cluster nodes /tmp
-- on the local node's temporary folder; this location is only available on the node itself. The slurm scheduler uses Linux namespaces such that every job gets its private /tmp
even when run on the same node.
"},{"location":"best-practice/temp-files/#best-practice-use-scratchtmp","title":"Best Practice: Use scratch/tmp
","text":"Use CephFS-based TMPDIR
Generally setup your environment to use /data/cephfs-1/home/users/$USER/scratch/tmp
as filling the local disk of a node with forgotten files can cause a lot of problems.
Ideally, you append the following to your ~/.bashrc
to use /data/cephfs-1/home/users/$USER/scratch/tmp
as the temporary directory. This will also create the directory if it does not exist. Further, it will create one directory per host name which prevents too many entries in the temporary directory.
export TMPDIR=$HOME/scratch/tmp/$(hostname)\nmkdir -p $TMPDIR\n
Prepending this to your job scripts is also recommended as it will ensure that the temporary directory exists.
"},{"location":"best-practice/temp-files/#tmpdir-and-the-scheduler","title":"TMPDIR
and the scheduler","text":"In the older nodes, the local disk is a relatively slow spinning disk, in the newer nodes, the local disk is a relatively fast SSD. Further, the local disk is independent from the CephFS file system, so I/O volume to it does not affect the network or any other job on other nodes. Please note that by default, Slurm will not change your environment variables. This includes the environment variable TMPDIR
.
Slurm will automatically update temporary files in a job's /tmp
on the local file system when the job terminates. To automatically clean up temporary directories on the shared file system, use the following tip.
"},{"location":"best-practice/temp-files/#use-bash-traps","title":"Use Bash Traps","text":"You can use the following code at the top of your job script to set TMPDIR
to the location in your home directory and get the directory automatically cleaned when the job is done (regardless of successful or erroneous completion):
# First, point TMPDIR to the scratch in your home as mktemp will use thi\nexport TMPDIR=$HOME/scratch/tmp\n# Second, create another unique temporary directory within this directory\nexport TMPDIR=$(mktemp -d)\n# Finally, setup the cleanup trap\ntrap \"rm -rf $TMPDIR\" EXIT\n
"},{"location":"connecting/connecting-windows/","title":"Connecting via SSH on Windows","text":""},{"location":"connecting/connecting-windows/#install-ssh-client-for-windows","title":"Install SSH Client for Windows","text":"We recommend to use the program MobaXterm on Windows. MobaXterm is a software that allows you to connect to an SSH server, much like PuTTy, but also maintains your SSH key.
Alternative SSH Clients for Windows
- Another popular option is PuTTy but many users have problems configuring it correctly with SSH keys.
- On Windows 10, you can also install Windows Subsystem for Linux, e.g., together with WSL Terminal. This is not for the faint of heart (but great if you're a Unix head).
- Navigate to https://mobaxterm.mobatek.net/download-home-edition.html
- Download either the
- Portable edition (blue button lefthand-side, if you have no admin rights, e.g. on a Charite or MDC workstation), or
- Installer edition (green button righthand-side, requires admin rights on your computer).
- Install or unpack MobaXterm and start the software. As a Charite user, please cancel any firewall warnings that pop up.
"},{"location":"connecting/connecting-windows/#software-for-transfering-data-fromto-windows","title":"Software for transfering data from/to Windows","text":"For transfering data from/to Windows, we recommand using WinSCP. Install the latest version from here: https://winscp.net/eng/download.php
On the Login
screen of WinSCP create a new login by selecting New Site
.
Fill in the following parameters:
File protocol
: SFTP
Host name
: hpc-transfer-1.cubi.bihealth.org
or hpc-transfer-2.cubi.bihealth.org
User name
: your user name
Go to Advanced
> SSH
> Authentication
> Authentication parameters
> Private key file
and select your private ssh key file (in .ppk
format).
Press Ok
then Save
.
Press Login
to connect. It will ask for your private key passphrase, if you set one up.
If you need to convert your private ssh key file the .ppk
format, on the WinSCP login screen go to Tools
> PuTTYgen
and follow the steps here: https://docs.acquia.com/cloud-platform/manage/ssh/sftp-key/
"},{"location":"connecting/connecting-windows/#connecting-from-within-mdccharite-network","title":"Connecting from within MDC/Charite Network","text":"Click on Session
.
Click on SSH
.
In Basic SSH settings, enter a hostname (hpc-login-X.cubi.bihealth.org
, where X
is 1 or 2), check Specify username and enter your username in the textfield. Select the tab Advanced SSH settings, check Use private key and select your private SSH key file (possible choices described with the next to figures).
Select the id_rsa
file generated in Linux OR
select the id_rsa.ppk
file generated in Windows with MobaXterm.
Afterwards hit the OK button and MobaXterm will connect.
The session will be stored automatically and you can establish new connections later on, or also multiple ones at the same time, if you like.
"},{"location":"connecting/connecting/","title":"Connecting to HPC 4 Research","text":"HPC 4 Research is only available via the Charit\u00e9, MDC, and BIH internal networks. VPN access requires additional measures which are described in Connecting from External Networks.
There are two primary methods for interacting with BIH HPC:
- Through the \u201cOndemand\u201d web portal.
- Via SSH and Slurm.
This part of the documentation only described direct console access via SSH. For information regarding the web portal, please read OnDemand Portal. In case you're not familiar with SSH, you should probably start via the web portal or (if you are determined to learn) read through our SSH basics page.
"},{"location":"connecting/connecting/#in-brief","title":"In brief","text":"Follow these steps to connect to BIH HPC via the command line:
- Register an account via your PI.
- Generate a SSH key pair in Linux or Windows
- Submit your public key to Charite or to MDC.
-
Connect to one of the two login nodes.
# Charite Users\n$ ssh user_c@hpc-login-1.cubi.bihealth.org\n$ ssh user_c@hpc-login-2.cubi.bihealth.org\n\n# MDC Users\n$ ssh user_m@hpc-login-1.cubi.bihealth.org\n$ ssh user_m@hpc-login-2.cubi.bihealth.org\n
Hint
There are two login nodes, hpc-login-1
and hpc-login-2
. There are two for redundancy reasons. Please do not perform big file transfers or an sshfs
mount via the login nodes. For this purpose, we have hpc-transfer-1
and hpc-transfer-2
.
Please also read Advanced SSH for more custom scenarios how to connect to BIH HPC. If you are using a Windows PC to access BIH HPC, please read Connecting via SSH on Windows
-
Allocate resources on a computation node using Slurm. Do not compute on the login node!
# Start interactive shell on computation node\n$ srun --pty bash -i\n
-
Bonus: Configure your SSH client on Linux and Mac or Windows.
- Bonus: Connect from external networks .
tl;dr
- Web Access: https://hpc-portal.cubi.bihealth.org
-
SSH-Based Access:
# Interactive login (choose one)\nssh username@hpc-login-1.cubi.bihealth.org\nssh username@hpc-login-2.cubi.bihealth.org\nsrun --pty bash -i\n\n# File Transfer (choose one)\nsftp local/file username@hpc-transfer-1.cubi.bihealth.org:remote/file\nsftp username@hpc-transfer-2.cubi.bihealth.org:remote/file local/file\n\n# Interactive login into the transfer nodes (choose one)\nssh username@hpc-transfer-1.cubi.bihealth.org\nssh username@hpc-transfer-2.cubi.bihealth.org\n
"},{"location":"connecting/connecting/#what-is-my-username","title":"What is my username?","text":"Your username for accessing the cluster are composed of your username at your primary organization (Charit\u00e9/MDC) and a suffix:
- Charite user:
<Charite username>_c -> doej_c
- MDC user:
<MDC username>_m -> jdoe_m
"},{"location":"connecting/connecting/#how-can-i-connect-from-the-outside","title":"How can I connect from the outside?","text":"Please read Connecting from External Networks
"},{"location":"connecting/connecting/#i-have-problems-connecting","title":"I have problems connecting","text":"Please read Debugging Connection Problems
"},{"location":"connecting/connection-problems/","title":"Debugging Connection Problems","text":"When you encounter problems with the login to the cluster although we indicated that you should have access, depending on the issue, here is a list of how to solve the problem:
"},{"location":"connecting/connection-problems/#im-getting-a-connection-refused","title":"I'm getting a \"connection refused\"","text":"The full error message looks as follows:
ssh: connect to host hpc-login-1.cubi.bihealth.org port 22: Connection refused\n
This means that your computer could not open a network connection to the server.
- HPC 4 Research can be connected to from:
- Charite (cabled) network
- Charite VPN but only with Zusatzantrag B.
- MDC (cabled) network
- MDC VPN
- BIH (cabled) network
- If you think that there is no problem with any of this then please include the output of the following command in your ticket (use the server that you want to read instead of
<DEST>
): - Linux/Mac
ifconfig\ntraceroute <DEST>\n
- Windows
ipconfig\ntracepath <DEST>\n
"},{"location":"connecting/connection-problems/#i-can-connect-but-it-seems-that-my-account-has-no-access-yet","title":"I can connect, but it seems that my account has no access yet","text":"You're logging into BIH HPC cluster! (login-1)\n\n ***Your account has not been granted cluster access yet.***\n\n If you think that you should have access, please contact\n hpc-helpdesk@bih-charite.de for assistance.\n\n For applying for cluster access, contact hpc-helpdesk@bih-charite.de.\n\nuser@login-1's password:\n
Hint
This is the most common error, and the main cause for this is a wrong username. Please take a couple of minutes to read the What is my username?!
If you encounter this message although we told you that you have access and you checked the username as mentioned above, please write to hpc-helpdesk@bih-charite.de, always indicating the message you get and a detailed description of what you did.
"},{"location":"connecting/connection-problems/#im-getting-a-passphrase-prompt","title":"I'm getting a passPHRASE prompt","text":"You're logging into BIH HPC cluster! (login-1)\n\n *** It looks like your account has access. ***\n\n Login is based on **SSH keys only**, if you are getting a password prompt\n then please contact hpc-helpdesk@bih-charite.de for assistance.\n\nEnter passphrase for key '/home/USER/.ssh/id_rsa':\n
Here you have to enter the passphrase that was used for encrypting your private key. Read SSH Basics for further information of what is going on here.
"},{"location":"connecting/connection-problems/#i-can-connect-but-i-get-a-password-prompt","title":"I can connect, but I get a passWORD prompt","text":"You're logging into BIH HPC cluster! (login-1)\n\n *** It looks like your account has access. ***\n\n Login is based on **SSH keys only**, if you are getting a password prompt\n then please contact hpc-helpdesk@bih-charite.de for assistance.\n\nuser@login-1's password:\n
This is diffeerent from passPHRASE prompt
Please see I'm getting a passPHRASE prompt for more information.
When you encounter this message during a login attempt, there is an issue with your SSH key. In this case, please connect with increased verbosity to the cluster (ssh -vvv ...
) and mail the output and a detailed description to hpc-helpdesk@bih-charite.de.
"},{"location":"connecting/from-external/","title":"Connecting from External Networks","text":"This page describes how to connect to the BIH HPC from external networks (e.g., another university or from your home). The options differ depending on your home organization and are described in detail below.
- MDC users can use
- the MDC SSH gateway/hop node, or
- MDC VPN.
- Charite users can use
- the Charite VPN with \"VPN Zusatzantrag B\".
Getting Help with VPN and Gateway Nodes
Please note that the VPNs and gateway nodes are maintained by the central IT departments of Charite/MDC. BIH HPC IT cannot assist you in problems with these serves. Authorative information and documentation is provided by the central IT departments as well.
SSH Key Gotchas
You should use separate SSH key pairs for your workstation, laptop, home computer etc. As a reminder, you will have to register the SSH keys with your home IT organization (MDC or Charite). When using gateway nodes, please make sure to use SSH key agents and agent forwarding (ssh
flag \"-A
\").
"},{"location":"connecting/from-external/#mdc-users","title":"MDC Users","text":""},{"location":"connecting/from-external/#via-gateway-node","title":"Via Gateway Node","text":"Use the following command to perform a proxy jump via the MDC SSH gateway (ssh1
aka jail1
) when connecting to a login node. Note that for logging into the jail, the <MDC_USER>
is required.
$ ssh -J <MDC_USER>@ssh1.mdc-berlin.de <HPC_USER>@hpc-login-1.cubi.bihealth.org\n
Note
Please Note that the cluster login is independent of access to the MDC jail node ssh1.mdc-berlin.de.
- Access to the cluster is granted by BIH HPC IT through hpc-helpdesk@bih-charite.de.
- Access to the MDC jail node is managed by MDC IT.
"},{"location":"connecting/from-external/#via-mdc-vpn","title":"Via MDC VPN","text":"You can find the instructions for getting MDC VPN access here in the MDC intranet below the \"VPN\" heading. Please contact helpdesk@mdc-berlin.de for getting VPN access.
Install the VPN client and then start it. Once VPN has been activated you can SSH to the HPC just as from your workstation.
$ ssh user_m@hpc-login-1.cubi.bihealth.org\n
"},{"location":"connecting/from-external/#charite-users","title":"Charit\u00e9 Users","text":"Access to BIH HPC from external networks (including Eduroam) requires a Charit\u00e9 VPN connection with special access permissions.
"},{"location":"connecting/from-external/#general-charite-vpn-access","title":"General Charit\u00e9 VPN Access","text":"You need to apply for general Charit\u00e9 VPN access if you haven't done so already. The form can be found in the Charite Intranet and contains further instructions. Charit\u00e9 IT Helpdesk can help you with any questions.
"},{"location":"connecting/from-external/#zusatzantrag-b","title":"Zusatzantrag B","text":"Special permissions form B is also required for HPC access. You can find Zusatzantrag B in the Charit\u00e9 intranet. Fill it out and send it to the same address as the general VPN access form above.
Once you have been granted VPN access, start the client and connect to VPN. You will then be able to connect from your client in the VPN just as you do from your workstation.
$ ssh jdoe_c@hpc-login-1.cubi.bihealth.org\n
"},{"location":"connecting/from-external/#charite-vdi-not-recommended","title":"Charit\u00e9 VDI (Not recommended)","text":"Alternative to using Zusatzantrag B, you can also get access to the Charit\u00e9 VDI (Virtual Desktop Infrastructure). Here, you connect to a virtual desktop computer which is in the Charit\u00e9 network. From there, you can connect to the BIH HPC system.
You need to apply for extended VPN access to be able to access the BIH VDI. The form can be found here. It is important to tick Dienst(e), enter HTTPS and as target view.bihealth.org
. Please write to helpdesk@charite.de with the request to access the BIH VDI.
When the access has been set up, follow the instructions on client configuration for Windows, after logging in to the BIH VDI.
"},{"location":"connecting/ssh-basics/","title":"SSH Basics","text":""},{"location":"connecting/ssh-basics/#what-is-ssh","title":"What is SSH?","text":"SSH stands for S ecure Sh ell. It is a software that allows to establish a user-connection to a remote UNIX/Linux machine over the network and remote-control it from your local work-station.
Let's say you have an HPC cluster with hundreds of machines somewhere in a remote data-center and you want to connect to those machines to issue commands and run jobs. Then you would use SSH.
"},{"location":"connecting/ssh-basics/#getting-started","title":"Getting Started","text":""},{"location":"connecting/ssh-basics/#installation","title":"Installation","text":"Simply install your distributions openssh-client
package. You should be able to find plenty of good tutorials online. On Windows you can consider using MobaXterm (recommended) or Putty.
"},{"location":"connecting/ssh-basics/#connecting","title":"Connecting","text":"Let's call your local machine the client and the remote machine you want to connect to the server.
You will usually have some kind of connection information, like a hostname, IP address and perhaps a port number. Additionally, you should also have received your user-account information stating your user-name, your password, etc.
Follow the instructions below to establish a remote terminal-session.
If your are on Linux
Open a terminal and issue the following command while replacing all the <...>
fields with the actual data:
# default port\nssh <username>@<hostname-or-ip-address>\n\n# non-default port\nssh <username>@<hostname-or-ip-address> -p <port-number>\n
If you are on windows
Start putty.exe
, go into the Session
category and fill out the form, then click the Connect
button. Putty also allows to save the connection information in different profiles so you don't have to memorize and retype all fields every time you want to connect.
"},{"location":"connecting/ssh-basics/#ssh-keys","title":"SSH-Keys","text":"When you connect to a remote machine via SSH, you will be prompted for your password. This will happen every single time you connect and can feel a bit repetitive at times, especially if you feel that your password is hard to memorize. For those who don't want to type in their password every single time they connect, SSH keys are an alternative way of authentication.
Instead if being prompted for a password, SSH will simply use the key to authenticate. As this key file should be device specific, this also increases security of the login process.
You can generate a new key by issuing:
client:~$ ssh-keygen -t ed25519\n\n# 1. Choose file in which to save the key *(leave blank for default)*\n# 2. Choose a passphrase of at least five characters\n
"},{"location":"connecting/ssh-basics/#how-do-ssh-keys-work","title":"How do SSH-Keys work?","text":"An SSH key consists of two files, one private and one public key. The public key is installed on remote machines and can only be validated with the matching private key, which is stored on client computers. During the login process this is achieved via public-key cryptography.
Traditionally the algorithm used for this was RSA. Recently elliptic curve cryptography has been developed as a more secure and more performant alternative. We recommend the ed25519
type of SSH key.
"},{"location":"connecting/ssh-basics/#passphrase","title":"Passphrase","text":"The security problem with SSH keys is that anyone with access to the private key has full access to all machines that have the public key installed. Loosing the key or getting it compromised in another way imposes a serious security threat. Therefore, it is best to secure the private key with a passphrase. This passphrase is needed to unlock and use the private key.
Once you have your key-pair generated, you can easily change the passphrase of that key by issuing:
client:~$ ssh-keygen -p\n
"},{"location":"connecting/ssh-basics/#ssh-agent","title":"SSH-Agent","text":"In order to avoid having to type the passphrase of the key every time we want to use it, the key can be loaded into an SSH-Agent.
For instance, if you have connected to a login-node via Putty and want to unlock your private key in order to be able to access cluster nodes, you cant configure the SSH-Agent.
client:~$ source <(ssh-agent)\n
(The above command will load the required environment variables of the SSH-Agent into your shell environment, effectively making the agent available for your consumption.)
Next, you can load your private key:
client:~$ ssh-add\n
(You will be prompted for the passphrase of the key)
You can verify that the agent is running and your key is loaded by issuing:
client:~$ ssh-add -l\n# 'l' as in list-all-loaded-keys\n
(The command should print at least one key, showing the key-size, the hash of the key-fingerprint and the location of the file in the file-system.)
Since all home-directories are shared across the entire cluster and you created your key-pair inside your home-directory, you public-key (which is also in your home-directory) is automatically installed on all other cluster nodes, immediately. Try connecting to any cluster node. It should not prompt your for a password.
There is nothing you have to do to \"unload\" or \"lock\" the key-file. Simply disconnect.
"},{"location":"connecting/advanced-ssh/linux/","title":"Connecting via SSH on Unix","text":""},{"location":"connecting/advanced-ssh/linux/#activating-your-key-in-the-ssh-key-agent","title":"Activating your Key in the SSH Key Agent","text":"Note
The big Linux distributions automatically manage ssh-agent for you and unlock your keys at login time. If this doesn't work for you, read on.
ssh-agent
caches your SSH keys so that you do not need to type your passphrase every time it is used. Activate it by making sure ssh-agent
runs in the background and add your key:
$ eval \"$(ssh-agent -s)\"\n$ ssh-add\n
or if you chose a custom key name, specify the file like so:
$ ssh-add ~/.ssh/mdc_id_rsa\n
"},{"location":"connecting/advanced-ssh/linux/#macos","title":"MacOS","text":"If you run into problems that your key is not accepted when connecting from MacOS, please use:
$ ssh-add --apple-use-keychain\n
"},{"location":"connecting/advanced-ssh/linux/#configure-ssh-client","title":"Configure SSH Client","text":"You can define a personal SSH configuration file to make connecting to the cluster more comfortable by reducing the typing necessary by a lot. Add the following lines to the file ~/.ssh/config
file. Replace USER_NAME
with your cluster user name. You can also adapt the Host naming as you like.
Host bihcluster\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n\nHost bihcluster2\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n
Now, you can do type the following (and you don't have to remember the host name of the login node any more).
$ ssh bihcluster\n
This configuration works if you are inside Charit\u00e9, the Charit\u00e9 VPN, or MDC.
"},{"location":"connecting/advanced-ssh/linux/#mdc-users-jail-node","title":"MDC users: Jail node","text":"If you have an MDC user account and want to connect from the outside, you can use the following ~/.ssh/config
lines to set up a ProxyJump via the MDC SSH jail.
Host mdcjail\n HostName ssh1.mdc-berlin.de\n User MDC_USER_NAME\n
Now you can run
$ ssh -J mdcjail bihcluster1\n
If you are always connecting from outside the internal network, you can also add a permanent ProxyJump to the SSH configuration like so:
Host bihcluster\n HostName hpc-login-1.cubi.bihealth.org\n User USER_NAME\n ProxyJump mdcjail\n
"},{"location":"connecting/advanced-ssh/linux/#connecting-with-another-computerlaptop","title":"Connecting with another computer/laptop","text":"If you need to connect to the cluster from another computer than the one that contains the SSH keys that you submitted for the cluster login, you have two possibilities.
- Generate another SSH key pair and submit the public part as described beforehand.
- Copy your private part of the SSH key (
~/.ssh/id_rsa
) to the second computer into the same location.
Danger
Do not leave the key on any USB stick. Delete it after file transfer. This is a sensible part of data. Make sure that the files are only readable for you.
$ cd ~/.ssh\n$ chmod g-rwx id_rsa*\n$ ssh-add id_rsa\n
"},{"location":"connecting/advanced-ssh/linux/#file-system-mount-via-sshfs","title":"File System mount via sshfs","text":"$ sshfs <USERNAME>@hpc-transfer-1.cubi.bihealth.org:/ <MOUNTPOINT>\n
hpc-transfer-1:
follows the structure <host>:<directory>
starting in the user home. <MOUNTPOINT>
must be an empty but existing and readable directory on your local computer
"},{"location":"connecting/advanced-ssh/linux/#macos_1","title":"MacOS","text":"Make sure you have both OSXFUSE and SSHFS installed. You can get both from here: https://osxfuse.github.io/ or the most recent version via Homebrew:
$ brew cask install osxfuse; brew install sshfs; brew link --overwrite sshfs\n
The last command is optional and unlinks any pre-existing links to older versions of sshfs. Now you can run $ sshfs -o follow_symlinks <USERNAME>@hpc-transfer-1<X>.cubi.bihealth.org:<directory_relative_to_Cluster_root> <MOUNTPOINT> -o volname=<BIH-FOLDER> -o allow_other,noapplexattr,noappledouble\n
"},{"location":"connecting/advanced-ssh/linux/#x11","title":"X11","text":"Do you really need to run a graphical application on the cluster?
Please note that running more complex Java applications, such as IGV may be not very efficient because of the connection speed. In most cases you can run them on your local workstation by mounting them via SSHFS.
Connect to one of the login nodes using X11 forwarding:
$ ssh -X -C -t <USERNAME>@hpc-login-1.bihealth.org\n
Once you get a login prompt, you can use the srun
command with the --x11
parameter to open a X11 session to a cluster node:
$ srun --pty --x11 bash\n
And finally you can start your X11 application, e.g.:
$ xterm\n
After a while Visual Terminal should start:
"},{"location":"connecting/advanced-ssh/overview/","title":"Advanced SSH usage","text":"Here we describe custom scenarios for using SSH to connect to BIH HPC. To keep it consise, this section is divided into separate documents for
- Linux and
- Windows users.
"},{"location":"connecting/advanced-ssh/windows/","title":"Windows","text":""},{"location":"connecting/advanced-ssh/windows/#mounting-the-fs-from-within-the-mdccharite-network","title":"Mounting the FS from within the MDC/Charite Network","text":"Danger
Mounting ssh on Windows is currently discouraged since relevant software is outdated (see also hpc-talk). Also, in most cases it is not really necessary to have a constant mount. For normal data transfer please use WinSCP instead.
Once WinSshFS is started, an icon will be added to your taskbar:
Left-clicking that icon will bring up a window. If not, right click the taskbar icon, select Show Manager
and click Add
in the menu.
Fill out the marked fields:
- Drive Name: Name that will show up in the windows explorer
- Host:
hpc-transfer-1.cubi.bihealth.org
- Username: Your cluster username
- Authentication method:
PrivateKey
. Select the id_rsa
private key, not the .ppk
format that is provided by PuTTY. Enter the password that you used to secure your key with. - Directory: Cluster directory that will be mounted, you can choose any directory you have access to on the cluster.
Then click Save
and then Mount
.
Open the explorer. A new drive with the name you gave should show up:
Finished!
"},{"location":"connecting/advanced-ssh/windows/#connecting-via-mdc-jail-node","title":"Connecting via MDC Jail Node","text":" -
This requires an active MDC account!
-
Additional to the steps above, click on the tab Network settings
.
- Check Connect through SSH gateway (jump host) and in the text field Gateway SSH server enter
ssh1.mdc-berlin.de
and in the field User your MDC username. - Check Use private key and select the SSH key that you uploaded to the MDC persdb (this might differ from your cluster key!).
- Click OK
"},{"location":"connecting/advanced-ssh/windows/#x11","title":"X11","text":"Do you really need to run a graphical application on the cluster?
Please note that running more complex Java applications, such as IGV may be not very efficient because of the connection speed. In most cases you can run them on your local workstation by mounting them via SSHFS.
Start MobaXterm, it should automatically fetch your saved Putty sessions as you can see on screen below:
Connect to one of the login nodes, by double-click on saved profile, and then use srun --pty --x11 bash
command to start X11 session to one of the nodes:
Finally, start X11 application (below example of starting Visual Terminal):
"},{"location":"connecting/generate-key/linux/","title":"Generating an SSH Key in Linux","text":" - You might already have one, check whether a file
~/.ssh/id_xxx.pub
is present. - Otherwise, create key using the following command (marking your key with your email address will make it easier to reidentify your key later on):
$ ssh-keygen -t ed25519 -C \"your_email@example.com\"\n
- Use the default location for your key
- Enter a passphrase twice to encrypt your key
What is a key passphrase?
You should set a passphrase when generating your key pair. It is used for encrypting your private key in case it is stolen or lost. When using the key for login, you will have to enter the passphrase. Many desktop environments offer ways to automatically unlock your key on login.
Read SSH Basics for more information.
The whole session should look something like this:
host:~$ ssh-keygen -t ed25519 -C \"your_email@example.com\"\nGenerating public/private ed25519 key pair.\nEnter file in which to save the key (/home/USER/.ssh/id_ed25519): \nCreated directory '/home/USER/.ssh'.\nEnter passphrase (empty for no passphrase):\nEnter same passphrase again: \nYour identification has been saved in /home/USER/.ssh/id_ed25519.\nYour public key has been saved in /home/USER/.ssh/id_ed25519.pub.\nThe key fingerprint is:\nSHA256:Z6InW1OYt3loU7z14Kmgy87iIuYNr1gJAN1tG71D7Jc your_email@example.com\nThe key's randomart image is:\n+--[ED25519 256]--+\n|.. . . o |\n|. . . + + |\n|. . = . . |\n|. . +oE. |\n|. So= o o |\n| . . . * = + + |\n| + o + B o o .|\n| oo+. .B + + . |\n|.ooooooo*. . |\n+----[SHA256]-----+\n
The file content of ~/.ssh/id_ed25519.pub
should look something like this):
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFzuiaSVD2j5y6RlFxOfREB/Vbd+47ABlxF7du5160ZH your_email@example.com\n
"},{"location":"connecting/generate-key/linux/#submit-your-key","title":"Submit Your Key","text":"As a next step you need to submit the SSH key use these links as:
- Charite user
- MDC user
"},{"location":"connecting/generate-key/windows/","title":"Generating an SSH Key in Windows","text":"Prerequisite: Installing an SSH Client
Please install an SSH client for Windows first.
"},{"location":"connecting/generate-key/windows/#generate-the-key","title":"Generate the Key","text":"Click on Tools
and MobaKeyGen (SSH key generator)
In the section Parameters make sure to set the following properties:
- Type of key to generate:
RSA
(this is the SSH-2
protocol) - Number of bits in a generated key:
4096
If all is set, hit the Generate button.
During generation, move the mouse cursor around in the blank area.
When finished, make sure to protect your generated key with a passphrase. Save the private and public key. The default name under Linux for the public key is id_rsa.pub
and id_rsa
for the private key, but you can name them however you want (the .pub
is NOT automatically added). Note that in the whole cluster wiki we will use this file naming convention. Also note that the private key will be stored in Putty format (.ppk
, this extension is added automatically).
What is your key's passphrase?
You should set a passphrase when generating your private key. This passphrase is used for encrypting you private key to protect it against the private key file theft/being lost. When using the key for login, you will have to enter it (or the first time you load it into the SSH key agent). Note that when being asked for the passphrase this does not occur on the cluster (and is thus unrelated to it) but on your local computer.
Also see SSH Basics for more information.
The gibberish in the textbox is your public key in the format how it has to be submitted to the MDC and Charite (links for this step below). Thus, copy this text and paste it to the SSH-key-submission-web-service of your institution.
Store the private key additionally in the OpenSSH format. To do so, click Conversions
and select Export OpenSSH key
. To be consistent, give the file the same name as your .ppk
private key file above (just without the .ppk
).
"},{"location":"connecting/generate-key/windows/#summary","title":"Summary","text":"To summarize, you should end up with three files:
id_rsa.pub
The public key file, it is not required if you copy and submit the SSH public key as described above and in the links below. id_rsa.ppk
This file is only needed if you plan to use Putty. id_rsa
This is your private key and the one and only most important file to access the cluster. It will be added to the sessions in MobaXterm and WinSSHFS (if required).
"},{"location":"connecting/generate-key/windows/#submit-your-key","title":"Submit Your Key","text":"As a next step you need to submit the SSH key use these links as:
- Charite user
- MDC user
"},{"location":"connecting/submit-key/charite/","title":"Submitting an SSH Key to Charite","text":"As of February 2020, SSH key submission not accepted via email anymore. Instead, use the process outline here.
For any help, please contact helpdesk@charite.de (as this site is maintained by Charite GB IT).
"},{"location":"connecting/submit-key/charite/#charite-zugangsportal","title":"Charite Zugangsportal","text":"Key are submitted in the Charite Zugangsportal. As of Feb 4, you have to use the \"test\" version for this.
Go to zugang.charite.de and login.
Follow through the login page until you reach the main menu (it's tedious but we belive in you ;) Click the \"SSH Keys\" button.
Paste your SSH key (starting with ssh-rsa
) and ending with the label (usually your email, e.g., john.doe@charite.de
) into the box (1) and press append (2). By default, the key can be found in the file ~/.ssh/id_rsa.pub
in Linux. If you generated the key in Windows, please paste the copied key from the text box. Repeat as necessary. Optionally, go back to the main menu (3) when done.
If you have generated your SSH key with PuTTy, you must right click on the ppk-file, then choose \"Edit with PuTTYgen\" in the right click menu. Enter your passphrase. Then copy the SSH key out of the upper box (already highlighted in blue).
Check if the key has been added
After you clicked append
, your key will be printed back to you (as shown in the blurred picture above).
If your key is not printed back to you then adding the SSH key to zugang.charite.de was not successful. In this case please contact helpdesk@charite.de for assistance as they (Charite GB IT) maintains that system and it is out of our (BIH HPC IT) control.
Once your key has been added, it will take a few minutes for the changes to go live.
"},{"location":"connecting/submit-key/mdc/","title":"Submitting an SSH Key to MDC","text":"For MDC users, SSH keys are submitted through the MDC PersDB interface (see below). PersDB is not maintained by BIH HPC IT but by MDC IT.
Warning
The SSH keys are only activated over night (but automatically). This is out of our control. Contact helpdesk@mdc-berlin.de for more information.
"},{"location":"connecting/submit-key/mdc/#detour-using-mdc-vmware-view-to-get-into-mdc-intranet","title":"Detour: Using MDC VMWare View to get into MDC Intranet","text":"In case you are not within the MDC network, connect to MDC VMWare view first and use the web brower in the Window session.
- Go to the MDC VMWare View
- Click \"VMWare Web Viewer\"
- Login with MDC username and password.
- Select Windows 7.
- Open Firefox or Internet Browser
"},{"location":"connecting/submit-key/mdc/#enter-mdc-persdb","title":"Enter MDC PersDB","text":" - If you are inside MDC network, you can start here, OR
- If you have the MDC VMWare Web Viewer open, start here.
"},{"location":"connecting/submit-key/mdc/#log-into-mdc-persdb","title":"Log into MDC PersDB","text":" - Open https://persdb.mdc-berlin.net/login
- Login with MDC username and password again
"},{"location":"connecting/submit-key/mdc/#click-on-mein-profil","title":"Click on \"Mein Profil\"","text":""},{"location":"connecting/submit-key/mdc/#click-on-zusaetzliches-ssh-public-key-bearbeiten","title":"Click on \"Zusaetzliches (ssh public key) -> Bearbeiten\"","text":" - This is the middle item.
"},{"location":"connecting/submit-key/mdc/#click-neue-zusaetzliche-eigenschaft-anlegen","title":"Click \"Neue zusaetzliche Eigenschaft anlegen\"","text":" - Most probably, you don't have any entries yet.
"},{"location":"connecting/submit-key/mdc/#activate-the-vmware-view-menu","title":"Activate the VMWare View Menu","text":" - This is the only way to get your SSH key into the clipboard of the Windows instance that has MDC PersDB open. :rolleyes:
"},{"location":"connecting/submit-key/mdc/#activate-clipboard-window","title":"Activate Clipboard Window","text":" - Click (middle) clipboard button.
- The clipboard window appears.
- Close the VMWare View window again.
"},{"location":"connecting/submit-key/mdc/#register-ssh-key","title":"Register SSH key","text":" - Paste SSH key from
~/.ssh/id_rsa.pub
into the clipboard window. Ensure that the whole file contents is there (should end with your email address). If you generated the key in Windows, please paste the copied key from the text box. - Left-click the \"Inhalt\" text box to put the cursor there
- Right-click the \"Inhalt\" text box, make the context menu appear, and click \"Einfuegen\"
- Click Submit
"},{"location":"connecting/submit-key/mdc/#youre-done","title":"You're Done","text":"Thus, you will only be able to connect the next day. - Bask in the glory of having completed this process.
"},{"location":"cubit/","title":"Overview","text":"The static data installation can be found at /data/cephfs-1/work/projects/cubit/18.12/static_data
.
The static data directory contains a sub-directory for the genomes, the precomputed index files for several different popular mapping tools and associated annotation (GFF and GTF files) from Ensembl and GENCODE for each of the available genomes. The top-level directory structure is as follows:
static_data/
annotations
app_support
db
exome_panel
exon_list
precomputed
reference
"},{"location":"cubit/annotations/","title":"Annotation Data","text":"The following Ensembl and GENCODE versions corresponding to the indicated reference genomes will be made available on the cluster.
Database Version Reference Genome Ensembl 65 NCBIM37 (Ensembl release corresponding to GENCODE M1) Ensembl 67 NCBIM37 (Ensembl release for sanger mouse genome assembly) Ensembl 68 GRCm38 (Ensembl release for sanger mouse genome assembly) Ensembl 74 GRCh37 (Ensembl release for GENCODE 19) Ensembl 75 GRCh37 (Latest release for GRCh37) Ensembl 79 GRCh38 (Ensembl release for GENCODE 22) Ensembl 80 GRCh38 (Ensembl release corresponding to GENCODE 22) Ensembl 80 GRCm38 (Ensembl release corresponding to GENCODE M1) GENCODE M1 NCBIM37 (No gff3 file) GENCODE M5 GRCm38 GENCODE 19 current for GRCh37 GENCODE 22 current for GRCh38 The annotation files associated with the indicated genomes can be accessed in the following directories:
static_data/annotation\n\u251c\u2500\u2500 ENSEMBL\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 65\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 67\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 68\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 74\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 75\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 79\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 80\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCm38\n\u2514\u2500\u2500 GENCODE\n \u251c\u2500\u2500 19\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n \u251c\u2500\u2500 22\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n \u251c\u2500\u2500 M1\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n \u2514\u2500\u2500 M5\n \u2514\u2500\u2500 GRCm38\n
"},{"location":"cubit/app-support/","title":"Cubit Static Data: Application Support","text":"The static_data/app_support
directory contains all data files that are shipped with a software package installed in cubit. For blast
this is not complete and more databases can be added upon request.
static_data/app_support\n\u251c\u2500\u2500 blast\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 variable\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 nt\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 refseq_protein\n\u251c\u2500\u2500 Delly\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.6.5\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.6.7\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.7.5\n\u251c\u2500\u2500 GATK_bundle\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.8\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u251c\u2500\u2500 Jannovar\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.14\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.16\n\u251c\u2500\u2500 kraken\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.10.5-cubi20160426\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 bacvir\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 minikraken_20141208\n\u251c\u2500\u2500 Oncotator\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 v1_ds_Jan262015\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 1000genome_db\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 achilles\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cancer_gene_census\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ccle_by_gene\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ccle_by_gp\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 clinvar\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic_fusion\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 cosmic_tissue\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dbNSFP_ds\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dbsnp\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dna_repair_genes\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 esp6500SI_v2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 esp6500SI_v2_coverage\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 familial\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 gencode_out2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 gencode_xrefseq\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hgnc\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mutsig\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 oreganno\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 override_lists\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ref_hg\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 simple_uniprot\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 so_terms\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tcgascape\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 tumorscape\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 uniprot_aa_annotation\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 uniprot_aa_xform\n\u2514\u2500\u2500 SnpEff\n \u2514\u2500\u2500 4.1\n \u2514\u2500\u2500 data\n \u251c\u2500\u2500 GRCh37.75\n \u251c\u2500\u2500 GRCh38.79\n \u251c\u2500\u2500 GRCm38.79\n \u251c\u2500\u2500 hg19\n \u251c\u2500\u2500 hg38\n \u2514\u2500\u2500 mm10\n
"},{"location":"cubit/databases/","title":"Databases","text":"The file formats in the static_data/db
folder are mostly .vcf
or .bed
files. We provide the following databases:
Database Version Reference genome COSMIC v72 GRCh37 dbNSFP 2.9 GRCh37/hg19 dbSNP b128 mm9 dbSNP b128 NCBIM37 dbSNP b142 GRCh37 dbSNP b144 GRCh38 dbSNP b147 GRCh37 dbSNP b147 GRCh38 dbSNP b150 GRCh37 dbSNP b150 GRCh38 DGV 2015-07-23 GRCh37 ExAC release0.3 GRCh37/hg19 ExAC release0.3.1 GRCh37/hg19 giab NA12878_HG001/NISTv2.19 GRCh37 goldenpath variable GRCh37 goldenpath variable hg19 goldenpath variable mm9 goldenpath variable NCBIM37 SangerMousegenomesProject REL-1211-SNPs_Indels mm9 SangerMousegenomesProject REL-1211-SNPs_Indels NCBIM37 UK10K cohort REL-2012-06-02 GRCh37 The directory structure is as follows:
static_data/db\n\u251c\u2500\u2500 COSMIC\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 v72\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u251c\u2500\u2500 dbNSFP\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.9\n\u251c\u2500\u2500 dbSNP\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b128\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b142\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 b144\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 b147\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh38\n\u251c\u2500\u2500 DGV\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2015-07-23\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n\u251c\u2500\u2500 ExAC\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 release0.3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 release0.3.1\n\u251c\u2500\u2500 giab\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NA12878_HG001\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NISTv2.19\n\u251c\u2500\u2500 goldenpath\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 variable\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u251c\u2500\u2500 SangerMouseGenomesProject\n\u2502 \u2514\u2500\u2500 REL-1211-SNPs_Indels\n\u2502 \u251c\u2500\u2500 mm9\n\u2502 \u2514\u2500\u2500 NCBIM37\n\u2514\u2500\u2500 UK10K_cohort\n \u2514\u2500\u2500 REL-2012-06-02\n
"},{"location":"cubit/exomes-panels/","title":"Exomes and Panels","text":"These exome panel data are proprietary and downloaded after registration. In case you want to use them, be sure you have access to them by creating an account at Agilent or Roche to not run into legal trouble.
static_data/exome_panel\n\u251c\u2500\u2500 Agilent\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V4\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V5\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 SureSelect_Human_All_Exon_V6\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 SureSelect_Mouse_All_Exon_V1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 NCBIM37\n\u2514\u2500\u2500 Roche\n \u2514\u2500\u2500 SeqCap_EZ_MedExome\n \u2514\u2500\u2500 GRCh37\n
"},{"location":"cubit/exon-lists/","title":"Exon Lists","text":"Here we provide exon lists for some human genome assemblies in the .bed
-file format. Each file exists with the original coordinates contained and as a version with 10 bp padded on each site (suffix: _plus_10bp.bed
). The folder structure is self-explanatory:
static_data/exon_list\n\u251c\u2500\u2500 CCDS\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 15\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 18\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hg38\n\u2514\u2500\u2500 ENSEMBL\n \u251c\u2500\u2500 74\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 GRCh37\n \u2514\u2500\u2500 75\n \u2514\u2500\u2500 GRCh37\n
"},{"location":"cubit/index-files/","title":"Precomputed Index Files","text":"Index files for
- BWA version 0.7.12 and 0.7.15,
- bowtie2 version 2.2.5 and
- STAR version 2.4.1d
have been precomputed. The index corresponding to each genome is stored in the following directory structure with the above mentioned reference genomes as subfolders (listed here only for Bowtie/1.1.2
, same subfolders for the remaining programs):
static_data/precomputed\n\u251c\u2500\u2500 Bowtie\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 1.1.2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 danRer10\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 dm6\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 ecoli\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg18\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hg38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm10\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 phix\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 sacCer3\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 UniVec\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 UniVec_Core\n\u251c\u2500\u2500 Bowtie2\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 2.2.5\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u251c\u2500\u2500 BWA\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.7.12\n\u2502\u00a0\u00a0 \u2502 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0.7.15\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n\u2514\u2500\u2500 STAR\n \u2514\u2500\u2500 2.4.1d\n \u2514\u2500\u2500 default\n \u00a0\u00a0 \u2514\u2500\u2500 [see Bowtie/1.1.2]\n
"},{"location":"cubit/references/","title":"Reference Sequences","text":""},{"location":"cubit/references/#ncbi-mouse-reference-genome-assemblies","title":"NCBI mouse reference genome assemblies","text":"We provide the NCBI mouse reference assembly used by the Sanger Mouse Genomics group for NCBIM37 and GRCm38. This is a reliable source where the appropriate contigs have already been selected by experts. NCBIM37 is annotated with Ensembl release 67 and GRCm38 with Ensembl release 68.
"},{"location":"cubit/references/#ucsc-mouse-reference-genome-assemblies","title":"UCSC mouse reference genome assemblies","text":"The assembly sequence is in one file per chromosome and is available for mm9 and mm10. We concatenated all the chromosome files to one final fasta file for each genome assembly.
"},{"location":"cubit/references/#ncbi-human-reference-genome-assemblies","title":"NCBI human reference genome assemblies","text":" - GRCh37: We provide the version used by the 1000genomes project as it is widely used and recommended. The chromosomes and contigs are already concatenated.
- g1k_phase1/hs37: This reference sequence contains the autosomal and both sex chromosomes, an updated mitochondrial chromosome as well as \"non-chromosomal supercontigs\". The README explains the method of construction.
- g1k_phase2/hs37d5: In addition to these sequences the phase 2 reference sequence contains the herpes virus genome and decoy sequences for improving SNP calling.
- GRCh38: The GRCh38 assembly offers an \"analysis set\" that was created to accommodate next generation sequencing read alignment pipelines. We provide the three analysis sets from the NCBI.
- hs38/no_alt_analysis_set: The chromosomes, mitochondrial genome, unlocalized scaffolds, unplaced scaffolds and the Epstein-Barr virus sequence which has been added as a decoy to attract contamination in samples.
- hs38a/full_analysis_set: the alternate locus scaffolds in addition to all the sequences present in the no_alt_analysis_set.
- hs38DH/full_plus_hs38d1_analysis_set: contains the human decoy sequences from hs38d1 in addition to all the sequences present in the full_analysis set. More detailed information is available in the README.
"},{"location":"cubit/references/#ucsc-human-reference-genome-assemblies","title":"UCSC human reference genome assemblies","text":"The assembly sequence is in one file per chromosome is available for hg18, hg19 and hg38. We concatenated all the chromosome files to one final fasta file for each genome assembly. Additionally, in the subfolder chromosomes
we keep the chromosome fasta files separately for hg18 and hg19.
"},{"location":"cubit/references/#other-reference-genomes","title":"Other reference genomes","text":" - danRer10: UCSC/GRC zebrafish build 10
- dm6: UCSC/GRC Drosophila melanogaster build 6
- ecoli:
- GCA_000005845.2_ASM584v2: Genbank Escherichia coli K-12 subst. MG1655 genome
- genomemedley:
- 1: Concatenated genome of hg19, dm6, mm10; Chromosomes are tagged with corresponding organism
- PhiX: Control genome that is used by Illumina for sequencing runs
- sacCer3: UCSC's Saccharomyces cerevisiae genome build 3
- UniVec:
- 9: NCBI's non redundant reference of vector sequences, adapters, linkers and primers commonly used in the process of cloning cDNA or genomic DNA (build 9)
- UniVec_Core
- 9: A subset of UniVec build 9
The following directory structure indicates the available genomes. Where there isn't a name for the data set, either the source (e.g. sanger - from the Sanger Mouse Genomes project) or the download date is used to name the sub-directory.
static_data/reference\n\u251c\u2500\u2500 danRer10\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 dm6\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 ecoli\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 GCA_000005845.2_ASM584v2\n\u251c\u2500\u2500 genomemedley\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 1\n\u251c\u2500\u2500 GRCh37\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 g1k_phase1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 g1k_phase2\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hs37d5\n\u251c\u2500\u2500 GRCh38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs38\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 hs38a\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 hs38DH\n\u251c\u2500\u2500 GRCm38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 sanger\n\u251c\u2500\u2500 hg18\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 hg19\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 hg38\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 mm10\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 mm9\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 NCBIM37\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 sanger\n\u251c\u2500\u2500 phix\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 illumina\n\u251c\u2500\u2500 sacCer3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 ucsc\n\u251c\u2500\u2500 UniVec\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 9\n\u2514\u2500\u2500 UniVec_Core\n \u2514\u2500\u2500 9\n
"},{"location":"help/faq/","title":"Frequently Asked Questions","text":""},{"location":"help/faq/#where-can-i-get-help","title":"Where can I get help?","text":" - Talk to your colleagues!
- Have a look at our forums at HPC-talk to see if someone already solved the same problem. If not, create a new topic. Administrators, CUBI, and other users can see and answer your question.
- For problems while connecting and logging in, please contact helpdesk@mdc-berlin.de or helpdesk@charite.de.
- For problems with BIH HPC please contact [hpc-helpdesk@bih-charite.de].
"},{"location":"help/faq/#i-cannot-connect-to-the-cluster-whats-wrong","title":"I cannot connect to the cluster. What's wrong?","text":"Please see the section Connection Problems.
"},{"location":"help/faq/#connecting-to-the-cluster-takes-a-long-time","title":"Connecting to the cluster takes a long time.","text":"The most probable cause for this is a conda installation which loads files on login. To disable this behaviour we can put the conda intialisation code behind a bash alias to run it manually later:
In your ~/.bashrc
find the conda block:
# >>> conda initialize >>>\n# !! Contents within this block are managed by 'conda init' !!\n...\n# <<< conda initialize <<<\n
Encapsulate the entire section in a bash function like this:
conda_init() {\n # >>> conda initialize >>>\n # !! Contents within this block are managed by 'conda init' !!\n ...\n # <<< conda initialize <<<\n}\n
From now on to use conda you must first run conda_init
which then loads the necessary files.
You can also run the bash shell in verbose mode to find out exactly which command is slowing down login:
$ ssh user@hpc-login-1.cubi.bihealth.org bash -iv\n
"},{"location":"help/faq/#what-is-the-difference-between-max-and-bih-cluster-what-is-their-relation","title":"What is the difference between MAX and BIH cluster? What is their relation?","text":"Administrativa
- The BIH HPC 4 Research cluster of the Berlin Institute of Health (BIH) is located in Buch and operated by BIH HPC IT. The cluster is open for users of both BIH/Charite and MDC.
- The MAX cluster is the cluster of the Max Delbrueck Center (MDC) in Buch. This cluster is used by the researchers at MDC and integrates with a lot of infrastructure of the MDC.
Request for both systems are handled separately, depending on the user's affiliation with research/service groups.
Hardware and Systems
- Both clusters consist of similar hardware for the compute nodes and both feature a DDN system at different number of nodes and different storage volume.
- Both clusters run CentOS/rocky but at potentially different version.
- BIH HPC uses the Slurm workload manager whereas MAX uses Univa Grid Engine.
- The BIH cluster has a significantly faster internal network (40GB/s optical).
Bioinformatics Software
- On the BIH cluster, users can install their own (bioinformatics) software in their user directory.
- On the MAX cluster, users can also install their own software or use software provided by Altuna Akalin's group at MDC.
"},{"location":"help/faq/#my-ssh-sessions-break-with-packet_write_wait-connection-to-xxx-broken-pipe-how-can-i-fix-this","title":"My SSH sessions break with \"packet_write_wait: Connection to XXX : Broken pipe
\". How can I fix this?","text":"Try to put the following line at the top of your ~/.ssh/config
.
ServerAliveInterval 30\n
This will make ssh
send an empty network package to the server. This will prevent network hardware from thinking your connection is unused/broken and terminating it.
If the problem persists, please report it to hpc-helpdesk@bih-charite.de.
"},{"location":"help/faq/#my-job-terminated-before-being-done-what-happened","title":"My job terminated before being done. What happened?","text":"First of all, look into your job logs. In the case that the job was terminated by Slurm (e.g., because it ran too long), you will find a message like this at the bottom. Please look at the end of the last line in your log file.
slurmstepd: error: *** JOB <your job id> ON med0xxx CANCELLED AT 2020-09-02T21:01:12 DUE TO TIME LIMIT ***\n
This indicates that you need to need to adjust the --time
limit to your sbatch
command.
slurmstepd: error: Detected 2 oom-kill event(s) in step <your job id>.batch cgroup.\nSome of your processes may have been killed by the cgroup out-of-memory handler\n
This indicates that your job tries to use more memory than has been allocated to it. Also see Slurm Scheduler: Memory Allocation
Otherwise, you can use sacct -j JOBID
to read the information that the job accounting system has recorded for your job. A job that was canceled (indicated by CANCELED
) by the Slurm job scheduler looks like this (ignore the COMPLETED
step that is just some post-job step added by Slurm automatically).
# sacct -j _JOBID_\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n_JOBID_ snakejob.+ medium hpc-ag-xx+ 4 TIMEOUT 0:0\n_JOBID_.bat+ batch hpc-ag-xx+ 4 CANCELLED 0:15\n_JOBID_.ext+ extern hpc-ag-xx+ 4 COMPLETED 0:0\n
Use the --long
flag to see all fields (and probably pipe it into less
as: sacct -j JOBID --long | less -S
). Things to look out for:
- What is the exit code?
- Is the highest recorded memory usage too high/higher than expected (field
MaxRSS
)? - Is the running time too long/longer than expected (field
Elapsed
)?
Note that --long
does not show all fields. For example, the following tells us that the given job was above its elapsed time which caused it to be killed.
# sacct -j _JOBID_ --format Timelimit,Elapsed\n Timelimit Elapsed\n---------- ----------\n 01:00:00 01:00:12\n 01:00:13\n 01:00:12\n
Use man sacct
, sacct --helpformat
, or see the Slurm Documentation for options for the --format
field of sacct
.
"},{"location":"help/faq/#im-getting-a-bus-error-core-dumped","title":"I'm getting a \"Bus error (core dumped)\"","text":"This is most probably caused by your job being allocated insufficient memory. Please see the memory part of the answer to My job terminated before being done. What happened?
"},{"location":"help/faq/#how-can-i-create-a-new-project","title":"How can I create a new project?","text":"You can create a project if you are either a group leader of an AG or a delegate of an AG. If this is the case, please follow these instructions.
"},{"location":"help/faq/#i-cannot-create-pngs-in-r","title":"I cannot create PNGs in R","text":"For using the png
method, you need to have an X11 session running. This might be the case if you logged into a cluster node using srun --x11
if configured correctly but is not the case if you submitted a bash job. The solution is to use xvfb-run
(xvfb = X11 virtual frame-buffer).
Here is the content of an example script:
$ cat img.R\n#!/usr/bin/env Rscript\n\npng('cars.png')\ncars <- c(1, 3, 6, 4, 9)\nplot(cars)\ndev.off()\n
Here, it fails without X11:
$ ./img.R\nError in .External2(C_X11, paste(\"png::\", filename, sep = \"\"), g$width, :\n unable to start device PNG\nCalls: png\nIn addition: Warning message:\nIn png(\"cars.png\") : unable to open connection to X11 display ''\nExecution halted\n
Here, it works with xvfb-run
:
$ xvfb-run ./img.R\nnull device\n 1\n$ ls\ncars.png foo.png img.R Rplots.pdf\n
"},{"location":"help/faq/#my-jobs-dont-get-scheduled","title":"My jobs don't get scheduled","text":"You can use scontrol show job JOBID
to get the details displayed about your jobs. In the example below, we can see that the job is in the PENDING
state. The Reason
field tells us that the job did not scheduled because the specified dependency was neverfulfilled. You can find a list of all job reason codes in the Slurm squeue
documentation.
JobId=863089 JobName=pipeline_job.sh\n UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(5272) MCS_label=N/A\n Priority=1 Nice=0 Account=(null) QOS=normal\n JobState=PENDING Reason=DependencyNeverSatisfied Dependency=afterok:863087(failed)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A\n SubmitTime=2020-05-03T18:57:34 EligibleTime=Unknown\n AccrueTime=Unknown\n StartTime=Unknown EndTime=Unknown Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-05-03T18:57:34\n Partition=debug AllocNode:Sid=hpc-login-1:28797\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=1,node=1,billing=1\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Command=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/pipeline_job.sh\n WorkDir=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export\n StdErr=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out\n StdIn=/dev/null\n StdOut=/data/cephfs-1/work/projects/medgen_genomes/2019-06-05_genomes_reboot/GRCh37/wgs_cnv_export/slurm-863089.out\n Power=\n MailUser=(null) MailType=NONE\n
If you see a Reason=ReqNodeNotAvail,_Reserved_for_maintenance
then also see Reservations / Maintenances.
For GPU jobs also see \"My GPU jobs don't get scheduled\".
"},{"location":"help/faq/#my-gpu-jobs-dont-get-scheduled","title":"My GPU jobs don't get scheduled","text":"There are only four GPU machines in the cluster (with four GPUs each, hpc-gpu-1 to hpc-gpu-4). Please inspect first the number of running jobs with GPU resource requests:
hpc-login-1:~$ squeue -o \"%.10i %20j %.2t %.5D %.4C %.10m %.16R %.13b\" \"$@\" | grep hpc-gpu- | sort -k7,7\n 1902163 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2\n 1902167 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2\n 1902164 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2\n 1902166 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2\n 1902162 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2\n 1902165 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2\n 1785264 bash R 1 1 1G hpc-gpu-4 gpu:tesla:2\n
This indicates that there are two free GPUs on hpc-gpu-4.
Second, inspect the node states:
hpc-login-1:~$ sinfo -n hpc-gpu-[1-4]\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\ndebug* up 8:00:00 0 n/a\nmedium up 7-00:00:00 0 n/a\nlong up 28-00:00:0 0 n/a\ncritical up 7-00:00:00 0 n/a\nhighmem up 14-00:00:0 0 n/a\ngpu up 14-00:00:0 1 drng hpc-gpu-4\ngpu up 14-00:00:0 3 mix med[0301-0303]\nmpi up 14-00:00:0 0 n/a\n
This tells you that hpc-gpu-1 to hpc-gpu-3 have jobs running (\"mix\" indicates that there are free resources, but these are only CPU cores not GPUs). hpc-gpu-4 is shown to be in \"draining state\". Let's look what's going on there.
hpc-login-1:~$ scontrol show node hpc-gpu-4\nNodeName=hpc-gpu-4 Arch=x86_64 CoresPerSocket=16\n CPUAlloc=2 CPUTot=64 CPULoad=1.44\n AvailableFeatures=skylake\n ActiveFeatures=skylake\n Gres=gpu:tesla:4(S:0-1)\n NodeAddr=hpc-gpu-4 NodeHostName=hpc-gpu-4 Version=20.02.0\n OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020\n RealMemory=385215 AllocMem=1024 FreeMem=347881 Sockets=2 Boards=1\n State=MIXED+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A\n Partitions=gpu\n BootTime=2020-06-30T20:33:36 SlurmdStartTime=2020-07-01T09:31:51\n CfgTRES=cpu=64,mem=385215M,billing=64\n AllocTRES=cpu=2,mem=1G\n CapWatts=n/a\n CurrentWatts=0 AveWatts=0\n ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s\n Reason=deep power-off required for PSU [root@2020-07-17T13:21:02]\n
The \"State\" attribute indicates the node has jobs running but is currenlty being \"drained\" (accepts no new jobs). The \"Reason\" gives that it has been scheduled for power-off for maintenance of the power supply unit.
"},{"location":"help/faq/#when-will-my-job-be-scheduled","title":"When will my job be scheduled?","text":"You can use the scontrol show job JOBID
command to inspect the scheduling information for your job. For example, the following job is scheduled to start at 2022-09-19T07:53:29
(StartTime
) and will be terminated if it does not stop before 2022-09-19T15:53:29
(EndTime
) For further information, it has been submitted at 2022-09-15T12:24:57
(SubmitTime
) and has been last considered by the scheduler at 2022-09-19T07:53:15
(LastSchedEval
).
# scontrol show job 4225062\nJobId=4225062 JobName=C2371_2\n UserId=user_c(133196) GroupId=hpc-ag-group(1030014) MCS_label=N/A\n Priority=805 Nice=0 Account=hpc-ag-group QOS=normal\n JobState=PENDING Reason=QOSMaxCpuPerUserLimit Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=08:00:00 TimeMin=N/A\n SubmitTime=2022-09-15T12:24:57 EligibleTime=2022-09-15T12:24:57\n AccrueTime=2022-09-15T12:24:57\n StartTime=2022-09-19T07:53:29 EndTime=2022-09-19T15:53:29 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-09-19T07:53:15 Scheduler=Main\n Partition=medium AllocNode:Sid=hpc-login-1:557796\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1-1 NumCPUs=25 NumTasks=25 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=25,mem=150G,node=1,billing=25\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=150G MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=YES Contiguous=0 Licenses=(null) Network=(null)\n Command=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/GS_wrapy/wrap_y0_VP_2371_GS_chunk2_C02.sh\n WorkDir=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims\n StdErr=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/E2371_2.txt\n StdIn=/dev/null\n StdOut=/data/cephfs-1/home/users/user_c/work/SCZ_replic/JR_sims/slurm-4225062.out\n Power=\n
"},{"location":"help/faq/#my-jobs-dont-run-in-the-partition-i-expect","title":"My jobs don't run in the partition I expect","text":"You can see the partition that your job runs in with squeue -j JOBID
:
hpc-login-1:~$ squeue -j 877092\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 877092 medium snakejob holtgrem R 0:05 1 med0626\n
See Job Scheduler for information about the partition's properties and how jbos are routed to partitions. You can force jobs to run in a particular partition by specifying the --partition
parameter, e.g., by adding --partition=medium
or -p medium
to your srun
and sbatch
calls.
"},{"location":"help/faq/#my-jobs-get-killed-after-four-hours","title":"My jobs get killed after four hours","text":"This is probably answered by the answer to My jobs don't run in the partition I expect.
"},{"location":"help/faq/#how-can-i-mount-a-network-volume-from-elsewhere-on-the-cluster","title":"How can I mount a network volume from elsewhere on the cluster?","text":"You cannot.
"},{"location":"help/faq/#how-can-i-make-workstationserver-files-available-to-the-hpc","title":"How can I make workstation/server files available to the HPC?","text":"You can transfer files to the cluster through Rsync over SSH or through SFTP to the hpc-transfer-1
or hpc-transfer-2
node.
Do not transfer files through the login nodes. Large file transfers through the login nodes can cause performance degradation for the users with interactive SSH connections.
"},{"location":"help/faq/#how-can-i-circumvent-invalid-instruction-signal-4-errors","title":"How can I circumvent \"invalid instruction\" (signal 4) errors?","text":"Make sure that software is compiled with \"sandy bridge\" optimizations and no later one. E.g., use the -march=sandybridge
argument to the GCC/LLVM compiler executables.
If you absolutely need it, there are some boxes with more recent processors in the cluster (e.g., Haswell architecture). Look at the /proc/cpuinfo
files for details.
"},{"location":"help/faq/#i-have-problems-connecting-to-the-gpu-node-whats-wrong","title":"I have problems connecting to the GPU node! What's wrong?","text":"Please check whether there might be other jobs waiting in front of you! The following squeue
call will show the allocated GPUs of jobs in the gpu
queue. This is done by specifying a format string and using the %b
field.
squeue -o \"%.10i %9P %20j %10u %.2t %.10M %.6D %10R %b\" -p gpu\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(R TRES_PER_NODE\n 872571 gpu bash user1 R 15:53:25 1 hpc-gpu-3 gpu:tesla:1\n 862261 gpu bash user2 R 2-16:26:59 1 hpc-gpu-4 gpu:tesla:4\n 860771 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860772 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860773 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1\n 860770 gpu kidney.job user3 R 4-03:23:08 1 hpc-gpu-1 gpu:tesla:1\n 860766 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-3 gpu:tesla:1\n 860767 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1\n 860768 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1\n
In the example above, user1 has one job with one GPU running on hpc-gpu-3, user2 has one job running with 4 GPUs on hpc-gpu-4 and user3 has 7 jobs in total running of different machines with one GPU each.
"},{"location":"help/faq/#how-can-i-access-graphical-user-interfaces-such-as-for-matlab-on-the-cluster","title":"How can I access graphical user interfaces (such as for Matlab) on the cluster?","text":" - First of all, you will need an X(11) server on your local machine (see Wikipedia: X Window System. This server offers a \"graphical surface\" that the programs on the cluster can then paint on.
- You need to make sure that the programs running on the cluster can access this graphical surface.
- Generally, you need to connect to the login nodes with X forwarding. Refer to the manual of your SSH client on how to do this (
-X
for Linux/Mac ssh
- As you should not run compute-intensive programs on the login node, connect to a cluster node with X forwarding. With Slurm, this is done using
srun --pty --x11 bash -i
(instead of srun --pty --x11 bash -i
).
Also see:
- Running graphical(X11) applications on Windows
- Running graphical(X11) applications on Linux
"},{"location":"help/faq/#how-can-i-log-into-a-node-outside-of-the-scheduler","title":"How can I log into a node outside of the scheduler?","text":"This is sometimes useful, e.g., for monitoring the CPU/GPU usage of your job interactively.
No Computation Outside of Slurm
Do not perform any computation outside of the scheduler as (1) this breaks the purpose of the scheduling system and (2) administration is not aware and might kill you jobs.
The answer is simple, just SSH into this node.
hpc-login-1:~$ ssh hpc-cpu-xxx\n
"},{"location":"help/faq/#why-am-i-getting-multiple-nodes-to-my-job","title":"Why am I getting multiple nodes to my job?","text":"Classically, jobs on HPC systems are written in a way that they can run on multiple nodes at once, using the network to communicate. Slurm comes from this world and when allocating more than one CPU/core, it might allocate them on different nodes. Please use --nodes=1
to force Slurm to allocate them on a single node.
"},{"location":"help/faq/#how-can-i-select-a-certain-cpu-architecture","title":"How can I select a certain CPU architecture?","text":"You can select the CPU architecture by using the -C
/--constraint
flag to sbatch
and srun
. The following are available (as detected by the Linux kernel):
ivybridge
(96 nodes, plus 4 high-memory nodes) haswell
(16 nodes) broadwell
(112 nodes) skylake
(16 nodes, plus 4 GPU nodes)
You can specify contraints with OR such as --constraint=haswell|broadwell|skylake
. You can see the assignment of architectures to nodes using the sinfo -o \"%8P %.5a %.10l %.6D %.6t %10f %N\"
command. This will also display node partition, availability etc.
"},{"location":"help/faq/#help-im-getting-a-quota-warning-email","title":"Help, I'm getting a Quota Warning Email!","text":"No worries!
As documented in the Storage Locations section, each user/project/group has three storage volumes: A small home
, a larger work
and a large (but temporary) scratch
. There are limits on the size of these volumes. You get a nightly warning email in case you are over the soft limit and you will not be able to write any more data if you get above the hard limit. When you login to the login nodes, the quotas and current usage is displayed to you.
Please note that not all files will be displayed when using ls
. You have to add the -a
parameter to also show files and directory starting with a dot. Often, users are confused if these dot directories take up all of their home
quota.
Use the following command to list all files and directories in your home:
hpc-login-1:~$ ls -la ~/\n
For more information on how to keep your home directory clean and avoid quota warnings, please read Home Folder Quota.
"},{"location":"help/faq/#im-getting-a-disk-quota-exceeded-error","title":"I'm getting a \"Disk quota exceeded\" error.","text":"Most probably you are running into the same problem as described above: Help, I'm getting a Quota Warning Email!
"},{"location":"help/faq/#environment-modules-dont-work-and-i-get-module-command-not-found","title":"Environment modules don't work and I get \"module: command not found\"","text":"First of all, ensure that you are on a compute node and not on one of the login nodes. One common reason is that the system-wide Bash configuration has not been loaded, try to execute source /etc/bashrc
and then re-try using module
. In the case that the problem persists, please contact hpc-helpdesk@bih-charite.de.
"},{"location":"help/faq/#what-should-my-bashrc-look-like","title":"What should my ~/.bashrc look like?","text":"All users get their home directory setup using a skelleton files. These file names start with a dot .
and are hidden when you type ls
, you have to type ls -a
to see them. You can find the current skelleton in /etc/skel.bih
and inspect the content of the Bash related files as follows:
hpc-login-1:~$ head /etc/skel.bih/.bash*\n==> /etc/skel.bih/.bash_logout <==\n# ~/.bash_logout\n\n==> /etc/skel.bih/.bash_profile <==\n# .bash_profile\n\n# Get the aliases and functions\nif [ -f ~/.bashrc ]; then\n . ~/.bashrc\nfi\n\n# User specific environment and startup programs\n\nPATH=$PATH:$HOME/.local/bin:$HOME/bin\n\n==> /etc/skel.bih/.bashrc <==\n# .bashrc\n\n# Source global definitions\nif [ -f /etc/bashrc ]; then\n . /etc/bashrc\nfi\n\n# Uncomment the following line if you don't like systemctl's auto-paging feature:\n# export SYSTEMD_PAGER=\n
There actually are a couple of more files by default. The original copy in /etc/skel.bih
might slightly change over time during improvements but we will not touch your home directory in an unsolicited way at any time!
hpc-login-1:~$ tree -a /etc/skel.bih/\n/etc/skel.bih/\n\u251c\u2500\u2500 .bash_logout\n\u251c\u2500\u2500 .bash_profile\n\u251c\u2500\u2500 .bashrc\n\u251c\u2500\u2500 .screenrc\n\u2514\u2500\u2500 .vimrc\n
"},{"location":"help/faq/#my-program-crashes-what-should-i-do","title":"My program crashes! What should I do?","text":"Have a look at our How-To: Debug Software and How-To: Debug Software on HPC Systems guides!
But it works on my workstation!
Yes, please also refer to these guides on possible approaches to find the problem.
"},{"location":"help/faq/#my-r-sessions-is-always-crashing","title":"My R sessions is always crashing!","text":"We are aware of issues for macOS users that are related to locales. You might see something like the following in your R session or logs.
Note that this is not macOS specific by itself but by macOS using a locale that R does not understand.
Try using export LC_ALL=C
before calling R. If the problem goes away, you can add the following lines to your ~/.bashrc
file to make this permanent. Note that this is actually a good idea so your locale settings are fixed.
# Fix all locale settings to \"C\" for robustness.\nexport LC_ALL=C\n
ChatGPT on Unix Locales
Let's se what ChatGPT 3.5 tells us about Unix locale.
Me: Please explain the concept of Unix locales to me.
Certainly! Unix locales are a set of environment variables and settings that define how various aspects of software and user interactions should be handled to accommodate different languages, regions, and cultural conventions. They play a crucial role in enabling software to adapt to local preferences for displaying text, formatting numbers, handling dates, and more.
In a Unix-like operating system (such as Linux or macOS), locales provide a way to customize how programs behave based on the user's language, country, and cultural norms. Here are some of the key components of Unix locales:
-
Language: The primary language used for messages, menus, prompts, and system output. For instance, if you set your locale to \"en_US\", messages and interfaces will be in English (United States).
-
Region: The country or region that defines conventions for aspects like date and time formatting, number formatting (decimal point or comma, thousands separator, etc.), and currency symbols. For example, \"en_GB\" represents English (Great Britain).
-
Character Encoding: Specifies how characters are encoded, which is vital for correctly displaying and handling text in various scripts and languages. Common encodings include UTF-8 for Unicode text.
-
Collation: Defines how strings should be sorted. Different languages have different sorting orders for characters, and the locale determines the appropriate collation rules.
-
Numeric and Date Formats: Specifies how numbers, dates, and times should be formatted. Different regions have different conventions for date formats (dd/mm/yyyy vs. mm/dd/yyyy) and numerical representations (commas vs. periods as decimal points).
-
Currency Symbols: Defines the symbol or abbreviation used for representing currency. This is important for currency formatting in software.
Unix locales are typically identified by a combination of language and region codes, like \"en_US\" for American English or \"fr_FR\" for French (France). These locales are stored in files on the system, and they contain the necessary rules and data for proper localization.
Users and administrators can set the system's default locale, and individual users can override this setting in their environment. Here are some common commands used to manage locales:
locale
: Displays the current locale settings. locale -a
: Lists all available locales on the system. export LC_ALL=locale_code
: Temporarily sets the locale for the current session. update-locale locale_code
: Changes the system-wide default locale.
Programs that are designed to be locale-aware use these settings to display messages, format numbers and dates, and handle other language-dependent features. This allows software to be more user-friendly and relevant to different linguistic and cultural contexts.
In summary, Unix locales provide a framework for adapting software behavior to match the language, region, and cultural preferences of users, ensuring a more inclusive and globally accessible computing experience.
"},{"location":"help/faq/#which-cuda-version-is-installed","title":"Which CUDA version is installed?","text":"For this, connect to the node you want to query (via SSH but do not perform any computation via SSH!)
hpc-login-1:~$ ssh hpc-gpu-1\nhpc-gpu-1:~$ yum list installed 2>/dev/null | grep cuda.x86_64\ncuda.x86_64 10.2.89-1 @local-cuda\nnvidia-driver-latest-dkms-cuda.x86_64 3:440.64.00-1.el7 @local-cuda\n
"},{"location":"help/faq/#can-i-use-docker-on-the-cluster","title":"Can I use Docker on the Cluster?","text":"No, as Docker essentially gives you access as the root user.
However, you can use Apptainer (former Singularity) to run containers (and even many Docker contains if they are \"properly built\"). Also see Using Apptainer (with Docker Images).
"},{"location":"help/faq/#how-can-i-copy-data-between-the-max-cluster-mdc-network-and-bih-hpc","title":"How can I copy data between the MAX Cluster (MDC Network) and BIH HPC?","text":"The MAX cluster is the HPC system of the MDC. It is located in the MDC network. The BIH HPC is located in the BIH network.
In general, connections can only be initiated from the MDC network to the BIH network. The reverse does not work. In other words, you have to log into the MAX cluster and then initiate your file copies to or from the BIH HPC from there. E.g., use rsync -avP some/path user_m@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_m@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
"},{"location":"help/faq/#how-can-i-copy-data-between-the-charite-network-and-bih-hpc","title":"How can I copy data between the Charit\u00e9 Network and BIH HPC?","text":"In general, connections can only be initiated from the Charit\u00e9 network to the BIH network. The reverse does not work. In other words, you have to be on a machine inside the Charit\u00e9 network and then initiate your file copies to or from the BIH HPC from there. E.g., use rsync -avP some/path user_c@hpc-transfer-1.cubi.bihealth.org:/another/path
to copy files from the MAX cluster to BIH HPC and rsync -avP user_c@hpc-transfer-1.cubi.bihealth.org:/another/path some/path
to copy data from the BIH HPC to the MAX cluster.
"},{"location":"help/faq/#my-jobs-are-slowdie-on-the-logintransfer-node","title":"My jobs are slow/die on the login/transfer node!","text":"As of December 3, 2020 we have established a policy to limit you to 512 files and 128MB of RAM. Further, you are limited to using the equivalent of one core. This limit is enforced for all processes originating from an SSH session and the limit is enforced on all jobs. This was done to prevent users from thrashing the head nodes or using SSH based sessions for computation.
"},{"location":"help/faq/#slurm-complains-about-execve-no-such-file-or-directory","title":"Slurm complains about execve
/ \"No such file or directory\"","text":"This means that the program that you want to execute does not exist. Consider the following example:
[user@hpc-login-1 ~]$ srun --time 2-0 --nodes=1 --ntasks-per-node=1 \\\n --cpus-per-task=12 --mem 96G --partition staging --immediate 5 \\\n --pty bash -i\nslurmstepd: error: execve(): 5: No such file or directory\nsrun: error: hpc-cpu-2: task 0: Exited with exit code 2\n
Can you spot the problem? In this case, the problem is that for long arguments such as --mem
you must use the equal sign for --arg=value
with Slurm. This means that instead of writing --mem 96G --partition staging --immediate 5
, you must use `--mem=96G --partition=staging --immediate=5
.
In this respect, Slurm deviates from the GNU argument syntax where the equal sign is optional for long arguments.
"},{"location":"help/faq/#slurmstepd-says-that-hwloc_get_obj_below_by_type-fails","title":"slurmstepd
says that hwloc_get_obj_below_by_type
fails","text":"You can ignore the following problem:
slurmstepd: error: hwloc_get_obj_below_by_type() failing, task/affinity plugin may be required to address bug fixed in HWLOC version 1.11.5\nslurmstepd: error: task[0] unable to set taskset '0x0'\n
This is a minor failure related to Slurm and cgroups. Your job should run through successfully despite this error (that is more of a warning for end-users).
"},{"location":"help/faq/#how-can-i-share-filescollaborate-with-users-from-another-work-group","title":"How can I share files/collaborate with users from another work group?","text":"Please use projects as documented here. Projects were created for this particular purpose.
"},{"location":"help/faq/#whats-the-relation-of-charite-mdc-and-cluster-accounts","title":"What's the relation of Charite, MDC, and cluster accounts?","text":"For HPC 4 Research either an active and working Charite or MDC account is required (that is, you can login e.g., into email.charite.de or mail.mdc-berlin.de). The system has a separate meta directory that is used for the authorization of users (in other words, whether the user is active, has access to the system, and which groups the user belongs to). Charite and MDC accounts map to accounts <Charite user name>_c
and <MDC user name>_m
accounts in this meta directory. In the case that a user has both Charite and MDC accounts these are completely separate entities in the meta directory. For authentication (veryfing that a user has acccess to an account), the Charite and MDC account systems (MS Active Directory) are used. Authentication currently only uses the SSH keys deposited into Charite (via zugang.charite.de) and MDC (via MDC persdb). Users have to obtain a suitable Charite/MDC account via Charite and MDC central IT departments and upload their SSH keys through the host organization systems on their own. The hpc-helpdesk process is then used for getting their accounts setup on the HPC 4 Research system (the home/work/scratch shares being setup), becoming part of the special hpc-users
group that controls access to the system and organizing users into work groups and projects.
The process of submitting keys to Charite and MDC is documented in the \"Connecting\" section.
"},{"location":"help/faq/#how-do-charitemdccluster-accounts-interplay-with-vpn-and-the-mdc-jail-node","title":"How do Charite/MDC/Cluster accounts interplay with VPN and the MDC jail node?","text":"Charite users have to obtain a VPN account with the appropriate VPN access permissions, i.e., Zusatzantrag B as documented here. For Charite VPN, as for all Charite IT systems, users must use their Charite user name (e.g., jdoe
and not jdoe_c
).
MDC users either have to use MDC VPN or the MDC jail node, as documented here. For MDC VPN and jail node, as for all MDC IT systems, users must use their MDC user name (e.g., jdoe
and not jdoe_m
).
For help with VPN or jail node, please contact the central Charite or MDC helpdesks as appropriate.
Only when connecting from the host organizations' VPN or from the host organizations' jail node, the users use the HPC 4 Research user name that is jdoe_c
or jdoe_m
and not jdoe
!
"},{"location":"help/faq/#how-can-i-exchange-data-with-external-collaborators","title":"How can I exchange data with external collaborators?","text":"BIH HPC IT does not have the resources to offer such a service to normal users.
In particular, for privacy sensitive data this comes with a large number of strings attached to fulfill all regulatory requirements. If you need to exchange such data then you need to contact the central IT departments of your home organisation:
- Charite GB IT: heldpesk@charite.de
- MDC: helpdesk@mdc-berlin.de
If your data is not privacy sensitive or you can guarantee strong encryption of the data then the Gigamove service of RWTH Aachen might come in handy:
- https://gigamove.rwth-aachen.de/en
- https://help.itc.rwth-aachen.de/en/service/1jeqhtat4k0o3/faq/
You can login via Charite/MDC credentials (or most German academic institutions) and store up to 1TB of data at a time in the account with each file having up to 100GB.
As a note, Charite GB IT has a (German) manual on how to use 7-Zip with AES256 and strong passwords for encrypting data such that it is fit for transfer over unencrypted channels. You can find it here (Charite Intranet only) at point 2.12.
- https://intranet.charite.de/it/helpdesk/anleitungen/
The key point is using a strong password (e.g. with the pwgen
utility), creating an encrypted file with AES256 encryption, using distinct password for each recipient, and exchanging the password over a second channel (SMS or voice phone). Note that the central manual remains the ground truth of information and this FAQ entry may not reflect the current process recommended by GB IT if it changes without us noticing.
"},{"location":"help/good-tickets/","title":"How-To: Write a Good Ticket","text":"Can you solve the question yourself?
Please try to solve the question yourself with this manual and Google.
If the problem turns out to be hard, we're happy to help.
This page describes how to write a good help request ticket.
- Write a descriptive summary.
- Which cluster are you on? We only support HPC 4 Research.
- Put in a short summary into the Subject.
- Expand on this in a first paragraph. Try to answer the following questions:
- What are you trying to achieve?
- When did the problem start?
- Did it work before?
- Which steps did you attempt to achieve this?
- Give us your basic information.
- Please give us your user name on the cluster.
- Put enough details in the details section.
- Please give us the exact commands you type into your console.
- What are the symptoms/is the error message
- Never put your password into the ticket. In the case that you handle person-related data of patients/study participants, never write any of this information into the ticket or sequent email.
- Please do not send us screenshot images of what you did but copy and paste the text instead.
There is more specific questions for common issues given below.
"},{"location":"help/good-tickets/#problems-connecting-to-the-cluster","title":"Problems Connecting to the Cluster","text":" - From which machine/IP do you try to connect (
ifconfig
on Linux/Mac, ipconfig
on Windows)? - Did it work before?
- What is your user name?
- Please send us the output of
ssh-add -l
and add -vvv
to the SSH command that fails for you. - What is the response of the server?
"},{"location":"help/good-tickets/#problems-submitting-jobs","title":"Problems Submitting Jobs","text":" - Please give us the directory that you run things in.
- Please send us the submission script that you have problems with.
- If the job was submitted, Slurm will give you a job ID. We will need this ID.
- Please send us the output of
scontrol show job <jobid>
or sacct --long -j <jobid>
of your job.
"},{"location":"help/helpdesk/","title":"HPC IT Helpdesk","text":"Getting Help
Our helpdesk can be reached via email to hpc-helpdesk@bih-charite.de. Please read our guide on how to write good tickets first.
Please also use the handy figure below on general problem resolution.
But before contacting the helpdesk, try to get help in the HPC Talk BIH HPC user self-help forum!
"},{"location":"help/helpdesk/#helpdesk-scope","title":"Helpdesk Scope","text":"Our helpdesk can support you in the following areas:
- Problems/questions with connecting to the clusters.
- Problems/questions with using the cluster scheduler or operating system.
- Requests for the installation of common software.
- Problems with running your software that works in other environments.
We will try our best to resolve these issues. Please note that all other questions can only be answered in a \"best effort way\".
"},{"location":"help/helpdesk/#helpdesk-non-scope","title":"Helpdesk Non-Scope","text":"The following topics are out of scope for the BIH HPC Helpdesk:
- Generic Linux or programming questions (try stackoverflow.com).
- Managing users, groups, and projects on the clusters (use hpc-helpdesk@bih-charite.de).
- Generic help with Snakemake or other workflow engines (See Stackoverflow for getting help with Snakemake).
- Help with bioinformatics or other scientific software. Please contact the authors/communities of these software for help (also known as \"upstream\").
We're happy to see if we can help when there is a concrete problem with the software, e.g.,
- something that breaks from one week to another without you changing anything and you assume a change on the cluster, or
- you need a generic dependency that you cannot install via conda or on your own. Please read the section Administration-Provided Software to learn about the kinds of software that we will install and the kinds that we will not.
"},{"location":"help/hpc-talk/","title":"HPC Talk","text":"Another community-driven possibility to get help is our \u201cHPC Talk\u201d forum. After this manual, it should be the first place to consult.
https://hpc-talk.cubi.bihealth.org/
Its main purpose is to serve as a FAQ, so with time and more people participating, you will more likely find an answer to your question. We also use it to make announcements and give an up-to-date status of current problems with the cluster, so it is worth logging in every once in a while. It is also a great first place to look at if you're experiencing problems with the cluster. Maybe it's a known issue.
Despite users also being able to answer questions, our admins do participate on a regular basis.
"},{"location":"how-to/connect/gpu-nodes/","title":"How-To: Connect to GPU Nodes","text":"The cluster has seven nodes with four Tesla V100 GPUs each: hpc-gpu-{1..7}
and one node with 10 A40 GPUs: hpc-gpu-8
.
Connecting to a node with GPUs is easy. You request one or more GPU cores by adding a generic resources flag to your Slurm job submission via srun
or sbatch
.
--gres=gpu:tesla:COUNT
will request NVIDIA V100 cores. --gres=gpu:a40:COUNT
will request NVIDIA A40 cores. --gres=gpu:COUNT
will request any available GPU cores.
Your job will be automatically placed in the Slurm gpu
partition and allocated a number of COUNT
GPUs.
Info
Fair use rules apply. As GPU nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
Interactive Use of GPU Nodes is Discouraged
While interactive computation on the GPU nodes is convenient, it makes it very easy to forget a job after your computation is complete and let it run idle. While your job is allocated, it blocks the allocated GPUs and other users cannot use them although you might not be actually using them. Please prefer batch jobs for your GPU jobs over interactive jobs.
Furthermore, interactive GPU jobs are currently limited to 24 hours. We will monitor the situation and adjust that limit to optimize GPU usage and usability.
Please also note that allocation of GPUs through Slurm is mandatory, in other words: Using GPUs via SSH sessions is prohibited. The scheduler is not aware of manually allocated GPUs and this interferes with other users' jobs.
"},{"location":"how-to/connect/gpu-nodes/#usage-example","title":"Usage example","text":""},{"location":"how-to/connect/gpu-nodes/#preparation","title":"Preparation","text":"We will setup a miniforge installation with pytorch
testing the GPU. If you already have this setup then you can skip this step
hpc-login-1:~$ srun --pty bash\nhpc-cpu-1:~$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh\nhpc-cpu-1:~$ bash Miniforge3-Linux-x86_64.sh -b -p ~/work/miniforge\nhpc-cpu-1:~$ source ~/work/miniforge/bin/activate\nhpc-cpu-1:~$ conda create -y -n gpu-test pytorch cudatoolkit=10.2 -c pytorch\nhpc-cpu-1:~$ conda activate gpu-test\nhpc-cpu-1:~$ python -c 'import torch; print(torch.cuda.is_available())'\nFalse\nhpc-cpu-1:~$ exit\nhpc-login-1:~$\n
The False
shows that CUDA is not available on the node but that is to be expected. We're only warming up!
"},{"location":"how-to/connect/gpu-nodes/#allocating-gpus","title":"Allocating GPUs","text":"Let us now allocate a GPU. The Slurm schedule will properly allocate GPUs for you and setup the environment variable that tell CUDA which devices are available. The following dry run shows these environment variables (and that they are not available on the login node).
hpc-login-1:~$ export | grep CUDA_VISIBLE_DEVICES\nhpc-login-1:~$ srun --gres=gpu:tesla:1 --pty bash\nhpc-gpu-1:~$ export | grep CUDA_VISIBLE_DEVICES\ndeclare -x CUDA_VISIBLE_DEVICES=\"0\"\nhpc-gpu-1:~$ exit\nhpc-login-1:~$ srun --gres=gpu:tesla:2 --pty bash\nhpc-gpu-1:~$ export | grep CUDA_VISIBLE_DEVICES\ndeclare -x CUDA_VISIBLE_DEVICES=\"0,1\"\n
As you see, you can also reserve multiple GPUs. If we were to open two concurrent connections (e. g. in a screen
) to the same node when allocating one GPU each, the allocated GPUs would be non-overlapping. Note that any two jobs are isolated using Linux cgroups (\"container\" technology) so you cannot accidentally use a GPU of another job.
Now to the somewhat boring part where we show that CUDA actually works.
hpc-login-1:~$ srun --gres=gpu:tesla:1 --pty bash\nhpc-gpu-1:~$ nvcc --version\nnvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2019 NVIDIA Corporation\nBuilt on Wed_Oct_23_19:24:38_PDT_2019\nCuda compilation tools, release 10.2, V10.2.89\nhpc-gpu-1:~$ source ~/work/miniforge/bin/activate\nhpc-gpu-1:~$ conda activate gpu-test\nhpc-gpu-1:~$ python -c 'import torch; print(torch.cuda.is_available())'\nTrue\n
Note
If scheduling a GPU fails, consider explicitely requesting the GPU partion via --partition gpu
(or #SBATCH --partition gpu
).
Also make sure to read the FAQ entry \"I have problems connecting to the GPU node! What's wrong?\" if you encounter problems.
"},{"location":"how-to/connect/gpu-nodes/#bonus-1-who-is-using-the-gpus","title":"Bonus #1: Who is using the GPUs?","text":"Use squeue
to find out about currently queued jobs (the egrep
only keeps the header and entries in the gpu
partition).
hpc-login-1:~$ squeue | egrep -iw 'JOBID|gpu'\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 33 gpu bash holtgrem R 2:26 1 hpc-gpu-1\n
"},{"location":"how-to/connect/gpu-nodes/#bonus-2-is-the-gpu-running","title":"Bonus #2: Is the GPU running?","text":"To find out how active the GPU nodes actually are, you can connect to the nodes (without allocating a GPU; you can do this even if the node is full) and then use nvidia-smi
.
hpc-login-1:~$ ssh hpc-gpu-1 bash\nhpc-gpu-1:~$ nvidia-smi\nFri Mar 6 11:10:08 2020\n+-----------------------------------------------------------------------------+\n| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |\n|-------------------------------+----------------------+----------------------+\n| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n|===============================+======================+======================|\n| 0 Tesla V100-SXM2... Off | 00000000:18:00.0 Off | 0 |\n| N/A 62C P0 246W / 300W | 16604MiB / 32510MiB | 99% Default |\n+-------------------------------+----------------------+----------------------+\n| 1 Tesla V100-SXM2... Off | 00000000:3B:00.0 Off | 0 |\n| N/A 61C P0 270W / 300W | 16604MiB / 32510MiB | 100% Default |\n+-------------------------------+----------------------+----------------------+\n| 2 Tesla V100-SXM2... Off | 00000000:86:00.0 Off | 0 |\n| N/A 39C P0 55W / 300W | 0MiB / 32510MiB | 0% Default |\n+-------------------------------+----------------------+----------------------+\n| 3 Tesla V100-SXM2... Off | 00000000:AF:00.0 Off | 0 |\n| N/A 44C P0 60W / 300W | 0MiB / 32510MiB | 4% Default |\n+-------------------------------+----------------------+----------------------+\n\n+-----------------------------------------------------------------------------+\n| Processes: GPU Memory |\n| GPU PID Type Process name Usage |\n|=============================================================================|\n| 0 43461 C python 16593MiB |\n| 1 43373 C python 16593MiB |\n+-----------------------------------------------------------------------------+\n
"},{"location":"how-to/connect/gpu-nodes/#fair-share-fair-use","title":"Fair Share / Fair Use","text":"Note that allocating a GPU makes it unavailable for everyone else, so please behave nicely and be cooperative. If you see someone blocking the GPU nodes for a long time, first find out who it is. You can type getent passwd USER_NAME
on any cluster node to see their email address (and work phone number if added). Send a friendly email, most likely they blocked the node accidentally. If you cannot resolve the issue (e. g. the user is not reachable) then please contact hpc-helpdesk@bih-charite.de.
"},{"location":"how-to/connect/high-memory/","title":"How-To: Connect to High-Memory Nodes","text":"The cluster has 4 high-memory nodes with 1.5 TB of RAM. You can connect to these nodes using the highmem
SLURM partition (see below). Jobs allocating more than 200 GB of RAM are automatically routed to the highmem
nodes.
Info
Fair use rules apply. As high-memory nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
"},{"location":"how-to/connect/high-memory/#how-to","title":"How-To","text":"In the cluster there are four High-memory used which can be used:
hpc-login-1:~$ sinfo -p highmem\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST \nhighmem up 14-00:00:0 3 idle med040[1-4] \n
To connect to one of them, simply allocate more than 200GB of RAM in your job.
hpc-login-1:~$ srun --pty --mem=300GB bash -i\nmed0401:~$\n
You can also pick one of the hostnames:
hpc-login-1:~$ srun --pty --mem=300GB --nodelist=med0403 bash -i\nmed0403:~$\n
After successfull login, you can see that you are in \"highmem\" queue:
med0403:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n[...]\n 270 highmem bash holtgrem R 1:25 1 med0403 \n
"},{"location":"how-to/misc/contribute/","title":"How-To: Contribute to this Document","text":"Click on the edit link at the top of each page as shown below.
- Sign in to github (or create a new account).
- Fork the repository and add your changes (more details: https://docs.github.com/en/github/getting-started-with-github/fork-a-repo )
- Add a pull request
"},{"location":"how-to/misc/debug-at-hpc/","title":"How-To: Debug Software on HPC Systems","text":"Please Contribute!
This guide is far from complete. Please feel free to contribute, e.g., refer to How-To: Contribute to this Document.
Please make sure that you have read How-To: Debug Software as a general primer.
As debugging is hard enough already, it makes one wonder how to do this on the HPC system in batch mode. Here is a list of pointers.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-1-run-it-interactively","title":"Attempt 1: Run it interactively!","text":"First of all, you can of course get an interactive session using srun --pty bash -i
and then run your program interactively. Make sure to allocate appropriate memory and cores for your purpose. You might also want to first start a screen
or tmux
session on the login node such that network interruptions to the login node don't harm your hard debugging work!
Does the program work correctly if you do this? If yes, and it only fails when run in batch mode, consider the following behaviour of the scheduler.
The scheduler takes your resource requirements and tries to find a free slot. Once it has found a free slot, it will attempt to run the program. This mainly differs in running it interactively in standard input, output, and error streams.
- By default, stdin is connected to
/dev/null
such that no input is read. You can change this with the --input=
flag to specify a file. - By default stdout and stderr are joint and written to the file specified as
--output=
. You can use certain wildcards to make the output (but also the input files) depend on things like the job ID or job name. - Please note that the directory name to the output file but exist before the job is launched. It is not sufficient to
mkdir
it in the job script itself.
Please refer to the sbatch documentation for details.
If your program fails without leaving any log file or any other trace, make sure that the path to the output file exists. To the best of the author's knowledge, there is no way to tell apart a crash because this does not exist and a program failure (except maybe for the running time of 0 seconds and memory usage of 0 bytes).
"},{"location":"how-to/misc/debug-at-hpc/#attempt-2-inspect-the-logs","title":"Attempt 2: Inspect the logs","text":"Do you see any exception in your log files? If not, continue.
If your job is canceled by scancel
or stopped because it exhausted it maximal running time or allocated resources then you will find a note in the last line of your error output log (usually folded into the standard output). Please note that if the previous output line did not include a line ending, the message might be at the very end of the last line.
The message will look similar to:
slurmstepd: error: *** JOB <your job id> ON med0xxx CANCELLED AT 2020-09-02T21:01:12 DUE TO TIME LIMIT ***\n
"},{"location":"how-to/misc/debug-at-hpc/#attempt-3-increase-loggingprinting","title":"Attempt 3: Increase logging/printing","text":"Ideally, you can add one or more --verbose
/-v
flags to your program to increase verbosity. See how far your program gets, see where it fails. This attempt will be greatly helped by reproducible running on a minimal working example.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-4-use-sattach","title":"Attempt 4: Use sattach
","text":"You can use sattach
for attaching your terminal to your running job. This way, you can perform an interactive inspection of the commands.
You can combine this with one of the next attempst of using debuggers to e.g., get an pdb
debugger at an important position of your program. However, please note that pdb
and ipdb
will stop the program's execution if the standard input stream is at end of file (which /dev/null
is and this is used by default in sbatch
jobs).
"},{"location":"how-to/misc/debug-at-hpc/#attempt-5-inspect-program-activity","title":"Attempt 5: Inspect Program Activity","text":"Log into the node that your program runs on either using srun --pty --nodelist=NODE
or using ssh
. Please note that you should never perform computational intensive things when logging into the node directly. You can then use all activity inspection tips from How-To: Debug Software.
"},{"location":"how-to/misc/debug-at-hpc/#attempt-6-use-debuggers","title":"Attempt 6: Use Debuggers","text":"After having logged into the node running your program, you can of course also attach to the program with gdb -p PID
or cgdb -p PID
.
"},{"location":"how-to/misc/debug-at-hpc/#dont-despair","title":"Don't Despair","text":"Here are some final remarks:
- Don't despair!
- The longer you search for the problem, the more fundamental it is. Chances are that you are just overlooking something obvious which is actually easy to fix.
- Keep old log files!
- Really, really, make sure that your program runs deterministically. You will save yourself a world of pain.
"},{"location":"how-to/misc/debug-software/","title":"How-To: Debug Software","text":"Please Contribute!
This guide is far from complete. Please feel free to contribute, e.g., refer to How-To: Contribute to this Document.
Software development in general or even debugging of software are very broad topics. As such, we will not be able to handle them here comprehensively. Rather, we will give a tour de force on practical and minimal approaches of debugging of software. Here, debugging refers to the process of locating errors in your program and removing them.
Origin of the term debugging
The terms \"bug\" and \"debugging\" are popularly attributed to Admiral Grace Hopper in the 1940s. While she was working on a Mark II computer at Harvard University, her associates discovered a moth stuck in a relay and thereby impeding operation, whereupon she remarked that they were \"debugging\" the system. However, the term \"bug\", in the sense of \"technical error\", dates back at least to 1878 and Thomas Edison (see software bug for a full discussion).
-- Wikipedia: Debugging
When forgetting a moment about everything known about software engineering, programming roughly work sin the following cycle:
You run your program. In the case of failure, you need to remove the problem until the program runs through. You then start implementing the next change or feature. But how do you actually locate the problem? Let us walk through a couple of steps.
"},{"location":"how-to/misc/debug-software/#step-1-find-out-that-there-is-an-error","title":"Step 1: Find out that there is an error","text":"This might seem trivial but let us think about this for a moment. For this
- you will have to run your program on some input and observe its behaviour and output,
- you will need to have an expectation of its behaviour and output, and
- observe unexpected behaviour, including but not limited to:
- the program crashes,
- the program produce wrong or corrupted output, or
- the program produces incomplete output.
You could make this step a bit more comfortable by writing a little checker script that compares expected and actual output.
"},{"location":"how-to/misc/debug-software/#step-2-reproduce-your-error","title":"Step 2: Reproduce your error","text":"You will have to find out how often or regularly the problem occurs. Does the problem occur on all inputs or only specific ones? Does it occur with all parameters? Make sure that you can reproduce the problem, otherwise the problem will be hard to track down.
Discard randomness
In most applications, true randomness is neither required nor used in programs. Rather, pseudo random number generators are used that are usually seeded with a special value. In many cases, the current time is used which makes it hard to reproduce problems. Rather, use a fixed seed, e.g., by calling srand(42)
in C. You could also make this a parameter of your program, but make sure that you can fix all pseudo randomness in your program so you can deterministically reproduce its behaviour.
"},{"location":"how-to/misc/debug-software/#step-3-create-a-minimal-working-example-mwe","title":"Step 3: Create a minimal working example (MWE)","text":"Try to find a minimal input set on which you can produce your problem. For example, you could use samtools view FILE.bam chr1:90,000-100,000
to cut out regions from a BAM file. The next step is to nail down the problem. Ideally, you can deactivate or comment out parts of your program that are irrelevant to the problem.
This will allow you to get to the problematic point in your program quicker and make the whole debugging exercise easier on yourself.
"},{"location":"how-to/misc/debug-software/#interlude-what-we-have-up-to-here","title":"Interlude: What we have up to here","text":"We can now
- tell expected and \"other\" behaviour and output apart (ideally semi-automatically),
- reproduce the problem,
- and reproduce the problem quickly.
If you reached the points above, you have probably cut the time to resolve the problem by 90% already.
Let us now consider a few things that you can do from here to find the source of your problems.
"},{"location":"how-to/misc/debug-software/#method-1-stare-at-your-source-code","title":"Method 1: Stare at your source code","text":"Again, this is trivial, but: look at your code and try to follow through what it does with your given input. This is nicely complemented with the following methods. ;-)
There is a class of tools to help you in doing this, so-called static code analysis tools. They analyze the source code for problematic patterns. The success and power of such analysis tools tends to corellate strongly with how strictly typed the targeted programming language is. E.g., there are very powerful tools for Java, C/C++. However, there is some useful tool support out there for dynamic languages such as Python.
Here is a short list of pointers to static code analysis tools (feel free to extend the list):
- Python Static Analysis Tools
"},{"location":"how-to/misc/debug-software/#method-2-inspect-your-codes-activity","title":"Method 2: Inspect your code's activity","text":""},{"location":"how-to/misc/debug-software/#print-it","title":"Print it!","text":"The most simple approach is to use print
statements (or similar) to print the current line or value of parameters. While sometimes frowned upon, this certainly is one of the most robust ways to see what is happening in your program. However, beware that too much output might slow down your program or actually make your problem disappear in the case of subtle threading/timing issues (sometimes referred to as \"Heisenbugs\").
Standard output vs. error
Classically, Linux/Unix programs can print back to the user's terminal in two ways: standard output and standard errors. By convention, logging should go to stderr. The standard error stream also has the advantage that writing to it has a more direct effect. In contrast to stdout which is usually setup to be (line) buffered (you will only see output after the next newline character), stderr is unbuffered.
"},{"location":"how-to/misc/debug-software/#look-at-tophtop","title":"Look at top
/htop
","text":"The tools top
and htop
are useful tools for inspecting the activity on the current computer. The following parameters are useful (and are actually also available as key strokes when they are running).
-c
-- show the programs' command lines -u USER
-- show the processes of the user
You can exit either tool by pressing q
or Ctrl-C
.
Use the man
, Luke!
Besides searching the internet for a unix command, you can also read its manual page by running man TOOL
. If this does not work, try TOOL --help
to see its builtin help function. Also, doing an internet search for \"man tool\" might help.
"},{"location":"how-to/misc/debug-software/#look-at-strace","title":"Look at strace
","text":"The program strace
allows you to intercept the calls of your program to the kernel. As the kernel is needed for actions such as accessing the network or file system. Thus this is not so useful if your program gets stuck in \"user land\", but this might be useful to see which files it is accessing.
Pro-Tip: if you move the selection line of htop
to a process then you can strace the program by pressing s
.
"},{"location":"how-to/misc/debug-software/#look-at-lsof","title":"Look at lsof
","text":"The lsof
program lists all open files with the processes that are accessing them. This is useful for seeing which files you program has opened.
You can even build a progress bar with lsof, although that requires sudo
privileges which you might not have on the system that you are using.
Pro-Tip: if you move the selection line of htop
to a process then you can list the open files by pressing l
.
"},{"location":"how-to/misc/debug-software/#more-looking","title":"More looking","text":"There are more ways of inspecting your program, here are some:
- Google Perftools
- Linux
perf
"},{"location":"how-to/misc/debug-software/#interactive-debuggers","title":"Interactive Debuggers","text":"Let us now enter the world of interactive debuggers. Integrated development environment (IDEs) generally consist of an editor, a compiler/interpreter, and an ineractive/visual debugger. Usually, they have a debugger program at their core that can also be used on their command line.
"},{"location":"how-to/misc/debug-software/#old-but-gold-gdb","title":"Old but gold: gdb
","text":"On Unix systems, a widely used debugger is gdb
the GNU debugger. gdb
is a command line program and if you are not used to it, it might be hard to use. However, here are some pointers on how to use it:
The commands in interactive mode include:
quit
or Ctrl-D
to exit the debugger b file.ext:123
set breakpoint in file.ext
on line 123
r
run the program p var_name
print the value of the variable var_name
display var_name
print the value of the variable var_name
every time execution stops l
print the source code around the current line (multiple calls will show the next 10 lines or so, and so on) l 123
print lines around line 123
f
show information about the current frame (that is the current source location) bt
show the backtrace (that is all functions above the current one) n
step to the next line s
step into function calls finish
run the current function until it returns help
to get more help
You can call your program directly with command line arguments using cgdb [cgdb-args] --args path/to/program -- [program-args
.You can also attach to running programs using
cgdb -p PIDonce you have found out the process ID to attach to using
htopor
ps`.
Pro-tip: use cgdb
for an easier to use version that displays the source code in split screen and stores command line histories over sessions.
"},{"location":"how-to/misc/debug-software/#interactive-python-debuggers","title":"Interactive Python Debuggers","text":"You can get a simple REPL (read-execute-print loop) at virtually any position in your program by adding:
import pdb; pdb.set_trace()\n
You will get a prompt at the current position and can issue several commands including:
quit
or Ctrl-D
to exit the debugger p var_name
to print the variable with var_name
f
show information about the current frame (that is the current source location) bt
show the backtrace (that is all functions called above the current one) continue
to continue running help
to get more help
Pro-tip: use import ipdb; ipdb.set_trace()
(after installing the ipdb
package, of course) to get an IPython-based prompt that is much more comfortable to use.
"},{"location":"how-to/misc/debug-software/#pro-tip-version-control-your-code","title":"Pro-Tip: Version control your code!","text":"Here is a free bonus pro-tip: learn how to use version control, e.g., Git. This will allow you to go back to previous versions without problems and see current changes to your source code.
- 10 Free Online Git Courses
- Github: Resources to learn Git
"},{"location":"how-to/misc/debug-software/#pro-tip-write-automated-tests","title":"Pro-Tip: Write automated tests!","text":"Combine the pro tip on using version control (learn Git already!) with this one: learn how to write automated tests. This will allow you to quickly narrow down problematic changes in your version control history.
Again, testing is another topic alltogether, so here are just some links to testing frameworks to get you started:
- pytest: testing framework for Python
- testthat: testing framework for R
"},{"location":"how-to/misc/debug-software/#reading-material-on-debuggers","title":"Reading Material on Debuggers","text":"The following web resources can serve as a starting point on how to use debuggers.
- Chapter Debugger from Wikibook: Introduction to Software Engineering
- The Python Debugger
- Debugging with GDB
"},{"location":"how-to/misc/hpc-talk/","title":"Accessing HPC Talk","text":"We provide a user forum using the Discourse software at
- https://hpc-talk.cubi.bihealth.org
"},{"location":"how-to/misc/hpc-talk/#log-in","title":"Log In!","text":"First of all, visit the website for the first time: https://hpc-talk.cubi.bihealth.org
You will then be directed to our Single-Sign-On Page.
Use the appropriate button for your host organisation (MDC / Charite) where also your cluster account belongs to.
Then use the usual of your host organisation.
Clicked wrong organisation?
If you accidentally clicked the wrong institution then you need to clear your browser history up to the point where you clicked (e.g., for the last hour).
- Delete your Chrome browsing history
- How do I delete browsing history in firefox
"},{"location":"how-to/misc/hpc-talk/#first-steps","title":"First Steps","text":"You will be shown the following screen after the first login.
You can proceed with reading the notification or split it. The site is mostly self-explanatory. let us point you at a couple of interesting things for first steps.
Here you can setup your preferences
Use the \"New Topic\" button to create a new topic. Set a meaningful title, select a suitable category (we will update the list of categories over time), and write down your question or discussion item. Finally, click \"Create Topic\" to create the new topic.
You will be directed to the page with your new topic.
You can enable email notifications to receive emails if someone answers.
"},{"location":"how-to/misc/hpc-talk/#disabling-browser-notifications","title":"Disabling Browser Notifications","text":"In your settings, you will find an option to disable browser notifications in this browser.
Or you can use the do not disturb button.
"},{"location":"how-to/misc/hpc-talk/#closing-remarks","title":"Closing Remarks","text":"We established the HPC Talk forum as a self-help forum for users. Alas, there is a number of such sites out there already that are populated by more users.
How does HPC Talk fit in?
We think it is most useful for asking questions and discussing points that are directly related to the BIH HPC system.
What alternatives do I have?
For example:
- Stack Overflow for general programming questions, including Python/R programming
- Cross Validated for questions regading statistics
- Unix & Linux Stack Exchange for discussing all sorts of Linux/Unix questions
- Super User for certain more advanced Unix topics
"},{"location":"how-to/service/file-exchange/","title":"How-To: Use File Exchange","text":"Obtaining File Boxes
At the moment, file boxes are only available to members of core facilities (e.g., genomics, bioinformatics, or metabolomics) for exchanging files for their collaboration partners. Currently, HPC users cannot use the file box mechanism on their own.
BIH HPC IT provides a file exchange server to be used by the BIH core facilities and their users. The server is located in the BIH DMZ in Buch. Users authenticate using their Charite/BIH (user@CHARITE
) or MDC accounts (user@MDC-BERLIN
). File exchange is organized using \"file boxes\", directories created on the server to which selected users are granted access. Access control list maintenance is done with audit-trails (\"Revisionssicherheit\") and the file access itself is also logged to comply with data protection standards.
- Jump to \"From Windows\"
- Jump to \"From Linux\"
- Jump to \"From Mac\"
Access from Charite Network
Access from the Charite network (IP ranges 141.x.x.x
and 10.x.x.x
) must connect through the Charite proxy (http://proxy.charite.de:8080
). Depending on the client software that you are using, you might have to configure the proxy.
"},{"location":"how-to/service/file-exchange/#file-box-management","title":"File Box Management","text":"File boxes are created by the core facilities (e.g., the genomics facilities at Charite and MDC). The facility members also organize the access control. Please talk to your core facility contact on file exchange.
External users must obtain a Charite or MDC account first. Account creation is handled by the core facilities that the external user is a customer of.
"},{"location":"how-to/service/file-exchange/#file-access","title":"File Access","text":"Generally, you will be given a URL to your file box similar to https://file-exchange.bihealth.org/<file-box-id>/
. The files are served over an encrypted connection using WebDAV (which uses HTTPS).
The following describes how to access the files in the box from different platforms.
"},{"location":"how-to/service/file-exchange/#from-linux","title":"From Linux","text":"We describe how to access the files on the command line using the lftp
program. The program is preinstalled on the BIH (and the MDC cluster) and you should be able to just install it with yum install lftp
on CentOS/Red Hat or apt-get install lftp
on Ubuntu/Debian.
When using lftp
, you have to add some configuration first:
# cat >>~/.lftprc <<\"EOF\"\nset ssl:verify-certificate no\nset ftp:ssl-force yes\nEOF\n
In case that you want to access the files using a graphical user interface, search Google for \"WebDAV\" and your operating system or desktop environment. File browsers such as Nautilus and Thunar have built-in WebDAV support.
"},{"location":"how-to/service/file-exchange/#connecting","title":"Connecting","text":"First, log into the machine that has lftp
installed. The login nodes of the BIH cluster do not have it installed but all compute and file transfer nodes have it. Go to the data download location.
host:~$ mkdir -p ~/scratch/download_dir\nhost:~$ cd ~/scratch/download_dir\n
Next, start lftp
. You can open the connection using open -u <user>@<DOMAIN> https://file-exchange.bihealth.org/<file-box-id>/
(NB: there is a trailing slash) where
<user>
is your user name, e.g., holtgrem
, <domain>
is either MDC-BERLIN
or CHARITE
, and <file-box-id>
the file box ID from the URL provided to you.
When prompted, use your normal Charite/MDC password to login.
host:download_dir$ lftp\nlftp :~> open -u holtgrem@CHARITE https://file-exchange.bihealth.org/c62910b3-c1ba-49a5-81a6-a68f1f15aef6\nPassword:\ncd ok, cwd=/c62910b3-c1ba-49a5-81a6-a68f1f15aef6\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6>\n
"},{"location":"how-to/service/file-exchange/#browsing-data","title":"Browsing Data","text":"You can find a full reference of lftp
on the lftp man page. You could also use help COMMAND
on the lftp prompt. For example, to look at the files of the server for a bit...
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> ls\ndrwxr-xr-x -- /\ndrwxr-xr-x -- dir\n-rw-r--r-- -- file1\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> find\n./\n./dir/\n./dir/file2\n./file1\n
"},{"location":"how-to/service/file-exchange/#downloading-data","title":"Downloading Data","text":"To download all data use mirror
, e.g. with -P 4
to use four download threads.
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> mirror .\nTotal: 2 directories, 3 files, 0 symlinks\nNew: 3 files, 0 symlinks\nlftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> exit\nhost:download_dir$ tree\n.\n\u251c\u2500\u2500 dir\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 file2\n\u251c\u2500\u2500 file1\n\u2514\u2500\u2500 file.txt\n\n1 directory, 3 files\n
Ignoring gnutls_record_recv
errors.
A common error to see is mirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.
. You can just ignore this.
"},{"location":"how-to/service/file-exchange/#uploading-data","title":"Uploading Data","text":"To upload data, you can use mirror -R .
which is essentially the \"reverse\" of the mirror command.
lftp holtgrem@CHARITE@file-exchange.bihealth.org:/c62910b3-c1ba-49a5-81a6-a68f1f15aef6> mirror -R\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nmirror: Fatal error: gnutls_record_recv: The TLS connection was non-properly terminated.\nTotal: 2 directories, 3 files, 0 symlinks\nModified: 3 files, 0 symlinks\n4 errors detected\n
"},{"location":"how-to/service/file-exchange/#from-windows","title":"From Windows","text":"We recommend to use WinSCP for file transfer.
- Pre-packaged WinSCP on Charite Workstations. Charite IT has packaged WinSCP and you can install it using Matrix24 Empirum on Windows 10 using these instructions in the Charite intranet.
- Installing WinSCP yourself. You can obtain it from the WinSCP Download Page. A \"portable\" version is available that comes as a ZIP archive that you just have to extract without an installer.
"},{"location":"how-to/service/file-exchange/#connecting_1","title":"Connecting","text":"After starting WinSCP, you will see a window titled Login
. Just paste the URL (e.g., https://file-exchange.bihealth.org/c62910b3-c1ba-49a5-81a6-a68f1f15aef6/
) of the file box into the Host name
entry field. In this case, the fields File protocol
etc. will be filled automatically. Next, enter your user name as user@CHARITE
or user@MDC-BERLIN
(the capitalization of the part behind the @
is important). The window should now look similar to the one below.
Proxy Configuration on Charite Network
If you are on the Charite network then you have to configure the proxy. Otherwise, you have to skip this step.
Click Advanced
and a window titled Advanced Site Settings
will pop up. Here, select Connection / Proxy
in the left side. Select HTTP
for the Proxy type
. Then, enter proxy.charite.de
as the Proxy host name
and set the Port number
to 8080
. The window should nwo look as below. Then, click OK
to apply the proxy settings.
Finally, click Login
. You can now transfer files between the file exchange server and your local computer using drag and drop between WinSCP and your local Windows File Explorer. Alternatively, you can use the two-panel view of WinSCP to transfer files as described here.
"},{"location":"how-to/service/file-exchange/#from-mac","title":"From Mac","text":"For Mac, we you can also use lftp
as described above in From Linux. You can find install instructions here online.
Proxy Configuration on Charite Network
If you are on the Charite network then you must have configured the proxy appropriately. Otherwise, you have to skip this step.
You can find them in your System Preference
in the Network
section, in the Advanced
tab of your network (e.g., WiFi
).
If you want to use a graphical interface then we recommend the usage of Cyberduck. After starting the program, click Open Connection
on the top left, then select WebDAV (HTTPS)
and fill out the form as in the following way. Paste the file box URL into the server field and use your login name (user@CHARITE
or user@MDC-BERLIN
) with your usual password.
If you need to perform access through a graphical user interface on your Mac, please contact hpc-helpdesk@bihealth.org for support.
"},{"location":"how-to/service/file-exchange/#security","title":"Security","text":"The file exchange server has the fail2ban
software installed and configured (Charite, MDC, and BIH IPs are excluded from this).
If you are entering your user/password incorrectly for more than 5 times in 10 minutes then your machine will be banned for one hour. This means someone else that has the same IP address from the side of the file exchange server can get you blocked. This can happen if you are in the same home or university network with NAT or if you are behind a proxy. In this case you get a \"connection refused\" error. In this case, try again in one hour.
"},{"location":"how-to/software/apptainer/","title":"Using Apptainer (with Docker Images)","text":"Note
Singularity is now Apptainer! While Apptainer provides an singularity
alias for backwards compatibility, it is recommanded to adapt all workflows to use the new binary apptainer
.
Apptainer (https://apptainer.org/) is a popular alternative to docker, because it does not require to run as a privileged user. Apptainer can run Docker images out-of-the-box by converting them to the apptainer image format. The following guide gives a quick dive into using docker images with apptainer.
Build on your workstation, run on the HPC
Building images using Apptainer requires root privileges. We cannot give you these permissions on the BIH HPC. Thus, you will have to build the images on your local workstation (or anywhere where you have root access). You can then run the built images on the BIH HPC.
This is also true for the --writeable
flag. Apparently it needs root permissions which you don't have on the cluster.
"},{"location":"how-to/software/apptainer/#quickstart","title":"Quickstart","text":"Link ~/.apptainer to ~/work/.apptainer
Because you only have a quota of 1 GB in your home directory, you should symlink ~/.apptainer
to ~/work/.apptainer
.
host:~$ mkdir -p ~/work/.apptainer && ln -sr ~/work/.apptainer ~/.apptainer\n
In case you already have a apptainer directory:
host:~$ mv ~/.apptainer ~/work/.apptainer && ln -sr ~/work/.apptainer ~/.apptainer\n
Run a bash in a docker image:
host:~$ apptainer shell docker://godlovedc/lolcow\n
Run a command in a docker image:
host:~$ apptainer exec docker://godlovedc/lolcow echo \"hello, hello!\"\n
Run a bash in a docker image, enable access to the cuda driver (--nv) and mount a path (--bind or -B):
host:~$ apptainer shell --nv --bind /path_on_host/:/path_inside_container/ docker://godlovedc/lolcow\n
"},{"location":"how-to/software/apptainer/#some-caveats-and-notes","title":"Some Caveats and Notes","text":"Caveats
- The default apptainer images format (.sif) is read-only.
- By default apptainer mounts /home/$USER, /tmp, and $PWD in the container.
Notes
- Environment variables can be provided by setting them in the bash and adding the prefix
APPTAINERENV_
: host:~$ APPTAINERENV_HELLO=123 apptainer shell docker://godlovedc/lolcow echo $HELLO\n
- Calling
apptainer shell
or apptainer exec
uses as cwd the host callers cwd not the one set in the Dockerfile. One can change this by setting --pwd
.
"},{"location":"how-to/software/apptainer/#referencingproviding-docker-images","title":"Referencing/Providing Docker Images","text":""},{"location":"how-to/software/apptainer/#option-1-use-docker-images-via-docker-hub","title":"Option 1: Use Docker Images via Docker Hub","text":"The easiest variant to run a docker image available via a docker hub is by specifying its url. This causes apptainer to download the image and convert it to a apptainer image:
host:~$ apptainer run docker://godlovedc/lolcow\n
or to open a shell inside the image
host:~$ apptainer shell docker://godlovedc/lolcow\n
Furthermore, similar to docker, one can pull (and convert) remote image with the following call:
host:~$ apptainer pull docker://godlovedc/lolcow\n
In case your registry requires authentication you can provide it via a prompt by adding the option --docker-login
:
host:~$ apptainer pull --docker-login docker://ilumb/mylolcow\n
or by setting the following environment variables:
host:~$ export APPTAINER_DOCKER_USERNAME=ilumb\nhost:~$ export APPTAINER_DOCKER_PASSWORD=<redacted>\nhost:~$ apptainer pull docker://ilumb/mylolcow\n
More details can be found in the Apptainer documentation.
"},{"location":"how-to/software/apptainer/#option-2-converting-docker-images","title":"Option 2: Converting Docker Images","text":"Another option is to convert your docker image into the Apptainer/Singularity image format. This can be easily done using the docker images provided by docker2singularity.
To convert the docker image docker_image_name
to the apptainer image apptainer_image_name
one can use the following command line. The output image will be located in output_directory_for_images
.
host:~$ docker run -v /var/run/docker.sock:/var/run/docker.sock -v /output_directory_for_images/:/output --privileged -t --rm quay.io/singularity/docker2singularity --name apptainer_image_name docker_image_name\n
The resulting image can then directly be used as image:
host:~$ apptainer exec apptainer_image_name.sif bash\n
"},{"location":"how-to/software/apptainer/#conversion-compatibility","title":"Conversion Compatibility","text":"Here are some tips for making Docker images compatible with Apptainer taken from docker2singulrity:
- Define all environmental variables using the ENV instruction set. Do not rely on
~/.bashrc
, ~/.profile
, etc. - Define an
ENTRYPOINT
instruction set pointing to the command line interface to your pipeline. - Do not define
CMD
- rely only on ENTRYPOINT
. - You can interactively test the software inside the container by overriding the
ENTRYPOINT docker run -i -t --entrypoint /bin/bash bids/example
. - Do not rely on being able to write anywhere other than the home folder and /scratch. Make sure your container runs with the
--read-only --tmpfs /run --tmpfs /tmp parameters
(this emulates the read-only behavior of Apptainer). - Don't rely on having elevated user permissions.
- Don't use the
USER
instruction set.
"},{"location":"how-to/software/cell-ranger/","title":"How-To: Run CellRanger","text":""},{"location":"how-to/software/cell-ranger/#what-is-cell-ranger","title":"what is Cell Ranger?","text":"from the official website: \"Cell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis\"
"},{"location":"how-to/software/cell-ranger/#installation","title":"installation","text":"requires registration before download from here
to unpack Cell Ranger, its dependencies and the cellranger
script:
cd /data/cephfs-1/home/users/$USER/work\nmv /path/to/cellranger-3.0.2.tar.gz .\ntar -xzvf cellranger-3.0.2.tar.gz\n
"},{"location":"how-to/software/cell-ranger/#reference-data","title":"reference data","text":"will be provided in /data/cephfs-1/work/projects/cubit/current/static_data/app_support/cellranger
"},{"location":"how-to/software/cell-ranger/#cluster-support-slurm","title":"cluster support SLURM","text":"add a file slurm.template
to /data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template
with the following contents:
#!/usr/bin/env bash\n#\n# Copyright (c) 2016 10x Genomics, Inc. All rights reserved.\n#\n# =============================================================================\n# Setup Instructions\n# =============================================================================\n#\n# 1. Add any other necessary Slurm arguments such as partition (-p) or account\n# (-A). If your system requires a walltime (-t), 24 hours (24:00:00) is\n# sufficient. We recommend you do not remove any arguments below or Martian\n# may not run properly.\n#\n# 2. Change filename of slurm.template.example to slurm.template.\n#\n# =============================================================================\n# Template\n# =============================================================================\n#\n#SBATCH -J __MRO_JOB_NAME__\n#SBATCH --export=ALL\n#SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__\n#SBATCH --signal=2\n#SBATCH --no-requeue\n#SBATCH --partition=medium\n#SBATCH --time=24:00:00\n### Alternatively: --ntasks=1 --cpus-per-task=__MRO_THREADS__\n### Consult with your cluster administrators to find the combination that\n### works best for single-node, multi-threaded applications on your system.\n#SBATCH --mem=__MRO_MEM_GB__G\n#SBATCH -o __MRO_STDOUT__\n#SBATCH -e __MRO_STDERR__\n\n__MRO_CMD__\n
note: on newer cellranger version, slurm.template
needs to go to /data/cephfs-1/home/users/$USER/work/cellranger-XX/external/martian/jobmanagers/
"},{"location":"how-to/software/cell-ranger/#demultiplexing","title":"demultiplexing","text":"if that hasn't been done yet, you can use cellranger mkfastq
(details to be added)
"},{"location":"how-to/software/cell-ranger/#run-the-pipeline-count","title":"run the pipeline (count
)","text":"create a script run_cellranger.sh
with these contents (consult the documentation for help:
#!/bin/bash\n\n/data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/cellranger count \\\n --id=sample_id \\\n --transcriptome=/data/cephfs-1/work/projects/cubit/current/static_data/app_support/cellranger/refdata-cellranger-${species}-3.0.0\\\n --fastqs=/path/to/fastqs \\\n --sample=sample_name \\\n --expect-cells=n_cells \\\n --jobmode=slurm \\\n --maxjobs=100 \\\n --jobinterval=1000\n
and then submit the job via
sbatch --ntasks=1 --mem-per-cpu=4G --time=8:00:00 -p medium -o cellranger.log run_cellranger.sh\n
"},{"location":"how-to/software/cell-ranger/#cluster-support-sge-outdated","title":"cluster support SGE (outdated)","text":"add a file sge.template
to /data/cephfs-1/home/users/$USER/work/cellranger-3.0.2/martian-cs/v3.2.0/jobmanagers/sge.template
with the following contents:
# =============================================================================\n# Template\n# =============================================================================\n#\n#$ -N __MRO_JOB_NAME__\n#$ -V\n#$ -pe smp __MRO_THREADS__\n#$ -cwd\n#$ -P medium\n#$ -o __MRO_STDOUT__\n#$ -e __MRO_STDERR__\n#$ -l h_vmem=__MRO_MEM_GB_PER_THREAD__G\n#$ -l h_rt=08:00:00\n\n#$ -m a\n#$ -M user@email.com\n\n__MRO_CMD__\n
and submit the job via
qsub -cwd -V -pe smp 1 -l h_vmem=8G -l h_rt=24:00:00 -P medium -m a -j y run_cellranger.sh\n
"},{"location":"how-to/software/jupyter/","title":"How-To: Run Jupyter","text":"SSH Tunnels Considered Harmful
Please use our Open OnDemand Portal for running Jupyter notebooks!
The information below is still accurate. However, many users find it tricky to get SSH tunnels working correctly. A considerable number of parts is involved and you have to get each step 100% correct. Helpdesk cannot support you in problems with SSH tunnels that are caused by incorrect usage.
"},{"location":"how-to/software/jupyter/#what-is-jupyter","title":"What is Jupyter","text":"Project Jupyter is a networking protocol for interactive computing that allows the user to write and execute code for a high number of different programming languages. The most used client is Jupyter Notebook that can be encountered in various form all over the web. Its basic principle is a document consisting of different cells, each of which contains either code (executed in place) or documentation (written in markdown). This allows one to handily describe the processed workflow.
"},{"location":"how-to/software/jupyter/#setup-and-running-jupyter-on-the-cluster","title":"Setup and running Jupyter on the cluster","text":"Install Jupyter on the cluster (via conda, by creating a custom environment)
hpc-cpu-x:~$ conda create -n jupyter jupyter\nhpc-cpu-x:~$ conda activate jupyter\n
(If you want to work in a language other than python, you can install more Jupyter language kernel, see the kernel list)
Now you can start the Jupyter server session (you may want to do this in a screen
& srun --pty bash -i
session as jupyter keeps running while you are doing computations)
hpc-cpu-x:~$ jupyter notebook --no-browser\n
Check the port number (usually 8888
) in the on output and remember it for later:
[I 23:39:40.860 NotebookApp] The Jupyter Notebook is running at:\n[I 23:39:40.860 NotebookApp] http://localhost:8888/\n
By default, Jupyter will create an access token (a link stated in the output) to protect your notebook against unauthorized access which you have to save and enter in the accessing browser. You can change this to password base authorization via jupyter notebook password
. If you are running multiple server on one or more nodes, one can separate them by changing the port number by adding --port=$PORT
.
"},{"location":"how-to/software/jupyter/#connecting-to-the-running-session","title":"Connecting to the Running Session","text":"This is slightly trickier as we have to create a SSH connection/tunnel with potentially multiple hops in between. The easiest way is probably to configure your .ssh/config
to automatically route your connection via the login node (and possibly MDC jail). This is described in our Advanced SSH config documentation
In short,add these lines to ~/.ssh/config
(replace curly parts):
Host bihcluster\n user {USER_NAME}\n HostName hpc-login-2.cubi.bihealth.org\n\nHost hpc-cpu*\n user {USER_NAME}\n ProxyJump bihcluster\n
For MDC users outside the MDC network:
Host mdcjail\n HostName ssh1.mdc-berlin.de\n User {MDC_USER_NAME}\n\nHost bihcluster\n user {USER_NAME}\n HostName hpc-login-2.cubi.bihealth.org\n\nHost hpc-cpu*\n user {USER_NAME}\n ProxyJump bihcluster\n
Check that this config is working by connecting like this: ssh hpc-cpu-1
. Please note that you cannot use any resources on this node without a valid Slurm session.
Now you setup a tunnel for your running Jupyter session:
workstation:~$ ssh -N -f -L 127.0.0.1:8888:localhost:{PORT} hpc-cpu-x\n
The port of your Jupyter server is usually 8888
. The cluster node srun
has sent you to determines the last argument. You should now be able to connect to your Jupyter server by typing localhost:8888
in your webbrowser (see the note about token and password above).
"},{"location":"how-to/software/jupyter/#losing-connection","title":"Losing connection","text":"It can and will happen that will lose connection, either due to network problems or due to shut-down of your computer. This is not a problem at all and you will not lose data, just reconnect to your session. If your notebooks are also losing connection (you will see a colorful remark in the top right corner), reconnect and click the colorful button. If this does not work, your work is still not lost as all cells that have been executed are automatically saved anyways. Copy all unexecuted cells (those are only saved periodically) and reload the browser page (after reconnecting) with F5
. (you can also open a copy of the notebook in another tab, just be aware that there may be synchronisation problems)
"},{"location":"how-to/software/jupyter/#ending-a-session","title":"Ending a Session","text":"There are two independent steps in ending a session:
Canceling the SSH tunnel
- Identify the running SSH process
hpc-cpu-x:~$ ps aux | grep \"$PORT\"\n
This will give you something like this:
user 54 0.0 0.0 43104 784 ? Ss 15:06 0:00 ssh -N -f -L 127.0.0.1:8888:localhost:8888 hpc-cpu-x\nuser 58 0.0 0.0 41116 1024 tty1 S 15:42 0:00 grep --color=auto 8888\n
from which you need the process ID (here 54
)
- Terminate it the process
hpc-cpu-x:~$ kill -9 $PID\n
Shutdown the Jupyter server
Open the Jupyter session, cancel the process with {Ctrl} + {C} and confirm {y}. Make sure you saved your notebooks beforehand (though auto-save catches most things).
"},{"location":"how-to/software/jupyter/#advanced","title":"Advanced","text":" - List of available Jupyter Kernels for different programming languages
- Jupyterlab is a further development in the Jupyter ecosystem that creates a display similar to RStudio with panels for the current file system and different notebooks in different tabs.
- One can install Jupyter kernels or python packages while running a server or notebook without restrictions
If anyone has figured out, the following might also be interesting (please add):
- create a Jupyter-Hub
- multi-user support
"},{"location":"how-to/software/keras/","title":"How-To: Run Keras (Multi-GPU)","text":"Because the GPU nodes med030[1-4]
has four GPU units we can train a model by using multiple GPUs in parallel. This How-To gives an example with Keras 2.2.4 together and tensorflow. Finally soem hints how you can submit a job on the cluster.
Hint
With tensorflow > 2.0 and newer keras version the multi_gpu_model
is deprecated and you have to use the MirroredStrategy
.
"},{"location":"how-to/software/keras/#keras-code","title":"Keras code","text":"we need to import the multi_gpu_model
model from keras.utils
and have to pass our actual model (maybe sequential Keras model) into it. In general Keras automatically configures the number of available nodes (gpus=None
). This seems not to work on our system. So we have to specify the numer of GPUs, e.g. two with gpus=2
. We put this in a try catch environment that it will also work on CPUs.
from keras.utils import multi_gpu_model\n\ntry: \n model = multi_gpu_model(model, gpus=2) \nexcept:\n pass\n
That's it!
Please read here on how to submit jobs to the GPU nodes.
"},{"location":"how-to/software/keras/#conda-environment","title":"Conda environment","text":"All this was tested with the following conda environment:
name: cuda channels: \n- conda-forge\n- bioconda\n- defaults\ndependencies:\n- keras=2.2.4\n- python=3.6.7\n- tensorboard=1.12.0\n- tensorflow=1.12.0\n- tensorflow-base=1.12.0\n- tensorflow-gpu=1.12.0\n
"},{"location":"how-to/software/matlab/","title":"How-To: Use Matlab","text":"Note
This information is outdated and will soon be removed.
GNU Octave as Matlab alternative
Note that GNU Octave is an Open Source alternative to Matlab. While both packages are not 100% compatible, Octave is an alternative that does not require any license management. Further, you can easily install it yourself using Conda.
Want to use the Matlab GUI?
Make sure you understand X forwarding as outline in this FAQ entry.
You can also use Open OnDemand Portal to run Matlab.
"},{"location":"how-to/software/matlab/#pre-requisites","title":"Pre-requisites","text":"You have to register with hpc-helpdesk@bih-charite.de for requesting access to the Matlab licenses. Afterwards, you can connect to the High-Memory using the license_matlab_r2016b
resource (see below).
"},{"location":"how-to/software/matlab/#how-to-use","title":"How-To Use","text":"BIH has a license of Matlab R2016b for 16 seats and various licensed packages (see below). To display the available licenses:
hpc-login-1:~$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=0 Free=16 Remote=no\n
Matlab is installed on all of the compute nodes:
# The following is VITAL so the scheduler allocates a license to your session.\nhpc-login-1:~$ srun -L matlab_r2016b:1 --pty bash -i\nmed0127:~$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=1 Free=15 Remote=no\nmed0127:~$ module avail\n----------------- /usr/share/Modules/modulefiles -----------------\ndot module-info null\nmodule-git modules use.own\n\n----------------------- /opt/local/modules -----------------------\ncmake/3.11.0-0 llvm/6.0.0-0 openmpi/3.1.0-0\ngcc/7.2.0-0 matlab/r2016b-0\nmed0127:~$ module load matlab/r2016b-0\nStart matlab without GUI: matlab -nosplash -nodisplay -nojvm\n Start matlab with GUI (requires X forwarding (ssh -X)): matlab\nmed0127:~$ matlab -nosplash -nodisplay -nojvm\n < M A T L A B (R) >\n Copyright 1984-2016 The MathWorks, Inc.\n R2016b (9.1.0.441655) 64-bit (glnxa64)\n September 7, 2016\n\n\nFor online documentation, see http://www.mathworks.com/support\nFor product information, visit www.mathworks.com.\n\n\n Non-Degree Granting Education License -- for use at non-degree granting, nonprofit,\n educational organizations only. Not for government, commercial, or other organizational use.\n\n>> ver\n--------------------------------------------------------------------------------------------\nMATLAB Version: 9.1.0.441655 (R2016b)\nMATLAB License Number: 1108905\nOperating System: Linux 3.10.0-862.3.2.el7.x86_64 #1 SMP Mon May 21 23:36:36 UTC 2018 x86_64\nJava Version: Java is not enabled\n--------------------------------------------------------------------------------------------\nMATLAB Version 9.1 (R2016b)\nBioinformatics Toolbox Version 4.7 (R2016b)\nGlobal Optimization Toolbox Version 3.4.1 (R2016b)\nImage Processing Toolbox Version 9.5 (R2016b)\nOptimization Toolbox Version 7.5 (R2016b)\nParallel Computing Toolbox Version 6.9 (R2016b)\nPartial Differential Equation Toolbox Version 2.3 (R2016b)\nSignal Processing Toolbox Version 7.3 (R2016b)\nSimBiology Version 5.5 (R2016b)\nStatistics and Machine Learning Toolbox Version 11.0 (R2016b)\nWavelet Toolbox Version 4.17 (R2016b)\n>> exit\n
"},{"location":"how-to/software/matlab/#running-matlab-ui","title":"Running MATLAB UI","text":"For starting the Matlab with GUI, make sure that your client is running a X11 server and you connect with X11 forwarding enabled (e.g., ssh -X hpc-login-1.cubi.bihealth.org
from the Linux command line). Then, make sure to use srun -L matlab_r2016b:1 --pty --x11 bash -i
for connecting to a node with X11 forwarding enabled.
client:~$ ssh -X hpc-login-1.cubi.bihealth.org\n[...]\nhpc-login-1:~ $ srun -L matlab_r2016b:1 --pty --x11 bash -i\n[...]\nmed0203:~$ module load matlab/r2016b-0\nStart matlab without GUI: matlab -nosplash -nodisplay -nojvm\n Start matlab with GUI (requires X forwarding (ssh -X)): matlab\nmed0203:~$ matlab\n[UI will start]\n
For forcing starting in text mode can be done (as said after module load
): matlab -nosplash -nodisplay -nojvm
.
Also see this FAQ entry.
"},{"location":"how-to/software/matlab/#see-available-matlab-licenses","title":"See Available Matlab Licenses","text":"You can use scontrol show lic
to see the currently available MATLAB license. E.g., here I am running an interactive shell in which I have requested 1 of the 16 MATLAB licenses, so 15 more remain.
$ scontrol show lic\nLicenseName=matlab_r2016b\n Total=16 Used=1 Free=15 Remote=no\n
"},{"location":"how-to/software/matlab/#a-working-example","title":"A Working Example","text":"Get a checkout of our MATLAB example. Then, look around at the contents of this repository.
hpc-login-1:~$ git clone https://github.com/bihealth/bih-cluster-matlab-example.git\nhpc-login-1:~$ cd bih-cluster-matlab-example\nhpc-login-1:~$ cat job_script.sh\n#!/bin/bash\n\n# Logging goes to directory sge_log\n#SBATCH -o slurm_log/%x-%J.log\n# Keep current environment variables\n#SBATCH --export=ALL\n# Name of the script\n#SBATCH --job-name MATLAB-example\n# Allocate 4GB of RAM per core\n#SBATCH --mem 4G\n# Maximal running time of 2 hours\n#SBATCH --time 02:00:00\n# Allocate one Matlab license\n#SBATCH -L matlab_r2016b:1\n\nmodule load matlab/r2016b-0\n\nmatlab -r example\n$ cat example.m\n% Example Hello World script for Matlab.\n\ndisp('Hello world!')\ndisp('Thinking...')\n\npause(10)\n\ndisp(sprintf('The square root of 2 is = %f', sqrt(2)))\nexit\n
For submitting the script, you can do the following
hpc-login-1:~$ sbatch job_script.sh\n
This will submit a job with one Matlab license requested. If you were to submit 17 of these jobs, then at least one of them would have to wait until all licenses are free.
Matlab License Server
Note that there is a Matlab license server running on the server that will check whether 16 or less Matlab sessions are currently running. If a Matlab session is running but this is not made known to the scheduler via -L matlab_r2016b
then this can lead to scripts crashing as not enough licenses are available. If this happens to you, double-check that you have specified the license requirements correctly and notify hpc-helpdesk@bih-charite.de in case of any problems. We will try to sort out the situation then.
"},{"location":"how-to/software/openmpi/","title":"How-To: Build and Run OpenMPI Program","text":"This article describes how to build an run an OpenMPI program. We will build a simple C program that uses the OpenMPI message passing interface and run it in parallel. You should be able to go from here with other languages and more complex programs. We will use a simple Makefile for building the software.
"},{"location":"how-to/software/openmpi/#loading-openmpi-environment","title":"Loading OpenMPI Environment","text":"First, load the OpenMPI package.
hpc-login-1:~$ srun --pty bash -i\nmed0127:~$ module load openmpi/4.3.0-0\n
Then, check that the installation works
med0127:~$ ompi_info | head\n Package: Open MPI root@med0127 Distribution\n Open MPI: 4.0.3\n Open MPI repo revision: v4.0.3\n Open MPI release date: Mar 03, 2020\n Open RTE: 4.0.3\n Open RTE repo revision: v4.0.3\n Open RTE release date: Mar 03, 2020\n OPAL: 4.0.3\n OPAL repo revision: v4.0.3\n OPAL release date: Mar 03, 2020\n
"},{"location":"how-to/software/openmpi/#building-the-example","title":"Building the example","text":"Next, clone the OpenMPI example project from Gitlab.
med0127:~$ git clone git@github.com:bihealth/bih-cluster-openmpi-example.git\nmed0127:~$ cd bih-cluster-openmpi-example/src\n
Makefile
.PHONY: default clean\n\n# configure compilers\nCC=mpicc\nCXX=mpicxx\n# configure flags\nCCFLAGS += $(shell mpicc --showme:compile)\nLDFLAGS += $(shell mpicc --showme:link)\n\ndefault: openmpi_example\n\nopenmpi_example: openmpi_example.o\n\nclean:\n rm -f openmpi_example.o openmpi_example\n
openmpi_example.c
#include <stdio.h>\n#include <mpi.h>\n\nint main(int argc, char** argv) {\n // Initialize the MPI environment\n MPI_Init(NULL, NULL);\n\n // Get the number of processes\n int world_size;\n MPI_Comm_size(MPI_COMM_WORLD, &world_size);\n\n // Get the rank of the process\n int world_rank;\n MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);\n\n // Get the name of the processor\n char processor_name[MPI_MAX_PROCESSOR_NAME];\n int name_len;\n MPI_Get_processor_name(processor_name, &name_len);\n\n // Print off a hello world message\n printf(\"Hello world from processor %s, rank %d\"\n \" out of %d processors\\n\",\n processor_name, world_rank, world_size);\n\n // Finalize the MPI environment.\n MPI_Finalize();\n\n return 0;\n}\n
run_mpi.sh
#!/bin/bash\n\n# Example job script for (single-threaded) MPI programs.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name openmpi_example\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per node\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log\"\n#SBATCH -o slurm_log/slurm-%x-%J.log\n\n# MPI-specific parameters\n\n# Run 64 tasks (threads/on virtual cores)\n#SBATCH --nodes 64\n\n# Make sure to source the profile.d file (not available on head nodes).\n/etc/profile.d/modules.sh\n\n# Load the OpenMPI environment module to get the runtime environment.\nmodule load openmpi/3.1.0-0\n\n# Launch the program.\nmpirun -np 64 ./openmpi_example\n
The next step is building the software
med0127:~$ make\nmpicc -c -o openmpi_example.o openmpi_example.c\nmpicc -pthread -Wl,-rpath -Wl,/opt/local/openmpi-4.0.3-0/lib -Wl,--enable-new-dtags -L/opt/local/openmpi-4.0.3-0/lib -lmpi openmpi_example.o -o openmpi_example\nmed0127:~$ ls -lh\ntotal 259K\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 287 Apr 7 23:29 Makefile\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 8.5K Apr 8 00:15 openmpi_example\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 760 Apr 7 23:29 openmpi_example.c\n-rw-rw---- 1 holtgrem_c hpc-ag-cubi 2.1K Apr 8 00:15 openmpi_example.o\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 1.3K Apr 7 23:29 run_hybrid.sh\n-rwxrwx--- 1 holtgrem_c hpc-ag-cubi 663 Apr 7 23:35 run_mpi.sh\ndrwxrwx--- 2 holtgrem_c hpc-ag-cubi 4.0K Apr 7 23:29 sge_log\n
The software will run outside of the MPI environment -- but in a single process only, of course.
med0127:~$ ./openmpi_example\nHello world from processor med0127, rank 0 out of 1 processors\n
"},{"location":"how-to/software/openmpi/#running-openmpi-software","title":"Running OpenMPI Software","text":"All of the arguments are already in the run_mpi.sh
script.
med01247:~# sbatch run_mpi.sh\n
Explanation of the OpenMPI-specific arguments
--ntasks 64
: run 64 processes in the MPI environment.
Let's look at the slurm log file, e.g., in slurm_log/slurm-openmpi_example-3181.log
.
med0124:~$ cat slurm_log/slurm-openmpi_example-*.log\nHello world from processor med0133, rank 6 out of 64 processors\nHello world from processor med0133, rank 25 out of 64 processors\nHello world from processor med0133, rank 1 out of 64 processors\nHello world from processor med0133, rank 2 out of 64 processors\nHello world from processor med0133, rank 3 out of 64 processors\nHello world from processor med0133, rank 7 out of 64 processors\nHello world from processor med0133, rank 9 out of 64 processors\nHello world from processor med0133, rank 12 out of 64 processors\nHello world from processor med0133, rank 13 out of 64 processors\nHello world from processor med0133, rank 15 out of 64 processors\nHello world from processor med0133, rank 16 out of 64 processors\nHello world from processor med0133, rank 17 out of 64 processors\nHello world from processor med0133, rank 18 out of 64 processors\nHello world from processor med0133, rank 23 out of 64 processors\nHello world from processor med0133, rank 24 out of 64 processors\nHello world from processor med0133, rank 26 out of 64 processors\nHello world from processor med0133, rank 27 out of 64 processors\nHello world from processor med0133, rank 31 out of 64 processors\nHello world from processor med0133, rank 0 out of 64 processors\nHello world from processor med0133, rank 4 out of 64 processors\nHello world from processor med0133, rank 5 out of 64 processors\nHello world from processor med0133, rank 8 out of 64 processors\nHello world from processor med0133, rank 10 out of 64 processors\nHello world from processor med0133, rank 11 out of 64 processors\nHello world from processor med0133, rank 14 out of 64 processors\nHello world from processor med0133, rank 19 out of 64 processors\nHello world from processor med0133, rank 20 out of 64 processors\nHello world from processor med0133, rank 21 out of 64 processors\nHello world from processor med0133, rank 22 out of 64 processors\nHello world from processor med0133, rank 28 out of 64 processors\nHello world from processor med0133, rank 29 out of 64 processors\nHello world from processor med0133, rank 30 out of 64 processors\nHello world from processor med0134, rank 32 out of 64 processors\nHello world from processor med0134, rank 33 out of 64 processors\nHello world from processor med0134, rank 34 out of 64 processors\nHello world from processor med0134, rank 38 out of 64 processors\nHello world from processor med0134, rank 39 out of 64 processors\nHello world from processor med0134, rank 42 out of 64 processors\nHello world from processor med0134, rank 44 out of 64 processors\nHello world from processor med0134, rank 45 out of 64 processors\nHello world from processor med0134, rank 46 out of 64 processors\nHello world from processor med0134, rank 53 out of 64 processors\nHello world from processor med0134, rank 54 out of 64 processors\nHello world from processor med0134, rank 55 out of 64 processors\nHello world from processor med0134, rank 60 out of 64 processors\nHello world from processor med0134, rank 62 out of 64 processors\nHello world from processor med0134, rank 35 out of 64 processors\nHello world from processor med0134, rank 36 out of 64 processors\nHello world from processor med0134, rank 37 out of 64 processors\nHello world from processor med0134, rank 40 out of 64 processors\nHello world from processor med0134, rank 41 out of 64 processors\nHello world from processor med0134, rank 43 out of 64 processors\nHello world from processor med0134, rank 47 out of 64 processors\nHello world from processor med0134, rank 48 out of 64 processors\nHello world from processor med0134, rank 49 out of 64 processors\nHello world from processor med0134, rank 50 out of 64 processors\nHello world from processor med0134, rank 51 out of 64 processors\nHello world from processor med0134, rank 52 out of 64 processors\nHello world from processor med0134, rank 56 out of 64 processors\nHello world from processor med0134, rank 57 out of 64 processors\nHello world from processor med0134, rank 59 out of 64 processors\nHello world from processor med0134, rank 61 out of 64 processors\nHello world from processor med0134, rank 63 out of 64 processors\nHello world from processor med0134, rank 58 out of 64 processors\n
"},{"location":"how-to/software/openmpi/#running-hybrid-software-mpimultithreading","title":"Running Hybrid Software (MPI+Multithreading)","text":"In some cases, you want to mix multithreading (e.g., via OpenMP) with MPI to run one process with multiple threads that then can communicate via shared memory. Note that OpenMPI will let processes on the same node communicate via shared memory anyway, so this might not be necessary in all cases.
The file run_hybrid.sh
shows how to run an MPI job with 8 threads each.
Note well that memory is allocated on a per-slot (thus per-thread) base!
run_hybrid.sh
#!/bin/bash\n\n# Example job script for multi-threaded MPI programs, sometimes\n# called \"hybrid\" MPI computing.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name openmpi_example\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per node\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log\"\n#SBATCH -o slurm_log/slurm-%x-%J.log\n\n# MPI-specific parameters\n\n# Run 8 tasks (threads/on virtual cores)\n#SBATCH --ntasks 8\n# Allocate 4 CPUs per task (cores/threads)\n#SBATCH --cpus-per-task 4\n\n# Make sure to source the profile.d file (not available on head nodes).\nsource /etc/profile.d/modules.sh\n\n# Load the OpenMPI environment module to get the runtime environment.\nmodule load openmpi/4.0.3-0\n\n# Launch the program.\nmpirun -n 8 ./openmpi_example\n
We changed the following
- run 8 tasks (\"processes\")
- allocate 4 threads each
Let's look at the log output:
# cat slurm_log/slurm-openmpi_example-3193.log\nHello world from processor med0133, rank 1 out of 8 processors\nHello world from processor med0133, rank 3 out of 8 processors\nHello world from processor med0133, rank 2 out of 8 processors\nHello world from processor med0133, rank 6 out of 8 processors\nHello world from processor med0133, rank 0 out of 8 processors\nHello world from processor med0133, rank 4 out of 8 processors\nHello world from processor med0133, rank 5 out of 8 processors\nHello world from processor med0133, rank 7 out of 8 processors\n
Each process can now launch 4 threads (e.g., by defining export OMP_NUM_THREADS=4
before the program call).
"},{"location":"how-to/software/scientific-software/","title":"How-To: Install Custom Scientific Software","text":"This page gives an end-to-end example how to build and install Gromacs as an example for managing complex scientific software installs in user land. You don't have to learn or understand the specifics of Gromacs. We use it as an example as there are some actual users on the BIH cluster. However, installing it is out of scope of BIH HPC administration.
Gromacs is a good example as it is a sufficiently complex piece of software. Quite some configuration is done on the command line and there is no current software package of it in the common RPM repositories. However, it is quite well-documented and easy to install for scientific software so there is a lot to be learned.
"},{"location":"how-to/software/scientific-software/#related-documents","title":"Related Documents","text":" - How-To: Build and Run OpenMPI Programs
"},{"location":"how-to/software/scientific-software/#steps-for-installing-scientific-software","title":"Steps for Installing Scientific Software","text":"We will perform the following step:
- Download and extract the source of the software
- Configure the software (i.e., create the actual build system
Makefile
s) - Compile the software
- Install the software
- Create environment module files so the software is easy to use
Many scientific software packages will have more dependencies. If the dependencies are available as CentOS Core or EPEL packages (such as zlib), HPC IT administration can install them. However, otherwise you will have to install them on their own.
Warning
Do not perform the compilation on the login nodes but go to a compute node instead.
"},{"location":"how-to/software/scientific-software/#downloading-and-extracting-software","title":"Downloading and Extracting Software","text":"This is best done in your scratch
directory as we don't have to keep these files around for long. Note that the files in your scratch
directory will automatically be removed after 2 weeks. You can also use your work
directory here.
hpc-login-1:~$ srun --pty bash -i\nmed0127:~$ mkdir $HOME/scratch/gromacs-install\nmed0127:~$ cd $HOME/scratch/gromacs-install\nmed0127:~$ wget http://ftp.gromacs.org/pub/gromacs/gromacs-2018.3.tar.gz\nmed0127:~$ tar xf gromacs-2018.3.tar.gz\nmed0127:~$ ls gromacs-2018.3\nadmin cmake COPYING CTestConfig.cmake INSTALL scripts src\nAUTHORS CMakeLists.txt CPackInit.cmake docs README share tests\n
So far so good!
"},{"location":"how-to/software/scientific-software/#perform-the-configure-step","title":"Perform the Configure Step","text":"This is the most critical step. Most scientific C/C++ software has a build step and allows for, e.g., disabling and enabling features or setting installation paths. Here, you can configure the software depending on your needs and environment. Also, it is the easiest step to mess up.
Gromac's documentation is actually quite good but the author had problems to follow it to the letter. Gromacs recommends to create an MPI and a non-MPI build but the precise way did not work. This installation creates two flavours for Gromacs 2018.3, but in a different way than the Gromacs documentation proposes.
First, here is how to configure the non-MPI flavour Gromacs wants a modern compiler, so we load gcc
. We will need to note down the precise version we used so later we can load it for running Gromacs with the appropriate libraries. We will install gromacs into $HOME/work/software
, which is appropriate for user-installed software, but it could also go into a group or project directory. Note that we install the software into your work directory as software installations are quite large and might go above your home quota. Also, software installations are usually not precious enough to waste resources on snapshots and backups. Also that we force Gromacs to use AVX_256
for SIMD support (Intel sandy bridge architecture) to not get unsupported CPU instruction errors.
med0127:~$ module load gcc/7.2.0-0 cmake/3.11.0-0\nmed0127:~$ module list\nCurrently Loaded Modulefiles:\n 1) gcc/7.2.0-0 2) cmake/3.11.0-0\nmed0127:~$ mkdir gromacs-2018.3-build-nompi\nmed0127:~$ cd gromacs-2018.3-build-nompi\nmed0127:~$ cmake ../gromacs-2018.3 \\\n -DGMX_BUILD_OWN_FFTW=ON \\\n -DGMX_MPI=OFF \\\n -DGMX_SIMD=AVX_256 \\\n -DCMAKE_INSTALL_PREFIX=$HOME/work/software/gromacs/2018.3\n
Second, here is how to configure the MPI flavour. Note that we are also enabling the openmpi
module. We will also need the precise version here so we can later load the correct libraries. Note that we install the software into the directory gromacs-mpi
but switch off shared library building as recommended by the Gromacs documentation.
med0127:~$ module load openmpi/3.1.0-0\nmed0127:~$ module list\nCurrently Loaded Modulefiles:\n 1) gcc/7.2.0-0 2) cmake/3.11.0-0 3) openmpi/4.0.3-0\nmed0127:~$ mkdir gromacs-2018.3-build-mpi\nmed0127:~$ cd gromacs-2018.3-build-mpi\nmed0127:~$ cmake ../gromacs-2018.3 \\\n -DGMX_BUILD_OWN_FFTW=ON \\\n -DGMX_MPI=ON \\\n -DGMX_SIMD=AVX_256 \\\n -DCMAKE_INSTALL_PREFIX=$HOME/work/software/gromacs-mpi/2018.3 \\\n -DCMAKE_C_COMPILER=$(which mpicc) \\\n -DCMAKE_CXX_COMPILER=$(which mpicxx) \\\n -DBUILD_SHARED_LIBS=off\n
"},{"location":"how-to/software/scientific-software/#perform-the-build-and-install-steps","title":"Perform the Build and Install Steps","text":"This is simple, using -j 32
allows us to build with 32 threads. If something goes wrong: meh, the \"joys\" of compilling C software.
Getting Support for Building Software
BIH HPC IT cannot provide support for compiling scientific software. Please contact the appropriate mailing lists or forums for your scientific software. You should only contact the BIH HPC IT helpdesk only if you are sure that the problem is with the BIH HPC cluster. You should try to resolve the issue on your own and with the developers of the software that you are trying to build/use.
For the no-MPI version:
med0127:~$ cd ../cd gromacs-2018.3-build-nompi\nmed0127:~$ make -j 32\n[...]\nmed0127:~$ make install\n
For the MPI version:
med0127:~$ cd ../cd gromacs-2018.3-build-mpi\nmed0127:~$ make -j 32\n[...]\nmed0127:~$ make install\n
"},{"location":"how-to/software/scientific-software/#create-environment-modules-files","title":"Create Environment Modules Files","text":"For Gromacs 2018.3, the following is appropriate. You should be able to use this as a template for your environment module files:
med0127:~$ mkdir -p $HOME/local/modules/gromacs\nmed0127:~$ cat >$HOME/local/modules/gromacs/2018.3 <<\"EOF\"\n#%Module\nproc ModulesHelp { } {\n puts stderr {\n Gromacs molecular simulation toolkit (non-MPI version)\n\n - http://www.gromacs.org\n }\n}\n\nmodule-whatis {Gromacs molecular simulation toolkit (non-MPI)}\n\nset root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3\n\nprereq gcc/7.2.0-0\n\nconflict gromacs\nconflict gromacs-mpi\n\nprepend-path LD_LIBRARY_PATH $root/lib64\nprepend-path LIBRARY_PATH $root/lib64\nprepend-path MANPATH $root/share/man\nprepend-path PATH $root/bin\nsetenv GMXRC $root/bin/GMXRC\nEOF\n
med0127:~$ mkdir -p $HOME/local/modules/gromacs-mpi\nmed0127:~$ cat >$HOME/local/modules/gromacs-mpi/2018.3 <<\"EOF\"\n#%Module\nproc ModulesHelp { } {\n puts stderr {\n Gromacs molecular simulation toolkit (MPI version)\n\n - http://www.gromacs.org\n }\n}\n\nmodule-whatis {Gromacs molecular simulation toolkit (MPI)}\n\nset root /data/cephfs-1/home/users/YOURUSER/work/software/gromacs-mpi/2018.3\n\nprereq openmpi/4.0.3-0\nprereq gcc/7.2.0-0\n\nconflict gromacs\nconflict gromacs-mpi\n\nprepend-path LD_LIBRARY_PATH $root/lib64\nprepend-path LIBRARY_PATH $root/lib64\nprepend-path MANPATH $root/share/man\nprepend-path PATH $root/bin\nsetenv GMXRC $root/bin/GMXRC\nEOF\n
With the next command, make your local modules files path known to the environemtn modules system.
med0127:~$ module use $HOME/local/modules\n
You can verify the result:
med0127:~$ module avail\n\n------------------ /data/cephfs-1/home/users/YOURUSER/local/modules ------------------\ngromacs/2018.3 gromacs-mpi/2018.3\n\n-------------------- /usr/share/Modules/modulefiles --------------------\ndot module-info null\nmodule-git modules use.own\n\n-------------------------- /opt/local/modules --------------------------\ncmake/3.11.0-0 llvm/6.0.0-0 openmpi/3.1.0-0\ngcc/7.2.0-0 matlab/r2016b-0 openmpi/4.0.3-0\n
"},{"location":"how-to/software/scientific-software/#interlude-convenient-module-use","title":"Interlude: Convenient module use
","text":"You can add this to your ~/.bashrc
file to always execute the module use
after login. Note that module
is not available on the login or transfer nodes, the following should work fine:
med0127:~$ cat >>~/.bashrc <<\"EOF\"\ncase \"${HOSTNAME}\" in\n login-*|transfer-*)\n ;;\n *)\n module use $HOME/local/modules\n ;;\nesac\nEOF\n
Note that the paths chosen above are sensible but arbitrary. You can install any software anywhere you have permission to -- somewhere in your user and group home, maybe a project home makes most sense on the BIH HPC, no root permissions required. You can also place the module files anywhere, as long as the module use
line is appropriate.
As a best practice, you could use the following location:
- User-specific installation:
$HOME/work/software
as a root to install software to $HOME/work/software/$PKG/$VERSION
for installing a given software package in a given version $HOME/work/software/modules
as the root for modules to install $HOME/work/software/$PKG/$VERSION
for the module file to load the software in a given version $HOME/work/software/modules.sh
as a Bash script to contain the line module use $HOME/work/software/modules
- Group/project specific installation for a shared setup. Don't forget to give the group and yourself read permission only so you don't accidentally damage files after instalation (
chmod ug=rX,o= $GROUP/work/software
, the upper case X
is essential to only set +x
on directories and not files): $GROUP/work/software
as a root to install software to $GROUP/work/software/$PKG/$VERSION
for installing a given software package in a given version $GROUP/work/software/modules
as the root for modules to install $GROUP/work/software/$PKG/$VERSION
for the module file to load the software in a given version $GROUP/work/software/modules.sh
as a Bash script to contain the case
Bash snippet from above but with module use $GROUP/work/software/modules
- This setup allows multiple users to provide software installations and share it with others.
"},{"location":"how-to/software/scientific-software/#going-on-with-gromacs","title":"Going on with Gromacs","text":"Every time you want to use Gromacs, you can now do
med0127:~$ module load gcc/7.2.0-0 gromacs/2018.3\n
or, if you want to have the MPI version:
med0127:~$ module load gcc/7.2.0-0 openmpi/4.0.3-0 gromacs-mpi/2018.3\n
"},{"location":"how-to/software/scientific-software/#launching-gromacs","title":"Launching Gromacs","text":"Something along the lines of the following job script should be appropriate. See How-To: Build Run OpenMPI Programs for more information.
#!/bin/bash\n\n# Example job script for (single-threaded) MPI programs.\n\n# Generic arguments\n\n# Job name\n#SBATCH --job-name gromacs\n# Maximal running time of 10 min\n#SBATCH --time 00:10:00\n# Allocate 1GB of memory per CPU\n#SBATCH --mem 1G\n# Write logs to directory \"slurm_log/<name>-<job id>.log\" (dir must exist)\n#SBATCH --output slurm_log/%x-%J.log\n\n# MPI-specific parameters\n\n# Launch on 8 nodes (== 8 tasks)\n#SBATCH --ntasks 8\n# Allocate 4 CPUs per task (== per node)\n#SBATCH --cpus-per-task 4\n\n# Load the OpenMPI and GCC environment module to get the runtime environment.\nmodule load gcc/4.7.0-0\nmodule load openmpi/4.0.3-0\n\n# Make custom environment modules known. Alternative, you can \"module use\"\n# them in the session you use for submitting the job.\nmodule use $HOME/local/modules\nmodule load gromacs-mpi/2018.3\n\n# Launch the program on 8 nodes and tell Gromacs to use 4 threads for each\n# invocation.\nexport OMP_NUM_THREADS=4\nmpirun -n 8 gmx_mpi mdrun -deffnm npt_1000\n
med0127:~$ mkdir slurm_log\nmed0127:~$ sbatch job_script.sh\nSubmitted batch job 3229\n
"},{"location":"how-to/software/tensorflow/","title":"How-To: Setup TensorFlow","text":"TensorFlow is a package for deep learning with optional support for GPUs. You can find the original TensorFlow installation instructions here.
This article describes how to set up TensorFlow with GPU support using Conda. This how-to assumes that you have just connected to a GPU node via srun --mem=10g --partition=gpu --gres=gpu:tesla:1 --pty bash -i
(for Tesla V100 GPUs, for A400 GPUs use --gres=gpu:a40:1
). Note that you will need to allocate \"enough\" memory, otherwise your python session will be Killed
because of too little memory. You should read the How-To: Connect to GPU Nodes tutorial on an explanation of how to do this.
This tutorial assumes, that conda has been set up as described in [Software Management]((../../best-practice/software-installation-with-conda.md).
"},{"location":"how-to/software/tensorflow/#create-conda-environment","title":"Create conda environment","text":"We recommend that you install mamba first with conda install -y mamba
and use this C++ reimplementation of the conda command
as follows.
$ conda create -y -n python-tf tensorflow-gpu\n$ conda activate python-tf\n
Let us verify that we have Python and TensorFlow installed. You might get different versions you could pin the version on installing with `conda create -y -n python-tf python==3.9.10 tensorflow-gpu==2.6.2
$ python --version\nPython 3.9.10\n$ python -c 'import tensorflow; print(tensorflow.__version__)'\n2.6.2\n
We thus end up with an installation of Python 3.9.10 with tensorflow 2.6.2.
"},{"location":"how-to/software/tensorflow/#run-tensorflow-example","title":"Run TensorFlow Example","text":"Let us now see whether TensorFlow has recognized our GPU correctly.
$ python\n>>> import tensorflow as tf\n>>> print(\"TensorFlow version:\", tf.__version__)\nTensorFlow version: 2.6.2\n>>> print(tf.config.list_physical_devices())\n[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]\n
Yay, we can proceed to run the Quickstart Tutorial.
>>> mnist = tf.keras.datasets.mnist\n>>> (x_train, y_train), (x_test, y_test) = mnist.load_data()\n>>> x_train, x_test = x_train / 255.0, x_test / 255.0\n>>> model = tf.keras.models.Sequential([\n... tf.keras.layers.Flatten(input_shape=(28, 28)),\n... tf.keras.layers.Dense(128, activation='relu'),\n... tf.keras.layers.Dropout(0.2),\n... tf.keras.layers.Dense(10)\n... ])\n>>> predictions = model(x_train[:1]).numpy()\n>>> predictions\narray([[-0.50569224, 0.26386747, 0.43226188, 0.61226094, 0.09630793,\n 0.34400576, 0.9819117 , -0.3693726 , 0.5221357 , 0.3323232 ]],\n dtype=float32)\n>>> tf.nn.softmax(predictions).numpy()\narray([[0.04234391, 0.09141268, 0.10817807, 0.12951255, 0.07731011,\n 0.09903987, 0.18743432, 0.04852816, 0.11835073, 0.09788957]],\n dtype=float32)\n>>> loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n>>> loss_fn(y_train[:1], predictions).numpy()\n2.3122327\n>>> model.compile(optimizer='adam',\n... loss=loss_fn,\n... metrics=['accuracy'])\n>>> model.fit(x_train, y_train, epochs=5)\n2022-03-09 17:53:47.237997: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)\nEpoch 1/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.2918 - accuracy: 0.9151\nEpoch 2/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.1444 - accuracy: 0.9561\nEpoch 3/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.1082 - accuracy: 0.9674\nEpoch 4/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.0898 - accuracy: 0.9720\nEpoch 5/5\n1875/1875 [==============================] - 3s 1ms/step - loss: 0.0773 - accuracy: 0.9756\n<keras.callbacks.History object at 0x154e81360190>\n>>> model.evaluate(x_test, y_test, verbose=2)\n313/313 - 0s - loss: 0.0713 - accuracy: 0.9785\n[0.0713074803352356, 0.9785000085830688]\n>>> probability_model = tf.keras.Sequential([\n... model,\n... tf.keras.layers.Softmax()\n... ])\n>>> probability_model(x_test[:5])\n<tf.Tensor: shape=(5, 10), dtype=float32, numpy=\narray([[1.2339272e-06, 6.5599060e-10, 1.0560590e-06, 5.9356184e-06,\n 5.3691075e-12, 1.4447859e-07, 5.4218874e-13, 9.9996936e-01,\n 1.0347234e-07, 2.2147648e-05],\n [2.9887938e-06, 6.8461006e-05, 9.9991941e-01, 7.2003731e-06,\n 2.9751782e-13, 8.2818183e-08, 1.4307782e-06, 2.3203837e-13,\n 4.7433215e-07, 2.9504194e-14],\n [1.8058477e-06, 9.9928612e-01, 7.8716243e-05, 3.9140195e-06,\n 3.0842333e-05, 9.4537208e-06, 2.2774333e-05, 4.5549971e-04,\n 1.1015874e-04, 6.9138093e-07],\n [9.9978787e-01, 3.0206781e-08, 2.8528208e-05, 8.5581682e-08,\n 1.3851340e-07, 2.3634559e-06, 1.8480707e-05, 1.0153375e-04,\n 1.1583331e-07, 6.0887167e-05],\n [6.4914235e-07, 2.5808356e-08, 1.8225538e-06, 2.3215563e-09,\n 9.9588013e-01, 4.6049720e-08, 3.8903639e-07, 2.9772724e-05,\n 4.3141077e-07, 4.0867776e-03]], dtype=float32)>\n>>> exit()\n
"},{"location":"how-to/software/tensorflow/#writing-tensorflow-slurm-jobs","title":"Writing TensorFlow Slurm Jobs","text":"Writing Slurm jobs using TensorFlow is as easy as creating the following scripts.
tf_script.py
#/usr/bin/env python\n\nimport tensorflow as tf\nprint(\"TensorFlow version:\", tf.__version__)\nprint(tf.config.list_physical_devices())\n\nmnist = tf.keras.datasets.mnist\n\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\nx_train, x_test = x_train / 255.0, x_test / 255.0\n\n\nmodel = tf.keras.models.Sequential([\n tf.keras.layers.Flatten(input_shape=(28, 28)),\n tf.keras.layers.Dense(128, activation='relu'),\n tf.keras.layers.Dropout(0.2),\n tf.keras.layers.Dense(10)\n])\n\npredictions = model(x_train[:1]).numpy()\nprint(predictions)\n\nprint(tf.nn.softmax(predictions).numpy())\n\n# ... and so on ;-)\n
tf_job.sh
#!/usr/bin/bash\n\n#SBATCH --job-name=tf-job\n#SBATCH --mem=10g\n#SBATCH --partition=gpu\n#SBATCH --gres=gpu:tesla:1\n\nsource $HOME/work/miniforge/bin/activate\nconda activate python-tf\n\npython tf_script.py &>tf-out.txt\n
And then calling
$ sbatch tf_job.sh\n
You can find the reuslts in tf-out.txt
after completion.
$ cat tf-out.txt \n2022-03-09 18:05:54.628846: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA\nTo enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n2022-03-09 18:05:56.999848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 30988 MB memory: -> device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:18:00.0, compute capability: 7.0\nTensorFlow version: 2.6.2\n[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]\n[[-0.07757086 0.04676083 0.9420195 -0.59902835 -0.26286742 -0.392514\n 0.3231195 -0.17169198 0.3480805 0.37013203]]\n[[0.07963609 0.09017922 0.22075593 0.04727634 0.06616627 0.05812084\n 0.11888511 0.07248258 0.12188996 0.12460768]]\n
"},{"location":"hpc-tutorial/episode-0/","title":"First Steps: Episode 0","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm?"},{"location":"hpc-tutorial/episode-0/#prerequisites","title":"Prerequisites","text":"This tutorial assumes familiarity with Linux/Unix operating systems. It also assumes that you have already connected to the cluster. We have collected some links to tutorials and manuals on the internet.
"},{"location":"hpc-tutorial/episode-0/#legend","title":"Legend","text":"Before we start with our first steps tutorial, we would like to introduce the following convention that we use throughout the series:
$ Commands are prefixed with a little dollar sign\n
While file paths are highlighted like this: /data/cephfs-1/work/projects/cubit/current
.
"},{"location":"hpc-tutorial/episode-0/#instant-gratification","title":"Instant Gratification","text":"After connecting to the cluster, you are located on a login node. To get to your first compute node, type srun --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i
which will launch an interactive Bash session on a free remote node running up to 7 days, enabling you to use 8 cores and 8 Gb memory. Typing exit
will you bring back to the login node.
hpc-login-1$ srun -p long --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i\nhpc-cpu-1$ exit\n$\n
See? That was easy!
"},{"location":"hpc-tutorial/episode-0/#preparation","title":"Preparation","text":"In preparation for our first steps tutorial series, we would like you to install the software for this tutorial. In general the users on the cluster will manage their own software with the help of conda. If you haven't done so so far, please follow the instructions in installing conda first. The only premise is that you are able to log into the cluster. Make also sure that you are logged in to a computation node using srun -p medium --time 1-00 --mem=4G --cpus-per-task=1 --pty bash -i
.
Now we will create a new environment, so as to not interfere with your current or planned software stack, and install into it all the software that we need during the tutorial. Run the following commands:
$ conda create -n first-steps python=3 snakemake bwa delly samtools gatk4\n$ conda activate first-steps\n(first-steps) $\n
"},{"location":"hpc-tutorial/episode-1/","title":"First Steps: Episode 1","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? This is part one of the \"First Steps\" BIH Cluster Tutorial. Here we will build a small pipeline with alignment and variant calling. The premise is that you have the tools installed as described in Episode 0. For this episode, please make sure that you are on a compute node. As a reminder, the command to access a compute node with the required resources is
$ srun --time 7-00 --mem=8G --cpus-per-task=8 --pty bash -i\n
"},{"location":"hpc-tutorial/episode-1/#tutorial-input-files","title":"Tutorial Input Files","text":"We will provide you with some example FASTQ files, but you can use your own if you like. You can find the data here:
/data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz
/data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz
"},{"location":"hpc-tutorial/episode-1/#creating-a-project-directory","title":"Creating a Project Directory","text":"First, you should create a folder where the output of this tutorial will go. It would be good to have it in your work
directory in /data/cephfs-1/home/users/$USER
, because it is faster and there is more space available.
(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode1\n(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode1\n
Quotas / File System limits
- Note well that you have a quota of 1 GB in your home directory at
/data/cephfs-1/home/users/$USER
. The reason for this is that nightly snapshots and backups are created for this directory which are precious resources. - This limit does not apply to your work directory at
/data/cephfs-1/home/users/$USER/work
. The limits are much higher here but no snapshots or backups are available. - There is no limit on your scratch directory at
/data/cephfs-1/home/users/$USER/scratch
. However, files placed here are automatically removed after 2 weeks. This is only appropriate for files during download or temporary files.
"},{"location":"hpc-tutorial/episode-1/#creating-a-directory-for-temporary-files","title":"Creating a Directory for Temporary Files","text":"In general it is advisable to have a proper temporary directory available. You can create one in your ~/scratch
folder and make it available to the system.
(first-steps) $ export TMPDIR=/data/cephfs-1/home/users/$USER/scratch/tmp\n(first-steps) $ mkdir -p $TMPDIR\n
"},{"location":"hpc-tutorial/episode-1/#using-the-cubit-static-data","title":"Using the Cubit Static Data","text":"The static data is located in /data/cephfs-1/work/projects/cubit/current/static_data
. For our small example, the required reference genome and index can be found at:
/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta
/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta
"},{"location":"hpc-tutorial/episode-1/#aligning-the-reads","title":"Aligning the Reads","text":"Let's align our data:
(first-steps) $ bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n /data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz \\\n| samtools view -b \\\n| samtools sort -O BAM -T $TMPDIR -o aln.bam\n\n(first-steps) $ samtools index aln.bam\n
"},{"location":"hpc-tutorial/episode-1/#perform-structural-variant-calling","title":"Perform Structural Variant Calling","text":"And do the structural variant calling:
(first-steps) $ delly call \\\n -g /data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n aln.bam\n
Note that delly will not find any variants.
"},{"location":"hpc-tutorial/episode-1/#small-variant-calling-snv-indel","title":"Small Variant Calling (SNV, indel)","text":"And now for the SNP calling (this step will take ~ 20 minutes):
(first-steps) $ gatk HaplotypeCaller \\\n -R /data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta \\\n -I aln.bam \\\n -ploidy 2 \\\n -O test.GATK.vcf\n
"},{"location":"hpc-tutorial/episode-1/#outlook-more-programs-and-static-data","title":"Outlook: More Programs and Static Data","text":"So this is it! We used the tools that we installed previously, accessed the reference data and ran a simple alignment and variant calling pipeline. You can access a list of all static data through this wiki, follow this link to the Static Data. You can also have a peek via:
(first-steps) $ tree -L 3 /data/cephfs-1/work/projects/cubit/current/static_data | less\n
"},{"location":"hpc-tutorial/episode-2/","title":"First Steps: Episode 2","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? Welcome to the second episode of our tutorial series!
Once you are logged in to the cluster, you have the possibility to distribute your jobs to all the nodes that are available. But how can you do this easily? The key command to this magic is sbatch
. This tutorial will show you how you can use this efficiently.
"},{"location":"hpc-tutorial/episode-2/#the-sbatch-command","title":"The sbatch
Command","text":"So what is sbatch
doing for you?
You use the sbatch
command in front of the script you actually want to run. sbatch
then puts your job into the job queue. The job scheduler looks at the current status of the whole system and will assign the first job in the queue to a node that is free in terms of computational load. If all machines are busy, yours will wait. But your job will sooner or later get assigned to a free node.
We strongly recommend using this process for starting your computationally intensive tasks because you will get the best performance for your job and the whole system won't be disturbed by jobs that are locally blocking nodes. Thus, everybody using the cluster benefits.
You may have noticed that you run sbatch
with a script, not with regular commands. The reason is that sbatch
only accepts bash scripts. If you give sbatch
a normal shell command or binary, it won't work. This means that we have to put the command(s) we want to use in a bash script. A skeleton script can be found at /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_job.sh
The content of the file:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=8\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=8G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nmkdir -p ${TMPDIR}\n
The lines starting with #SBATCH
are actually setting parameters for a sbatch
command, so #SBATCH --job-name=tutorial
is equal to sbatch --job-name=tutorial
. Slurm will create a log file with a file name composed of the job name (%x
) and the job ID (%j
), e.g. logs/tutorial-XXXX.log
. It will not automatically create the logs
directory, we need to do this manually first. Here, we emphasize the importance of the log files! They are the first place to look if anything goes wrong.
To start now with our tutorial, create a new tutorial directory with a log directory, e.g.,
(first-steps) $ mkdir -p /data/cephfs-1/home/users/$USER/work/tutorial/episode2/logs\n
and copy the wrapper script to this directory:
(first-steps) $ pushd /data/cephfs-1/home/users/$USER/work/tutorial/episode2\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_job.sh .\n(first-steps) $ chmod u+w submit_job.sh\n
Now open this file and copy the same commands we executed in the last tutorial to this file.
To keep it simple, we will put everything into one script. This is perfectly fine because the alignment and indexing are sequential. But there are two steps that could be run in parallel, namely the variant calling, because they don't depend on each other. We will learn how to do that in a later tutorial. Your file should look something like this:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=8\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=8G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nmkdir -p ${TMPDIR}\n\nBWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\nREF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\nbwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n $BWAREF \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz \\\n /data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz \\\n| samtools view -b \\\n| samtools sort -O BAM -T $TMPDIR -o aln.bam\n\nsamtools index aln.bam\n\ndelly call -g \\\n $REF \\\n aln.bam\n\ngatk HaplotypeCaller \\\n -R $REF \\\n -I aln.bam \\\n -ploidy 2 \\\n -O test.GATK.vcf\n
Let's run it (make sure that you are in the tutorial/episode2
directory!):
(first-steps) $ sbatch submit_job.sh\n
And wait for the response which will tell you that your job was submitted and which job id number it was assigned. Note that sbatch
only tells you that the job has started, but nothing about finishing. You won't get any response at the terminal when the job finishes. It will take approximately 20 minutes to finish the job.
"},{"location":"hpc-tutorial/episode-2/#monitoring-jobs","title":"Monitoring Jobs","text":"You'll probably want to see how your job is doing. You can get a list of your jobs using:
(first-steps) $ squeue --me\n
Note that logins are also considered as jobs.
Identify your job by the <JOBID>
(1st column) or the name of the script (3rd column). The most likely states you will see (5th column of the table):
PD
pending, waiting to be submitted R
running - disappeared, either because of an error or because it finished
In the 8th column you can see that your job is very likely running on a different machine than the one you are on!
Do not use Slurm and watch
or loops
The watch
command is a useful tool for running commands in a loop every N
seconds. For example, on your workstation you could do watch 'ping -c 3 google.com'
to execute three network pings to Google every two seconds.
\ud83d\udc4e Using watch
or manual loops in a cluster environment can have bad effects when querying Slurm or the shared file system. Both are shared resources and \"expensive\" queries should not be run in loops. For Slurm, this includes running squeue
. The same would be true for running squeue -i
which performs an internal loop.
\ud83d\udc4d Use the Slurm query commands only when you actually need the output. If you run them in an (implict or explicit) loop, then do so only for a short time and don't leave this open in a screen.
Get more information about your jobs by either passing the job id:
(first-steps) $ sstat <JOBID>\n
And of course, watch what the logs are telling you:
(first-steps) $ tail -f logs/tutorial-<JOBID>.log\n
There will be no notification when your job is done, so it is best to watch the squeue --me
command. To watch the sbatch
command there is a linux command watch
that you give a command to execute every few seconds. This is useful for looking for changes in the output of a command. The seconds between two executions can be set with the -n
option. It is best to use -n 60
to minimize unnecessary load on the file system:
(first-steps) $ watch -n 60 squeue --me\n
If for some reason your job is hanging, you can delete your job using scancel
with your job-ID: (first-steps) $ scancel <job-ID>\n
"},{"location":"hpc-tutorial/episode-2/#job-queues","title":"Job Queues","text":"The cluster has a special way of organizing itself and by telling the cluster how long and with which priority you want your jobs to run, you can help it in this. There is a system set up on the cluster where you can enqueue your jobs to so-called partitions. partitions have different prioritites and are allowed for different running times. To get to know what partitions are available, and how to use them properly, we highly encourage you to read the cluster queues wiki page.
"},{"location":"hpc-tutorial/episode-3/","title":"First Steps: Episode 3","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? In this episode we will discuss how we can parallelize steps in a pipeline that are not dependent on each other. In the last episode we saw a case (the variant calling) that could have been potentially parallelized.
We will take care of that today. Please note that we are not going to use the sbatch
command we learned earlier. Thus, this tutorial will run on the same node where you execute the script. We will introduce you to Snakemake, a tool with which we can model dependencies and run things in parallel. In the next tutorial we will learn how to submit the jobs with sbatch
and Snakemake combined.
For those who know make
already, Snakemake will be familiar. You can think of Snakemake being a bunch of dedicated bash scripts that you can make dependent on each other. Snakemake will start the next script when a previous one finishes, and potentially it will run things in parallel if the dependencies allow.
Snakemake can get confusing, especially if the project gets big. This tutorial will only cover the very basics of this powerful tool. For more, we highly recommend digging into the Snakemake documentation:
- https://snakemake.readthedocs.io/en/stable/
- http://slides.com/johanneskoester/deck-1#/
Every Snakemake run requires a Snakefile
file. Create a new folder inside your tutorial folder and copy the skeleton:
(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode3\n(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode3\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/Snakefile .\n(first-steps) $ chmod u+w Snakefile\n
Your Snakefile
should look as follows:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/test_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/test_R2.fq.gz',\n output:\n bam='alignment/test.bam',\n bai='alignment/test.bam.bai',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/test.bam'\n output:\n 'structural_variants/test.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\nrule snps:\n input:\n 'alignment/test.bam'\n output:\n 'snps/test.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
Let me explain. The content resembles the same steps we took in the previous tutorials. Although every step has its own rule (alignment, snp calling, structural variant calling), we could instead have written everything in one rule. It is up to you to design your rules! Note that the rule names are arbitrary and not mentioned anywhere else in the file.
But there is one primary rule: the rule all
. This is the kickoff rule that makes everything run.
As you might have noticed, every rule has three main parameters: input
, output
and shell
. input
defines the files that are going into the rule, output
those that are produced when executing the rule, and shell
is the bash script that processes input
to produce output
.
Rule all
does not have any output
or shell
, it uses input
to start the chain of rules. Note that the input files of this rule are the output files of rule snps
and structural_variants
. The input of those rules is the output of rule alignment
. This is how Snakemake processes the rules: It looks for rule all
(or a rule that just has input
files) and figures out how it can create the required input files with other rules by looking at their output
files (the input
files of one rule must be the output
files of another rule). In our case it traces the workflow back to rule snps
and structural_variants
as they have the matching output files. They depend in return on the alignment, so the alignment
rule must be executed, and this is the first thing that will be done by Snakemake.
There are also some peculiarities about Snakemake:
- You can name files in
input
or output
as is done in rule alignment
with the output files. - You can access the
input
and output
files in the script by writing {input}
or {output}
. - If they are not named, they will be concatenated, separated by white space
- If they are named, access them with their name, e.g.,
{output.bam}
- Curly braces must be escaped with curly braces, e.g., for bash variables:
${{VAR}}
instead of ${VAR}
but not Snakemake internal variables like {input}
or {output}
- In the rule
structural_variants
we cheat a bit because delly does not produce output files if it can't find variants. - We do this by
touching
(i.e., creating) the required output file. Snakemake has a function for doing so (call touch()
on the filename).
- Intermediate folders in the path to output files are always created if they don't exist.
- Because Snakemake is Python based, you can write your own functions for it to use, e.g. for creating file names automatically.
But Snakemake can do more. It is able to parse the paths of the output files and set wildcards if you want. For this your input (and output) file names have to follow a parsable scheme. In our case they do! Our FASTQ files, our only initial input files, start with test
. The output of the alignment as well as the variant calling is also prefixed test
. We now can modify the Snakemake file accordingly, by exchanging every occurrence of test
in each input
or output
field with {id}
(note that you could also give a different name for your variable). Only the input rule should not be touched, otherwise Snakemake would not know which value this variable should have. Your Snakefile
should look now like this:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R2.fq.gz',\n output:\n bam='alignment/{id}.bam',\n bai='alignment/{id}.bam.bai',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/{id}.bam'\n output:\n 'structural_variants/{id}.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\nrule snps:\n input:\n 'alignment/{id}.bam'\n output:\n 'snps/{id}.vcf'\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
Before we finally run this, we can make a dry run. Snakemake will show you what it would do:
(first-steps) $ snakemake -n\n
If everything looks green, you can run it for real. We provide it two cores to allow two single-threaded jobs to be run simultaneously:
(first-steps) $ snakemake -j 2\n
"},{"location":"hpc-tutorial/episode-4/","title":"First Steps: Episode 4","text":"Episode Topic 0 How can I install the tools? 1 How can I use the static data? 2 How can I distribute my jobs on the cluster (Slurm)? 3 How can I organize my jobs with Snakemake? 4 How can I combine Snakemake and Slurm? In the last episodes we learned about distributing a job among the cluster nodes using sbatch
and how to automate and parallelize our pipeline with Snakemake. We are lucky that those two powerful commands can be combined. What is the result? You will have an automated pipeline with Snakemake that uses sbatch
to distribute jobs among the cluster nodes instead of running only the same node.
The best thing is that we can reuse our Snakefile
as it is and just write a wrapper script to call Snakemake. We run the script and the magic will start.
First, create a new folder for this episode:
(first-steps) $ mkdir -p /data/cephfs-1/home/users/${USER}/work/tutorial/episode4/logs\n(first-steps) $ pushd /data/cephfs-1/home/users/${USER}/work/tutorial/episode4\n
And copy the wrapper script to this folder as well as the Snakefile (you can also reuse the one with the adjustments from the previous episode):
(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/submit_snakejob.sh .\n(first-steps) $ cp /data/cephfs-1/work/projects/cubit/tutorial/skeletons/Snakefile .\n(first-steps) $ chmod u+w submit_snakejob.sh Snakefile\n
The Snakefile
is already known to you but let me explain the wrapper script submit_snakejob.sh
:
#!/bin/bash\n\n# Set a name for the job (-J or --job-name).\n#SBATCH --job-name=tutorial\n\n# Set the file to write the stdout and stderr to (if -e is not set; -o or --output).\n#SBATCH --output=logs/%x-%j.log\n\n# Set the number of cores (-c or --cpus-per-task).\n#SBATCH --cpus-per-task=2\n\n# Force allocation of the two cores on ONE node.\n#SBATCH --nodes=1\n\n# Set the total memory. Units can be given in T|G|M|K.\n#SBATCH --mem=1G\n\n# Optionally, set the partition to be used (-p or --partition).\n#SBATCH --partition=medium\n\n# Set the expected running time of your job (-t or --time).\n# Formats are MM:SS, HH:MM:SS, Days-HH, Days-HH:MM, Days-HH:MM:SS\n#SBATCH --time=30:00\n\n\nexport TMPDIR=/data/cephfs-1/home/users/${USER}/scratch/tmp\nexport LOGDIR=logs/${SLURM_JOB_NAME}-${SLURM_JOB_ID}\nmkdir -p $LOGDIR\n\neval \"$($(which conda) shell.bash hook)\"\nconda activate first-steps\n\nset -x\n\nsnakemake --profile=cubi-v1 -j 2 -k -p --restart-times=2\n
In the beginning you see the #SBATCH
that introduces the parameters when you provide this script to sbatch
as described in the second episode. Please make sure that the logs
folder exists before starting the run! We then set and export the TMPDIR
and LOGDIR
variables. Note that LOGDIR
has a subfolder named $SLURM_JOB_NAME-$SLURM_JOB_ID
that will be created for you. Snakemake will store its logfiles for this very Snakemake run in this folder. The next new thing is set -x
. This simply prints to the terminal every command that is executed within the script. This is useful for debugging.
Finally, the Snakemake call takes place. With the --profile
option we define that Snakemake uses the Snakemake profile at /etc/xdg/snakemake/cubi-v1
. The profile will take create appropriate calls to sbatch
and interpret the following settings from your Snakemake rule:
threads
: the number of threads to execute the job on - memory in megabytes or with a suffix of
k
, M
, G
, or T
. You can specify EITHER resources.mem
/resources.mem_mb
: the memory to allocate for the whole job, OR resources.mem_per_thread
: the memory to allocate for each thread.
resources.time
: the running time of the rule, in a syntax supported by Slurm, e.g. HH:MM:SS
or D-HH:MM:SS
resources.partition
: the partition to submit your job into (Slurm will pick a fitting partition for you by default) resources.nodes
: the number of nodes to schedule your job on (defaults to 1
and you will want to keep that value unless you want to use MPI)
The other options to snakemake
have the meaning:
-j 2
: run at most two jobs at the same time -k
: keep going even if a rule execution fails -p
: print the executed shell commands --restart-times=2
: restart failing jobs up to two times
It is now time to update your Snakefile
such that it actually specifies the resources mentioned above:
rule all:\n input:\n 'snps/test.vcf',\n 'structural_variants/test.vcf'\n\nrule alignment:\n input:\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R1.fq.gz',\n '/data/cephfs-1/work/projects/cubit/tutorial/input/{id}_R2.fq.gz',\n output:\n bam='alignment/{id}.bam',\n bai='alignment/{id}.bam.bai',\n threads: 8\n resources:\n mem='8G',\n time='12:00:00',\n shell:\n r\"\"\"\n export TMPDIR=/data/cephfs-1/home/users/${{USER}}/scratch/tmp\n mkdir -p ${{TMPDIR}}\n\n BWAREF=/data/cephfs-1/work/projects/cubit/current/static_data/precomputed/BWA/0.7.17/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n bwa mem -t 8 \\\n -R \"@RG\\tID:FLOWCELL.LANE\\tPL:ILLUMINA\\tLB:test\\tSM:PA01\" \\\n ${{BWAREF}} \\\n {input} \\\n | samtools view -b \\\n | samtools sort -O BAM -T ${{TMPDIR}} -o {output.bam}\n\n samtools index {output.bam}\n \"\"\"\n\nrule structural_variants:\n input:\n 'alignment/{id}.bam'\n output:\n 'structural_variants/{id}.vcf'\n threads: 1\n resources:\n mem='4G',\n time='2-00:00:00',\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n delly call -o {output} -g ${{REF}} {input}\n \"\"\"\n\ndef snps_mem(wildcards, attempt):\n mem = 2 * attempt\n return '%dG' % mem\n\nrule snps:\n input:\n 'alignment/{id}.bam'\n output:\n 'snps/{id}.vcf'\n threads: 1\n resources:\n mem=snps_mem,\n time='04:00:00',\n shell:\n r\"\"\"\n REF=/data/cephfs-1/work/projects/cubit/current/static_data/reference/GRCh37/g1k_phase1/human_g1k_v37.fasta\n\n gatk HaplotypeCaller \\\n -R ${{REF}} \\\n -I {input} \\\n -ploidy 2 \\\n -O {output}\n \"\"\"\n
We thus configure the resource consumption of the rules as follows:
alignment
with 8 threads and up to 8GB of memory in total with a running time of up to 12 hours, structural_variants
with one thread and up to 4GB of memory in with a running time of up to 2 days, snps
with one thread and running up to four hours. Instead of passing a static amount of memory, we pass a resource callable. The attempt
parameter will be passed a value of 1
on the initial invocation. If variant calling with the GATK HaplotypeCaller fails then it will retry and attempt
will have an incremented value on each invocation (2
on the first retry and so on). Thus, we try to do small variant calling with 2, 4, 6, and 8 GB.
Finally, run the script:
(first-steps) $ sbatch submit_snakejob.sh\n
If you watch squeue --me
now, you will see that the jobs are distributed to the system:
(first-steps) $ squeue --me\n
Please refer to the Snakemake documentation for more details on using Snakemake, in particular how to use the cluster configuration on how to specify the resource requirements on a per-rule base.
"},{"location":"misc/external-resources/","title":"External Resources","text":""},{"location":"misc/external-resources/#basic-linux","title":"Basic Linux","text":"The BIH HPC uses CentOS Linux. A basic understanding of Linux is required. Even better, you should already have intermediate to advanced Linux/Unix skills.
BIH HPC IT cannot provide you with basic Unix training. Please ask your home organization (e.g., Charite or MDC) to provide you with basic Linux training.
That said, here are some resources that we find useful:
"},{"location":"misc/external-resources/#internet-tutorials","title":"Internet Tutorials","text":"There is a large number of Linux tutorials online including:
- Ryans Linux Tutorial
- Digital Ocean Tutorials
- Linux Basics
- Environment Variables
- Using Jupyter Notebooks to manage SLURM jobs
"},{"location":"misc/external-resources/#internet-forums","title":"Internet Forums","text":" - Unix & Linux Stack Exchange
"},{"location":"misc/external-resources/#global-organisation-for-bioinformatics-learning-education-and-training","title":"Global Organisation for Bioinformatics Learning, Education, and Training","text":"GOBLET has a number of Bioinformatics-focused tutorials. This includes
- \"A Critical Guide to Unix\"
"},{"location":"misc/provided-software/","title":"Administration-Provided Software","text":"Some software is provided by HPC Administration based on the criteria that it is:
- system-near or system-level,
- very commonly used.
Currently, this includes:
- GCC v7.2.0
- CMake v3.11.0
- LLVM v6.0.0
- OpenMPI v4.0.3
On the GPU node, this also includes a recent NVIDIA CUDA version.
To see which software is available, use module avail
on a compute node (this will not work on login nodes):
$ module avail\n--------------------- /opt/local/modules ---------------------\ncmake/3.11.0-0 llvm/6.0.0-0\ngcc/7.2.0-0 openmpi/4.0.3-0\n
To load software, use module load
. This will adjust the environment variables accordingly, in particular update PATH
such that the executable are available.
$ which gcc\n/bin/gcc\n$ module load gcc/7.2.0-0\n$ which gcc\n/opt/local/gcc-7.2.0-0/bin/gcc\n
Problems with executing module
?
See the corresponding FAQ entry in the case that you get a -bash: module: command not found
when calling module
.
"},{"location":"misc/publication-list/","title":"Publication List","text":"The BIH Cluster is a valuable resource. It has been used to support the publications listed below.
- Please add your publications here.
- Acknowledge usage of the cluster in your manuscript as \"Computation has been performed on the HPC for Research cluster of the Berlin Institute of Health\".
"},{"location":"misc/publication-list/#articles-preprints","title":"Articles & Preprints","text":""},{"location":"misc/publication-list/#2024","title":"2024","text":"Hollunder, B., Ostrem, J.L., Sahin, I.A., Rajamani, N., Oxenford, S., Butenko, K., Neudorfer, C., Reinhardt, P., Zvarova, P., Polosan, M., Akram, H., Vissani, M., Zhang, C., Sun, B., Navratil, P., Reich, M.M., Volkmann, J., Yeh, F.-C., Baldermann, J.C., Dembek, T.A., Visser-Vandewalle, V., Alho, E.J.L., Franceschini, P.R., Nanda, P., Finke, C., K\u00fchn, A.A., Dougherty, D.D., Richardson, R.M., Bergman, H., DeLong, M.R., Mazzoni, A., Romito, L.M., Tyagi, H., Zrinzo, L., Joyce, E.M., Chabardes, S., Starr, P.A., Li, N., Horn, A., 2024. Mapping dysfunctional circuits in the frontal cortex using deep brain stimulation. Nat. Neurosci. 1\u201314. doi: 10.1038/s41593-024-01570-1
"},{"location":"misc/publication-list/#2022","title":"2022","text":"Kossen T, Hirzel MA, Madai VI, Boenisch F, Hennemuth A, Hildebrand K, Pokutta S, Sharma K, Hilbert A, Sobesky J, Galinovic I, Khalil AA, Fiebach JB and Frey D. Toward Sharing Brain Images: Differentially Private TOF-MRA Images With Segmentation Labels Using Generative Adversarial Networks. Frontiers in Artificial Intelligence. 5 (2022). issn: 2624-8212. doi: 10.3389/frai.2022.813842
"},{"location":"misc/publication-list/#2021","title":"2021","text":"Li, N., Hollunder, B., Baldermann, J. C., Kibleur, A., Treu, S., Akram, H., Al-Fatly, B., Strange, B. A., Barcia, J. A., Zrinzo, L., Joyce, E. M., Chabardes, S., Visser-Vandewalle, V., Polosan, M., Kuhn, J., K\u00fchn, A. A., & Horn, A. (2021). A Unified Functional Network Target for Deep Brain Stimulation in Obsessive-Compulsive Disorder. Biological Psychiatry. doi: 10.1016/j.biopsych.2021.04.006
Bressem KK, Vahldiek JL, Adams L, Niehues SM, Haibel H, Rodriguez VR, Torgutalp M, Protopopov M, Proft F, Rademacher J, Sieper J, Rudwaleit M, Hamm B, Makowski MR, Hermann KG, Poddubnyy D. Deep learning for detection of radiographic sacroiliitis: achieving expert-level performance. Arthritis Res Ther. 2021 Apr 8;23(1):106. doi: 10.1186/s13075-021-02484-0
Kossen T, Subramaniam P, Madai VI, Hennemuth A, Hildebrand K, Hilbert A, Sobesky J, Livne M, Galinovic I, Khalil AA, Fiebach JB, Frey D. Synthesizing anonymized and labeled TOF-MRA patches for brain vessel segmentation using generative adversarial networks. Computers in Biology and Medicine. 2021 Apr 131,104254. doi: 10.1016/j.compbiomed.2021.104254
Paraskevopoulou S., K\u00e4fer S., Zirkel F., Donath A., Petersen M., Liu S., Zhou X., Drosten C., Misof B., Junglen S. (2021). \"Viromics of extant insect orders unveil the evolution of the flavi-like superfamily.\" Virus Evolution 2021 Mar 30. doi: 10.1093/ve/veab030
Thomas Krannich, W Timothy J White, Sebastian Niehus, Guillaume Holley, Bjarni V Halld\u00f3rsson, Birte Kehr, Population-scale detection of non-reference sequence variants using colored de Bruijn graphs, Bioinformatics, 2021, btab749, doi: 10.1093/bioinformatics/btab749
Julia Markowski, Rieke Kempfer, Alexander Kukalev, Ibai Irastorza-Azcarate, Gesa Loof, Birte Kehr, Ana Pombo, Sven Rahmann, Roland F Schwarz, GAMIBHEAR: whole-genome haplotype reconstruction from Genome Architecture Mapping data, Bioinformatics, Volume 37, Issue 19, 1 October 2021, Pages 3128\u20133135. doi: 10.1093/bioinformatics/btab238
"},{"location":"misc/publication-list/#2020","title":"2020","text":"Kr\u00fctzfeldt LM, Schubach M, Kircher M. The impact of different negative training data on regulatory sequence predictions. PLoS One. 2020 Dec 1;15(12):e0237412. doi: 10.1371/journal.pone.0237412.
Klotz-Noack K, Klinger B, Rivera M, Bublitz N, Uhlitz F, Riemer P, L\u00fcthen M, Sell T, Kasack K, Gastl B, Ispasanie SSS, Simon T, Janssen N, Schwab M, Zuber J, Horst D, Bl\u00fcthgen N, Sch\u00e4fer R, Morkel M, Sers C. SFPQ Depletion Is Synthetically Lethal with BRAFV600E in Colorectal Cancer Cells. Cell Rep. 2020 Sep 22;32(12):108184. doi: 10.1016/j.celrep.2020.108184.
Kleinert, P., Martin, B., & Kircher, M. (2020). \"HemoMIPs\u2014Automated analysis and result reporting pipeline for targeted sequencing data.\" PLOS Computational Biology, 16(6), e1007956. doi: 10.1371/journal.pcbi.1007956
Ehmke, N.; Cusmano-Ozog, K.; Koenig, R.; Holtgrewe, M.; Nur, B.; Mihci, E.; Babcock, H.; Gonzaga-Jauregui, C.; Overton, J. D.; Xiao, J.; et al. Biallelic Variants in KYNU Cause a Multisystemic Syndrome with Hand Hyperphalangism. Bone 2020, 115219. doi: 10.1016/j.bone.2019.115219.
Niehus, S.; J\u00f3nsson, H.; Sch\u00f6nberger, J.; Bj\u00f6rnsson, E.; Beyter, D.; Eggertsson, H.P.; Sulem, P.; Stef\u00e1nsson, K.; Halld\u00f3rsson, B.V.; Kehr, B. PopDel identifies medium-size deletions jointly in tens of thousands of genomes. bioRxiv 2020, 10.1101/740225 doi: 10.1101/740225
Gordon, M. G., Inoue, F., Martin, B., Schubach, M., Agarwal, V., Whalen, S., ... & Kreimer, A. (2020). \"lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements.\" Nature Protocols, 15(8), 2387-2412. doi: 10.1038/s41596-020-0333-5
Paraskevopoulou S., Pirzer F., Goldmann N., Schmid J., Corman V.M., Gottula L.T.,Schroeder S., Rasche A., Muth D., Drexler J.F., Heni A.C., Eibner G.J., Page R.A., Jones T.C., M\u00fcllerM.A., Sommer S., Glebe D., and Drosten C. (2020). \"Mammalian deltavirus without hepadnavirus coinfection in the neotropical rodent Proechimys semispinosus.\" Proceedings of the National Academy of Sciences 2020 Jul 28;117(30):17977-17983. doi: 10.1073/pnas.2006750117.
"},{"location":"misc/publication-list/#2019","title":"2019","text":"Kircher, M., Xiong, C., Martin, B. et al. \"Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution.\" Nat Commun 10, 3583 (2019). doi: 10.1038/s41467-019-11526-w
Stefanovski L, Triebkorn P, Spiegler A, Diaz-Cortes M-A, Solodkin A, Jirsa V, McIntosh RA and Ritter P (2019). \"Linking Molecular Pathways and Large-Scale Computational Modeling to Assess Candidate Disease Mechanisms and Pharmacodynamics in Alzheimer's Disease.\" Front. Comput. Neurosci.. 13:54. doi: 10.3389/fncom.2019.00054
Boeddrich A., Babila J.T., Wiglenda T., Diez L., Jacob M., Nietfeld W., Huska M.R., Haenig C., Groenke N., Buntru A., Blanc E., Meier J.C., Vannoni E., Erck C., Friedrich B., Martens H., Neuendorf N., Schnoegl S., Wolfer DP., Loos M., Beule D., Andrade-Navarro M.A., Wanker E.E. (2019). \"The Anti-amyloid Compound DO1 Decreases Plaque Pathology and Neuroinflammation-Related Expression Changes in 5xFAD Transgenic Mice.\" Cell Chem Biol. 2019 Jan 17;26(1):109-120.e7. doi: 10.1016/j.chembiol.2018.10.013.
Fountain M.D., Oleson, D.S., Rech. M.E., Segebrecht, L., Hunter, J.V., McCarthy, J.M., Lupo, P.J., Holtgrewe, M., Mora, R., Rosenfeld, J.A., Isidor, B., Le Caignec, C., Saenz, M.S., Pedersen, R.C., Morgen, T.M., Pfotenhauer, J.P., Xia, F., Bi, W., Kang, S.-H.L., Patel, A., Krantz, I.D., Raible, S.E., Smith, W.E., Cristian, I., Tori, E., Juusola, J., Millan, F., Wentzensen, I.M., Person, R.E., K\u00fcry, S., B\u00e9zieau, S., Uguen, K., F\u00e9rec, C., Munnich, A., van Haelst, M., Lichtenbelt, K.D., van Gassen, K., Hagelstrom, T., Chawla, A., Perry, D.L., Taft, R.J., Jones, M., Masser-Frye, D., Dyment, D., Venkateswaran, S., Li, C., Escobar, L,.F., Horn, D., Spillmann, R.C., Pe\u00f1a, L., Wierzba, J., Strom, T.M. Parent, I. Kaiser, F.J., Ehmke, N., Schaaf, C.P. (2019). \"Pathogenic variants in USP7 cause a neurodevelopmental disorder with speech delays, altered behavior, and neurologic anomalies.\" Genet. Med. 2019 Jan 25. doi: 10.1038/s41436-019-0433-1
Holtgrewe,M., Messerschmidt,C., Nieminen,M. and Beule,D. (2019) DigestiFlow: from BCL to FASTQ with ease. Bioinformatics, 10.1093/bioinformatics/btz850.
K\u00e4fer S., Paraskevopoulou S., Zirkel F., Wieseke N., Donath A., Petersen M., Jones T.C., Liu S., Zhou X., Middendorf M., Junglen S., Misof B., Drosten C. (2019). \"Re-assessing the diversity of negative strand RNA viruses in insects.\" PLOS Pathogens 2019 Dec 12. doi: 10.1371/journal.ppat.1008224
K\u00fchnisch,J., Herbst,C., Al\u2010Wakeel\u2010Marquard,N., Dartsch,J., Holtgrewe,M., Baban,A., Mearini,G., Hardt,J., Kolokotronis,K., Gerull,B., et al. (2019) Targeted panel sequencing in pediatric primary cardiomyopathy supports a critical role of TNNI3. Clin Genet, 96, 549\u2013559. https://doi.org/10.1111/cge.13645
Marklewitz M., Dutari L.C., Paraskevopoulou S., Page R.A., Loaiza J.R., Junglen S. (2019). \"Diverse novel phleboviruses in sandflies from the Panama Canal area, Central Panama.\" Journal of General Virology 2019 May 3. doi: 10.1099/jgv.0.001260
Quade,A., Thiel,A., Kurth,I., Holtgrewe,M., Elbracht,M., Beule,D., Eggermann,K., Scholl,U.I. and H\u00e4usler,M. (2019) Paroxysmal tonic upgaze: A heterogeneous clinical condition responsive to carbonic anhydrase inhibition. European Journal of Paediatric Neurology, 10.1016/j.ejpn.2019.11.002.
"},{"location":"misc/publication-list/#2018","title":"2018","text":"Blanc, E., Holtgrewe, M., Dhamodaran, A., Messerschmidt, C., Willimsky, G., Blankenstein, T., Beule, D. (2018). \"Identification and Ranking of Recurrent Neo-Epitopes in Cancer\". bioRxiv. 2018/389437, 2018. doi: 10.1101/389437
Brandt, R., Uhlitz, F., Riemer, P., Giesecke, C., Schulze, S., El-Shimy, I.A., Fauler, B., Mielke, T., Mages, N., Herrmann, B.G., Sers, C., Bl\u00fcthgen, N., Morkel, M. (2018). \"Cell type-dependent differential activation of ERK by oncogenic KRAS or BRAF in the mouse intestinal epithelium\". bioRxiv. 2018/340844. doi: 10.1101/340844.
Holtgrewe, M., Knaus, A., Hildebrand, G., Pantel, J.-T., Rodriguesz de los Santos, M., Neveling, K., Goldmann, J., Schubach, M., J\u00e4ger, M., Couterier, M., Mundlos, S., Beule, D., Sperling, K., Krawitz, P. (2018). \"Multisite de novo mutations in human offspring after paternal exposure to ionizing radiation\", Nature Scientific Reports. 2018 Oct 2;8(1):14611. doi: 10.1038/s41598-018-33066-x.
Kircher M., Xiong C., Martin B, Schubach M, Inoue F, Bell R.JA., Costello J.F., Shendure J., Ahituv N. (2018). \"Saturation mutagenesis of disease-associated regulatory elements.\" bioRxiv (2018): 505362. doi: 10.1101/505362
PCAWG Transcriptome Core Group, Calabrese, C., Davidson, N.R., Fonseca1, N.A., He, Y., Kahles, A., Lehmann, K.-V., Liu, F., Shiraishi, Y., Soulette, C.M., Urban, L., Demircio\u011flu, D., Greger, L., Li, S., Liu, D., Perry, M.D., Xiang, L., Zhang, F., Zhang, J., Bailey, P., Erkek, S., Hoadley, K.A., Hou, Y., Kilpinen, H., Korbel, J.O., Marin, M.G., Markowski, J., Nandi11, T., Pan-Hammarstr\u00f6m, Q., Pedamallu, C.S., Siebert, R., Stark, S.G., Su, H., Tan, P., Waszak, S.M., Yung, C., Zhu, S., PCAWG Transcriptome Working Group, Awadalla, P., Creighton, C.J., Meyerson, M., Ouellette, B.F.F., Wu, K., Yang, H., ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Network, Brazma1, A., Brooks, A.N., G\u00f6ke, J., R\u00e4tsch, G., Schwarz, R.F., Stegle, O., Zhang, Z. (2018). \"Genomic basis for RNA alterations revealed by whole-genome analyses of 27 cancer types\". bioRxiv. 2018/183889. doi: 10.1101/183889
Guneykaya D., Ivanov A., Hernandez D.P., Haage V., Wojtas B., Meyer N., Maricos M., Jordan P., Buonfiglioli A., Gielniewski B., Ochocka N., C\u00f6mert, C., Friedrich, C., Artiles, L. S., Kaminska, B., Mertins, P., Beule, D., Kettenmann, H. (2018). \"Transcriptional and translational differences of microglia from male and female brains\", Cell reports. 2018 Sep 4;24(10):2773-83. doi: 10.1016/j.celrep.2018.08.001.
Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. (2018). \"CADD: predicting the deleteriousness of variants throughout the human genome\", Nucleic Acids Res. 2018 Oct 29. doi: 10.1093/nar/gky1016.
Salatzki J., Foryst-Ludwig A., Bentele K., Blumrich A., Smeir E., Ban Z., Brix S., Grune J., Beyhoff N., Klopfleisch R., Dunst S., Surma, M.A., Klose, C., Rothe, M., Heinzel, F.R., Krannich, A., Kershaw, E.E., Beule, D., Schulze, P.C., Marx, N., Kintscher, U. (2018). \"Adipose tissue ATGL modifies the cardiac lipidome in pressure-overload-induced left ventricular failure\", PLoS genetics. 2018 Jan 10;14(1):e1007171. doi: 10.1371/journal.pgen.100717.
Schubach M., Re M., Robinson P.N., Valentini G. (2017) \"Imbalance-aware machine learning for predicting rare and common disease-associated non-coding variants\", Scientific reports 7:1, 2959. doi: 10.1038/s41598-017-03011-5.
Schubert M., Klinge, B., Kl\u00fcnemann M., Sieber A., Uhlitz F., Sauer S., Garnett M., Bl\u00fcthgen N., Saez-Rodriguez J. (2018). \"Perturbation-response genes reveal signaling footprints in cancer gene expression\". Nature Communications. 9: 20, 2018. doi: 10.1038/s41467-017-02391-6
"},{"location":"misc/publication-list/#2017","title":"2017","text":"Euskirchen, P., Bielle, F., Labreche, K., Kloosterman, W.P., Rosenberg, S., Daniau, M., Schmitt, C., Masliah-Planchon, J., Bourdeaut, F., Dehais, C., et al. (2017). Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing. Acta Neuropathol 1\u201313. doi: 10.1007/s00401-017-1743-5
Euskirchen, P., Radke, J., Schmidt, M.S., Heuling, E.S., Kadikowski, E., Maricos, M., Knab, F., Grittner, U., Zerbe, N., Czabanka, M., et al. (2017). Cellular heterogeneity contributes to subtype-specific expression of ZEB1 in human glioblastoma. PLOS ONE 12, e0185376. doi: 10.1371/journal.pone.0185376
Mattei D., Ivanov A., Ferrai C., Jordan P., Guneykaya D., Buonfiglioli A., Schaafsma W., Przanowski P., Deuther-Conrad W., Brust P., Hesse S., Patt, M., Sabri, O., Ross, T.L., Eggen, B.J.L., Boddeke E.W.G.M., Kaminska, B., Beule, D., Pombo, A., Kettenmann, H., Wolf, S.A. (2017). \"Maternal immune activation results in complex microglial transcriptome signature in the adult offspring that is reversed by minocycline treatment.\" Translational psychiatry. 2017 May;7(5):e1120. doi: 10.1038/tp.2017.80.
Mamlouk, S., Childs, L. H., Aust, D., Heim, D., Melching, F., Oliveira, C., Wolf, T., Durek, P., Schumacher, D., Bl\u00e4ker, H., von Winterfeld, M., Gastl, B., M\u00f6hr, K., Menne, A., Zeugner, S., Redmer, T., Lenze, D., Tierling, S., M\u00f6bs, M., Weichert, W., Folprecht, G., Blanc, E., Beule, D., Sch\u00e4fer, R., Morkel, M., Klauschen, F., Leser, U. and Sers, C. (2017). \"DNA copy number changes define spatial patterns of heterogeneity in colorectal cancer\", Nature Communications. 2017; 8, p. 14093. doi: 10.1038/ncomms14093.
Messerschmidt, C., Holtgrewe, M. and Beule, D. (2017). \"HLA-MA: simple yet powerful matching of samples using HLA typing results\". Bioinformatics. 28, pp. 2592\u20132599. doi: 10.1093/bioinformatics/btx132.
Kammertoens, T., Friese, C., Arina, A., Idel, C., Briesemeister, D., Rothe, M., Ivanov, A., Szymborska, A., Patone, G., Kunz, S., Sommermeyer, D., Engels, B., Leisegang, M., Textor, A., Fehling, H. J., Fruttiger, M., Lohoff, M., Herrmann, A., Yu, H., Weichselbaum, R., Uckert, W., H\u00fcbner, N., Gerhardt, H., Beule, D., Schreiber, H. and Blankenstein, T. (2017). \"Tumour ischaemia by interferon-\u03b3 resembles physiological blood vessel regression\". Nature. 545(7652), pp. 98\u2013102. doi: 10.1038/nature22311.
Schulze Heuling, E., Knab, F., Radke, J., Eskilsson, E., Martinez-Ledesma, E., Koch, A., Czabanka, M., Dieterich, C., Verhaak, R.G., Harms, C., et al. (2017). Prognostic Relevance of Tumor Purity and Interaction with MGMT Methylation in Glioblastoma. Mol. Cancer Res. 15, 532\u2013540. doi: 10.1158/1541-7786.MCR-16-0322
Yaakov, G., Lerner, D., Bentele, K., Steinberger, J., Barkai, N., Bigger, J., Maisonneuve, E., Gerdes, K., Lewis, K., Dhar, N., McKinney, J. D., Gefen, O., Balaban, N. Q., Jayaraman, R., Balaban, N. Q., Merrin, J., Chait, R., Kowalik, L., Leibler, S., Balaban, N. Q., Allison, K. R., Brynildsen, M. P., Collins, J. J., Nathan, C., Lewis, K., Glickman, M. S., Sawyers, Knoechel, B., Welch, A. Z., Gibney, P. A., Botstein, D., Koshland, D. E., Levy, S. F., Ziv, N., Siegal, M. L., Stewart-Ornstein, J., Weissman, J. S., El-Samad, H., Gasch, A. P., Weinert, T., Hartwell, L., Weinert, T. A., Hartwell, L. H., Lisby, M., Rothstein, R., Mortensen, U. H., Lisby, M., Mortensen, U. H., Rothstein, R., Domkin, V., Thelander, L., Chabes, A., Hendry, J. A., Tan, G., Ou, J., Boone, C., Brown, G. W., Berry, D. B., Gasch, A. P., Lynch, M., Nishant, K. T., Serero, A., Jubin, C., Loeillet, S., Legoix-Ne, P., Nicolas, A. G., Huh, W. K., Janke, C., Lee, S. E., Blecher-Gonen, R., Martin, M., Cherry, J. M., McKenna, A., DePristo, M. A., Lawrence, M., Obenchain, V., Ye, K., Schulz, M. H., Long, Q., Apweiler, R., Ning, Z., Layer, R. M., Chiang, C., Quinlan, A. R., Hall, I. M., Faust, G. G., Hall, I. M., Boeva, V., Boeva, V., Li, H., Koren, A., Soifer, I. and Barkai, N. (2017). \"Coupling phenotypic persistence to DNA damage increases genetic diversity in severe stress\". Nature Ecology & Evolution. 1(1), pp. 497\u2013500. doi: 10.1038/s41559-016-0016.
Uhlitz, F., Sieber, A., Wyler, E., Fritsche-Guenther, R., Meisig, J., Landthaler, M., Klinger, B., Bl\u00fcthgen, N. (2017). \"An immediate-late gene expression module decodes ERK signal duration\". Molecular Systems Biology. 13: 928, 2017. doi: 10.15252/msb.20177554.
"},{"location":"misc/publication-list/#theses","title":"Theses","text":""},{"location":"misc/publication-list/#2019_1","title":"2019","text":"Schumann F. (2019). \"Establishing a pipeline for stable mutational signature detection and evaluation of variant filter effects\". Freie Universit\u00e4t Berlin. Bachelor Thesis, Bioinformatics.
"},{"location":"misc/publication-list/#2018_1","title":"2018","text":"Borgsm\u00fcller N. (2018). \"Optimization of data processing in GC-MS metabolomics\", Technische Universit\u00e4t Berlin. Master Thesis, Biotechnology.
Kuchenbecker, S.-L. (2018). \"Analysis of Antigen Receptor Repertoires Captured by High Throughput Sequencing\". Freie Universit\u00e4t Universit\u00e4t Berlin. PhD Thesis, Dr. rer. nat. URN:NBN: urn:nbn:de:kobv:188-refubium-22171-8
Schubach M. (2018). \"Learning the Non-Coding Genome\", Freie Universit\u00e4t Universit\u00e4t Berlin. PhD Thesis, Dr. rer. nat. URN:NBN: urn:nbn:de:kobv:188-refubium-23332-7
"},{"location":"misc/publication-list/#posters","title":"Posters","text":""},{"location":"misc/publication-list/#2018_2","title":"2018","text":"Roskosch, S., Hald\u00f3rsson B., Kehr, B. (2018). \"PopDel: Population-Scale Detection of Genomic Deletions\" ECCB 2018. Poster.
White T., Kehr B. (2018). \"Comprehensive extraction of structural variations from long-read DNA sequences\" WABI 2018. Poster.
"},{"location":"misc/publication-list/#2017_1","title":"2017","text":"Schubach M., Re R., Robinson P.N., Valentini G. (2017). \"Variant relevance prediction in extremely imbalanced training sets\" ISMB/ECCB 2017. Poster.
White T., Kehr B. (2017). \"Improving long-read mapping with simple lossy sequence transforms\" ISMB/ECCB 2017. Poster.
"},{"location":"ondemand/interactive/","title":"OnDemand: Interactive Sessions","text":"Interactive sessions allow you to start and manage selected apps. Depending on the app they run as servers or GUIs. Selecting My Interactive Sessions
in the top menu will direct you to the overview of currently running sessions. The left-hand panel provides a short cut to start a new session of one of the provided apps.
Each running interactive session is listed. Each card corresponds to one session. The title of each card provides the name, allocated resources and the current status. Furthermore, detailed information and links are available:
- Host: Provides the name of the node the session is running on. Click on the host name to open a shell to the given cluster node.
- Time remaining: Time until session till terminate.
- Session ID: Click to open the session directory in the interactive file browser (see below).
- Connect to: This will open the app in your browser (opens a new tab).
- Delete: Terminate the session.
Don't hit reload in your apps
Please note that the portal will use the authentication mechanisms of the apps to ensure that nobody except for you can connect to the session. This means that hitting the browsers \"reload\" button in your app will most likely not work.
Just go back to the interactive session list and click on the \"connect\" button.
"},{"location":"ondemand/interactive/#session-directories","title":"Session Directories","text":"The portal software will create a folder ondemand
in your home directory. Inside, it will create session directories for each started interactive job. For technical reasons, these folders have very long names, for example:
$HOME/ondemand/data/sys/dashboard/batch_connect/sys/ood-bih-rstudio-server/output/e40e03b3-11ca-458a-855b-98e6f148c99a/
This follows the pattern:
$HOME/${application name}/output/${job UUID}
The job identifier used is not the Slurm job ID but an identifier internal to OnDemand. Inside this directory you will find log files and a number of scripts that are used to start your job.
If you need to debug any interactive job, start here. Also, the helpdesk will need the path to this folder to help you with interactive jobs.
You can find the name of the latest output folder with the following command:
$ ls -lhtr $HOME/${application name}/output | tail -n 1\n
For example, for RStudio Server:
$ ls -lhtr $HOME/ondemand/data/sys/dashboard/batch_connect/sys/ood-bih-rstudio-server/output | tail -n 1\n
Prevent Home From Filling Up
You should probably move ~/ondemand
to your work volume with the following:
$ mv ~/ondemand ~/work/ondemand\n$ ln -sr ~/work/ondemand ~/ondemand\n
Make sure to delete potential interactive sessions and to logout from the Ondemand Portal first. Otherwise, the ~/ondemand
folder is constantly recreated and the symlink will be just created within this folder as ~/ondemand/ondemand
and thus not be used as intended.
Also, clear out ~/work/ondemand/*
from time to time but take care that you don't remove the directory of any running job.
"},{"location":"ondemand/interactive/#example-1-default-rstudio-session","title":"Example 1: Default RStudio Session","text":"This description of starting an RStudio session is a showcase for starting other interactive apps as well.
To start the session, please go to Interactive Apps
in the top menu bar and select RStudio Server
or click RStudio Server
in the left-hand panel.
Allocate appropriate resources and click Launch
.
An info card for the RStudio Server will be added to My Interactive Sessions
, and during start, it will change its state from Queued
to Starting
to Running
. Depending on the app, resources allocated and current cluster usage, this will take a couple of seconds.
When in the final state (Running
), one can directly connect to the RStudio Server to get an interactive session by clicking Connect to RStudio Server
:
"},{"location":"ondemand/interactive/#example-2-rstudio-session-with-custom-r-installation-from-conda","title":"Example 2: RStudio Session with custom R-installation from conda","text":"To use the OnDemand portal with a specific R installation including a stable set of custom packages you can use a conda enviroment from the cluster as a R source.
For this you may first need to create this conda environment including your R version of choice and all necessary packages. Specific installations of i.e. python from conda can be used similarly in other interactive apps.
- For reproducibility this environment should clearly define all package versions and include dependencies. This is easiest to achieve by first collecting all packages you need into a primary collection (i.e. a yaml file, potentially including a specific R version for r-base if needed) and creating an environment from there. Exporting this environment will generate a file with all used packages and their version numbers, that can be used to recreate the same environment.
- Example code:
Click to expand * Commands: + `conda env create -n R-example -f R-example.yaml` + `conda activate R-example` + `conda env export -f R-fixed-versions.yaml` + `conda env create -n R-fixed-versions -f R-fixed-versions.yaml` * R-example.yaml channels:\n - conda-forge\n - bioconda\n - defaults\ndependencies:\n - r-base\n - r-essentials\n - r-devtools\n - bioconductor-deseq2\n - r-tidyverse\n - r-rmarkdown\n - r-knitr\n - r-dt\n
- R packages only available from github
Some packages (i.e. several single-cell-RNAseq analysis tools) are only available from github and not on Cran/Bioconductor. There are two ways to install such packages into a conda enviroment.
Click to expand 1) Install from inside R \\[easier option, but not pure conda\\] * First setup the conda env, ideally including all dependencies for the desired package from github (and do include r-devtools) * Then within R run `devtools::install_github('owner/repo', dependencies=F, upgrade=F, lib='/path/to/conda/env-name/lib/R/library')` * if you don't have all dependencies already installed you will have to omit dependencies=F and risk a mix of conda & native R installed packages (or just have to redo the conda env). * github_install involves a build process and still needs a bit of memory, so this might crash on the default `srun --pty bash -i` shell 2) Build packages into a local conda channel \\[takes longer, but pure conda\\]\\ This approach is mostly taken from the answers given [here](https://stackoverflow.com/questions/52061664/install-r-package-from-github-using-conda). These steps must be taken _before_ building the final env used with Rstudio * use `conda skeleton cran https://github.com/owner/repo [--git-tag vX.Y]` to generate build files * conda skeleton only works for repositories with a release/version tag. If the package you want to install does not have that, you either need to create a fork and add a such a tag, or find a fork that already did that. Downloading the code directly from github and building the package from that is also possible, but you will the need to manually set up the `meta.yaml` and `build.sh` files that conda skeleton would create. * If there is more than one release tag, do specify which one you want, it may not automatically take the most recent one. * If any r-packages from bioconductor are dependencies, conda will not find them during the build process. You will need to change the respective entries in the `meta.yaml` file created by conda skeleton. I.e. change `r-deseq2` to `bioconductor-deseq2` * Build the package with `conda build --R= [--use-local] r-` * You need to specifying the same R-version used in the final conda env * If the github package has additional dependencies from github, build those first and then add `--use-local` so the build process can find them. * The build process definitely needs more memory than the default `srun --pty bash -i` shell. It also takes quite a bit of time (much longer than installing through devtools::install_github) * Finally add the packages (+versions) you built to the environment definition (i.e. yaml file) and create the (final) conda environment. Don't forget to tell conda to use locally build packages (either supply `--use-local` or add `- local` to the channel list in the yaml file) Starting the Rstudio session via the OnDemand portal works almost as described above (see Example 1). However, you do have to select `miniforge` as R source and provide the path to your miniforge installation and (separated by a colon) the name of the (newly created) conda enviroment you want to use.
Additional notes:
- Updating the conda env, that an already running rstudio instance is using, does work but does requires a restart of the R session to take effect
- If you are starting a new interactive Rstudio session but with a different conda environment than before, Rstudio will still start from the same project as before. In this case the 'old' project likely still contains the previous
.libPaths()
entries and therefore a link to your previous conda installation. Creating a new project cleans .libPaths()
to only the env specified in setting up the Rstudio session.
"},{"location":"ondemand/overview/","title":"The Open OnDemand Portal","text":"Status / Stability
OnDemand Support is currently in beta phase on the BIH HPC. In case of any issues, please send an email to hpc-helpdesk@bih-charite.de.
To allow for better interactive works, BIH HPC administration has setup an Open OnDemand (OOD) portal web server.
You can find the OnDemand Portal for HPC 4 Research at:
- https://hpc-portal.cubi.bihealth.org
"},{"location":"ondemand/overview/#background","title":"Background","text":"OOD allows you to access cluster resources using a web-based graphical interface in addition to traditional SSH connections. You can then connect to jobs running graphical applications either to virtual desktops (such as Matlab) or to web apps (such as Jupyter and RStudio Server).
The following figure illustrates this.
The primary way to the cluster continues to be SSH which has several advantages. By the nature of the cluster being based on Linux servers, it will offer more features through the \"native\" access and through its lower complexity, it will offer higher stability. However, we all like to have the option of a graphical interface, at least from time to time .
The main features are:
- Easy web-based access to Jupyter and RStudio Server on the cluster.
- Generally lower the entry barrier of using the HPC system.
"},{"location":"ondemand/overview/#logging-into-the-portal","title":"Logging into the Portal","text":"The first prerequisite is to have a cluster account already (see Getting Access). Once you have done your first SSH connection to the cluster successfully you can start using the portal. For this you perform the following steps:
- Go to https://hpc-portal.cubi.bihealth.org - you will be redirected to the login page shown below. If you have an account with Charite (ends in
_c
) then please use the \"Charit\u00e9 - Universit\u00e4tmedizin Berlin\" button, for MDC Accounts please use the \"Max Delbr\u00fcck Center Berlin\" button. - Login with your home organization's SSO system. Please note that depending on whether you are accessing the system via the wired network in your home organization or via VPN the SSO might look differently.
Clicked the Wrong Login Button?
If you clicked the wrong button then please clear your cookies to force a logout of the system.
"},{"location":"ondemand/overview/#prepare-ondemand-folder","title":"Prepare OnDemand Folder","text":"The ondemand
folder is automatically created in your home directory, and the OnDemand service searches for this folder in your home directory, i.e. it has to stay there. But as the quota in the home directory is very limited, you can easily hit the hard quota which might prevent you from working on the cluster.
To prevent this, move the ~/ondemand
folder to the ~/work
folder and create a symlink for the now dislocated ~/ondemand
folder:
hpc-login-1:~$ mv ~/ondemand ~/work/ondemand\nhpc-login-1:~$ ln -sr ~/work/ondemand ~/ondemand\n
Important
Make sure to delete potential interactive sessions and to logout from the Ondemand Portal first. Otherwise, the ~/ondemand
folder is constantly recreated and the symlink will be just created within this folder as ~/ondemand/ondemand
and thus not be used as intended.
"},{"location":"ondemand/overview/#portal-dashboard","title":"Portal Dashboard","text":"Problems with Open OnDemand?
First try to log out and login again. Next, try to clear all cookies for the domain hpc-portal.cubi.bihealth.org
. Finally, try the Help > Restart Web Server
link to restart the per-user nginx (PUN) server.
You will then be redirected to the dashboard screen.
Here you have access to the following actions. We will not go into detail of all of them and expect them to be self-explanatory.
Important
Please note that when using the portal then you are acting as your HPC user. Use standard best practice. Consider carefully what you do as you would from the command line (e.g., don't use the portal to browse the web from the cluster).
- Files
- Home Directory - Access a file browser.
- Quotas - Display quota information (only available on HPC 4 Research).
- Jobs
- Active Jobs - List your jobs.
- Job Composer - Start a new job.
- Clusters
- Shell Access - Shell access in your browser.
- Interactive Apps
- Mate and Xfce Desktops - Start virtual desktops on the HPC.
- Matlab - Run a virtual desktop that has Matlab installed.
- MaxQuant - Run a virtual desktop that has MaxQuant installed.
- Jupyter - Run Jupyter on the HPC and easily connect to it from your browser without setting up any SSH tunnels.
- RStudio Server - Run RStudio Server on the HPC and easily connect to it from your browser without setting up any SSH tunnels.
- My Interactive Sessions - See details of your currently running interactive sessions.
- Help
- Contact Support - Links ot the \"Getting Help\" page in this documentation.
- Online Documentation - Links to this documentation.
- Restart Web Server - Try this if the portal acts weird before contacting the helpdesk. OnDemand runs a web server per user, so this does not affect any other user.
- Log Out - Log out of the system.
"},{"location":"ondemand/quotas/","title":"OnDemand: Quota Inspection","text":"Outdated
This document is only valid for the old, third-generation file system and will be removed soon. Quotas of our new CephFS storage are communicated via the HPC Access web portal.
Accessing the quota report by selecting Files
and then Quotas
in the top menu will provide you with a detailed list of all quotas for directories that you are assigned to.
There are two types of quotas: for (a) size of and (b) number of files in a directory.
Every row in the table corresponds to a directory that you have access to. This implies your home directory (fast/users
) as well as the group directory of your lab (fast/groups
) and possible projects (fast/projects
) (if any). Quotas are not directly implied on these directories but on the home
, scratch
and work
subdirectories that each of subdirectory of the beforementioned directories has (for a detailed explanation see Storage and Volumes).
The following list explains the columns of the table:
- path resembles the path to the directory the quota is displayed for. Please note that this is not actually a path but the fileset name the cluster uses internally to handle the associated directory/path. The \"real\" path can be derived by preceding the name with a slash (
/
) and substituting the underscores with a slash in the (users|groups|projects)_
and _(home|scratch|work)
substring. The corresponding path for name fast/users_stolpeo_c_home
would be /fast/users/stolpeo_c/home
. - block usage gives the current size of the directory/fileset. The unit is variable and directly attached to the number.
- block soft limit gives the soft quota for the directory/fileset. Exceeding the soft quota (and staying below the hard quota) will trigger the grace period. The unit is variable and directly attached to the number.
- block hard limit gives the hard quota for the directory/fileset. Exceeding the hard quota is not possible and will prevent you from writing any data to the directory. That might cause trouble even deleting files as logging in and browsing the file system may create data. The unit is variable and directly attached to the number.
- block grace gives the grace period in days when exceeding the soft quota.
- files usage gives the number of files in the directory tree.
- files soft limit gives the soft quota for the allowed number of files in the directory/fileset. Exceeding the soft quota (and staying below the hard quota) will trigger the grace period.
- files hard limit gives the hard quota for the allowed number of files. Exceeding the hard quota is not possible and will prevent you from writing any data to the directory. That might cause trouble even deleting files as logging in and browsing the file system may create data.
- files grace gives the grace period in days when exceeding the soft quota for files.
"},{"location":"overview/architecture/","title":"Cluster Architecture","text":"BIH HPC IT provides acess to high-performance compute (HPC) cluster systems. A cluster system bundles a high number of nodes and in the case of HPC, the focus is on performance (with contrast to high availability clusters).
"},{"location":"overview/architecture/#hpc-4-research","title":"HPC 4 Research","text":""},{"location":"overview/architecture/#cluster-hardware","title":"Cluster Hardware","text":" - approx. 256 nodes (from three generations),
- 4 high-memory nodes (2 nodes with 512 GB RAM, 2 nodes with 1 TB RAM),
- 7 GPU nodes with 4 Tesla GPUs each, 1 GPU node with 10 A40 GPUs, and
- a high-performance Tier 1 parallel CephFS file system with a larger but slower Tier 2 CephFS file system, and
- a legacy parallel GPFS files system.
"},{"location":"overview/architecture/#network-interconnect","title":"Network Interconnect","text":" - Older nodes are interconnected with 2x10GbE/2x40GbE
- Recent nodes are interconnected with 2x25GbE/2x100GbE
"},{"location":"overview/architecture/#cluster-management","title":"Cluster Management","text":"Users don't connect to nodes directly but rather create interactive or batch jobs to be executed by the cluster job scheduler Slurm.
- Interactive jobs open interactive sessions on compute nodes (e.g., R or iPython sessions). These jobs are run directly in the user's terminal.
- Batch jobs consist a job script with execution instructions (a name, resource requirements etc.) These are submitted to the cluster and then assigned to compute hosts by the job scheduler. Users can configure the scheduler to send them an email upon completion. Users can submit many batch jobs at the same time and the scheduler will execute them once the cluster offers sufficient resources.
- Web-based access can be achieved using the OnDemand Portal
"},{"location":"overview/architecture/#head-vs-compute-nodes","title":"Head vs. Compute Nodes","text":"As common with HPC systems, users cannot directly access the compute nodes but rather connect to so-called head nodes. The BIH HPC system provides the following head nodes:
login-1
and login-2
that accept SSH connections and are meant for low intensity, interactive work such as editing files, running screen/tmux sessions, and logging into the compute nodes. Users should run no computational tasks and no large-scale data transfer on these nodes. transfer-1
and transfer-2
also accept SSH connections. Users should run all large-scale data transfer through these nodes.
"},{"location":"overview/architecture/#common-use-case","title":"Common Use Case","text":"After registration and client configurations, users with typically connect to the HPC system through the login nodes:
local:~$ ssh -l jdoe_c hpc-login-1.cubi.bihealth.org\nhpc-login-1:~$\n
Subsequently, they might submit batch jobs to the cluster for execution through the Slurm scheduling system or open interactive sessions:
hpc-login-1:~$ sbatch job_script.sh\nhpc-login-1:~$ srun --pty bash -i\nmed0104:~$\n
"},{"location":"overview/for-the-impatient/","title":"Overview","text":""},{"location":"overview/for-the-impatient/#bih-hpc-4-research","title":"BIH HPC 4 Research","text":"BIH HPC 4 Research is located in the BIH data center in Buch and connected via the BIH research network. Connections can be made from Charite, MDC, and BIH networks. The cluster is open for users with either Charite or MDC accounts after getting access through the gatekeeper proces. The system has been designed to be suitable for the processing of human genetics data from research contexts (and of course data without data privacy concerns such as public and mouse data).
"},{"location":"overview/for-the-impatient/#cluster-hardware-and-scheduling","title":"Cluster Hardware and Scheduling","text":"The cluster consists of the following major components:
- 2 login nodes for users
hpc-login-1
and hpc-login-2
(for interactive sessions only), - 2 nodes for file transfers
hpc-transfer-1
and hpc-transfer-2
, - a scheduling system using Slurm,
- 228 general purpose compute nodes
hpc-cpu-{1..228}
- a few high memory nodes
hpc-mem-{1..5}
, - 7 nodes with 4 Tesla V100 GPUs each (!)
hpc-gpu-{1..7}
and 1 node with 10x A40 GPUs (!) hpc-gpu-8
, - a legacy parallel GPFS file system with 2.1 PB, by DDN mounted at
/fast
, - a next generation high-performance storage system based on Ceph/CephFS
- a tier 2 (slower) storage system based on Ceph/CephFS
This is shown by the following picture:
"},{"location":"overview/for-the-impatient/#differences-between-workstations-and-clusters","title":"Differences Between Workstations and Clusters","text":"The differences include:
- The directly reachable login nodes are not meant for computation! Use
srun
to go to a compute node. - Every time you type
srun
to go to a compute node you might end up on a different host. - Most directories on the nodes are not shared, including
/tmp
. - The
/fast
directory is shared throughout the cluster which contains your home, group home, and project directories. - You will not get
root
or sudo
permissions on the cluster. - You should prefer batch jobs (
sbatch
) over calling programs interactively.
"},{"location":"overview/for-the-impatient/#what-the-cluster-is-and-is-not","title":"What the Cluster Is and Is NOT","text":"NB: the following might sound a bit harsh but is written with everyone's best intentions in mind (we actually like you, our user!) This addresses a lot of suboptimal (yet not dangerous, of course) points we observed in our users.
IT IS
- It is scientific infrastructure just like a lab workbench or miscroscope. It is there to be used for you and your science. We trust you to behave in a collaboratively. We will monitor usage, though, and call out offenders.
- With its ~200 nodes, ~6400 threads and fast parallel I/O, it is a powerful resource for life science high performance computation, originally optimized at bioinformatics sequence processing.
- A place for data move data at the beginning of your project. By definition, every project has an end. Your project data needs to leave the cluster at the end of the project.
- A collaborative resource with central administration managed by BIH HPC IT and supported via hpc-helpdesk@bih-charite.de
IT IS NOT
- A self-administrated workstation or servers.
- You will not get
sudo
. - We will not install software beyond those in broad use and available in CentOS Core or EPEL repositories.
- You can install software in your user/group/project directories, for example using Conda.
- A place to store primary copies of your data. You only get 1 GB of storage in your home for scripts, configuration, and documents.
- A safe place to store data. Only your 1 GB of home is in snapshots and backup. While data is stored on redundant disks, technical or administrative failure might eventually lead to data loss. We do everything humanly possible to prevent this. Despite this, it is your responsibility to keep important files in the snapshot/backup protected home, ideally even in copy (e.g., a git repository) elsewhere. Also, keeping safe copies of primary data files, your published results, and the steps in between reproducible is your responsibility.
- A place to store data indefinitely. The fast CephFS Tier 1 storage is expensive and \"rare\". CephFS Tier 2 is bigger in volume, but still not unlimited. The general workflow is: (1) copy data to cluster, (2) process it, creating intermediate and final results, (3) copy data elsewhere and remove it from the cluster
- Generally suitable for primary software development. The I/O system might get overloaded and saving scripts might take some time. We know of people who do this and it works for them. Your mileage might vary.
"},{"location":"overview/job-scheduler/","title":"Job Scheduler","text":"Once logged into the cluster through the login nodes, the Slurm scheduler needs to be used to submit computing jobs. In Slurm nomenclature, cluster compute nodes are assigned to one or more partitions. Submitted jobs are assigned to nodes according to the partition's configuration.
"},{"location":"overview/job-scheduler/#partitions","title":"Partitions","text":"The BIH HPC has the partitions described below. The cluster focuses on life science applications and not \"classic HPC\" with numerical computations using MPI. Thus, all partitions except for mpi
only allow to reserve resources on one node. This makes the cluster easier to use as users don't have to explicitely specify this limit when submitting their jobs.
"},{"location":"overview/job-scheduler/#standard","title":"standard
","text":"Jobs are submitted to the standard
partition by default. From the, the scheduler will route the jobs to their actual partition using the routing rule set described below. You can override this routing by explicitely assigning a partition (but this is discouraged).
- Jobs requesting a GPU resources are routed to the
gpu
queue. - Else, jobs requesting more than 200 GB of RAM are routed to the
highmem
queue. - Else, jobs are assigned to the partitions
debug
, short
, medium
, and long
long depending on their configured maximal running time. The partitions are evaluated in the order given above and the first fitting partition will be used.
"},{"location":"overview/job-scheduler/#debug","title":"debug
","text":"This partition is for very short jobs that should be executed quickly, e.g., for tests. The job running time is limited to one hour and at most 128 cores can be used per user but the jobs are submitted with highest priority.
- maximum run time: 1 hour
- maximum cores: 128 cores per user
- partition name:
debug
- argument string: maximum run time:
--time 01:00:00
"},{"location":"overview/job-scheduler/#short","title":"short
","text":"This partition is for jobs running only few hours. The priority of short jobs is high and many cores can be used at once to reward users for splitting their jobs into smaller parts.
- maximum run time: 4 hours
- maximum cores: 2000 cores
- partition name:
short
- argument string: maximum run time:
--time 04:00:00
"},{"location":"overview/job-scheduler/#medium","title":"medium
","text":"This partition is for jobs running for multiple days. Users can only allocate the equivalent of 4 nodes.
- maximum run time: 7 days
- maximum cores: 128 cores/slots (4 nodes)
- partition name:
medium
- argument string: maximum run time:
--time 7-00:00:00
"},{"location":"overview/job-scheduler/#long","title":"long
","text":"This partition is for long-running tasks. Only one node can be reserved for so long to discourage really long-running jobs and encourage users for splitting their jobs into smaller parts.
- maximum run time: 14 days
- maximum cores: 32 cores/slots (1 node)
- partition name:
long
- argument string: maximum run time:
--time 14-00:00:00
"},{"location":"overview/job-scheduler/#gpu","title":"gpu
","text":"Jobs requesting GPU resources are automatically assigned to the gpu
partition.
The GPU nodes are only part of the gpu
partition so they are not blocked by normal compute jobs. Maximum run time is relatively high (14 days) to allow for longer training jobs. Contact hpc-helpdesk@bih-charite.de if you have longer running jobs that you really cannot make run any shorter for assistance.
Info
Fair use rules apply. As GPU nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
- maximum run time: 14 days
- partition name:
gpu
- argument string: select
$count
GPUs: -p gpu --gres=gpu:$card:$count
(card=tesla
or card=a40
), maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#highmem","title":"highmem
","text":"Jobs requesting more than 200 GB of RAM are automatically routed to the highmem
partition.
The high memory nodes are only part of the highmem
partition so they are not blocked by normal compute jobs. Maximum run time is relatively high (14 days) to allow for longer jobs. Contact hpc-helpdesk@bih-charite.de for assistance if you have longer running jobs that you really cannot make run any shorter.
Info
Fair use rules apply. As high-memory nodes are a limited resource, excessive use by single users is prohibited and can lead to mitigating actions. Be nice and cooperative with other users. Tip: getent passwd USER_NAME
will give you a user's contact details.
- maximum run time: 14 days
- partition name:
highmem
- argument string:
-p highmem
, maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#mpi","title":"mpi
","text":"Jobs are not routed automatically to the mpi
partition but you have to explitely request the partition. This is the only partition in which more than one node can be allocated to a job.
You can submit multi-node jobs into the mpi
partition. Maximum run time is relatively high (14 days) to allow for longer jobs. Don't abuse this. Contact hpc-helpdesk@bih-charite.de for assistance if you have longer running jobs that you really cannot make run any shorter.
- maximum run time: 14 days
- partition name:
highmem
- argument string:
-p mpi
, maximum run time: --time 14-00:00:00
"},{"location":"overview/job-scheduler/#critical","title":"critical
","text":"Jobs are not routed into critial
automatically and the partition has to be selected manually.
This partition is for time-critical jobs with deadlines. As long as the cluster is not very busy, requests for critical jobs will be granted most of the time. However, do not use this partition without arranging with hpc-helpdesk as killing jobs will be used as the ultima ratio in case of such policy violations.
- maximum run time: 7 days
- maximum cores: 2000 cores/slots (48 nodes)
- partition name:
critical
- argument string: maximum run time:
--time 7-00:00:00
"},{"location":"overview/monitoring/","title":"Monitoring","text":"We currently provide you only with Ganglia for monitoring the cluster status.
"},{"location":"overview/monitoring/#using-ganglia","title":"Using Ganglia","text":"Go to the following address and login with your home organization (Charite or MDC):
- https://hpc-ganglia.cubi.bihealth.org
Ganglia does not know about Slurm
Ganglia will not show you anything about the Slurm job schedulign system. If a job uses a whole node but uses no CPUs then this will be displayed as unused in Ganglia. However, Slurm would not schedule another job on this node.
You will be show a screen as shown below. This allows you to get a good idea of what is going on on the HPC.
By default you will be shown the cluster usage of the last day. You can quickly switch to report for two or four hours as well, etc.
In the first row of pictures you see the number of total CPUs (actually hardware threads), number of hosts seen as up and down by Ganglia, and cluster load/utilization. You will then see the overall cluster load, memory usage, CPU usage, and network utilization across the selected time period.
Linux load is not intuitive
Note that the technical details behind Linux load is not very interactive. It is incorporating much more than just the CPU usage. You can find a quite comprehensive treatement of Linux Load here.
We are using a fast shared storage system and almost no local storage (except in /tmp
). Also, almost no jobs use MPI or other heavy network communication. Thus, the network utilization is a good measure of the I/O on the cluster.
Below, you can drill down into various metrics and visualize them historically. Just try it out and find your way around, you cannot break anything. Sadly, there is no good documentation of Ganglia online.
"},{"location":"overview/monitoring/#aggregate-gpu-utilization-visualization","title":"Aggregate GPU Utilization Visualization","text":"Ganglia allows you to obtain metrics in several interesting and useful ways. If you click on \"Aggregate Graphs\" then you could enter the following values to get an overview of the live GPU utilization.
- Title:
Aggreate GPU Utilization
- Host Regular expression:
hpc-gpu-.*
- Metric Regular Expressions:
gpu._util
- Graph Type:
Stacked
- Legend Options:
Hide legend
Then click Create Graph
.
If a GPU is fully used, it will contribute 100 points on the vertical axis. See above for an example, and here is a direct link:
- Aggregate GPU Utilization
"},{"location":"overview/storage/","title":"Nodes and Storage Volumes","text":"No mounting on the cluster itself.
For various technical and security-related reasons it is not possible to mount anything on the cluster nodes by users. For mounting the cluster storage on your computer, please read Connecting: SSHFS Mounts.
This document gives an overview of the nodes and volumes on the cluster.
"},{"location":"overview/storage/#cluster-layout","title":"Cluster Layout","text":""},{"location":"overview/storage/#cluster-nodes","title":"Cluster Nodes","text":"The following groups of nodes are available to cluster users. There are a number of nodes that are invisible to non-admin staff, hosting the queue master and monitoring tools and providing backup storage for key critical data, but these are not shown here.
hpc-login-{1,2}
- available as
hpc-login-{1,2}.cubi.bihealth.org
- do not perform any computation on these nodes!
- each process may at most use 1 GB of RAM
med0101..0124,0127
- 25 standard nodes
- Intel Xeon E5-2650 v2 @2.60Ghz, 16 cores x2 threading
- 128 GB RAM
med0133..0164
- 32 standard nodes
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
med0201..0264
- 64 nodes with Infiniband interconnect
- Intel Xeon E5-2650 v2 @2.60Ghz, 16 cores x2 threading
- 128 GB RAM
med0301..0304
- 4 nodes with 4 Tesla V100 GPUs each
med0401..0405
special purpose/high-memory machines - Intel Xeon E5-4650 v2 @2.40GHz, 40 cores x2 threading
med0401
and med0402
- 1 TB RAM
med0403
and med0404
- 500 GB RAM
med0405
- 2x \"Tesla K20Xm\" GPU accelleration cards (cluster resource
gpu
) - access limited to explicit GPU users
med0601..0616
- 16 nodes owned by CUBI
- Intel Xeon E5-2640 v3 @2.60Ghz
- 192 GB RAM
med0618..0633
- 16 nodes owned by CUBI
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
med0701..0764
- 64 standard nodes
- Intel Xeon E5-2667 v4 @3.20GHz, 16 cores x 2 threading
- 192 GB RAM
"},{"location":"overview/storage/#cluster-volumes-and-locations","title":"Cluster Volumes and Locations","text":"The cluster has 2.1 PB of legacy fast storage, currently available at /fast
, as well as 1.6 PB of next-generation fast storage, available at /data/cephfs-1
. Additionally 7.4 PB of slower \"Tier 2\" storage is available at /data/cephfs-2
. Storage is provided by a Ceph storage cluster and designed for massively parallel access from an HPC system. In contrast to \"single server\" NFS systems, the system can provide large bandwidth to all cluster nodes in parallel as long as large data means relatively \"few\" files are read and written.
Storage is split into three sections:
home
-- small, persistent, and safe storage, e.g., for documents and configuration files (default quota of 1 GB). work
-- larger and persistent storage, e.g., for your large data files (default quota of 1 TB). scratch
-- large and non-persistent storage, e.g., for temporary files, files are automatically deleted after 2 weeks (default quota of 10 TB; deletion not implemented yet).)
Each user, group, and project has one or more of these sections each, e. g. for users:
/data/cephfs-1/home/users/$NAME
/data/cephfs-1/home/users/$NAME/work
/data/cephfs-1/home/users/$USER/scratch
See Storage and Volumes: Locations for more informatin.
"},{"location":"slurm/background/","title":"Introduction to Scheduling","text":"As explained elsewhere in more detail, an HPC cluster consists of multiple computers connected via a network and working together. Multiple users can use the system simultaneously to do their work. This means that the system needs to join multiple computers (nodes) to provide a coherent view of them and the same time partition the system to allow multiple users to work concurrently.
user 1 user 2 ...\n\n .---. .---. .---. .---.\n | J | | J | | J | | J |\n | o | | o | | o | | o | ...\n | b | | b | | b | | b |\n | 1 | | 2 | | 3 | | 4 |\n '---' '---' '---' '---'\n\n.------------------------------------------.\n| Cluster Scheduler |\n'------------------------------------------'\n\n.----------. .------------. .------------.\n| multiple | | separate | | computers |\n'----------' '------------' '------------'\n
"},{"location":"slurm/background/#interlude-partitioning-single-computers","title":"Interlude: Partitioning Single Computers","text":"Overall, this partitioning is not so different from how your workstation or laptop works. Most likely, your computer (or even your smartphone) has multiple processors (or cores). You can run multiple programs on the same computer and the fact that (a) there is more than one core and (b) there is more than one program running is not known to the running programs (unless they explicitly communicate with each other). Different programs can explicitly take advantage of the multiple processor cores. The main difference is that you normally use your computer in an interactive fashion (you perform an action and expect an immediate reaction).
Even with a single processor (and core), your computer manages to run more than one program at the same time. This is done with the so-called time-slicing approach where the operating system lets each programs run in turn for a short time (a few milliseconds). A program with a higher priority will get more time slices than one with a lower (e.g., your audio player has real-time requirements and you will hear artifacts if it is starved for compute resources). Your operating system protects programs from each other by creating an address space for each. When two programs are running, the value of the memory at any given position in one program is independent from the value in the other program. Your operating system offers explicit functionality for sharing certain memory areas that two programs can use to exchange data efficiently.
Similarly, file permissions with Unix users/groups or Unix/Windows ACLs (access control lists) are used to isolate users from each other. Programs can share data by accessing the same file if they can both access it. There are special files called sockets that allow for network-like inter-process communication but of course two programs on the same computer can also connect (virtually) via the computer network (no data will actually go through a cable).
"},{"location":"slurm/background/#interlude-resource-types","title":"Interlude: Resource Types","text":"As another diversion, let us consider how Unix manages its resources. This is important to understand when requesting resources from the scheduler later on.
First of all, a computer might offer a certain feature such as a specific hardware platform or special network connection. Examples for this on the BIH HPC are specific Intel processor generations such as haswell
or the availability of Infiniband networking. You can request these with so-called constraints; they are not allocated to specific jobs.
Second, there are resources that are allocated to specific jobs. The most important resources here are:
- computing resources (processors/CPUs (central progressing units) and cores, details are explained below),
- main memory / RAM,
- special hardware such as GPUs, and
- (wall-clock) time that a job wants to run as an upper bound.
Generally, once a resource has been allocated to one job, it is not available to another. This means if you allocating more resources to your job that you actually need (overallocation) then those resources are not available to other jobs (whether they are your jobs or those of other users). This will be explained further below.
Another example of resource allocation are licenses. The BIH HPC has a few Matlab 2016b licenses that users can request. As long as a license is allocated to one job, it is unavailable to another.
"},{"location":"slurm/background/#nodes-sockets-processors-cores-threads","title":"Nodes, Sockets, Processors, Cores, Threads","text":"Regarding compute resources, Slurm differentiates between:
- nodes: a compute server,
- sockets: a socket in the compute server that hosts one physical processor,
- processor: a CPU or a CPU core in a multi-core computer (all CPUs in the BIH HPC are multi-core), and
- (hardware) threads: most Intel CPUs feature hardware threads (also known as \"hyperthreading\") where each core appears to be two cores.
In most cases, you will use one compute node only. When using more than one node, you will need to use some form of message passing, e.g., MPI, so processes on different nodes can communicate. On a single node you would mostly use single- or multi-threaded processes, or multiple processes.
Above: Slurm's nomenclature for sockets, processors, cores, and threads (from Slurm Documentation).
Co-locating processes/threads on the same socket has certain implications that are mostly useful for numerical applications. We will not further go into detail here. Slurm provides many different features of ways to specify allocation of \"pinning\" to specific process locations. If you need this feature, we trust that you find sufficient explanation in the Slurm documentation.
Usually, you would allocate multiple cores (a term Slurm uses synonymously with processors) on a single node (allocation on a single node is the default).
"},{"location":"slurm/background/#how-scheduling-works","title":"How Scheduling Works","text":"Slurm is an acronym for \"Simple Linux Unix Resource Manager\" (note that the word \"scheduler\" does not occur here). Actually, one classically differentiates between the managing of resources and the scheduling of jobs that use them. The resource manager allocates resources according to a user's request for a job and ensures that there are no conflicts. If the required resources are not available, the scheduler puts the user's job into a queue. Later, when then requested resources become available the scheduler assigns them to the job and runs it. In the following, both resource allocation and the running of the job are described as being done by the scheduler.
The interesting case occurs when there are not enough resources available for at least two jobs submitted to the scheduler. The scheduler has to decide how to proceed. Consider the simplified case of only scheduling cores. Each job will request a number of cores. The scheduler will then generate a scheduling plan that might look as follows.
core\n ^\n4 | |---job2---|\n3 | |---job2---|\n2 | |---job2---|\n1 | |--job1--|\n +--------------------------> t time\n 5 1 1 2\n 0 5 0\n
job1
has been allocated one core and job2
has been allocated two cores. When job3
, requesting one core is submitted at t = 5, it has to wait at least as long until job1
is finished. If job3
requested two or more cores, it would have to wait at least until job2
also finished.
We can now ask several questions, including the following:
- What if a job runs for less than the allocated time? -- In this case, resources become free and the scheduler will attempt to select the next job(s) to run.
- What if a job runs longer than the allocated time? -- In this case, the scheduler will send an informative Unix signal to the process first. The job will be given a bit more time and if it does not exit it will be forcibly terminated. You will find a note about this at the end of your job log file.
- What if multiple jobs compete for resources? -- The scheduler will prefer certain jobs over others using the Slurm Multifactor Priority Plugin. In practice, small jobs will be preferred over large, users with few used resources in the last month will be favored over heavy consumers, long-waiting jobs will be favored over recently submitted jobs, and many other factors. You can use the sprio utility to inspect these factors in real-time.
- How does the scheduler handle new requests? -- Generally, the scheduler will add new jobs to the waiting queue. The scheduler regularly adjusts its planning by recalculating job priorities. Slurm is configured to perform computationally simple schedule recalculations quite often and larger recalculations more infrequently.
Also see the Slurm Frequently Asked Questions.
Please note that even if all jobs were known at the start of time, scheduling is still a so-called NP-complete problem. Entire computer science journals and books are dedicated only to scheduling. Things get more complex in the case of online scheduling, in which new jobs can appear at any time. In practice, Slurm does a fantastic job with its heuristics but it heavily relies on parameter tuning. HPC administration is constantly working on optimizing the scheduler settings. Note that you can use the --format
option to the squeue
command to request that it shows you information about job scheduling (in particular, see the %S
field, which will show you the expected start time for a job, assuming Slurm has calculated it). See man squeue
for details. If you observe inexplicable behavior, please notify us at hpc-helpdesk@bih-charite.de
.
"},{"location":"slurm/background/#slurm-partitions","title":"Slurm Partitions","text":"In Slurm, the nodes of a cluster are split into partitions. Nodes are assigned to one or more partition (see the Job Scheduler section for details). Jobs can also be assigned to one or more partitions and are executed on nodes of the given partition.
In the BIH HPC, partitions are used to stratify jobs of certain running times and to provide different quality of service (e.g., maximal number of CPU cores available to a user for jobs of a certain running time and size). The partitions gpu
and highmem
provide special hardware (the nodes are not assigned to other partitions) and the mpi
partition allows MPI-parallelism and the allocation of jobs to more than one node. The Job Scheduler provides further details.
"},{"location":"slurm/cheat-sheet/","title":"Slurm Cheat Sheet","text":"This page contains assorted Slurm commands and Bash snippets that should be helpful.
man
pages!
$ man sinfo\n$ man scontrol\n$ man squeue\n# etc...\n
interactive sessions
hpc-login-1:~$ srun --pty bash\nmed0740:~$ echo \"Hello World\"\nmed0740:~$ exit\n
batch submission
hpc-login-1:~$ sbatch script.sh\nSubmitted batch job 2\nhpc-login-1:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 27 debug script.s holtgrem R 0:06 1 med0703\n
listing nodes
$ sinfo -N\nNODELIST NODES PARTITION STATE\nmed0740 1 debug* idle\nmed0741 1 debug* down*\nmed0742 1 debug* down*\n\n$ scontrol show nodes\nNodeName=med0740 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.06\n AvailableFeatures=(null)\n[...]\n\n$ scontrol show nodes med0740\nNodeName=med0740 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.06\n AvailableFeatures=(null)\n ActiveFeatures=(null)\n Gres=(null)\n NodeAddr=med0740 NodeHostName=med0740 Version=20.02.0\n OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020\n RealMemory=1 AllocMem=0 FreeMem=174388 Sockets=2 Boards=1\n State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A\n Partitions=debug\n BootTime=2020-03-05T00:54:15 SlurmdStartTime=2020-03-05T16:23:25\n CfgTRES=cpu=32,mem=1M,billing=32\n AllocTRES=\n CapWatts=n/a\n CurrentWatts=0 AveWatts=0\n ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s\n
queue states
$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n
node resources
$ sinfo -o \"%20N %10c %10m %25f %10G \"\n
additional resources such as GPUs
$ sinfo -o \"%N %G\"\n
listing job details
$ scontrol show job 225\nJobId=225 JobName=bash\n UserId=XXX(135001) GroupId=XXX(30069) MCS_label=N/A\n Priority=4294901580 Nice=0 Account=(null) QOS=normal\n JobState=FAILED Reason=NonZeroExitCode Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=130:0\n RunTime=00:16:27 TimeLimit=14-00:00:00 TimeMin=N/A\n SubmitTime=2020-03-23T11:34:26 EligibleTime=2020-03-23T11:34:26\n AccrueTime=Unknown\n StartTime=2020-03-23T11:34:26 EndTime=2020-03-23T11:50:53 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-03-23T11:34:26\n Partition=gpu AllocNode:Sid=hpc-login-1:1918\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=med0301\n BatchHost=med0301\n NumNodes=1 NumCPUs=2 NumTasks=0 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=2,node=1,billing=2\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Command=bash\n WorkDir=XXX\n Power=\n TresPerNode=gpu:tesla:4\n MailUser=(null) MailType=NONE\n
host:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n 1177 medium bash jweiner_ R 4-21:52:24 1 med0127 \n 1192 medium bash jweiner_ R 4-07:08:40 1 med0127 \n 1209 highmem bash mkuhrin_ R 2-01:07:17 1 med0402 \n 1210 gpu bash hilberta R 1-10:30:34 1 med0304 \n 1213 long bash schubacm R 1-09:42:27 1 med0127 \n 2401 gpu bash ramkem_c R 1-05:14:53 1 med0303 \n 2431 medium ngs_mapp holtgrem R 1-05:01:41 1 med0127 \n 2437 critical snakejob holtgrem R 1-05:01:34 1 med0135 \n 2733 debug bash schubacm R 7:36:42 1 med0127 \n 3029 critical ngs_mapp holtgrem R 5:59:07 1 med0127 \n 3030 critical snakejob holtgrem R 5:56:23 1 med0134 \n 3031 critical snakejob holtgrem R 5:56:23 1 med0137 \n 3032 critical snakejob holtgrem R 5:56:23 1 med0137 \n 3033 critical snakejob holtgrem R 5:56:23 1 med0138 \n 3034 critical snakejob holtgrem R 5:56:23 1 med0138 \n 3035 critical snakejob holtgrem R 5:56:20 1 med0139 \n 3036 critical snakejob holtgrem R 5:56:20 1 med0139 \n 3037 critical snakejob holtgrem R 5:56:20 1 med0140 \n 3038 critical snakejob holtgrem R 5:56:20 1 med0140 \n 3039 critical snakejob holtgrem R 5:56:20 1 med0141 \n 3040 critical snakejob holtgrem R 5:56:20 1 med0141 \n 3041 critical snakejob holtgrem R 5:56:20 1 med0142 \n 3042 critical snakejob holtgrem R 5:56:20 1 med0142 \n 3043 critical snakejob holtgrem R 5:56:20 1 med0143 \n 3044 critical snakejob holtgrem R 5:56:20 1 med0143 \n 3063 long bash schubacm R 4:12:37 1 med0127 \n 3066 long bash schubacm R 4:11:47 1 med0127 \n 3113 medium ngs_mapp holtgrem R 1:52:33 1 med0708 \n 3118 medium snakejob holtgrem R 1:50:38 1 med0133 \n 3119 medium snakejob holtgrem R 1:50:38 1 med0703 \n 3126 medium snakejob holtgrem R 1:50:38 1 med0706 \n 3127 medium snakejob holtgrem R 1:50:38 1 med0144 \n 3128 medium snakejob holtgrem R 1:50:38 1 med0144 \n 3133 medium snakejob holtgrem R 1:50:35 1 med0147 \n 3134 medium snakejob holtgrem R 1:50:35 1 med0147 \n 3135 medium snakejob holtgrem R 1:50:35 1 med0148 \n 3136 medium snakejob holtgrem R 1:50:35 1 med0148 \n 3138 medium snakejob holtgrem R 1:50:35 1 med0104 \n
host:~$ squeue -o \"%.10i %9P %20j %10u %.2t %.10M %.6D %10R %b\"\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(R TRES_PER_NODE\n 1177 medium bash jweiner_m R 4-21:52:22 1 med0127 N/A\n 1192 medium bash jweiner_m R 4-07:08:38 1 med0127 N/A\n 1209 highmem bash mkuhrin_m R 2-01:07:15 1 med0402 N/A\n 1210 gpu bash hilberta_c R 1-10:30:32 1 med0304 gpu:tesla:4\n 1213 long bash schubacm_c R 1-09:42:25 1 med0127 N/A\n 2401 gpu bash ramkem_c R 1-05:14:51 1 med0303 gpu:tesla:1\n 2431 medium ngs_mapping holtgrem_c R 1-05:01:39 1 med0127 N/A\n 2437 critical snakejob.ngs_mapping holtgrem_c R 1-05:01:32 1 med0135 N/A\n 2733 debug bash schubacm_c R 7:36:40 1 med0127 N/A\n 3029 critical ngs_mapping holtgrem_c R 5:59:05 1 med0127 N/A\n 3030 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0134 N/A\n 3031 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0137 N/A\n 3032 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0137 N/A\n 3033 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0138 N/A\n 3034 critical snakejob.ngs_mapping holtgrem_c R 5:56:21 1 med0138 N/A\n 3035 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0139 N/A\n 3036 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0139 N/A\n 3037 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0140 N/A\n 3038 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0140 N/A\n 3039 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0141 N/A\n 3040 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0141 N/A\n 3041 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0142 N/A\n 3042 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0142 N/A\n 3043 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0143 N/A\n 3044 critical snakejob.ngs_mapping holtgrem_c R 5:56:18 1 med0143 N/A\n 3063 long bash schubacm_c R 4:12:35 1 med0127 N/A\n 3066 long bash schubacm_c R 4:11:45 1 med0127 N/A\n 3113 medium ngs_mapping holtgrem_c R 1:52:31 1 med0708 N/A\n 3118 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0133 N/A\n 3119 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0703 N/A\n 3126 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0706 N/A\n 3127 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0144 N/A\n 3128 medium snakejob.ngs_mapping holtgrem_c R 1:50:36 1 med0144 N/A\n 3133 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0147 N/A\n 3134 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0147 N/A\n 3135 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0148 N/A\n 3136 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0148 N/A\n 3138 medium snakejob.ngs_mapping holtgrem_c R 1:50:33 1 med0104 N/A\n
host:~$ sinfo\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST \ndebug* up 8:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \ndebug* up 8:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \ndebug* up 8:00:00 10 alloc med[0137-0144,0147-0148] \ndebug* up 8:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nmedium up 7-00:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \nmedium up 7-00:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \nmedium up 7-00:00:00 10 alloc med[0137-0144,0147-0148] \nmedium up 7-00:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nlong up 28-00:00:0 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \nlong up 28-00:00:0 8 mix med[0104,0127,0133-0135,0703,0706,0708] \nlong up 28-00:00:0 10 alloc med[0137-0144,0147-0148] \nlong up 28-00:00:0 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \ncritical up 7-00:00:00 11 drain med[0707,0709-0710,0740-0742,0744-0745,0749,0752,0755] \ncritical up 7-00:00:00 8 mix med[0104,0127,0133-0135,0703,0706,0708] \ncritical up 7-00:00:00 10 alloc med[0137-0144,0147-0148] \ncritical up 7-00:00:00 103 idle med[0105-0124,0136,0145-0146,0151-0164,0201-0264,0704-0705] \nhighmem up 14-00:00:0 1 mix med0402 \nhighmem up 14-00:00:0 3 idle med[0401,0403-0404] \ngpu up 14-00:00:0 2 mix med[0303-0304] \ngpu up 14-00:00:0 2 idle med[0301-0302] \n
"},{"location":"slurm/commands-sacct/","title":"Slurm Command: sacct
","text":"Perform queries to the Slurm accounting information.
Representative Example
hpc-login-1:~$ sacct -j 1607103\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n1607103 wgs_sv_an+ medium 1 PENDING 0:0\n
The sacct
command displays information from the Slurm accounting service. The Slurm scheduler only knows about active or completing (very recently active) jobs. The accouting system also knows about currently running jobs so it is the more robust way to query information about jobs. However, not all information is available to the accouting system, so scontrol show job
and squeue
provide more information about current and pending jbos.
Slurm Documentation: sacct
Please also see the official Slurm documentation on sacct.
"},{"location":"slurm/commands-sacct/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sbatch
command.
--jobs
-- The job(s) to query for. --format
-- Define attributes to retrieve. --long
-- Get a lot of information from the database, consider to pipe into | less -S
.
"},{"location":"slurm/commands-sacct/#notes","title":"Notes","text":" - If you need to get information about a job regardless of it being in the past, present, or future execution, use
sacct
over scontrol
and squeue
.
"},{"location":"slurm/commands-sattach/","title":"Slurm Command: sattach
","text":"The sattach
command allows you to connect the standard input, output, and error streams to your current terminals ession.
Representative Example
hpc-login-1:~$ sattach 12345.0\n[...output of your job...]\nmed0211:~$ [Ctrl-C]\nhpc-login-1:~$\n
Press Ctrl-C
to detach from the current session. Please note that you will have to give the job ID as well as step step ID. For most cases, simply append \".0\"
to your job ID.
Slurm Documentation: sattach
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-sattach/#important-arguments","title":"Important Arguments","text":" --pty
-- Execute task zero in pseudo terminal. --verbose
-- Increase verbosity of sattach
.
"},{"location":"slurm/commands-sbatch/","title":"Slurm Command: sbatch
","text":"The sbatch
command allows you to put a job into the scheduler's queue to be executed at a later time.
Representative Example
# Execute job.sh in partition medium with 4 threads and 4GB of RAM total for a\n# running time of up to one day.\nhpc-login-1:~$ sbatch --partition=medium --mem=4G --ntasks 4 --time=1-00 job.sh\nSubmitted batch job JOB_ID\n
The command will create a batch job and add it to the queue to be executed at a later point in time.
Slurm Documentation: sbatch
Please also see the official Slurm documentation on sbatch.
"},{"location":"slurm/commands-sbatch/#important-arguments","title":"Important Arguments","text":" --array
-- Submit jobs as array jobs. Also see the section [#array-jobs] below. --nodes
-- The number of nodes to allocate. This is only given here as an important argument as the maximum number of nodes allocatable to any partition but mpi
is set to one (1). This is done as there are few users on the BIH HPC that actually use multi-node paralleilsm. Rather, most users will use multi-core parallelism and might forget to limit the number of nodes which causes inefficient allocation of resources. --cpus-per-task
-- This corresponds to the number of CPU cores allocated to each task. --mem
-- The memory to allocate for the job. As you can define minimal and maximal number of tasks/CPUs/cores, you could also specify --mem-per-cpu
and get more flexible scheduling of your job. --gres
-- Generic resource allocation. On the BIH HPC, this is only used for allocating GPUS, e.g., with --gres=gpu:tesla:2
, a user could allocate two NVIDIA Tesla GPUs on the same host (use a40
instead of tesla
for the A40 GPUs). --licenses
-- On the BIH HPC, this is used for the allocation of MATLAB 2016b licenses only. --partition
-- The partition to run in. Also see the Job Scheduler section. --time
-- Specify the running time, see man sbatch
or the official Slurm documentation on srun for supported formats. **Please note that the DRMA API only accepts the hours:minutes
format. --dependency
-- Specify dependencies on other jobs, e.g., using --dependency afterok:JOBID
to only execute if the job with ID JOBID
finished successfully or --dependency after:JOBID
to wait for a job to finish regardless of its termination status. --constraint
-- Require one or more features from your node. On the BIH HPC, the processor generation is defined as a feature on the nodes, e.g., haswell
, or special networking such as infiniband
. You can have a look at /etc/slurm/slurm.conf
on all configured features. --output
-- The path to the output log file (by default joining stdout and stderr, see the man page on --error
on how to redirect stderr separately). A various number of placeholders is available, see the \"filename pattern\" section of man sbatch
or the official Slurm documentation on srun. --mail-type=<type>
-- Send out notifications by email when an event occurs. Use FAIL
to get emails when your job fails. Also see the documentation of sbatch in the Slurm manual. --mail-user=<email>
-- The email address to send to. Must end in @charite.de
, @mdc-berlin.de
, or @bih-charite.de
.
Ensure your --output
directory exists!
In the case that the path to the log/output file does not exist, the job will just fail. scontrol show job ID
will report JobState=FAILED Reason=NonZeroExitCode
. Regrettably, no further information is displayed to you as the user. Always check that the path to the directories in StdErr
and StdOut
exists when checking scontrol show job ID
.
"},{"location":"slurm/commands-sbatch/#other-arguments","title":"Other Arguments","text":" --job-name
"},{"location":"slurm/commands-sbatch/#job-scripts","title":"Job Scripts","text":"Also see the section Slurm Job Scripts on how to embed the sbatch
parameters in #SBATCH
lines.
"},{"location":"slurm/commands-sbatch/#array-jobs","title":"Array Jobs","text":"If you have many (say, more than 10) similar jobs (e.g., when performing a grid search), you can also use array jobs. However, you should also consider whether it would make sense to increase the time of your jobs, e.g, to be at least ~10min.
You can submit array jobs by specifying -a EXPR
or --array EXPR
where EXPR
is a range or a list (of course, you can also add this as an #SBATCH
header in your job script). For example:
hpc-login-1 ~# sbatch -a 1-3 grid_search.sh\nhpc-login-1 ~# sbatch -a 1,2,5-10 grid_search.sh\n
This will submit grid_search.sh
with certain variables set:
SLURM_ARRAY_JOB_ID
-- the ID of the first job SLURM_ARRAY_TASK_ID
-- the index of the job in the array SLURM_ARRAY_TASK_COUNT
-- number of submitted jobs in array SLURM_ARRAY_TASK_MAX
-- higehst job array index value SLURM_ARRAY_TASK_MIN
-- lowest job array index value
Using array jobs has several advantages:
- It greatly reduces the load on the Slurm scheduler.
- You do not need to submit in a loop, but rather
- You can use a single command line.
Also see Slurm documentation on job arrays.
For example, if you submit sbatch --array=1-3 grid_search.sh
and slurm responsds with Submitted batch job 36
then the script will be run three times with the following prameters set:
SLURM_JOB_ID=36\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=1\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n\nSLURM_JOB_ID=37\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=2\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n\nSLURM_JOB_ID=38\nSLURM_ARRAY_JOB_ID=36\nSLURM_ARRAY_TASK_ID=3\nSLURM_ARRAY_TASK_COUNT=3\nSLURM_ARRAY_TASK_MAX=3\nSLURM_ARRAY_TASK_MIN=1\n
"},{"location":"slurm/commands-sbatch/#notes","title":"Notes","text":" - This is the primary entry point for creating batch jobs to be executed at a later point in time.
- As with all jobs allocated by Slurm, interactive sessions executed with
sbatch
are governed by resource allocations, in particular: sbatch
jobs have a maximal running time set, sbatch
jobs have a maximal memory and number of cores set, and - also see
scontrol show job JOBID
.
"},{"location":"slurm/commands-scancel/","title":"Slurm Command: scancel
","text":"Terminate a running Slurm job.
Representative Example
hpc-login-1:~$ scancel 1703828\nhpc-login-1:~$\n
This command allows to terminate one or more running jobs (of course, non-superusers can only terminate their own jobs).
Slurm Documentation: scancel
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-scontrol/","title":"Slurm Command: scontrol
","text":"The scontrol
allows to query detailed information from the scheduler and perform manipulation. Object manipulation is less important for normal users.
Representative Example
hpc-login-1:~$ scontrol show job 1607103\nJobId=1607103 JobName=wgs_sv_annotation\n UserId=holtgrem_c(100131) GroupId=hpc-ag-cubi(5272) MCS_label=N/A\n Priority=748 Nice=0 Account=(null) QOS=normal\n [...]\nhpc-login-1:~$ scontrol show node med02[01-32]\nNodeName=med0201 Arch=x86_64 CoresPerSocket=8\n CPUAlloc=0 CPUTot=32 CPULoad=0.01\n AvailableFeatures=ivybridge,infiniband\n ActiveFeatures=ivybridge,infiniband\n [...]\nhpc-login-1:~$ scontrol show partition medium\nPartitionName=medium\n AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL\n AllocNodes=ALL Default=NO QoS=medium\n DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO\n [...]\n
This command allows to query all information for an object from Slurm, e.g., jobs, nodes, or partitions. The command also accepts ranges of jobs and hosts. It is most useful to get the information of one or a few objects from the scheduler.
Slurm Documentation: scontrol
Please also see the official Slurm documentation on scontrol.
"},{"location":"slurm/commands-scontrol/#important-sub-commands","title":"Important Sub commands","text":" scontrol show job
-- Show details on jobs. scontrol show partition
-- Show details on partitions. scontrol show node
-- Show details on nodes. scontrol help
-- Show help. scontrol
-- Start an interactive scontrol shell / REPL (read-eval-print loop).
"},{"location":"slurm/commands-scontrol/#notes","title":"Notes","text":" scontrol
can only work on jobs that are pending (in the queue), running, or in \"completing' state. - For jobs that have finished, you have to use Slurm's accounting features, e.g., with the
sacct
command.
"},{"location":"slurm/commands-sinfo/","title":"Slurm Command: sinfo
","text":"The sinfo
command allows you to query the current cluster status.
Representative Example
hpc-login-1:~$ sinfo\nPARTITION AVAIL TIMELIMIT NODES STATE NODELIST\n[...]\nmedium up 7-00:00:00 10 drain* med[0101-0103,0125-0126,0128-0132]\nmedium up 7-00:00:00 1 down* med0243\nmedium up 7-00:00:00 31 mix med[0104,0106-0122,0124,0133,0232-0233,0237-0238,0241-0242,0244,0263-0264,0503,0506]\nmedium up 7-00:00:00 5 alloc med[0105,0123,0127,0239-0240]\nmedium up 7-00:00:00 193 idle med[0134-0164,0201-0231,0234-0236,0245-0262,0501-0502,0504-0505,0507-0516,0601-0632,0701-0764]\n[...]\nhpc-login-1:$ sinfo --summarize\nPARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST\ndebug* up 8:00:00 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\nmedium up 7-00:00:00 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\nlong up 28-00:00:0 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\ncritical up 7-00:00:00 25/141/10/176 med[0101-0164,0501-0516,0601-0632,0701-0764]\nhighmem up 14-00:00:0 1/2/1/4 med[0401-0404]\ngpu up 14-00:00:0 3/0/1/4 med[0301-0304]\nmpi up 14-00:00:0 38/191/11/240 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n
This command will summaries the state of nodes by different criteria (e.g., by partition or globally).
Slurm Documentation: sinfo
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-sinfo/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sinfo
command.
--summarize
-- Summarize the node state by partition. --nodes
-- Select the nodes to show the status for, e.g., display the status of all GPU nodes with sinfo -n med030[1-4]
.
"},{"location":"slurm/commands-sinfo/#node-states","title":"Node States","text":"The most important node states are:
down
-- node is marked as offline draining
-- node will not accept any more jobs but has jobs running on it drained
-- node will not accept any more jobs and has no jobs running on it, but is not offline yet idle
-- node is ready to run jobs allocated
-- node is fully allocated (e.g., CPU, RAM, or GPU limit has been reached) mixed
-- node is running jobs but there is space for more
"},{"location":"slurm/commands-sinfo/#notes","title":"Notes","text":" - Also see the Slurm Format Strings section.
"},{"location":"slurm/commands-squeue/","title":"Slurm Command: squeue
","text":"The squeue
command allows you to view currently running and pending jobs.
Representative Example
hpc-login-1:~$ squeue\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 1583165 highmem 20200702 usr PD 0:00 1 (DependencyNeverSatisfied)\n 1605901 critical variant_ holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n 1605902 critical variant_ holtgrem PD 0:00 1 (Dependency)\n 1605905 critical variant_ holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n 1605916 critical wgs_sv_c holtgrem PD 0:00 1 (Dependency)\n 1607103 medium wgs_sv_a holtgrem PD 0:00 1 (DependencyNeverSatisfied)\n[...]\n
Slurm Documentation: squeue
Please also see the official Slurm documentation on squeue.
"},{"location":"slurm/commands-squeue/#important-arguments","title":"Important Arguments","text":" --nodelist
-- Only display jobs running on certain nodes (e.g., GPU nodes). --format
-- Define the format to print, see man squeue
for details. See below for a format string that includes the jobid, partition, job name, user name, job status, running time, number of nodes, number of CPU cores, and allocated GPUs.
"},{"location":"slurm/commands-squeue/#notes","title":"Notes","text":"The following aliases in ~/.bashrc
will allow you to print a long and informative squeue
output with sq
, pipe it into less with sql
, get only your jobs (adjust the alias
to your account) using sqme
and pipe that into less with sqmel
.
alias sq='squeue -o \"%.10i %9P %60j %10u %.2t %.10M %.6D %.4C %10R %b\" \"$@\"'\nalias sql='sq \"$@\" | less -S'\nalias sqme='sq -u YOURUSER_c_or_m \"$@\"'\nalias sqmel='sqme \"$@\" | less -S'\n
"},{"location":"slurm/commands-srun/","title":"Slurm Command: srun
","text":"The srun
command allows you to run a command now.
Representative Example
hpc-login-1:~$ srun --pty bash -i\nmed0201:~$\n
The command will perform a resource allocation with the scheduler (and wait until it has allocated the requested resources) first. Most importantly, you can specify the --pty
argument which will connect the current terminal's standard output, error, and input to your current one. This allows you to run interactive jobs such as shells with srun --pty bash -i
.
Slurm Documentation: srun
Please also see the official Slurm documentation on srun.
"},{"location":"slurm/commands-srun/#important-arguments","title":"Important Arguments","text":"Also see all important arguments of the sbatch
command.
--pty
-- Connect current terminal to the job's stdoud/stderr/stdin. --x11
-- Setup X11 forwarding. --immediate
-- Immediately terminate if the resources to run the job are not available, do not wait. --test-only
-- Don't run anything, but only estimate when the job would be scheduled.
"},{"location":"slurm/commands-srun/#notes","title":"Notes","text":" - This is the primary entry point for creating interactive shell sessions on the cluster.
- As with all jobs allocated by Slurm, interactive sessions executed with
srun
are governed by resource allocations, in particular: srun
jobs have a maximal running time set, srun
jobs have a maximal memory and number of cores set, and - also see
scontrol show job JOBID
.
"},{"location":"slurm/format-strings/","title":"Slurm Command Format Strings","text":"In the sections Slurm Quickstart and Slurm Cheat Sheet, we have seen that sinfo
and squeue
allow for the compact display partitions/nodes and node information. In contrast, scontrol show job <id>
and scontrol show partition <id>
and scontrol show node <id>
show comprehensive information that quickly gets hard to comprehend for multiple entries.
Now you might ask: is there anything in between? And: yes, there is.
You can tune the output of sinfo
and squeue
using parameters, in particular by providing format strings. All of this is described in the man pages of the commands that you can display with man sinfo
and man squeue
on the cluster.
"},{"location":"slurm/format-strings/#tuning-sinfo-output","title":"Tuning sinfo
Output","text":"Notable arguments of sinfo
are:
-N, --Node
-- uncompress the usual lines and display one line per node and partition. -s, --summarize
-- compress the node state, more compact display. -R, --list-reasons
-- for nodes that are not up, display reason string provided by admin. -o <fmt>, --format=<fmt>
-- use format string for display.
The most interesting argument is -o/--format
. The man page lists the following values that are used when using other arguments. In other words, many of the display modifications could also be applied with -o/--format
.
default \"%#P %.5a %.10l %.6D %.6t %N\"\n--summarize \"%#P %.5a %.10l %.16F %N\"\n--long \"%#P %.5a %.10l %.10s %.4r %.8h %.10g %.6D %.11T %N\"\n--Node \"%#N %.6D %#P %6t\"\n--long --Node \"%#N %.6D %#P %.11T %.4c %.8z %.6m %.8d %.6w %.8f %20E\"\n--list-reasons \"%20E %9u %19H %N\"\n--long --list-reasons\n \"%20E %12U %19H %6t %N\"\n
The best way to learn more about this is to play around with sinfo -o
, starting out with one of the format strings above. Details about the format strings are described in man sinfo
. Some remarks here:
%<num><char>
displays the value represented by <char>
padded with spaces to the right such that a width of <num>
is reached, %.<num><char>
displays the value represented by <char>
padded with spaces to the left such that a width of <num>
is reached, and %#<char>
displays the value represented by <char>
padded with spaces to the max length of the value represented by <char>
(this is a \"virtual\" value, used internally only, you cannot use this and you will have to place an integer here).
For example, to create a grouped display with reasons for being down use:
hpc-login-1:~$ sinfo -o \"%10P %.5a %.10l %.16F %40N %E\"\nPARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST REASON\ndebug* up 8:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\ndebug* up 8:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nmedium up 7-00:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\nmedium up 7-00:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nlong up 28-00:00:0 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\nlong up 28-00:00:0 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\ncritical up 7-00:00:00 0/0/16/16 med[0703-0710,0740-0742,0744-0745,0749,0 bogus node\ncritical up 7-00:00:00 18/98/0/116 med[0104-0124,0127,0133-0148,0151-0164,0 none\nhighmem up 14-00:00:0 0/4/0/4 med[0401-0404] none\ngpu up 14-00:00:0 3/1/0/4 med[0301-0304] none\n
"},{"location":"slurm/format-strings/#tuning-squeue-output","title":"Tuning squeue
Output","text":"The standard squeue output might yield the following
hpc-login-1:~$ squeue | head\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 3149 medium variant_ holtgrem PD 0:00 1 (Dependency)\n 1177 medium bash jweiner_ R 6-03:32:41 1 med0127\n 1192 medium bash jweiner_ R 5-12:48:57 1 med0127\n 1210 gpu bash hilberta R 2-16:10:51 1 med0304\n 1213 long bash schubacm R 2-15:22:44 1 med0127\n 2401 gpu bash ramkem_c R 2-10:55:10 1 med0303\n 3063 long bash schubacm R 1-09:52:54 1 med0127\n 3066 long bash schubacm R 1-09:52:04 1 med0127\n 3147 medium ngs_mapp holtgrem R 1-03:13:42 1 med0148\n
Looking at man squeue
, we learn that the default format strings are:
default \"%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R\"\n-l, --long \"%.18i %.9P %.8j %.8u %.8T %.10M %.9l %.6D %R\"\n-s, --steps \"%.15i %.8j %.9P %.8u %.9M %N\"\n
This looks a bit wasteful. Let's cut down on the padding of the job ID and expand on the job name and remove some right paddings.
hpc-login-1:~$ squeue -o \"%.6i %9P %30j %.10u %.2t %.10M %.6D %R %b\" | head\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 3149 medium variant_calling holtgrem_c PD 0:00 1 (Dependency)\n 1177 medium bash jweiner_m R 6-03:35:55 1 med0127\n 1192 medium bash jweiner_m R 5-12:52:11 1 med0127\n 1210 gpu bash hilberta_c R 2-16:14:05 1 med0304\n 1213 long bash schubacm_c R 2-15:25:58 1 med0127\n 2401 gpu bash ramkem_c R 2-10:58:24 1 med0303\n 3063 long bash schubacm_c R 1-09:56:08 1 med0127\n 3066 long bash schubacm_c R 1-09:55:18 1 med0127\n 3147 medium ngs_mapping holtgrem_c R 1-03:16:56 1 med0148\n
"},{"location":"slurm/format-strings/#displaying-resources","title":"Displaying Resources","text":"Now display how many of our internal projects still exist.
hpc-login-1:~$ squeue -o \"%.6i %9P %30j %.10u %.2t %.10M %.6D %10R %s\" | head\n
The next steps are (TODO):
- setup of certificate for containers
- opening firewall apropriately
- integrate with openmpi documentation
"},{"location":"slurm/job-scripts/","title":"Slurm Job Scripts","text":"This page describes how to create SLURM job scripts.
SLURM job scripts look as follows. On the top you have lines starting with #SBATCH
. These appear as comments to bash scripts. These lines are interpreted by sbatch
in the same way as command line arguments. That is, when later submitting the script with sbatch my-job.sh
you can either have the parameter to the sbatch
call or in the file.
Multi-Node Allocation in Slurm
Classically, jobs on HPC systems are written in a way that they can run on multiple nodes at once, using the network to communicate. Slurm comes from this world and when allocating more than one CPU/core, it might allocate them on different nodes. Please use --nodes=1
to force Slurm to allocate them on a single node.
Creating the Script
host:example$ cat >my-job.sh <<\"EOF\"\n#!/bin/bash\n#\n#SBATCH --job-name=this-is-my-job\n#SBATCH --output=output.txt\n#\n#SBATCH --ntasks=1\n#SBATCH --nodes=1\n#SBATCH --time=10:00\n#SBATCH --mem-per-cpu=100M\n\ndate\n\nhostname\n>&2 echo \"Hello World\"\n\nsleep 1m\n\ndate\nEOF\n
Also see the SLURM Rosetta Stone for more options.
Submit, Look at Queue & Result
host:example$ sbatch script.sh \nSubmitted batch job 315\nhost:example$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \n 315 debug this-is- holtgrem R 0:40 1 med0127 \nhost:example$ sleep 2m\nhost:example$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) \nhost:example$ cat output.txt \nWed Mar 25 13:30:56 CET 2020\nmed0127\nHello World\nWed Mar 25 13:31:56 CET 2020\n
"},{"location":"slurm/memory-allocation/","title":"Memory Allocation","text":"Memory allocation is one of the topics that users find confusing most often. This section first gives some technical background and then explains how to implement this properly with Slurm on the BIH HPC.
"},{"location":"slurm/memory-allocation/#technical-background","title":"Technical Background","text":"Technical Background Summary
- virtual memory is what your programs tells the operating system it wants to use
- resident set size is the amount of memory that your program actually uses
- most memory will be allocated on the heap
Main memory used to be one of the most important topics when programming, as computers had so little. There is the infamous quote \"640KB ought ot be enough for anybody\" wrongly attribute to Bill Gates which refers to the fact that early computers could only address that amount of memory. In MS DOS, one had to use special libraries for a program to use more memory. Today, computers are very fast and memory is plentiful and people can (rightfully) forget about memory allocation ... as long as they don't use \"much\" memory by today's standards.
The Linux operating system differentiates between the following types of memory:
- virtual memory size (vsize), the amount of memory that a process (virtually) allocates,
- resident set size (rss), the amount of memory actually used and currently in the computer's main memory,
- the swap memory usage, the amount of active memory that is not present in main memory but on the computer's disk,
- sometimes, the shared memory is also interesting, and
- it might be interesting to know about heap and stack size.
Note that above we are talking about processes, not Slurm jobs yet. Let us look at this in detail:
Each program uses some kind of memory management. For example, in C the malloc
and free
functions manually allocate and free memory while in Java, R, and Python, memory allocation and release is done automatically using a concept called garbage collection. Each program starts with a certain virtual memory size, that is the amount of memory it can address, say 128MB. When the program allocates memory, the memory allocation mechanism will check whether it has sufficient space left. If not, it will request an increase in virtual memory from the operating system, e.g., to 256MB. If this fails then the program can try to handle the error, e.g., terminate gracefully, but many programs will just panic and stop. Otherwise, the program will get access to more memory and happily continue to run.
However, programs can allocate humonguous amounts of virtual memory and only use a little. Memory is organized in \"pages\" (classically these are 4096 bytes each, but can be larger using so-called \"huge page\" features). The operating system tracks which memory pages are actually used by a process. The total size of these pages is called the resident set size: the amount of memory that is actually currently used by a program. Programs can also mark pages as unused again, thus freeing resident memory and can also decrease their virtual memory.
In some cases it is still interesting to use swap memory. Here, the contents of resident memory are copied to disk by the operating system. This process is completely transparent to the program; the data remains available at the original positions in the virtual memory! However, accessing it will take some time as it must be read back into main memory from the disk. In this way, it was possible for a computer with 4MB of RAM and a disk of 100MB to run programs that used 8MB. Of course, this was only really useable for programs that ran in the background. One could really feel the latency if a graphical program was using swapped memory (you could actually hear the hard drive working). Today, swap storage is normally only relevant when put your computer into hibernation. Given the large main memory on the cluster nodes, their small local hard drives (just used for loading the operating system), and the extreme slowness involved in using swapped memory, the BIH HPC nodes have no swap memory allocated.
Most HPC users will also use shared memory, at least implicitly. Whenever a program uses fork
to create a subprocess (BTW, this is not a thread), the program can chose to \"copy\" its current address space. The second process then has access to the same memory than the parent process in a copy-on-write fashion. This allows, for example, pre-loading a database, and also allows the use of already loaded library code by the child process as well. If the child process writes to the copy-on-write memory of the parent, the relevant memory page will be copied and attributed to the child.
Two or more processes can share the same memory explicitly. This is usually used for inter-process communication but the Bowtie program uses it for sharing the memory of indices. For example, the Python multiprocessing
module will use this, including if you have two MPI processes running on the same host.
Memory is also separated into segments, the most interesting ones are heap and stack memory. For compiled languages, memory can be allocated on either. For C, an int
variable will be allocated on the stack. Every time you call a function, a stack frame is created in memory to hold the local variables and other information for the duration of the function execution. The stack thus grows through function calls made by your program and shrinks when the functions return. The stack size for a process is limited (by ulimit -s
) and a program that goes too deep (e.g., via infinite recursion) will be terminated by the operating system if it exceeds this limit. Again in C, int * ptr = (int *)malloc(10 * sizeof(int));
will allocate memory for one variable (an integer pointer) on the stack and memory for 10 integers on the heap. When the function returns, the ptr
variable on the stack will be freed but to free the array of integers, you'd have to call free(ptr)
. If the memory is not freed then this constitutes a memory leak, but that is another topic.
Other relevant segments are code, where the compiled code lives, and data, where static data such as strings displayed to the user are stored. As a side node, in interpreted languages such as R or Python, the code and data segments will refer to the code and data of Python while the actual program text will be on the heap.
"},{"location":"slurm/memory-allocation/#interlude-memory-in-java","title":"Interlude: Memory in Java","text":"Memory in Java Summary
- set
-XX:MaxHeapSize=<size>
(e.g., <size>=2G
) for your program and only tune the other parameters if needed - also consider the amount of memory that Java needs for heap management in your Slurm allocations
Java's memory management provides for some interesting artifacts. When running simple Java programs, you will never run into this but if you need to use gigabytes of memory in Java then you will have to learn a bit about Java memory management. This is the case when running GATK programs, for example.
As different operating systems handle memory management differently, the Java virtual machine does its own memory management to provide a consistent interface. The following three settings are important in governing memory usage of Java:
-Xmx<size>
/-XX:MaxHeapSize=<size>
-- the maximal Java heap size -Xms<size>
/-XX:InitialHeapSize=<size>
-- the initial Java heap size -Xss<size>
/-XX:ThreadStackSize=<size>
-- maximal stack size available to a Java thread (e.g., the main thread)
Above, <size>
is a memory specification, either in bytes or with a suffix, e.g., 80M
, or 1G
.
On startup, Java does roughly the following:
- Setup the core virtual machine, load libraries, etc. and allocate (vsize) consume (rss) memory on the OS (operating system) heap.
- Setup the Java heap allocate (vsize) and consume (rss) memory on the OS heap. In particular, Java will need to setup data structures for the memory management of each individual object.
- Run the program where Java data and Java threads will lead to memory allocation (vsize) and consumption (rss) of memory.
Memory freed by the Java garbage collector can be re-used by other Java objects (rss remains the same) or be freed in the operating system (rss decreases). The Java VM program itself will also consume memory on the OS stack but that is negligible.
Overall, the Java VM needs to store in main memory:
- The Java VM, program code, Java thread stacks etc. (very little memory).
- The Java heap (potentially a lot of memory).
- The Java heap management data structures (so-called \"off-heap\", but of course on the OS heap) (potentially also considerable memory).
In the BIH HPC context, the following is recommended to:
- Set the Java heap to an appropriate size (use trial-and-error to determine the correct size or look through internet forums).
- Only tune initial heap size in the case of performance issues (unlikely in batch processing).
- Only bump the stack size when problems occur.
- Consider \"off-heap\" memory when performing Slurm allocations.
"},{"location":"slurm/memory-allocation/#memory-allocation-in-slurm","title":"Memory Allocation in Slurm","text":"Memory Allocation in Slurm Summary
- most user will simply use
--mem=<size>
(e.g., <size>=3G
) to allocate memory per node - both interactive
srun
and batch sbatch
jobs are governed by Slurm memory allocation - the sum of all memory of all processes started by your job may not exceed the job reservation.
- please don't over-allocate memory, see \"Memory Accounting in Slurm\" below for details
Our Slurm configuration uses Linux cgroups to enforce a maximum amount of resident memory. You simply specify it using --mem=<size>
in your srun
and sbatch
command.
In the (rare) case that you provide more flexible number of threads (Slurm tasks) or GPUs, you could also look into --mem-per-cpu
and --mem-per-gpu
. The official Slurm sbatch manual is quite helpful, as is man sbatch
on the cluster command line.
Slurm (or rather Linux via cgroups) will track all memory started by all jobs by your process. If each process works independently (e.g., you put the output through a pipe prog1 | prog2
) then the amount of memory consumed will at any given time be the sum of the RSS of both processes at that time. If your program uses fork
, which uses memory in a copy-on-write fashion, the shared memory is of course only counted once. Note that Python's multiprocessing does not use copy on write: its data will be explicitly copied and consume additional memory. Refer to the Scipy/Numpy/Pandas etc. documentation on how to achieve parallelism without copying too much data.
The amount of virtual memory that your program can reserve is only \"virtually\" unlimited (pun not intended). However, in practice, the operating system will not like you allocating more than physically available. If your program attempts to allocate more memory than requested via Slurm, your program will be killed.
This is reported to you in the Slurm job output log as something like:
slurmstepd: error: Detected 1 oom-kill event(s) in step <JOB ID>.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.\n
You can inspect the amount of memory available on each node in total with sinfo --format \"%.10P %.10l %.6D %.6m %N\"
, as shown below.
$ sinfo --format \"%.10P %.10l %.6D %.6m %N\"\n PARTITION TIMELIMIT NODES MEMORY NODELIST\n debug* 8:00:00 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n medium 7-00:00:00 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n long 28-00:00:0 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n critical 7-00:00:00 176 128722 med[0101-0164,0501-0516,0601-0632,0701-0764]\n highmem 14-00:00:0 4 515762 med[0401-0404]\n gpu 14-00:00:0 4 385215 med[0301-0304]\n mpi 14-00:00:0 240 128722 med[0101-0164,0201-0264,0501-0516,0601-0632,0701-0764]\n
"},{"location":"slurm/memory-allocation/#memorycpu-accounting-in-slurm","title":"Memory/CPU Accounting in Slurm","text":"Memory Accounting in Slurm Summary
- you can use Slurm accounting to see memory and CPU usage of your program
- use
sacct -j JOBID --format=JobID,MaxRSS
to display the RSS usage of your program - use
sacct -j JOBID --format=Elapsed,AllocCPUs,TotalCPU
to display information about CPU usage - consider using the helpful script below to compute overallocated memory
While Slurm runs your job, it collects information about the job such as the running time, exit status, and memory usage. This information is available through the scheduling system via the squeue
and scontrol
commands, but only while the job is pending execution, executing, or currently completing. After job completion, the information is only available through the Slurm accounting system.
You can query information about jobs, e.g., using sacct
:
$ sacct -j 1607166\n JobID JobName Partition Account AllocCPUS State ExitCode\n------------ ---------- ---------- ---------- ---------- ---------- --------\n1607166 snakejob.+ critical 16 COMPLETED 0:0\n1607166.bat+ batch 16 COMPLETED 0:0\n1607166.ext+ extern 16 COMPLETED 0:0\n
This shows that the job with ID 1607166
with a job ID starting with snakejob.
has been run in the critical
partition, been allocated 16 cores and had an exit code of 0:0
. For technical reasons, there is a batch
and an extern
sub step. Actually, Slurm makes it possible to run various steps in one batch as documented in the Slurm documentation.
The sacct
command has various command-line options that you can read about via man sacct
or in the Slurm documentation. We can use --brief
/-b
to show only a brief summary.
$ sacct -j 1607166 --brief\n JobID State ExitCode\n------------ ---------- --------\n1607166 COMPLETED 0:0\n1607166.bat+ COMPLETED 0:0\n1607166.ext+ COMPLETED 0:0\n
Similarly, you can use --long
to display extended information (see the manual for the displayed columns). Very long report lines can be piped into less -S
for easier display. You can fine-tune the information to display with a format string to --format
:
$ sacct -j 1607166 --format=JobID,ReqMem,MaxRSS,Elapsed,TotalCPU,AllocCPUS\n JobID ReqMem MaxRSS Elapsed TotalCPU AllocCPUS\n------------ --------- ---------- ---------- ---------- ----------\n1607166 60Gn 13:07:31 7-16:21:29 16\n1607166.bat+ 60Gn 4314560K 13:07:31 7-16:21:29 16\n1607166.ext+ 60Gn 0 13:07:31 00:00.001 16\n
From this command, we can read that we allocate 60GB memory of memory per node (suffix n
, here Gn
for gigabytes per node) and the maximum RSS is reported as 4.3GB. You can use this information to fine-tune your memory allocations. As a side-remark, a suffic c
indicates the memory per core (e.g., that could be60Gc
)
Further, the program ran for 13 hours and 7 minutes with allocated 16 CPU cores and consumed a total of 7 days, 16 hours, and 21 minutes of CPU time. Thus, a total of 10,061 CPU minutes were spent in 787 minutes wall-clock time. This yields an overall empirical degree of parallelism of about 10061 / 787 = 14, and a parallel efficiency of 14 / 16 = 88%. The discussion of parallel efficiency is a topic not covered here.
However, you can use the awk
script below to compute the empirical parallelism (EmpPar
) and the parallel efficiency (ParEff
). The script also displays the difference I requested, and used RSS (DiffRSS
). The script can be found here.
$ sacct -j 1607166 --format=JobID,ReqMem,MaxRSS,Elapsed,TotalCPU,AllocCPUS \\\n | awk -f quick-sacct.awk\n JobID ReqMem MaxRSS Elapsed TotalCPU AllocCPUS EmpPar ParEff DiffMEM\n------------ ---------- ---------- ---------- ---------- ---------- --------- -------- --------\n1607166 60Gn 13:07:31 7-16:21:29 16 0.00 0.00 -\n1607166.bat+ 60Gn 4314560K 13:07:31 7-16:21:29 16 14.05 0.88 55.89\n1607166.ext+ 60Gn 0 13:07:31 00:00.001 16 0.00 0.00 -\n
"},{"location":"slurm/overview/","title":"Scheduling Overview","text":"The BIH HPC uses the Slurm scheduling system for resource allocation. This section of the manual attempts to give an overview of what scheduling is and how to use the Slurm scheduler. For more detailed information, you will have to refer to the Slurm website and the Slurm man pages (e.g., by entering man sbatch
or man srun
on the HPC terminal's command line).
For a quick introduction and hands-on examples, please see the manual sections
- Overview, starting with Slurm Quickstart, and
- HPC Tutorial, starting with Episode 0.
Also, make sure that you are aware of our How-To: Debug Software and How-To: Debug Software on HPC Systems guides in the case that something goes wrong.
"},{"location":"slurm/overview/#annotated-contents","title":"Annotated Contents","text":" - Background on Scheduling -- some background on scheduling and the terminology used
- Quickstart -- explains the most important Slurm commands, with examples
- Cheat Sheet -- for quick reference
- Job Scripts -- how to setup job scripts with Slurm
- Memory Allocation -- memory allocation ( one of the most important concepts that is most often found confusing)
- Introduction to Slurm Commands
srun
-- running parallel jobs now sbatch
-- submission of batch jobs scancel
-- stop/kill jobs sinfo
-- display information about the Slurm cluster squeue
-- information about pending and running jbos scontrol
-- detailed information (and control) sacct
-- access Slurm accounting information (pending, running, and past jobs) - Format Strings in Slurm -- format strings allow to display extended information about Slurm scheduler objects
- Slurm and Snakemake -- how to use Snakemake with Slurm
- X11 Forwarding -- X11 forwarding in Slurm (simple; short)
- Rosetta Stone -- lookup table for SGE <-> Slurm
"},{"location":"slurm/overview/#a-word-on-elsewhere","title":"A Word on \"Elsewhere\"","text":"Many other facilities run Slurm clusters and make their documentation available on the internet. We list some that we found useful below. However, be aware that Slurm is a highly configurable and extensible system. Other sites may have different configurations and plugins enabled than we have (or might even have written custom plugins that are not available at BIH). In any case, it's always useful to look \"\u00fcber den Tellerrand\".
- Quick Start User Guide - the official guide from the Slurm creators.
- Slurm
man
Pages - web versions of Unix manual (man
) pages. - TU Dresden Slurm Compendium - nice documentation from the installation in Dresden. Note that their installation is highly customized, in particular, their partition selection is automated (but is not for us).
- Slurm at CECI - CECI is a HPC consortium from Belgium.
- Slurm at the Arctic University of Norway
- Slurm at Technical University of Denmark - if you want to get an insight in how this looks to administrator.
"},{"location":"slurm/quickstart/","title":"Slurm Quickstart","text":"Create an interactive bash session (srun
will run bash in real-time, --pty
connects its stdout
and stderr
to your current session).
hpc-login-1:~$ srun --pty bash -i\nmed0740:~$ echo \"Hello World\"\nHello World\nmed0740:~$ exit\nhpc-login-1:~$\n
Note you probably want to longer running time for your interactive jobs. This way, your jobs can run for up to 28 days. This will make your job be routed automatically into the long
partition as it is the only one that can fit your job.
hpc-login-1:~$ srun --pty --time 28-00 bash -i\nmed0740:~$\n
Pro-Tip: Using Bash aliases for quick access.
hpc-login-1:~$ alias slogin=\"srun --pty bash -i\"\nhpc-login-1:~$ slogin\nmed0740:~$ exit\nhpc-login-1:~$ cat >>~/.bashrc <<\"EOF\"\n# Useful aliases for logging in via Slurm\nalias slogin=\"srun --pty bash -i\"\nalias slogin-x11=\"srun --pty --x11 bash -i\"\nEOF\n
Create an interactive R session on the cluster (assuming conda is active and the environment my-r
is created, e.g., with conda create -n my-r r
).
hpc-login-1:~$ conda activate my-r\nhpc-login-1:~$ srun --pty R\nR version 3.6.2 (2019-12-12) -- \"Dark and Stormy Night\"\nCopyright (C) 2019 The R Foundation for Statistical Computing\n[...]\nType 'demo()' for some demos, 'help()' for on-line help, or\n'help.start()' for an HTML browser interface to help.\nType 'q()' to quit R.\n\n\n> Sys.info()[\"nodename\"]\n nodename\n\"med0740\"\n> q()\nSave workspace image? [y/n/c]:\nhpc-login-1:~$\n
Create an interactive iPython session on the cluster (assuming conda is active and the environment my-python
is created, e.g., with conda create -n my-python python=3 ipython
).
hpc-login-1:~$ conda activate my-python\nhpc-login-1:~$ srun --pty ipython\nPython 3.8.2 | packaged by conda-forge | (default, Mar 5 2020, 17:11:00)\nType 'copyright', 'credits' or 'license' for more information\nIPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.\n\nIn [1]: import socket; socket.gethostname()\nOut[1]: 'med0740'\n\nIn [2]: exit\nhpc-login-1:~$\n
Allocate 4 cores (default is 1 core), and a total of 4GB of RAM on one node (alternatively use --mem-per-cpu
to set RAM per CPU); sbatch
accepts the same argument.
hpc-login-1:~$ srun --cpus-per-task=4 --nodes=1 --mem=4G --pty bash\nmed0740:~$ export | grep SLURM_CPUS_ON_NODE\n4\nmed0740:~$ your-parallel-script --threads 4\n
Submit an R script to the cluster in batch mode (sbatch
schedules the job for later execution).
hpc-login-1:~$ cat >job-script.sh <<\"EOF\"\n#!/bin/bash\necho \"Hello, I'm running on $(hostname) and it's $(date)\"\nEOF\nhpc-login-1:~$ sbatch job-script.sh\nSubmitted batch job 7\n\n# Some time later:\nhpc-login-1:~$ cat slurm-7.out\nHello, I'm running on med0740 and it's Fri Mar 6 07:36:42 CET 2020\nhpc-login-1:~$\n
"},{"location":"slurm/reservations/","title":"Reservations / Maintenances","text":"Hint
Read this in particular if you want to know why your job does not get scheduled and you see Reason=ReqNodeNotAvail,_Reserved_for_maintenance
in scontrol show job
.
Administration registers maintenances with the Slurm scheduler as so-called reservations. You can see the current reservations with scontrol show reservation
. The following is a scheduled reservation affecting ALL nodes of the cluster.
# scontrol show reservation\nReservationName=root_13 StartTime=2021-09-07T00:00:00 EndTime=2021-09-09T00:00:00 Duration=2-00:00:00\n Nodes=hpc-cpu-[1-36],med[0101-0116,0201-0264,0301-0304,0401-0404,0501-0516,0601-0632,0701-0764]\n NodeCnt=236 CoreCnt=5344 Features=(null) PartitionName=(null)\n Flags=MAINT,IGNORE_JOBS,SPEC_NODES,ALL_NODES TRES=cpu=10176\n Users=root Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a\n MaxStartDelay=(null)\n
You will also be notified when logging into the login nodes, e.g.,
--\n ***NOTE: 1 scheduled maintenance(s)***\n\n 1: 2021-09-07 00:00:00 to 2021-09-09 00:00:00 ALL nodes\n\nYou jobs do not start because of \"Reserved_for_maintenance\"?\nSlurm jobs will only start if they do not overlap with scheduled reservations.\nMore information:\n\n - https://bihealth.github.io/bih-cluster/slurm/reservations/\n - https://bihealth.github.io/bih-cluster/admin/maintenance/\n--\n
"},{"location":"slurm/reservations/#what-is-the-effect-of-a-reservation","title":"What is the Effect of a Reservation?","text":"Maintenance reservations will block the affected nodes (or even the whole cluster) for jobs. If there is a maintenance in one week then your job must have an end time before the reservation starts. By this, the job gives a guarantee to the scheduler that it will not interfer with the maintenance reservation.
For example, scontrol show job JOBID
might report the following
JobId=4011580 JobName=snakejob\n UserId=USER(UID) GroupId=GROUP(GID) MCS_label=N/A\n Priority=1722 Nice=0 Account=GROUP QOS=normal\n JobState=PENDING Reason=ReqNodeNotAvail,_Reserved_for_maintenance Dependency=(null)\n Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0\n RunTime=00:00:00 TimeLimit=28-00:00:00 TimeMin=N/A\n SubmitTime=2021-08-30T09:01:01 EligibleTime=2021-08-30T09:01:01\n AccrueTime=2021-08-30T09:01:01\n StartTime=2021-09-09T00:00:00 EndTime=2021-10-07T00:00:00 Deadline=N/A\n SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-08-30T10:20:40\n Partition=long AllocNode:Sid=172.16.35.153:5453\n ReqNodeList=(null) ExcNodeList=(null)\n NodeList=(null)\n NumNodes=1-1 NumCPUs=8 NumTasks=8 CPUs/Task=1 ReqB:S:C:T=0:0:*:*\n TRES=cpu=8,mem=4G,node=1,billing=8\n Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*\n MinCPUsNode=1 MinMemoryNode=4G MinTmpDiskNode=0\n Features=(null) DelayBoot=00:00:00\n OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)\n Power=\n NtasksPerTRES:0\n
Look out for the Reason
line:
Reason=ReqNodeNotAvail,_Reserved_for_maintenance\n
This job is scheduled to run up to 4 weeks and has been submitted on 2021-08-30.
Right now the following reservation is active
# scontrol show reservation\nReservationName=root_13 StartTime=2021-09-07T00:00:00 EndTime=2021-09-09T00:00:00 Duration=2-00:00:00\n Nodes=hpc-cpu-[1-36],med[0101-0116,0201-0264,0301-0304,0401-0404,0501-0516,0601-0632,0701-0764]\n NodeCnt=236 CoreCnt=5344 Features=(null) PartitionName=(null)\n Flags=MAINT,IGNORE_JOBS,SPEC_NODES,ALL_NODES TRES=cpu=10176\n Users=root Groups=(null) Accounts=(null) Licenses=(null) State=INACTIVE BurstBuffer=(null) Watts=n/a\n MaxStartDelay=(null)\n
Thus, the scheduler decided to set a StartTime
of the job to 2021-09-09T00:00:00
, which is the end time of the reservation. Effectively, the job is forced to run outside the maintenance reservation.
You can resolve this by using a --time=
parameter to srun
or sbatch
such that the job ends before the maintenance reservation starts.
"},{"location":"slurm/rosetta-stone/","title":"Slurm Rosetta Stone","text":"Rosetta Stone?
The Rosetta Stone is a stone slab that carries the same text in Egyptian hieroglyphs and ancient Greek. This was key for decyphering Egyptian hieroglyphs in the 18th century. Nowadays, the term is often used to label translation tables such as the one below.
The table below shows some SGE commands and their Slurm equivalents.
User Command SGE Slurm remote login qrsh/qlogin
srun --pty bash
run interactively N/A srun --pty program
submit job qsub script.sh
sbatch script.sh
delete job qdel job-id
scancel job-id
job status by job id N/A squeue --job job-id
detailed job status qstat -u '*' -j job-id
sstat job-id
job status of your jobs qstat
squeue --me
job status by user qstat -u user
squeue -u user
hold job qhold job-id
scontrol hold job-id
release job qrls job-id
scontrol release job-id
queue list qconf -sql
scontrol show partitions
node list qhost
sinfo -N
OR scontrol show nodes
cluster status qhost -q
sinfo
show node resources N/A sinfo \"%n %G\"
Job Specification SGE Slurm script directive marker #$
#SBATCH
(run in queue) -q queue
-p queue
allocated nodes N/A -N min[-max]
allocate cores -pe smp count
-n count
limit running time -l h_rt=time
-t days-hh:mm:s
redirectd stdout -o file
-o file
redirect stderr -e file
-e file
combine stdout/stderr -j yes
-o without -e
copy environment -V
--export=ALL\\|NONE\\|variables
email notification -m abe
--mail-type=events
send email to -M email
--mail-user=email
job name -N name
--job-name=name
restart job -r yes|no
--requeue|--no-requeue
working directory -wd path
--workdir
run exclusively -l exclusive
--exclusive
OR --shared
allocate memory -l h_vmem=size
--mem=mem
OR --mem-per-cpu=mem
wait for job -hold_jid jid
--depend state:job
select target host -l hostname=host1\\|host1
--nodelist=nodes
AND/OR --exclude
allocate GPU -l gpu=1
--gres=gpu:tesla:count
or --gres=gpu:a40:count
"},{"location":"slurm/snakemake/","title":"Snakemake with Slurm","text":"This page describes how to use Snakemake with Slurm.
"},{"location":"slurm/snakemake/#prerequisites","title":"Prerequisites","text":" - This assumes that you have Miniforge properly setup with Bioconda.
- Also it assumes that you have already activated the Miniforge base environment with
source miniforge/bin/activate
.
"},{"location":"slurm/snakemake/#environment-setup","title":"Environment Setup","text":"We first create a new environment snakemake-slurm
and activate it. We need the snakemake
package for this.
host:~$ conda create -y -n snakemake-slurm snakemake\n[...]\n#\n# To activate this environment, use\n#\n# $ conda activate snakemake-slurm\n#\n# To deactivate an active environment, use\n#\n# $ conda deactivate\nhost:~$ conda activate snakemake-slurm\n(snakemake-slurm) host:~$\n
"},{"location":"slurm/snakemake/#snakemake-workflow-setup","title":"Snakemake Workflow Setup","text":"We create a workflow and ensure that it works properly with multi-threaded Snakemake (no cluster submission here!)
host:~$ mkdir -p snake-slurm\nhost:~$ cd snake-slurm\nhost:snake-slurm$ cat >Snakefile <<\"EOF\"\nrule default:\n input: \"the-result.txt\"\n\nrule mkresult:\n output: \"the-result.txt\"\n shell: r\"sleep 1m; touch the-result.txt\"\nEOF\nhost:snake-slurm$ snakemake --cores=1\n[...]\nhost:snake-slurm$ ls\nSnakefile the-result.txt\nhost:snake-slurm$ rm the-result.txt\n
"},{"location":"slurm/snakemake/#snakemake-and-slurm","title":"Snakemake and Slurm","text":"You have two options:
- Simply use
snakemake --profile=cubi-v1
and the Snakemake resource configuration as shown below. STRONGLY PREFERRED - Use the
snakemake --cluster='sbatch ...'
command.
Note that we sneaked in a sleep 1m
? In a second terminal session, we can see that the job has been submitted to SLURM indeed.
host:~$ squeue -u holtgrem_c\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 325 debug snakejob holtgrem R 0:47 1 med0127\n
"},{"location":"slurm/snakemake/#threads-resources","title":"Threads & Resources","text":"The cubi-v1
profile (stored in /etc/xdg/snakemake/cubi-v1
on all cluster nodes) supports the following specification in your Snakemake rule:
threads
: the number of threads to execute the job on - memory in a syntax understood by Slurm, EITHER
resources.mem
/resources.mem_mb
: the memory to allocate for the whole job, OR resources.mem_per_thread
: the memory to allocate for each thread.
resources.time
: the running time of the rule, in a syntax supported by Slurm, e.g. HH:MM:SS
or D-HH:MM:SS
resources.partition
: the partition to submit your job into (Slurm will pick a fitting partition for you by default) resources.nodes
: the number of nodes to schedule your job on (defaults to 1
and you will want to keep that value unless you want to use MPI)
You will need Snakemake >=7.0.2 for this.
Here is how to call Snakemake:
# snakemake --profile=cubi-v1 -j1\n
To set rule-specific resources:
rule myrule:\n threads: 1\n resources:\n mem='8G',\n time='04:00:00',\n input: # ...\n output: # ...\n shell: # ...\n
You can combine this with Snakemake resource callables, of course:
def myrule_mem(wildcards, attempt):\n mem = 2 * attempt\n return '%dG' % mem\n\nrule snps:\n threads: 1\n resources:\n mem=myrule_mem,\n time='04:00:00',\n input: # ...\n output: # ...\n shell: # ...\n
"},{"location":"slurm/snakemake/#custom-logging-directory","title":"Custom logging directory","text":"By default, slurm will write log files into the working directory of snakemake, which will look like slurm-$jobid.out
.
To change this behaviour, the environment variable SBATCH_DEFAULTS
can be set to re-route the --output
parameter. If you want to write your files into slurm_logs
with a filename pattern of $name-$jobid
for instance, consider the following snippet for your submission script:
#!/bin/bash\n#\n#SBATCH --job-name=snakemake_main_job\n#SBATCH --ntasks=1\n#SBATCH --nodes=1\n#SBATCH --time=48:10:00\n#SBATCH --mem-per-cpu=300M\n#SBATCH --output=slurm_logs/%x-%j.log\n\nmkdir -p slurm_logs\nexport SBATCH_DEFAULTS=\" --output=slurm_logs/%x-%j.log\"\n\ndate\nsrun snakemake --use-conda -j1 --profile=cubi-v1\ndate\n
The name of the snakemake slurm job will be snakemake_main_job
, the name of the jobs spawned from it will be called after the rule name in the Snakefile.
"},{"location":"slurm/temporary-files/","title":"Slurm and Temporary Files","text":"This section describes how Slurm handles temporary files on the local disk.
Temporary Files Best Practices
See Best Practices: Temporary Files for information how to use temporary files effectively.
"},{"location":"slurm/temporary-files/#slurm-behaviour","title":"Slurm Behaviour","text":"Our Slurm configuration has the following behaviour.
"},{"location":"slurm/temporary-files/#environment-variable-tmpdir","title":"Environment Variable TMPDIR","text":"Slurm itself will by default not change the TMPDIR
environment variable but retain the variable's value from the srun
or sbatch
call.
"},{"location":"slurm/temporary-files/#private-local-tmp-directories","title":"Private Local /tmp
Directories","text":"The only place where users can write data to on local storage of the compute nodes is /tmp
.
Storage is a consumable shared resource as the storage used by one job cannot use another job. It is thus critical that Slurm cleans up after each job such that all space on the local node is available to the next job. This is done using the job_container/tmpfs Slurm plugin.
This plugin creates a so-called Linux namespace for each job and creates a bind mount of /tmp
to a location on the local storage. This mount is only visible to the currently running job and each job, even of the same user, get their own /tmp
. After a job terminates, Slurm will remove the directory and all of its content.
There is a notable exception. If you use ssh
to connect to a node rather than using srun
or sbatch
, you will see the system /tmp
directory and can also write to it. This usage of storage is not tracked and consequently you can circumvent the Slurm quota management. Using /tmp
in this fashion (i.e., outside of Slurm-controlled jobs) is prohibited. If it cannot be helped (e.g., if you need to run some debugging application that needs to create FIFO or socket files) then keep usage of /tmp
outside of Slurm job below 100MB.
"},{"location":"slurm/temporary-files/#tracking-local-storage-localtmp","title":"Tracking Local Storage localtmp
","text":"Enforcing localtmp
Gres
From January 31, we will enforce the allocated storage in /tmp
on the local disk with quotas. Jobs writing to /tmp
beyond the quota in the job allocation will not function properly and probably crash with \"out of disk quota\" messages.
Slurm tracks the available local storage above 100MB on nodes in the localtmp
generic resource (aka Gres). The resource is counted in steps of 1MB, such that a node with 350GB of local storage would look as follows in scontrol show node
:
hpc-login-1 # scontrol show node hpc-cpu-1\nNodeName=hpc-cpu-1 Arch=x86_64 CoresPerSocket=24\n [...]\n Gres=localtmp:350K\n [...]\n CfgTRES=cpu=96,mem=360000M,billing=96,gres/localtmp=358400\n [...]\n
Each job is automaticaly granted 100MB of storage on the local disk which is sufficient for most standard programs. If your job needs more temporary storage then you should either
- use the
$HOME/scratch
volume (see Best Practices: Temporary Files) - specify a
localtmp
generic resource (described here)
You can allocate the resource with --gres=localtmp:SIZE
where SIZE
is given in MB.
hpc-login-1 # srun --gres=localtmp:100k --pty bash -i\nhpc-cpu-1 # scontrol show node hpc-cpu-1\nNodeName=hpc-cpu-1 Arch=x86_64 CoresPerSocket=24\n [...]\n Gres=localtmp:250K\n [...]\n CfgTRES=cpu=96,mem=360000M,billing=96,gres/localtmp=358400\n [...]\n AllocTRES=cpu=92,mem=351G,gres/localtmp=102400\n [...]\n
The first output tells us about the resource configured to be available to user jobs and the last line show us that 100k=102400
MB of local storage are allocated.
You can also see the used resources in the details of your job:
scontrol show job 14848\nJobId=14848 JobName=example.sh\n [...]\n TresPerNode=gres:localtmp:100k\n
"},{"location":"slurm/x11/","title":"Slurm and X11","text":"Make sure to connect to the login node with X11 forwarding.
host:~$ ssh -X -l user_c hpc-login-1.cubi.bihealth.org\n
Once connected to the login node, pass the --x11
flag.
hpc-login-1:~$ srun --pty --x11 xterm\n
"},{"location":"storage/home-quota/","title":"Keeping your home folder clean","text":"We set quite restrictive quotas for user homes, but in exchange you get file system snapshots and mirroring. Your home folder should therefore only be used for scripts, your user config, and other small files. Everything else should be stored in the work
or scratch
subdirectories, which effectively link to your group's shared storage space. This document describes some common pitfalls and how to circumvent them.
Hint
The tilde character (~
) is shorthand for your home directory.
"},{"location":"storage/home-quota/#code-libraries-and-other-big-folders","title":"Code libraries and other big folders","text":"Various programs are used to depositing large folders in a user's home and can quickly use up your allotted storage quota. These include:
- Python:
~/.local/lib/python*
- *conda: Location chosen by the user.
- R:
~/R/x86_64-pc-linux-gnu-library
- HPC portal:
~/ondemand
Please note that directories whose name is starting with a dot are not shown by the normal ls
command, but require the ls -a
flag. You can search your home folder for large directories like so:
$ du -shc ~/.* ~/* --exclude=.. --exclude=.\n
You should move these locations to your work
folder and create symbolic links in their place. Conda installations should be installed in work
from the very beginning as they do not react well to being moved around.
Here is an example for the .local
folder.
$ mv ~/.local ~/work/.local\n$ ln -s ~/work/.local ~/.local\n
"},{"location":"storage/home-quota/#temporary-files","title":"Temporary Files","text":"Another usual culprit is the hidden .cache
directory which contains temporary files. This folder can be moved to the scratch
volume in a similar manner as described above.
$ mv ~/.cache ~/scratch/.cache\n$ ln -s ~/scratch/.cache ~/.cache\n
Important
Files placed in your scratch
directory will be automatically removed after 2 weeks. Do not place any valuable files in there.
"},{"location":"storage/migration-faq/","title":"Data Migration Tips and tricks","text":"Please use hpc-transfer-1
and hpc-transfer-2
for moving large amounts of files. This not only leaves the compute notes available for actual computation, but also has no risk of your jobs being killed by Slurm. You should also use tmux
to not risk connection loss during long running transfers.
"},{"location":"storage/migration-faq/#moving-a-project-folder","title":"Moving a project folder","text":" -
Define source and target location and copy contents. Please replace the parts in curly brackets with your actual folder names. It is important to end paths with a trailing slash (/
) as this is interpreted by sync
as \u201call files in this folder\u201d.
$ SOURCE=/data/gpfs-1/work/projects/{my_project}/\n$ TARGET=/data/cephfs-2/unmirrored/projects/{my-project}/\n$ rsync -ahP --stats --dry-run $SOURCE $TARGET\n
-
Remove the --dry-run
flag to start the actual copying process.
Important
File ownership information will be lost during this process. This is due to non-root users not being allowed to change ownership of arbitrary files. If this is a problem for you, please contact our admins again after completing this step.
-
Perform a second rsync
to check if all files were successfully transferred. Paranoid users might want to add the --checksum
flag to rsync
or use hashdeep
. Please note the flag --remove-source-files
which will do exactly as the name suggests, but leaves empty directories behind.
$ rsync -ahX --stats --remove-source-files --dry-run $SOURCE $TARGET\n
- Again, remove the
--dry-run
flag to start the actual deletion. - Check if all files are gone from the SOURCE folder and remove the empty directories:
$ find $SOURCE -type f | wc -l\n0\n$ rm -r $SOURCE\n
Warning
When defining your SOURCE location, do not use the *
wildcard character. It will not match hidden (dot) files and leave them behind. Its better to use a trailing slash which matches \u201cAll files in this folder\u201d.
"},{"location":"storage/migration-faq/#moving-user-work-folders","title":"Moving user work folders","text":""},{"location":"storage/migration-faq/#work-data","title":"Work data","text":" -
All files within your own work directory can be transferred as follows. Please replace parts in curly braces with your cluster user name.
$ SOURCE=/data/gpfs-1/work/users/{username}/\n$ TARGET=/data/cephfs-1/home/users/{username}/work/\n$ rsync -ahP --stats --dry-run $SOURCE $TARGET\n
Note
The --dry-run
flag lets you check that rsync is working as expected without copying any files. Remove it to start the actual transfer.
-
Perform a second rsync
to check if all files were successfully transferred. Paranoid users might want to add the --checksums
flag or use hashdeep
. Please note the flag --remove-source-files
which will do exactly as the name suggests, but leaves empty directories behind.
$ rsync -ahP --stats --remove-source-files --dry-run $SOURCE $TARGET\n
- Check if all files are gone from the SOURCE folder:
$ find $SOURCE -type f | wc -l\n0\n
"},{"location":"storage/migration-faq/#conda-environments","title":"Conda environments","text":"Conda installations tend not to react well to moving their main folder from its original location. There are numerous ways around this problem which are described here.
A simple solution we can recommend is this:
-
Install a fresh version of conda or mamba in your new work folder. Don't forget to first remove the conda init block in ~/.bashrc
.
$ nano ~/.bashrc\n$ conda init\n$ conda config --set auto_activate_base false\n
-
You can then use your new conda to export your old environments by specifying a full path like so:
$ conda env export -p /fast/work/user/$USER/miniconda/envs/<env_name> -f <env_name>.yaml\n
If you run into errors it might be better to also use the --no-builds
flag. -
Finally re-create your old environments from the yaml files:
$ conda env create -f {environment.yml}\n
"},{"location":"storage/querying-storage/","title":"Querying Storage Quotas","text":"Outdated
This document is only valid for the old, third-generation file system and will be removed soon. Quotas of our new CephFS storage are communicated via the HPC Access web portal.
As described elsewhere, all data in your user, group, and project volumes is subject to quotas. This page quickly shows how to query for the current usage of data volume and file counts for your user, group, and projects.
"},{"location":"storage/querying-storage/#query-for-user-data-and-file-usage","title":"Query for User Data and File Usage","text":"The file /etc/bashrc.gpfs-quota
contains some Bash functions that you can use for querying the quota usage. This file is automatically sourced in all of your Bash sessions.
For querying your user's data and file usage, enter the following command:
# bih-gpfs-quota-user holtgrem_c\n
You will get a report as follows. As soon as usage reaches 90%, the data/file usage will be highlighted in yellow. If you pass 99%, the data/file usage will be highlighted in red.
=================================\nQuota Report for: user holtgrem_c\n=================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\nusers holtgrem_c home 103M 10% 1.0G 1.5G - 2.5k 25% 10k 12k -\nusers holtgrem_c work 639G 62% 1.0T 1.1T - 1.0M 52% 2.0M 2.2M -\nusers holtgrem_c scratch 42G 0% 200T 220T - 207k 0.1% 200M 220M -\n[...]\n
"},{"location":"storage/querying-storage/#query-for-group-data-and-file-usage","title":"Query for Group Data and File Usage","text":"# bih-gpfs-report-quota group ag_someag\n=================================\nQuota Report for: group ag_someag\n=================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\ngroups ag_someag home 0 0% 1.0G 1.5G - 4 0% 10k 12k -\ngroups ag_someag work 349G 34% 1.0T 1.5T - 302 0% 2.0M 2.2M -\ngroups ag_someag scratch 0 0% 200T 220T - 1 0% 200M 220M -\n\n[...]\n
"},{"location":"storage/querying-storage/#query-for-project-data-and-file-usage","title":"Query for Project Data and File Usage","text":"# bih-gpfs-report-quota project someproj\n==================================\nQuota Report for: project someproj\n==================================\n\n DATA quota GR- FILES quota GR-\nENTITY NAME FSET USED SOFT HARD ACE USED SOFT HARD ACE\n------- ---------- ------- ----- ---- ----- ----- --- ----- ---- ----- ----- ---\ngroups someproj home 0 0% 1.0G 1.5G - 4 0% 10k 12k -\ngroups someproj work 349G 34% 1.0T 1.5T - 302 0% 2.0M 2.2M -\ngroups someproj scratch 0 0% 200T 220T - 1 0% 200M 220M -\n\n[...]\n
"},{"location":"storage/scratch-cleanup/","title":"Automated Cleanup of Scratch","text":"The scratch
space is automatically cleaned up nightly with the following mechanism.
- Daily snapshots of the
scratch
folder are created and retained for 3 days. - Files which were not modified for the last 14 days are removed.
- Erroneously deleted files can be manually retrieved from the snapshots.
Warning
We specifically use the mtime
attribute to determine if files in scratch should be cleaned up. Copying or downloading files to scratch while preserving the original mtime
might lead to unexpected results.
"},{"location":"storage/storage-locations/","title":"Storage and Volumes: Locations","text":"This document describes the forth iteration of the file system structure on the BIH HPC cluster. It was made necessary because the previous file system was no longer supported by the manufacturer and we since switched to distributed Ceph storage.
Important
For now, the old, third-generation file system is still mounted at /fast
. It will be decommissioned soon, please consult this document describing the migration process!
"},{"location":"storage/storage-locations/#organizational-entities","title":"Organizational Entities","text":"There are the following three entities on the cluster:
- Users (real people)
- Groups (Arbeitsgruppen) with one leader and an optional delegate
- Projects with one owner and an optional delegate
Each user, group, and project can have storage folders in different locations.
"},{"location":"storage/storage-locations/#data-types-and-storage-tiers","title":"Data Types and Storage Tiers","text":"Files stored on the HPC fall into one of three categories:
-
Home folders store programs, scripts, and user config i.\u00a0e. long-lived and very important files. Loss of this data requires to redo manual work (like programming).
-
Work folders store data of potentially large size which has a medium life time and is important. Examples are raw sequencing data and intermediate results that are to be kept (e.\u00a0g. sorted and indexed BAM files). Work data requires time-consuming actions to be restored, such as downloading large amounts of data or long-running computation.
-
Scratch folder store temporary files with a short life-time. Examples are temporary files (e.\u00a0g. unsorted BAM files). Scratch data is created to be removed eventually.
Ceph storage comes in two types which differ in their I/O speed, total capacity, and cost. They are called Tier 1 and Tier 2 and sometimes hot storage and warm storage. In the HPC filesystem they are mounted in /data/cephfs-1
and /data/cephfs-2
.
- Tier 1 storage is fast, relatively small, expensive, and optimized for performance.
- Tier 2 storage is slow, big, cheap, and built for keeping large files for longer times.
Storage quotas are imposed in these locations to restrict the maximum size of folders. Amount and utilization of quotas is communicated via the HPC Access web portal.
"},{"location":"storage/storage-locations/#home-directories","title":"Home Directories","text":"Location: /data/cephfs-1/home/
Only users have home directories on Tier 1 storage. This is the starting point when starting a new shell or SSH session. Important config files are stored here as well as analysis scripts and small user files. Home folders have a strict storage quota of 1\u00a0GB.
"},{"location":"storage/storage-locations/#work-directories","title":"Work Directories","text":"Location: /data/cephfs-1/work/
Groups and projects have work directories on Tier 1 storage. User home folders contain a symlink to their respective group's work folder. Files shared within a group/project are stored here as long as they are in active use. Work folders are generally limited to 1\u00a0TB per group. Project work folders are allocated on an individual basis.
"},{"location":"storage/storage-locations/#scratch-space","title":"Scratch Space","text":"Location: /data/cephfs-1/scratch/
Groups and projects have scratch space on Tier 1 storage. User home folders contain a symlink to their respective group's scratch space. Meant for temporary, potentially large data e.\u00a0g. intermediate unsorted or unmasked BAM files, data downloaded from the internet etc. Scratch space is generally limited to 10\u00a0TB per group. Projects are allocated scratch on an individual basis. Files in scratch will be automatically removed 2 weeks after their creation.
"},{"location":"storage/storage-locations/#tier-2-storage","title":"Tier 2 Storage","text":"Location: /data/cephfs-2/
This is where big files go when they are not in active use. Groups are allocated 10 TB of Tier 2 storage by default. File quotas here can be significantly larger as space is much cheaper and more abundant than on Tier 1.
Note
Tier 2 storage is currently not accessible from HPC login nodes.
"},{"location":"storage/storage-locations/#overview","title":"Overview","text":"Tier Function Path Default Quota 1 User home /data/cephfs-1/home/users/<user>
1 GB 1 Group work /data/cephfs-1/work/groups/<group>
1 TB 1 Group scratch /data/cephfs-1/scratch/groups/<group>
10 TB 1 Project work /data/cephfs-1/work/projects/<project>
On request 1 Project scratch /data/cephfs-1/scratch/projects/<project>
On request 2 Group /data/cephfs-2/unmirrored/groups/<group>
10 TB 2 Project /data/cephfs-2/unmirrored/projects/<project>
On request 2 Group /data/cephfs-2/mirrored/groups/<group>
On request 2 Project /data/cephfs-2/mirrored/projects/<project>
On request"},{"location":"storage/storage-locations/#snapshots-and-mirroring","title":"Snapshots and Mirroring","text":"Snapshots are incremental copies of the state of the data at a particular point in time. They provide safety against various \"Ops, did I just delete that?\" scenarios, meaning they can be used to recover lost or damaged files. Depending on the location and Tier, CephFS creates snapshots in different frequencies and retention plans.
Location Path Retention policy Mirrored User homes /data/cephfs-1/home/users/
Hourly for 48 h, daily for 14 d yes Group/project work /data/cephfs-1/work/
Four times a day, daily for 5 d no Group/project scratch /data/cephfs-1/scratch/
Daily for 3 d no Group/project mirrored /data/cephfs-2/mirrored/
Daily for 30 d, weekly for 16 w yes Group/project unmirrored /data/cephfs-2/unmirrored/
Daily for 30 d, weekly for 16 w no Some parts of Tier 1 and Tier 2 snapshots are also mirrored into a separate fire compartment within the data center. This provides an additional layer of security i.\u00a0e. physical damage to the servers.
"},{"location":"storage/storage-locations/#accessing-snapshots","title":"Accessing Snapshots","text":"To access snapshots simply navigate to the .snap/
sub-folder of the respective location. This special folder exists on all levels of the CephFS file hierarchy, so even in your user home directory. Inside you will find one folder per snapshot created and in those a complete replica of the respective folder at the time of snapshot creation.
For example:
/data/cephfs-1/home/.snap/<some_snapshot>/users/<your_user>/
same as: /data/cephfs-1/home/users/<your_user>/.snap/<some_snapshot>
/data/cephfs-1/work/.snap/<some_snapshot>/groups/<your_group>/
/data/cephfs-2/unmirrored/.snap/<some_snapshot>/projects/<your_project>/
Here is a simple example of how to restore a file:
$ cd /data/cephfs-2/unmirrored/groups/cubi/.snap/scheduled-2024-03-11-00_00_00_UTC/\n$ ls -l\nimportant_file.txt\n$ cp important_file.txt /data/cephfs-2/unmirrored/groups/cubi/\n
"},{"location":"storage/storage-locations/#technical-implementation","title":"Technical Implementation","text":""},{"location":"storage/storage-locations/#tier-1","title":"Tier 1","text":" - Fast & expensive
- mounted on
/data/cephfs-1
- Currently 12 Nodes with 10 \u00d7 14 TB NVME SSD each
- 1.68 PB raw storage
- 1.45 PB erasure coded (EC 8:2)
- 1.23 PB usable (85 %, Ceph performance limit)
- For typical CUBI use case 3 to 5 times faster I/O then the old DDN
- Two more nodes in purchasing process
- Hardware costs:
- One node/chunk: 45.000 \u20ac (150 TB)
- ca. 300 \u20ac/TB
"},{"location":"storage/storage-locations/#tier-2","title":"Tier 2","text":" - Slower but more affordable
- mounted on
/data/cephfs-2
- Currently 10 nodes with 52 HDDs slots and SSD cache (~40 HDDs per node with 16\u201318 TB capacity)
- 6.6 PB raw storage
- 5.3 PB erasure coded (EC 8:2)
- 4.5 PB usable (85 %; Ceph performance limit)
- More nodes in purchasing process
- Hardware costs:
- ca. 50 \u20ac per TB, 100 \u20ac mirrored
- small chunk extension possible
"},{"location":"storage/storage-locations/#tier-2-mirror","title":"Tier 2 mirror","text":" - Similar in hardware and size (10 nodes, 6+ PB)
- Stored in separate fire compartment.
"},{"location":"storage/storage-migration/","title":"Migration from old GPFS to new CephFS","text":"Important
We will remove access to /fast
on most cluster nodes following September 30th.
"},{"location":"storage/storage-migration/#what-is-going-to-happen","title":"What is going to happen?","text":"Files on the cluster's main storage /data/gpfs-1
aka. /fast
will move to a new file system. That includes users' home directories, work directories, and work-group directories. Once files have been moved to their new locations, /fast
will be retired.
Simultaneously we will move towards a more unified naming scheme for project and group folder names. From now on, all such folders names shall be in kebab-case. This is Berlin after all. Group folders will also be renamed, removing the \"ag_\" prefix.
Detailed communication about the move will be communicated via the cluster mailinglist and the user forum. For technical help, please consult the Data Migration Tips and tricks.
"},{"location":"storage/storage-migration/#why-is-this-happening","title":"Why is this happening?","text":"/fast
is based on a high performance proprietary hardware (DDN) & file system (GPFS). The company selling it has terminated support which also means buying replacement parts will become increasingly difficult.
"},{"location":"storage/storage-migration/#the-new-storage","title":"The new storage","text":"There are two file systems set up to replace /fast
, named Tier 1 and Tier 2 after their difference in I/O speed:
- Tier 1 is faster than
/fast
ever was, but it only has about 75\u00a0% of its usable capacity. - Tier 2 is not as fast, but much larger, almost 3 times the current usable capacity.
The Hot storage Tier 1 is reserved for files requiring frequent random access, user homes, and scratch. Tier 2 (Warm storage) should be used for everything else. Both file systems are based on the open-source, software-defined Ceph storage platform and differ in the type of drives used. Tier 1 or Cephfs-1 uses NVME SSDs and is optimized for performance, Tier 2 or Cephfs-2 used traditional hard drives and is optimized for cost.
So these are the three terminologies in use right now:
- Cephfs-1 = Tier 1 = Hot storage =
/data/cephfs-1
- Cephfs-2 = Tier 2 = Warm storage =
/data/cephfs-2
More information about CephFS can be found here.
"},{"location":"storage/storage-migration/#new-file-locations","title":"New file locations","text":"Naturally, paths are going to change after files move to their new location. Due to the increase in storage quality options, there will be some more folders to consider.
"},{"location":"storage/storage-migration/#users","title":"Users","text":" - Home on Tier 1:
/data/cephfs-1/home/users/<user>
- Work on Tier 1:
/data/cephfs-1/work/groups/<doe>/users/<user>
- Scratch on Tier 1:
/data/cephfs-1/scratch/groups/<doe>/users/<user>
Important
User work
& scratch
spaces are now part of the user's group folder. This means, groups need to coordinate internally to distribute their allotted quota according to each user's needs.
The implementation is done via symlinks created by default when the user account is moved to its new destination:
~/work -> /data/cephfs-1/work/groups/<group>/users/<user>
~/scratch -> /data/cephfs-1/scratch/groups/<group>/users/<user>
"},{"location":"storage/storage-migration/#groups","title":"Groups","text":" - Work on Tier 1:
/data/cephfs-1/work/groups/<group>
- Scratch on Tier 1:
/data/cephfs-1/scratch/groups/<group>
- Tier 2 storage:
/data/cephfs-2/unmirrored/groups/<group>
- Mirrored space on Tier 2 is available on request.
"},{"location":"storage/storage-migration/#projects","title":"Projects","text":" - Work on Tier 1:
/data/cephfs-1/work/projects/<project>
- Scratch on Tier 1:
/data/cephfs-1/scratch/projects/<project>
- Tier 2 storage is available on request.
"},{"location":"storage/storage-migration/#recommended-practices","title":"Recommended practices","text":""},{"location":"storage/storage-migration/#data-locations","title":"Data locations","text":""},{"location":"storage/storage-migration/#tiers","title":"Tiers","text":" - Tier 1: Designed for many I/O operations. Store files here which are actively used by your compute jobs.
- Tier 2: Big, cheap storage. Fill with files not in active use.
- Tier 2 mirrored: Extra layer of security. Longer term storage of invaluable data.
"},{"location":"storage/storage-migration/#folders","title":"Folders","text":" - Home: Configuration files, templates, generic scripts, & small documents.
- Work: Conda environments, R packages, data actively processed or analyzed.
- Scratch: Non-persistent storage for temporary or intermediate files, caches, etc.
"},{"location":"storage/storage-migration/#project-life-cycle","title":"Project life cycle","text":" - Import raw data on Tier 2 for validation (checksums, \u2026)
- Stage raw data on Tier 1 for QC & processing.
- Save processing results to Tier 2.
- Continue analysis on Tier 1.
- Save analysis results on Tier 2.
- Reports & publications can remain on Tier 2.
- After publication (or the end of the project), files on Tier 1 should be deleted.
"},{"location":"storage/storage-migration/#example-use-cases","title":"Example use cases","text":"Space on Tier 1 is limited. Your colleagues, other cluster users, and admins will be very grateful if you use it only for files you actively need to perform read/write operations on. This means main project storage should probably always be on Tier 2 with workflows to stage subsets of data onto Tier 1 for analysis.
These examples are based on our experience of processing diverse NGS datasets. Your mileage may vary but there is a basic principle that remains true for all projects.
"},{"location":"storage/storage-migration/#dna-sequencing-wes-wgs","title":"DNA sequencing (WES, WGS)","text":"Typical Whole Genome Sequencing data of a human sample at 100x coverage requires about 150 GB of storage, Whole Exome Sequencing files occupy between 6 and 30 GB. These large files require considerable I/O resources for processing, in particular for the mapping step. A prudent workflow for these kind of analysis would therefore be the following:
- For one sample in the cohort, subsample its raw data files (
fastqs
) from the Tier 2 location to Tier 1. seqtk
is your friend! - Test, improve & check your processing scripts on those smaller files.
- Once you are happy with the scripts, copy the complete
fastq
files from Tier 2 to Tier 1. Run the your scripts on the whole dataset, and copy the results (bam
or cram
files) back to Tier 2. - Remove raw data & bam/cram files from Tier 1, unless the downstream processing of mapped files (variant calling, structural variants, ...) can be done immediatly.
Tip
Don't forget to use your scratch
area for transient operations, for example to sort your bam
file after mapping. More information on how to efficiently set up your temporary directory here.
"},{"location":"storage/storage-migration/#bulk-rna-seq","title":"bulk RNA-seq","text":"Analysis of RNA expression datasets are typically a long and iterative process, where the data must remain accessible for a significant period. However, there is usually no need to keep raw data files and mapping results available once the gene & transcripts counts have been generated. The count files are much smaller than the raw data or the mapped data, so they can live longer on Tier 1.
A typical workflow would be:
- Copy your
fastq
files from Tier 2 to Tier 1. - Perform raw data quality control, and store the outcome on Tier 2.
- Get expression levels, for example using
salmon
or STAR
, and store the results on Tier 2. - Import the expression levels into
R
, using tximport
and DESeq2
or featureCounts
& edgeR
, for example. - Save expression levels (
R
objects) and the output of salmon
, STAR
, or any mapper/aligner of your choice to Tier 2. - Remove raw data, bam & count files from Tier 1.
Tip
If using STAR
, don't forget to use your scratch
area for transient operations. More information on how to efficiently set up your temporary directory here
"},{"location":"storage/storage-migration/#scrna-seq","title":"scRNA-seq","text":"The analysis workflow of bulk RNA & single cell dataset is conceptually similar: Large raw files need to be processed once and only the outcome of the processing (gene counts matrices) are required for downstream analysis. Therefore, a typical workflow would be:
- Copy your
fastq
files from Tier 2 to Tier 1. - Perform raw data QC, and store the results on Tier 2.
- Get the count matrix, e.\u00a0g. using
Cell Ranger
or alevin-fry
, perform count matrix QC and store the results on Tier 2. - Remove raw data, bam & count files from Tier 1.
- Downstream analysis with
seurat
, scanpy
, or Loupe Browser
.
"},{"location":"storage/storage-migration/#machine-learning","title":"Machine learning","text":"There is no obvious workflow that covers most used cases for machine learning. However,
- Training might be done on scratch where data access is quick and data size not as constrained as on work space. But files will disappear after 14 days.
- Some models can be updated with new data, without needing to keep the whole dataset on Tier 1.
"},{"location":"storage/storage-migration/#data-migration-process-from-old-fast-to-cephfs","title":"Data migration process from old /fast
to CephFS","text":" - After being contacted by HPC admins, delegates move project folders to Tier 2. Additional Tier 1 storage is granted on request.
- User homes and group folders are moved by HPC admins to Tier 1 and 2 as appropriate. This is done on a group-by-group basis.
- Users move contents of their work directories into the new shared group work space.
Best practices and tools will be provided.
"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 553c73e8..dc6b6346 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,462 +2,462 @@
https://bihealth.github.io/bih-cluster/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/admin/getting-access/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/admin/maintenance/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/admin/policies/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/bashrc-guide/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/env-modules/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/project-structure/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/screen-tmux/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/software-craftmanship/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/software-installation-with-conda/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/best-practice/temp-files/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/connecting-windows/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/connecting/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/connection-problems/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/from-external/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/ssh-basics/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/advanced-ssh/linux/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/advanced-ssh/overview/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/advanced-ssh/windows/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/generate-key/linux/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/generate-key/windows/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/submit-key/charite/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/connecting/submit-key/mdc/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/annotations/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/app-support/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/databases/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/exomes-panels/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/exon-lists/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/index-files/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/cubit/references/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/help/faq/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/help/good-tickets/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/help/helpdesk/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/help/hpc-talk/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/connect/gpu-nodes/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/connect/high-memory/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/misc/contribute/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/misc/debug-at-hpc/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/misc/debug-software/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/misc/hpc-talk/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/service/file-exchange/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/apptainer/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/cell-ranger/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/jupyter/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/keras/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/matlab/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/openmpi/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/scientific-software/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/how-to/software/tensorflow/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/hpc-tutorial/episode-0/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/hpc-tutorial/episode-1/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/hpc-tutorial/episode-2/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/hpc-tutorial/episode-3/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/hpc-tutorial/episode-4/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/misc/external-resources/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/misc/provided-software/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/misc/publication-list/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/ondemand/interactive/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/ondemand/overview/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/ondemand/quotas/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/overview/architecture/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/overview/for-the-impatient/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/overview/job-scheduler/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/overview/monitoring/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/overview/storage/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/background/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/cheat-sheet/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-sacct/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-sattach/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-sbatch/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-scancel/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-scontrol/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-sinfo/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-squeue/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/commands-srun/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/format-strings/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/job-scripts/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/memory-allocation/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/overview/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/quickstart/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/reservations/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/rosetta-stone/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/snakemake/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/temporary-files/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/slurm/x11/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/home-quota/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/migration-faq/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/querying-storage/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/scratch-cleanup/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/storage-locations/
- 2024-09-27
+ 2024-10-07
daily
https://bihealth.github.io/bih-cluster/storage/storage-migration/
- 2024-09-27
+ 2024-10-07
daily
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 045e7f69..e6994284 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/slurm/background/index.html b/slurm/background/index.html
index ff588705..f0589914 100644
--- a/slurm/background/index.html
+++ b/slurm/background/index.html
@@ -3376,7 +3376,7 @@ Slurm Partitions
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/cheat-sheet/index.html b/slurm/cheat-sheet/index.html
index 4151f3ba..720cc583 100644
--- a/slurm/cheat-sheet/index.html
+++ b/slurm/cheat-sheet/index.html
@@ -3314,7 +3314,7 @@ Slurm Cheat Sheet
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-sacct/index.html b/slurm/commands-sacct/index.html
index 09159d45..4fb317ca 100644
--- a/slurm/commands-sacct/index.html
+++ b/slurm/commands-sacct/index.html
@@ -3226,7 +3226,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-sattach/index.html b/slurm/commands-sattach/index.html
index ccaefe87..ce3dcb8b 100644
--- a/slurm/commands-sattach/index.html
+++ b/slurm/commands-sattach/index.html
@@ -3200,7 +3200,7 @@ Important Arguments
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-sbatch/index.html b/slurm/commands-sbatch/index.html
index 8b06fec5..389627e5 100644
--- a/slurm/commands-sbatch/index.html
+++ b/slurm/commands-sbatch/index.html
@@ -3374,7 +3374,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-scancel/index.html b/slurm/commands-scancel/index.html
index d87c4a48..e414df6f 100644
--- a/slurm/commands-scancel/index.html
+++ b/slurm/commands-scancel/index.html
@@ -3133,7 +3133,7 @@ Slurm Command: scancel
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-scontrol/index.html b/slurm/commands-scontrol/index.html
index c90db1b0..81b4ae24 100644
--- a/slurm/commands-scontrol/index.html
+++ b/slurm/commands-scontrol/index.html
@@ -3243,7 +3243,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-sinfo/index.html b/slurm/commands-sinfo/index.html
index eff1fc11..23f31ad4 100644
--- a/slurm/commands-sinfo/index.html
+++ b/slurm/commands-sinfo/index.html
@@ -3263,7 +3263,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-squeue/index.html b/slurm/commands-squeue/index.html
index 94e70906..34ffc7d2 100644
--- a/slurm/commands-squeue/index.html
+++ b/slurm/commands-squeue/index.html
@@ -3228,7 +3228,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/commands-srun/index.html b/slurm/commands-srun/index.html
index 7764cb92..a9914353 100644
--- a/slurm/commands-srun/index.html
+++ b/slurm/commands-srun/index.html
@@ -3231,7 +3231,7 @@ Notes&
September 27, 2024
+ October 7, 2024
diff --git a/slurm/format-strings/index.html b/slurm/format-strings/index.html
index 533d1058..1f9000c4 100644
--- a/slurm/format-strings/index.html
+++ b/slurm/format-strings/index.html
@@ -3316,7 +3316,7 @@ Displaying Resources
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/job-scripts/index.html b/slurm/job-scripts/index.html
index ed16d6ef..a8e9bb71 100644
--- a/slurm/job-scripts/index.html
+++ b/slurm/job-scripts/index.html
@@ -3169,7 +3169,7 @@ Slurm Job Scripts
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/memory-allocation/index.html b/slurm/memory-allocation/index.html
index 8ddc3c8d..19f08154 100644
--- a/slurm/memory-allocation/index.html
+++ b/slurm/memory-allocation/index.html
@@ -3429,7 +3429,7 @@ Memory/CPU Accounting in Slurm
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/overview/index.html b/slurm/overview/index.html
index e982b52b..a95bf76c 100644
--- a/slurm/overview/index.html
+++ b/slurm/overview/index.html
@@ -3238,7 +3238,7 @@ A Word on "Elsewhere"
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/quickstart/index.html b/slurm/quickstart/index.html
index 52d0da40..e9962d60 100644
--- a/slurm/quickstart/index.html
+++ b/slurm/quickstart/index.html
@@ -3192,7 +3192,7 @@ Slurm Quickstart
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/reservations/index.html b/slurm/reservations/index.html
index 8b5caa9b..0af97fdc 100644
--- a/slurm/reservations/index.html
+++ b/slurm/reservations/index.html
@@ -3247,7 +3247,7 @@ What is the Effect of a Reservation
September 27, 2024
+ October 7, 2024
diff --git a/slurm/rosetta-stone/index.html b/slurm/rosetta-stone/index.html
index f497ea7d..73600152 100644
--- a/slurm/rosetta-stone/index.html
+++ b/slurm/rosetta-stone/index.html
@@ -3313,7 +3313,7 @@ Slurm Rosetta Stone
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/snakemake/index.html b/slurm/snakemake/index.html
index c3dab49c..cd00a983 100644
--- a/slurm/snakemake/index.html
+++ b/slurm/snakemake/index.html
@@ -3377,7 +3377,7 @@ Custom logging directory
- September 27, 2024
+ October 7, 2024
diff --git a/slurm/temporary-files/index.html b/slurm/temporary-files/index.html
index 0643b9f8..1c1a5ccc 100644
--- a/slurm/temporary-files/index.html
+++ b/slurm/temporary-files/index.html
@@ -3304,7 +3304,7 @@ Tracking Local Storage localtmp
September 27, 2024
+ October 7, 2024
diff --git a/slurm/x11/index.html b/slurm/x11/index.html
index f05b37d1..3717fc2a 100644
--- a/slurm/x11/index.html
+++ b/slurm/x11/index.html
@@ -3125,7 +3125,7 @@ Slurm and X11
- September 27, 2024
+ October 7, 2024
diff --git a/storage/home-quota/index.html b/storage/home-quota/index.html
index 95a6e31b..2a190db7 100644
--- a/storage/home-quota/index.html
+++ b/storage/home-quota/index.html
@@ -3230,7 +3230,7 @@ Temporary Files
- September 27, 2024
+ October 7, 2024
diff --git a/storage/migration-faq/index.html b/storage/migration-faq/index.html
index 97364832..a159ebba 100644
--- a/storage/migration-faq/index.html
+++ b/storage/migration-faq/index.html
@@ -3341,7 +3341,7 @@ Conda environments
- September 27, 2024
+ October 7, 2024
diff --git a/storage/querying-storage/index.html b/storage/querying-storage/index.html
index 706c670f..d3943669 100644
--- a/storage/querying-storage/index.html
+++ b/storage/querying-storage/index.html
@@ -3269,7 +3269,7 @@ Query for Project Data and File U
September 27, 2024
+ October 7, 2024
diff --git a/storage/scratch-cleanup/index.html b/storage/scratch-cleanup/index.html
index 8c386b28..66ad1a5f 100644
--- a/storage/scratch-cleanup/index.html
+++ b/storage/scratch-cleanup/index.html
@@ -3131,7 +3131,7 @@ Automated Cleanup of Scratch
- September 27, 2024
+ October 7, 2024
diff --git a/storage/storage-locations/index.html b/storage/storage-locations/index.html
index 7ff1c5b4..8741d213 100644
--- a/storage/storage-locations/index.html
+++ b/storage/storage-locations/index.html
@@ -3671,7 +3671,7 @@ Tier 2 mirror
- September 27, 2024
+ October 7, 2024
diff --git a/storage/storage-migration/index.html b/storage/storage-migration/index.html
index 98c8b67c..4240658d 100644
--- a/storage/storage-migration/index.html
+++ b/storage/storage-migration/index.html
@@ -3682,7 +3682,7 @@ Data migration process f
September 27, 2024
+ October 7, 2024
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-