diff --git a/bih-cluster/docs/help/faq.md b/bih-cluster/docs/help/faq.md index d555dc5fc..81a4a4569 100644 --- a/bih-cluster/docs/help/faq.md +++ b/bih-cluster/docs/help/faq.md @@ -212,49 +212,49 @@ For GPU jobs also see "My GPU jobs don't get scheduled". ## My GPU jobs don't get scheduled -There are only four GPU machines in the cluster (with four GPUs each, med0301 to med0304). +There are only four GPU machines in the cluster (with four GPUs each, hpc-gpu-1 to hpc-gpu-4). Please inspect first the number of running jobs with GPU resource requests: ```bash -res-login-1:~$ squeue -o "%.10i %20j %.2t %.5D %.4C %.10m %.16R %.13b" "$@" | grep med03 | sort -k7,7 - 1902163 ONT-basecalling R 1 2 8G med0301 gpu:tesla:2 - 1902167 ONT-basecalling R 1 2 8G med0301 gpu:tesla:2 - 1902164 ONT-basecalling R 1 2 8G med0302 gpu:tesla:2 - 1902166 ONT-basecalling R 1 2 8G med0302 gpu:tesla:2 - 1902162 ONT-basecalling R 1 2 8G med0303 gpu:tesla:2 - 1902165 ONT-basecalling R 1 2 8G med0303 gpu:tesla:2 - 1785264 bash R 1 1 1G med0304 gpu:tesla:2 +res-login-1:~$ squeue -o "%.10i %20j %.2t %.5D %.4C %.10m %.16R %.13b" "$@" | grep hpc-gpu- | sort -k7,7 + 1902163 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2 + 1902167 ONT-basecalling R 1 2 8G hpc-gpu-1 gpu:tesla:2 + 1902164 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2 + 1902166 ONT-basecalling R 1 2 8G hpc-gpu-2 gpu:tesla:2 + 1902162 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2 + 1902165 ONT-basecalling R 1 2 8G hpc-gpu-3 gpu:tesla:2 + 1785264 bash R 1 1 1G hpc-gpu-4 gpu:tesla:2 ``` -This indicates that there are two free GPUs on med0304. +This indicates that there are two free GPUs on hpc-gpu-4. Second, inspect the node states: ```bash -res-login-1:~$ sinfo -n med030[1-4] +res-login-1:~$ sinfo -n hpc-gpu-[1-4] PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up 8:00:00 0 n/a medium up 7-00:00:00 0 n/a long up 28-00:00:0 0 n/a critical up 7-00:00:00 0 n/a highmem up 14-00:00:0 0 n/a -gpu up 14-00:00:0 1 drng med0304 +gpu up 14-00:00:0 1 drng hpc-gpu-4 gpu up 14-00:00:0 3 mix med[0301-0303] mpi up 14-00:00:0 0 n/a ``` -This tells you that med0301 to med0303 have jobs running ("mix" indicates that there are free resources, but these are only CPU cores not GPUs). -med0304 is shown to be in "draining state". +This tells you that hpc-gpu-1 to hpc-gpu-3 have jobs running ("mix" indicates that there are free resources, but these are only CPU cores not GPUs). +hpc-gpu-4 is shown to be in "draining state". Let's look what's going on there. ```bash hl_lines="10 18" -res-login-1:~$ scontrol show node med0304 -NodeName=med0304 Arch=x86_64 CoresPerSocket=16 +res-login-1:~$ scontrol show node hpc-gpu-4 +NodeName=hpc-gpu-4 Arch=x86_64 CoresPerSocket=16 CPUAlloc=2 CPUTot=64 CPULoad=1.44 AvailableFeatures=skylake ActiveFeatures=skylake Gres=gpu:tesla:4(S:0-1) - NodeAddr=med0304 NodeHostName=med0304 Version=20.02.0 + NodeAddr=hpc-gpu-4 NodeHostName=hpc-gpu-4 Version=20.02.0 OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 RealMemory=385215 AllocMem=1024 FreeMem=347881 Sockets=2 Boards=1 State=MIXED+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A @@ -362,18 +362,18 @@ This is done by specifying a format string and using the `%b` field. ```bash squeue -o "%.10i %9P %20j %10u %.2t %.10M %.6D %10R %b" -p gpu JOBID PARTITION NAME USER ST TIME NODES NODELIST(R TRES_PER_NODE - 872571 gpu bash user1 R 15:53:25 1 med0303 gpu:tesla:1 - 862261 gpu bash user2 R 2-16:26:59 1 med0304 gpu:tesla:4 - 860771 gpu kidney.job user3 R 2-16:27:12 1 med0302 gpu:tesla:1 - 860772 gpu kidney.job user3 R 2-16:27:12 1 med0302 gpu:tesla:1 - 860773 gpu kidney.job user3 R 2-16:27:12 1 med0302 gpu:tesla:1 - 860770 gpu kidney.job user3 R 4-03:23:08 1 med0301 gpu:tesla:1 - 860766 gpu kidney.job user3 R 4-03:23:11 1 med0303 gpu:tesla:1 - 860767 gpu kidney.job user3 R 4-03:23:11 1 med0301 gpu:tesla:1 - 860768 gpu kidney.job user3 R 4-03:23:11 1 med0301 gpu:tesla:1 + 872571 gpu bash user1 R 15:53:25 1 hpc-gpu-3 gpu:tesla:1 + 862261 gpu bash user2 R 2-16:26:59 1 hpc-gpu-4 gpu:tesla:4 + 860771 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1 + 860772 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1 + 860773 gpu kidney.job user3 R 2-16:27:12 1 hpc-gpu-2 gpu:tesla:1 + 860770 gpu kidney.job user3 R 4-03:23:08 1 hpc-gpu-1 gpu:tesla:1 + 860766 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-3 gpu:tesla:1 + 860767 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1 + 860768 gpu kidney.job user3 R 4-03:23:11 1 hpc-gpu-1 gpu:tesla:1 ``` -In the example above, user1 has one job with one GPU running on med0303, user2 has one job running with 4 GPUs on med0304 and user3 has 7 jobs in total running of different machines with one GPU each. +In the example above, user1 has one job with one GPU running on hpc-gpu-3, user2 has one job running with 4 GPUs on hpc-gpu-4 and user3 has 7 jobs in total running of different machines with one GPU each. ## How can I access graphical user interfaces (such as for Matlab) on the cluster? @@ -579,8 +579,8 @@ export LC_ALL=C For this, connect to the node you want to query (via SSH but do not perform any computation via SSH!) ```bash -res-login-1:~$ ssh med0301 -med0301:~$ yum list installed 2>/dev/null | grep cuda.x86_64 +res-login-1:~$ ssh hpc-gpu-1 +hpc-gpu-1:~$ yum list installed 2>/dev/null | grep cuda.x86_64 cuda.x86_64 10.2.89-1 @local-cuda nvidia-driver-latest-dkms-cuda.x86_64 3:440.64.00-1.el7 @local-cuda ``` diff --git a/bih-cluster/docs/how-to/connect/gpu-nodes.md b/bih-cluster/docs/how-to/connect/gpu-nodes.md index 9f4419543..4b0466e72 100644 --- a/bih-cluster/docs/how-to/connect/gpu-nodes.md +++ b/bih-cluster/docs/how-to/connect/gpu-nodes.md @@ -1,9 +1,9 @@ # How-To: Connect to GPU Nodes -The cluster has seven nodes with four Tesla V100 GPUs each: `hpc-gpu-{1..7}`. +The cluster has seven nodes with four Tesla V100 GPUs each: `hpc-gpu-{1..7}` and one node with 10 A400 GPUs: `hpc-gpu-8`. Connecting to a node with GPUs is easy. -You simply request a GPU using the `--gres=gpu:tesla:COUNT` argument to `srun` and `batch`. +You simply request a GPU using the `--gres=gpu:$CARD:COUNT` (for `CARD=tesla` or `CARD=a40`) argument to `srun` and `batch`. This will automatically place your job in the `gpu` partition (which is where the GPU nodes live) and allocate a number of `COUNT` GPUs to your job. !!! note @@ -80,14 +80,14 @@ Now to the somewhat boring part where we show that CUDA actually works. ```bash res-login-1:~$ srun --gres=gpu:tesla:1 --pty bash -med0301:~$ nvcc --version +hpc-gpu-1:~$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Wed_Oct_23_19:24:38_PDT_2019 Cuda compilation tools, release 10.2, V10.2.89 -med0301:~$ source ~/miniconda3/bin/activate -med0301:~$ conda activate gpu-test -med0301:~$ python -c 'import torch; print(torch.cuda.is_available())' +hpc-gpu-1:~$ source ~/miniconda3/bin/activate +hpc-gpu-1:~$ conda activate gpu-test +hpc-gpu-1:~$ python -c 'import torch; print(torch.cuda.is_available())' True ``` @@ -103,7 +103,7 @@ Use `squeue` to find out about currently queued jobs (the `egrep` only keeps the ```bash res-login-1:~$ squeue | egrep -iw 'JOBID|gpu' JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) - 33 gpu bash holtgrem R 2:26 1 med0301 + 33 gpu bash holtgrem R 2:26 1 hpc-gpu-1 ``` ## Bonus #2: Is the GPU running? @@ -111,8 +111,8 @@ res-login-1:~$ squeue | egrep -iw 'JOBID|gpu' To find out how active the GPU nodes actually are, you can connect to the nodes (without allocating a GPU you can do this even if the node is full) and then use `nvidia-smi`. ```bash -res-login-1:~$ ssh med0301 bash -med0301:~$ nvidia-smi +res-login-1:~$ ssh hpc-gpu-1 bash +hpc-gpu-1:~$ nvidia-smi Fri Mar 6 11:10:08 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 | diff --git a/bih-cluster/docs/how-to/software/tensorflow.md b/bih-cluster/docs/how-to/software/tensorflow.md index 11eb4bb87..d2f94b73c 100644 --- a/bih-cluster/docs/how-to/software/tensorflow.md +++ b/bih-cluster/docs/how-to/software/tensorflow.md @@ -4,7 +4,7 @@ TensorFlow is a package for deep learning with optional support for GPUs. You can find the original TensorFlow installation instructions [here](https://www.tensorflow.org/install). This article describes how to set up TensorFlow with GPU support using Conda. -This how-to assumes that you have just connected to a GPU node via `srun --mem=10g --partition=gpu --gres=gpu:tesla:1 --pty bash -i`. +This how-to assumes that you have just connected to a GPU node via `srun --mem=10g --partition=gpu --gres=gpu:tesla:1 --pty bash -i` (for Tesla V100 GPUs, for A400 GPUs use `--gres=gpu:a40:1`). Note that you will need to allocate "enough" memory, otherwise your python session will be `Killed` because of too little memory. You should read the [How-To: Connect to GPU Nodes](../../how-to/connect/gpu-nodes/) tutorial on an explanation of how to do this and to learn how to register for GPU usage. diff --git a/bih-cluster/docs/overview/architecture.md b/bih-cluster/docs/overview/architecture.md index 1e47a6558..deaeb4780 100644 --- a/bih-cluster/docs/overview/architecture.md +++ b/bih-cluster/docs/overview/architecture.md @@ -9,8 +9,9 @@ A cluster system bundles a high number of nodes and in the case of HPC, the focu - approx. 256 nodes (from three generations), - 4 high-memory nodes (2 nodes with 512 GB RAM, 2 nodes with 1 TB RAM), -- 7 GPU nodes (with 4 Tesla GPUs each), and -- a high-perfomance parallel GPFS files system. +- 7 GPU nodes with 4 Tesla GPUs each, 1 GPU node with 10 A40 GPUs, and +- a high-performance Tier 1 parallel CephFS file system with a larger but slower Tier 2 CephFS file system, and +- a legacy parallel GPFS files system. ### Network Interconnect diff --git a/bih-cluster/docs/overview/for-the-impatient.md b/bih-cluster/docs/overview/for-the-impatient.md index 17a0bd532..37e132458 100644 --- a/bih-cluster/docs/overview/for-the-impatient.md +++ b/bih-cluster/docs/overview/for-the-impatient.md @@ -18,7 +18,7 @@ The cluster consists of the following major components: - a scheduling system using Slurm, - 228 general purpose compute nodes `hpc-cpu-{1..228} - a few high memory nodes `hpc-mem-{1..4}`, -- 7 nodes with 4 Tesla V100 GPUs each (!) `hpc-gpu-{1..7}`, +- 7 nodes with 4 Tesla V100 GPUs each (!) `hpc-gpu-{1..7}` and 1 node with 10x A40 GPUs (!) `hpc-gpu-8`, - a high-performance, parallel GPFS file system with 2.1 PB, by DDN mounted at `/fast`, - a tier 2 (slower) storage system based on Ceph/CephFS diff --git a/bih-cluster/docs/overview/job-scheduler.md b/bih-cluster/docs/overview/job-scheduler.md index 7c6c2870d..59ed8aaa6 100644 --- a/bih-cluster/docs/overview/job-scheduler.md +++ b/bih-cluster/docs/overview/job-scheduler.md @@ -75,7 +75,7 @@ See [Resource Registration: GPU Nodes](../admin/resource-registration.md#gpu-nod * **maximum run time:** 14 days * **partition name:** `gpu` -* **argument string:** select `$count` GPUs: `-p gpu --gres=gpu:tesla:$count`, maximum run time: `--time 14-00:00:00` +* **argument string:** select `$count` GPUs: `-p gpu --gres=gpu:$card:$count` (`card=tesla` or `card=a40`), maximum run time: `--time 14-00:00:00` ### `highmem` diff --git a/bih-cluster/docs/slurm/commands-sbatch.md b/bih-cluster/docs/slurm/commands-sbatch.md index bea1e1768..ade6742fd 100644 --- a/bih-cluster/docs/slurm/commands-sbatch.md +++ b/bih-cluster/docs/slurm/commands-sbatch.md @@ -34,7 +34,7 @@ The command will create a batch job and add it to the queue to be executed at a As you can define minimal and maximal number of tasks/CPUs/cores, you could also specify `--mem-per-cpu` and get more flexible scheduling of your job. - `--gres` -- Generic resource allocation. - On the BIH HPC, this is only used for allocating GPUS, e.g., with `--gres=gpu:tesla:2`, a user could allocate two NVIDIA Tesla GPUs on the same hsot. + On the BIH HPC, this is only used for allocating GPUS, e.g., with `--gres=gpu:tesla:2`, a user could allocate two NVIDIA Tesla GPUs on the same host (use `a40` instead of `tesla` for the A40 GPUs). - `--licenses` -- On the BIH HPC, this is used for the allocation of MATLAB 2016b licenses only. - `--partition` diff --git a/bih-cluster/docs/slurm/rosetta-stone.md b/bih-cluster/docs/slurm/rosetta-stone.md index 09ea312d7..0ed8294fc 100644 --- a/bih-cluster/docs/slurm/rosetta-stone.md +++ b/bih-cluster/docs/slurm/rosetta-stone.md @@ -45,4 +45,4 @@ The table below shows some SGE commands and their Slurm equivalents. | allocate memory | `-l h_vmem=size` | `--mem=mem` OR `--mem-per-cpu=mem` | | wait for job | `-hold_jid jid` | `--depend state:job` | | select target host | `-l hostname=host1\|host1` | `--nodelist=nodes` AND/OR `--exclude` | -| allocate GPU | `-l gpu=1` | `--gres=gpu:tesla:count` | +| allocate GPU | `-l gpu=1` | `--gres=gpu:tesla:count` or `--gres=gpu:a40:count` |