Usage

Introduction and quick start

This section walks you through a typical session on "Hydra", for more general information regarding cluster usage click here.

Logging in on a cluster

In order to log on to a cluster you will need to be "HPC" enabled. This requires your IT-Coordinator to enable the "HPC-flag" on your UZH account in ITIM. When an "HPC-flag" is enabled, we will receive a request for approval. Please create a ticket in our issue tracker to let us know who is requesting access (including the research group you are in), and what you would like to do (very short description of your task that you want to run on the cluster). After being approved, you will need to set your HPC password in the Tivoli Identity Manager.

Now you can log in to the cluster using the SSH command from a terminal command line (with the password you just set):

ssh shortname@cluster.s3it.uzh.ch

Note that the SSH command is already preinstalled on Linux and MacOSX; Microsoft Windows users might want to use PuTTY.

Select a partition to work in

The "Hydra" cluster has two partitions: once you are logged in, you should first of all select which partition you would like to work in. At the moment, only one partition is available:

  • largemem: the SGI UV-2000 machine: 96 processor core, 4TB of RAM

Selection of a partition is done with the following command:

module load hydra

# to work in the `largemem` partition run this instead

module load cluster/largemem

The module avail command will show also another partition called cluster/scicloud, which is no longer available (since November 2015). Attempts to select the scicloud partition will result in an error.

For more information regarding cluster partitions, click here.

Load

After you have selected a partition, the list of installed software becomes available through the module avail command.

You can then load support for a particular piece of software by issuing the module load command. For example, the following command loads the "R" language interpreter version 3.2.1 (compiled with the GNU compilers, i.e., the "foss" toolchain)::

module load R/3.2.1-foss-2015a

The following table shows the output of the module avail command at the time of this writing (August 2015), with all the software installed on the largemem partition::

$ module avail

------------- /apps/redhatenterpriseserver-7.1-x86_64/modules/all -------------

Autoconf/2.69-GCC-4.9.2 R/3.2.1-foss-2015a Automake/1.15-GCC-4.9.2
ScaLAPACK/2.0.2-gompi-2015a FFTW/3.3.4-gompi-2015a foss/2015a GCC/4.9.2
gompi/2015a Java/1.8.0_51 hwloc/1.10.0-GCC-4.9.2
NASM/2.11.06-foss-2015a libtool/2.4.2-GCC-4.9.2 OpenBLAS/0.2.13-GCC-4.9.2
numactl/2.0.10-GCC-4.9.2 OpenMPI/1.8.4-GCC-4.9.2  

--------------------------- /apps/etc/modules/stage2 --------------------------

EasyBuild/2.2.0 Fiji/2015-08-03 Intel/2015.3.187
matlab/R2015a    

--------------------------- /apps/etc/modules/stage1 --------------------------

cluster/largemem (D) cluster/scicloud Where: (D): Default Module

Use `module spider` to find all possible modules. Use `module keyword key1 key2 ...` to search for all possible modules matching any of the "keys".

Running computational workloads

There are 2 modes of running computational workflows; interactive and non-interactive.

Interactive sessions are meant for minor testing, development and debugging of the job you will want to run. For actual computing, please submit a non-interactive batch job. This will put your command in a queue and execute it when there are enough resources to comply with the request. 

Initially load the correct environment for the specific cluster, hydra as an example:

module load hydra

In order to run a non-interactive batch job, you will need to create a job script. Such a script for hydra may be written as follows:

Create a file named `test.job`, and add the following content:


#!/bin/sh
/bin/hostname
srun -l /bin/hostname
srun -l /bin/pwd

Now when the file is saved, run: `sbatch test.job` on the command line to run this job in non-interactive batch mode. You can check your job in the queue by running `squeue` on the command line.

With batch jobs, you can perform larger/heavier computations than are allowed with interactive jobs. Of course, since batch jobs are non-interactive, they will abort if they try to stop and wait for user input. Note that only scripts can be executed through the sbatch command. If you need to run a binary command, you will need to wrap it into a shell script. Detailed examples of job scripts can be found here. 

To get a shell on the cluster, you would need to request an "interactive job". (Note: interactive jobs are reserved for testing/debugging/development work).
You can start an interactive session on hydra with the following sequence of commands::

module load hydra
srun --pty --time=1:0:0 bash -l

The above `srun` command will request an interactive bash session lasting at most 1 hour (--time=1:0:0).

Note: if you don't set the --time option, you will be kicked out of the interactive session after 1 minute, with a "security warning" message.

Extra information regarding workloads

Interactive sessions

To get a shell on the SGI NUMA "largemem" machine, you would need to request an "interactive job". (Note: interactive jobs are reserved for debugging/development work)

You can start an interactive session with the following sequence of commands::

srun --pty --time=1:0:0 --mem=16g bash -l

The above `srun` command will request an interactive `bash session` lasting at most 1 hour (--time=1:0:0) and using up to 16GB of RAM (--mem=16g).

You can specify a different command than `bash` to be executed; for instance, the following will start an interactive session in the R statistical system:

smodule load R/3.2.1-foss-2015a srun --pty --time=1:0:0 --mem=16g R
Note: if you don't set the --time option, you will be kicked out of the interactive session after 1 minute, with a "security warning" message.

Running computational jobs

For actual computing, please submit a non-interactive batch job. This will put your command in a queue and execute it when there are enough resources to comply with the request. With batch jobs, you can perform larger/heavier computations than are allowed with interactive jobs. Of course, since batch jobs are non-interactive, they will abort if they try to stop and wait for user input.

For example, the following sequence of commands runs a .R file in batch mode, allowing the job to use max. 256GB of memory (--mem=256g), and to run for a max. of 12 hours (--time=12:0:0)::

 module load cluster/largemem
 module load R/3.2.1-foss-2015a
 sbatch --time=12:0:0 --mem=256g my_algo.R

Note that only scripts can be executed through the sbatch command. If you need to run a binary command, you will need to wrap it into a shell script.

More advanced usage of batch jobs

Hydra can be accessed for computation through the SLURM scheduler. 

However in the daily usage and for a normal interaction you only need to know three basic commands:

  • sbatch - submit a batch script
  • squeue - check the status of jobs on the system
  • scancel - delete one of your jobs from the queue

All the commands above when issued with the --help option provide useful usage information. More information on these commands can be found here.

A detailed Slurm user guide can be found here.

Creating job scripts

Hydra is normally used in a batch mode. For this you need to create a script with the commands you want to run. The first line specifies the command interpreter (normally bash). Programs should be run using the "srun" command. The following is a simple example.

#!/bin/bash
srun hostname

Use the sbatch command to submit the script for execution. If the above script was called job.sh, you would use:

sbatch job.sh

Qualifiers can be used to the sbatch command to control job execution. At the very minimum you must specify how long the job will run, and the amount of memory it requires. Both are maximums, and if either is exceeded then the job will be killed. For example to run for ten hours with 20GB of memory you would say:

sbatch --time=10:00:00 --mem=20G job.sh

It is often more convenient to include these options directly in the batch file. This is done with a SBATCH comment like the following.

#!/bin/bash
#SBATCH --time=10:00:00 --mem=20G
srun hostname

Now you can simply use:

sbatch job.sh

and the options from the batch file will be used. If the same option appears on the command line, and in the batch file then the command line value is used.

Task and CPU selection

The Hydra largemem machine has 96 compute cores. A single job can use any number of these cores. The slurm batch system distinguishes between "tasks" (or processes) and "CPUs" (commonly called cores). If you are running an OpenMPI or pthread code, then you probably want a single process with multiple threads. This means that you want 1 task and (for example) 32 cpus per task. For an MPI code you would instead want 32 tasks.

This distribution of resources is controlled with the following two options:

--ntasks
--cpus-per-task

For detailed examples of batch files, which use these settings please go here.

Caveats

Please be aware that the following restrictions apply to the slurm scheduler.

  • Default memory is set to 1 MB. This will probably cause your job to fail unless you specify with the #SBATCH --mem-per-cpu directive the amount of memory (in MB) per core your job would need. e.g. #SBATCH --mem-per-core=4096 will reserve 4G/core.
  • Default execution time is 1s. This will cause your job to fail after 1 second unless you specify with the #SBATCH --time directive the duration of your job. e.g. #SBATCH --time=01:00:00 will grant 1 hour of computation time to the job. Note that the max running time is normally 24 hours; please ask the systems administrators for an extension if you need one.