HSUper

HSU

7. September 2023

Technical Specifications

The HSUper cluster consists of

  • Regular nodes: 571 compute nodes each equipped with 256 GB RAM and 2 Intel Icelake sockets; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node
  • Fat memory nodes: 5 compute nodes each equipped with 1 TB RAM and 2 Intel Icelake sockets; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node
  • GPU nodes: 5 compute nodes each equipped with 256 GB RAM, 2 Intel Icelake sockets, 2 NVidia A100 (40GB) GPUs and 894GB local scratch storage; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node

Besides, HSUper provides access to a 1PB BeeGFS file system, as well as a 1 PB Ceph-based file system.
All nodes are connected by a non-blocking NVIDIA InfiniBand HDR100 fabric.

How to Access HSUper

HSUper can be accessed from within the HSU network only. If you are not located at the HSU, you need to connect using the HSU VPN (see below). To be able to access HSUper, you need to be a registered HSUper user. Please apply for HPC access using the following form (available on campus/VPN only, login with RZ credentials), if needed.

Access HSUper using Linux / Windows 10+ / MacOS 10+:

Login using your RZ credentials. Open a terminal / command prompt and enter, replacing <rz-name> with your RZ name:

ssh <rz-name>@hsuper-login01.hsu-hh.de

Access HSUper using PuTTY:

You may also download and use “PuTTY” to manage different SSH connections (and more) using a GUI for settings. After opening PuTTY, put hsuper-login01.hsu-hh.de as the “Host Name” and click on “Open”. Enter your credentials afterwards.

Access HSUper using your public SSH key:

Instead of entering every time your password, you may set up logging in using SSH keys.

  • If you already have created SSH keys, you may add your public key to the ~/.ssh/authorized_keys file on HSUper using your favourite text editor.
  • Connect to HSUper with your private key:
    ssh -i ~/path/to/private/key <rz-name>@hsuper-login01.hsu-hh.de
    • PuTTY: Configuration category “Connection -> SSH -> Auth”. Browse for your private key file.
    • Note: You may skip the key parameter if you have only one SSH key stored in your ~/.ssh/ folder as it then gets transmitted automatically.

If you need to create new SSH keys, use the type “Ed25519” as it is currently seen as the fastest and most secure type. If you work from a terminal / command prompt, you may create a key using: ssh-keygen -t ed25519
Follow the instructions and remember where the public and private key are saved. Default values are fine.

X11 Forwarding:

If you are on a console using ssh, you may add the “-X” parameter to use X11 forwarding, allowing you to open graphical applications on the HSUper frontend and having them forward their windows to your local computer(which can, however, be very slow). If you use PuTTY, you’ll find the option in the PuTTY configuration category “Connection -> SSH -> X11”. Have a look at PuTTY’s documentation for more information.

Using the HSU VPN:

  • Install the OpenVPN client.
  • (First time only: Download the the HSU OpenVPN configuration file “HSU Open VPN Konfiguration 2022″ while on campus from the webbox (Rechenzentrum).)
  • Start the VPN connection:
    • Linux:
    • Windows:
      • Start OpenVPN,
      • First time only: Right click on the tray icon, import the VPN configuration file,
      • Right click on the tray icon, click on HSU_VPN_2022 and enter your RZ credentials.

Available Software on HSUper

Most software packages are provided as a module and must be loaded. HSUper has its modules hierarchically organized by compiler and MPI implementation. The command module avail shows all modules, that are available in the current environment. If you are missing a software package, it might not be available for the selected compilers or MPI implementation. module spider <modulename> is useful to find out how a module could be loaded. More information on the module system is provided by module --help and the Lmod documentation.

If you cannot find a package using module spider, verify if you have loaded the USER-SPACK module.

Currently available software:

  • Apptainer (Singularity)
  • slurm
  • Lmod
  • tmux
  • Python3
  • GNU Debugger (gdb)
  • vim
  • tcsh
  • zsh

Modules:

  • OpenMPI
  • Intel® MPI Library (Intel oneAPI MPI)
  • Intel® oneAPI Math Kernel Library (Intel oneAPI MKL)
  • Intel® oneAPI Base Toolkit & HPC Toolkit 
    • Intel oneAPI DPC++/C++ Compiler
    • Intel® C++ Compiler Classic
    • Intel® oneAPI Threading Building Blocks (Intel oneAPI TBB)
  • NVIDIA HPC SDK
    • NVIDIA C, C++, Fortran 77, Fortran 90 compilers
  • NVIDIA CUDA®
  • TensorFlow (TODO: coming soon)
  • R
  • gcc
  • cmake
  • ccmake (part of the cmake module)
  • eigen3
  • bison
  • flex
  • Python3
  • py-pybind
  • OpenCL
  • FTTW
  • LAPACK
  • ScaLAPACK
  • OpenFOAM
  • GNU Octave
  • gnuplot
  • Miniconda3
  • Spack (USER-SPACK)

USER-SPACK

The module “USER-SPACK” creates a folder with the same name in your home directory. The module allows you to use spack for installing packages with all the advantages spack offers. The module integrates your locally installed spack packages directly into the environment module system Lmod — if compiled with a different compiler than the default system compiler (currently [email protected]). For example spack install <package> %[email protected] installs the <package> using the gcc compiler in version 12.1.0 and creates a module file for you. Not mentioning any compiler (with %) result in using the system compiler and no module file creation. In these cases you rely on spack load.
Furthermore, all globally available software packages are made known to your local user spack installation on the first load of the module – no need to recompile these nor make them known to your spack installation.
Hence, you are free to choose between spack and module for loading and unloading installed packages.

Tip: spack find -x shows you only the explicitly (globally) installed packages, without any dependencies. A spack documentation with many examples and tutorials can be found online.

If you cannot find a package that should exist globally using module spider, you may unload USER-SPACK or update your local copy of the cluster module files using the following command: spack module lmod refresh --delete-tree --upstream-modules -y

Please note: USER-SPACK is only compatible with the following shells: bash/zsh/sh/csh

Storage & Quota

Every user has a personal quota on the BeeGFS parallel file system. Once the quota is reached, one cannot write more files. The current status is shown at login, noting ones personal quota (highlighted) and of ones (chair / research) groups.

Groups may have a project folder in the /beegfs/project/ folder.

Projects have an individual quota, independent of the user quota.
Feel free to submit a request using the ticket system if you need access to an existing project folder or need one created.

On every compute node and the login node is a /scratch partition mounted. You may use it for temporary files. Please note that files created there count to your personal quota. Remember to clean up after your job succeeded and the results are saved.

In general it is a good idea to keep your home directory clean and small.

Please note: The parallel file systems do not offer any backup solution. Deleted files are gone forever. Create backups of important files!

Preparing Jobs & Testing

Software should be compiled/installed on the login nodes – manually or using USER-SPACK (see above).

Container (for Apptainer/Singularity) may be prepared on your local computer (or the login node) and later uploaded to the cluster.

Jobs running on compute nodes should only contain the execution of the already set up environment. All environment preparations (download of additional software packages and data from the Internet) should be done beforehand.

There is a special dev partition (see below) meant to be used for testing purposes. Feel free to use that partition to test different job settings etc.

How to Submit Jobs to HSUper

HSUper resources are managed via SLURM. Job scripts need to be written and scheduled. They describe how a certain compute task is to be executed. Exemplary SLURM job scripts are provided further below. They can be scheduled using sbatch, e.g. sbatch helloworld-omp.job.
For details on SLURM, refer to its documentation.

HSUper provides different nodes (see above) and also different partitions in which compute jobs can be run. In the following, you find a list of available partitions and their restrictions:

  • dev: 1 job per user on 1-2 regular nodes (nodes 1-571); wall-clock limit: 1h. Please use this partition for testing purposes only.
  • small: Jobs on 1-5 regular nodes (nodes 3-571); wall-clock limit: 72h; exclusive node reservation
  • small_shared: Same settings as small but node resources are by default shared.
  • small_fat: Jobs on 1-5 fat memory nodes (nodes 572-576); wall-clock limit: 24h; exclusive node reservation
  • small_gpu: Jobs on 1-5 GPU nodes (gpu nodes 1-5); wall-clock limit: 24h
  • medium: Jobs on 6-256 regular nodes (nodes 3-571); wall-clock limit: 24h; exclusive node reservation
  • large: Jobs on >256 regular nodes (nodes 3-571; available to selected users only!); exclusive node reservation

Before executing your software on HSUper, double-check that a time limit is set (that must not differ drastically from the actual run time) and that the job is capable of exploiting the requested resources efficiently:

  • Is your software parallelized?
    If it is parallelized, is it shared-memory parallelized (e.g. with OpenMP, Intel TBB, Cilk)? Then you can use single compute nodes, but typically not multiple of them.
    If it is parallelized, is it distributed-memory parallelized (e.g. with MPI or some PGAS approach such as used in Fortran 2008, UPC++, Chapel)? Then your program can potentially also run on multiple compute nodes at a time.
  • Does your software support execution on graphics cards (GPUs)? If yes, you might want to consider using the GPU nodes of HSUper.
  • Is your application extremely memory-intensive?
    If it requires even more than 256 GB of memory and is not parallelized for distributed-memory systems, you might still be able to execute your software on the fat memory nodes of HSUper. Note: this should be the exception!, typically distributed-memory parallelism is highly recommended for most applications!
  • Does your software only require a subset of the resources of single nodes? Consider using the small_shared partition.

In the following, you find job script examples that execute a simple C++ test program on HSUper. The test program is the following — you may simply copy-paste it into a file helloworld.cpp or download it from the webbox:

#include<iostream>
#ifdef MYMPI
#include<mpi.h>
#endif
#ifdef MYOMP
#include<omp.h>
#endif

int main(int argc, char *argv[]){
int rank=0;
int size=1;
int threads=1;
#ifdef MYMPI
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
#endif
#ifdef MYOMP
threads=omp_get_max_threads();
#endif

if (rank==0){
std::cout<<"Hello World on root rank "<<rank << std::endl;
std::cout<<"Total number of ranks: "<< size << std::endl;
std::cout<<"Number of threads/rank: "<< threads << std::endl;
}
#ifdef MYMPI
MPI_Finalize();
#endif
return 0;
}


To run the examples, you may compile the program in different variants:

  • Sequential (not parallel):
    g++ helloworld.cpp -o helloworld_gnu
    -> remark: sequential programs are not well-suited for execution on HSUper! Please make sure to run parallelized programs to efficiently exploit the given HPC hardware! For a respective job script, you could simply rely on the shared-memory example, excluding the use of threads and OpenMP.
  • Shared-memory parallel using OpenMP:
    g++ -DMYOMP -fopenmp helloworld.cpp -lgomp -o helloworld_gnu_omp
    -> remark: this is sufficient to use up to all cores and threads, respectively, of one single compute node! Remember that HSUper is particularly good for massively parallel compute jobs, leveraging many compute nodes at a time (see next bullet)
  • Distributed-memory parallel using MPI:
    mpicxx -DMYMPI helloworld.cpp -lmpi -o helloworld_gnu_mpi
    -> remark: this is sufficient to use all cores and, potentially, several nodes of HSUper. Make sure to load a valid MPI compiler beforehand (see section on software: module load ... with ... representing an existing MPI module; check available modules via module avail)
  • Distributed- and shared-memory parallel using both MPI and OpenMP:
    mpicxx -DMYMPI -DMYOMP -fopenmp helloworld.cpp -lmpi -lgomp -o helloworld_gnu_mpi_omp

Examples:

  • Single-node shared-memory parallel job using OpenMP:
    The following batch script describes a job using 1 compute node (equipped with 72 cores), on which the OpenMP-parallel program is executed with 13 threads. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-omp-%j.log, where %j corresponds to the job ID.
    Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folder helloworld in your home directory. You can download the file also from the webbox.
    #!/bin/bash
    #SBATCH --job-name=helloworld-omp # specifies a user-defined job name
    #SBATCH --nodes=1 # number of compute nodes to be used
    #SBATCH --ntasks=1 # number of MPI processes
    #SBATCH --partition=small # partition (small_shared, small, medium, small_fat, small_gpu)

    # special partitions: large (for selected users only!)
    # job configuration testing partition: dev
    #SBATCH --cpus-per-task=72 # number of cores per process
    #SBATCH --time=00:02:00 # maximum wall clock limit for job execution
    #SBATCH --output=helloworld-omp_%j.log # log file which will contain all output

    # commands to be executed
    cd $HOME/helloworld               #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworld

    export OMP_NUM_THREADS=13
    ./helloworld_gnu_omp


  • Multi-node distributed-memory parallel job using MPI:
    The following batch script describes a job using 3 compute nodes. On every compute node, 36 MPI processes are launched. One core is reserved per MPI process. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-mpi-%j.log, where %j corresponds to the job ID.
    Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folder helloworld in your home directory. You may search for MPI implementations utilizing ml spider mpi or ml spider openmpi. You can download the file also from the webbox.
    #!/bin/bash
    #SBATCH --job-name=helloworld-mpi # specifies a user-defined job name
    #SBATCH --nodes=3 # number of compute nodes to be used
    #SBATCH --ntasks-per-node=36 # number of MPI processes per node
    #SBATCH --partition=small # partition (small_shared, small, medium, small_fat, small_gpu)

    # special partitions: large (for selected users only!)
    # job configuration testing partition: dev
    #SBATCH --cpus-per-task=1 # number of cores per process
    #SBATCH --time=00:02:00 # maximum wall clock limit for job execution
    #SBATCH --output=helloworld-mpi_%j.log # log file which will contain all output
    ### some additional information (you can delete those lines)
    echo "#==================================================#"
    echo " num nodes: " $SLURM_JOB_NUM_NODES
    echo " num tasks: " $SLURM_NTASKS
    echo " cpus per task: " $SLURM_CPUS_PER_TASK
    echo " nodes used: " $SLURM_JOB_NODELIST
    echo " job cpus used: " $SLURM_JOB_CPUS_PER_NODE
    echo "#==================================================#"

    # commands to be executed
    # modify the following line to load a specific MPI implementation
    module load mpi
    cd $HOME/helloworld               #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworld
    # use the SLURM variable "ntasks" to set the number of MPI processes;
    # here, ntasks is computed from "nodes" and "ntasks-per-node"; alternatively
    # specify, e.g., ntasks directly (instead of ntasks-per-node)
    mpirun -np $SLURM_NTASKS ./helloworld_gnu_mpi


  • Multi-node distributed-memory/shared-memory parallel job using MPI/OpenMP:
    The following batch script describes a job using 3 compute nodes. On every compute node, 2 MPI processes are launched. Each process comprises 36 cores. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-mpi-omp-%j.log, where %j corresponds to the job ID.
    Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folder helloworld in your home directory. You may search for MPI implementations utilizing ml spider mpi or ml spider openmpi. You can download the file also from the webbox.
    #!/bin/bash
    #SBATCH --job-name=helloworld-mpi-omp # specifies a user-defined job name
    #SBATCH --nodes=3 # number of compute nodes to be used
    #SBATCH --ntasks-per-node=2 # number of MPI processes per node
    #SBATCH --partition=small # partition (small_shared, small, medium, small_fat, small_gpu)
    # special partitions: large (for selected users only!)
    # job configuration testing partition: dev
    #SBATCH --cpus-per-task=36 # number of cores per process
    #SBATCH --time=00:02:00 # maximum wall clock limit for job execution
    #SBATCH --output=helloworld-mpi-omp_%j.log # log file which will contain all output
    ### some additional information (you can delete those lines)
    echo "#==================================================#"
    echo " num nodes: " $SLURM_JOB_NUM_NODES
    echo " num tasks: " $SLURM_NTASKS
    echo " cpus per task: " $SLURM_CPUS_PER_TASK
    echo " nodes used: " $SLURM_JOB_NODELIST
    echo " job cpus used: " $SLURM_JOB_CPUS_PER_NODE
    echo "#==================================================#"
    # commands to be executed
    # modify the following line to load a specific MPI implementation
    module load mpi
    cd $HOME/helloworld               #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworld
    # use the SLURM variable "ntasks" to set the number of MPI processes;

    # here, ntasks is computed from "nodes" and "ntasks-per-node"
    export OMP_NUM_THREADS=36
    mpirun -np $SLURM_NTASKS ./helloworld_gnu_mpi_omp
Download Symbol-Icon
All example files in one zip

Useful Environment Variables

  • $HOME – Path to ones home directory
  • $PROJECT – Path to the BeeGFS project directory
  • $SLURM_TMPDIR – Path to the temporary folder for the current slurm job. On job completion all created files will be automatically deleted. Hence, the user’s BeeGFS quota will not be affected by the temporary files that are created in this path, once the slurm job completes. To keep files, essential files should be copied during the jobs runtime (copy commands in the slurm job script).
    • On compute nodes: $SLURM_TMPDIR points to the node’s memory. Please be careful not to run out of memory by writing too much data there.
    • On GPU nodes: $SLURM_TMPDIR points to a local SSD with a total of 894GB. Memory is not affected by writing data to $SLURM_TMPDIR. The storage might be shared if the node is not allocated exclusively.
  • $SCRATCH – Path to the /scratch directory.

Support

The administration / technical support can be reached via the ticket system (login with RZ credentials):

Alternatively, send an email to: [email protected]

Acknowledgement

The HPC-cluster HSUper has been provided by the project hpc.bw, funded by dtec.bw — Digitalization and Technology Research Center of the Bundeswehr. dtec.bw is funded by the European Union – NextGenerationEU.

See below for a suggestion of an acknowledgement statement for publications:

dtec.bw is funded by the European Union – NextGenerationEU.

Computational resources (HPC-cluster HSUper) have been provided by the project hpc.bw, funded by dtec.bw — Digitalization and Technology Research Center of the Bundeswehr. dtec.bw is funded by the European Union – NextGenerationEU.