kiz
- 1:
Library. - 2:
Information Technology.- 2.1:
Online-Statusabfrage. - 2.2:
Network. - 2.3:
Sicherheit & Zertifikate. - 2.4:
Communication Services. - 2.5:
Campus Systems. - 2.6:
Processor & Compute-Server.- 2.6.1:
CUSS Compute-Server. - 2.6.2:
bwGRiD Cluster Ulm . - 2.6.3:
Unix server. - 2.6.4:
Computer and Course Labs. - 2.6.5:
Reading Room Computers.
- 2.6.1:
- 2.7:
Software & Operating Systems. - 2.8:
Data Management. - 2.9:
Dienste für die Verwaltung.
- 2.1:
- 3:
Media. 4: - 5:
Accounts / Logins / Downloads. - 6:
Forms. - 7:
Workshops & Events. 8: - 9:
Who We Are. - 10:
The kiz from A to Z.
CUSS Compute-Server
Continuing the cooperation with Stuttgart University, the University of Ulm installed new computeservers from Sun and IBM in March 2007. It was funded by the state of Baden-Württemberg and the FRG (HBFG). For this acquisition performance and available memory were most important.
The compute server consists of four Sun nodes:
- andromeda.rz.uni-ulm.de
- perseus.rz.uni-ulm.de
- pegasus.rz.uni-ulm.de
- cassiopeia.rz.uni-ulm.de
and 36 IBM Opteron nodes:
- alpha[0...8].rz.uni-ulm.de
- beta[0...8].rz.uni-ulm.de
- gamma[0...8].rz.uni-ulm.de
- delta[0...8].rz.uni-ulm.de
For login you should use the four Sun nodes mentioned above and the following Linux servers:
- zeus.rz.uni-ulm.de
- hera.rz.uni-ulm.de
Please do not login to the Opteron nodes directly.
Hardware
Sun nodes
4 x SunFire 2900
12 x Dual-Core UltraSPARC-IV+, 1.8 GHz, 32 MB L2 Cache
96 GB RAM
Solaris 10
Opteron nodes
36 x IBM x3755
4 x Dual-Core Opteron 2.6 GHz, 2 MB L2 Cache
32 bzw. 64 GB RAM
SuSE Linux 10.2 und Solaris 10_x86
Login server zeus/hera
2 x SunFire V40z
4 x Dual-Core Opteron 2.2 GHz, 2 MB L2 Cache
16 GB RAM
SuSE Linus 10.2
All compute server nodes are linked by Gigabit Ethernet to local area networks at the Universities in Ulm and Stuttgart (using BelWü).
Introduction and guidelines
Besides Sun software like compilers or
MPI we also provide a rich set of GNU and OpenSource tools as well as libraries and standard application software (s. kiz website:
Software).
- Resources
The amount of resources non-batch jobs can consume are limited. Currently the CPU limit is set to 20 minutes, memory is limited to 8 GB. Jobs requiring resources beyond those limits have to be submitted through the batchsystem
Grid Engine. - Clustertools
SUN Clustertools (MPI) are installed on all nodes.
The PATH variable has been extended by /opt/SUNWhpc/bin. - Compiler
The default compiler on the Solaris servers is the actual SunStudio compiler. On the SuSE servers the GNU compiler is used. Alternatively the Sun compilers or the products of the Portland Group can be used. To get good performance on the SPARC platform we recommend to start with the following compiler options:
32bit application to run on UltraSPARC-IV+ CPUs only:
-fast -xtarget=ultra4plus -xarch=v8plusb
64bit application to run on UltraSPARC-IV+ CPUs only:
-fast -xtarget=ultra4plus -xarch=v9b - Scratch disks
Scratch disks are provided with different diskspace. On x86 platform 500 MB local diskspace are available, on SPARC systems up to 200 GB. Scratch on each host is available for every user by /work/<username>. We also offer 3 TB NFS available scratch space accessible via /soft/scratch/<username>, respectively. The later one should only be used for applications without heavy IO performance and many small file access. Sequential reading or writing is no problem.
Be aware: We don't backup all these filesystems! Additionally, files not accessed during the last 10 days will be removed automatically. Moreover, we don't guarantee that data will be available after upgrades or changes of the configuration. So please don't use these filesystems to store important data! - Softwaretools
As several software packages need modified PATH, MANPATH and LD_LIBRARY_PATH environment variables we decided not to put them into the default search PATH. Please use the options command to get a quick list of software packages available. The software is activated by using the command option <software-name>. Please acknowledge the terms of use after activating the software. - Login Shell
Because of our heterogenous computing environment (besides the SMP cluster) we only support bash as login shell and provide environment setup for this shell only. Of course csh and tcsh are also available but not supported. - GNU Software
For the same reasons, we strongly rely on GNU software; so if in doubt make sure to check the path by the which command. Because the GNU version of the make command may be incompatible with make scripts of applications, option ungnu can be used to remove the GNU software from the path. Current information is provided during login: It is strongly recommended to consider these comments about updates, prefetching and so on. To get a list of all topics use news -l; single items are provided by news <topicname>. - FTP
Please don't use ftp or telnet to transfer data to or from the nodes. These protocols are insecure because passwords are transmitted in cleartext. For this reason we will disable them in the near future. Please use ssh and or sftp instead.
To transfer huge amounts of data to the scratch area please use scp and scratch.rz.uni-ulm.de:/soft/scratch/<username> as target. The ssh daemon on this machine is optimized for WAN access, e.g. from the universities in Stuttgart and Konstanz. Depending on the client link transfer speeds up to 50 MB/s can be reached. For transfers from your own PC the local ssh has to be optimized, too (
Hints for SSH/SCP optimizing).
The Batch system
The batch system
Grid Engine provides better overall performance of the nodes as well as advantages for scheduling downtime and the like. For interactive jobs we additionally provide special slots within the batch system. If there are any problems with the batchsystem or if your application is not capable of dealing with the batch environment, please contact your local admins. Together we certainly will find an appropriate workaround.
- The current GridEngine configuration supports the following limits and defaults:
CPU time (serial jobs) 336:0:0 (14 days)
CPU time (only MPI jobs) 72:0:0 (3 days)
max. memory per Process 60 GB
max. jobs per user 30
max. CPUs per job 8 - By default the following resources are requested:
1 CPU * 2 GB of memory * 72 hours of CPU time - To start a batch job see the demo batch file:
batch_demo.sh - Start the job with: qsub your_script.sh
- Additional information concerning the scheduler is available via the qstat command:
qstat: show state of all jobs
qstat -f: like qstat plus additional information
qstat -u <username>: show state of users jobs
qstat -j <jobid>: show extended state of a single job, especially the reason why it hasn't started yet
qstat -F <resource>: overview of current state with respect to one or more resources (allocation, ...) - The batch system should be instructed which architecture is required by the application. Default is always the architecture of the system on which the job was initiated. Allowed are "arch=sol-sparc64" and "arch=lx26-amd64". The old notations "solaris64" and "linux86" from the former compute server are not working anymore.
- You can delete jobs with "qdel JobID".
Clustertools (MPI)
Submitting batchjobs using MPI is basically the same as submitting 'normal' jobs. First of all, be aware that the integration of SUN HPC tools (MPI) with GridEngine is not as tight as we would like it to be.
The process is split up into two parts: a wrapper and the compute job
Common hints:
- Demo batch file
batch_demo.sh - To request a parallel job with 4-8 CPUs add the following to resources at top of file
#$ -l cre=true
#$ -pe mpi 4-8
This will only work for programs that have been explicitly written with MPI. This means that it will not work with e.g. Gaussian! - To make the wrapper call the MPI job passing the number of CPUs the batchsystem was able to provide replace the example application (date) by
/opt/SUNWhpc/bin/mprun -x sge ./
batch_mpi_demo.sh
The example requires that the compute job is located in the same directory from which the job was submitted. Be aware that the local scratch area of the different hosts is not accessible by other nodes. Also make sure to specify a range of CPUs if possible which makes the scheduler's life much easier. - Additional information concerning the use of MPI is available at the
High Performance Computing Center in Stuttgart.
Help Desk
Mon - Fri 8 a.m. - 6 p.m.
+49 (0)731/50-30000
helpdesk(at)uni-ulm.de
Help Desk support form
[more]
