DARWIN - Delaware
- 1 Getting started on DARWIN
- 2 ACCESS Allocations
- 3 Connecting to DARWIN
- 4 File Systems
- 5 Transferring Files
- 6 Application Development
- 7 Programming Environment
- 8 Running Applications
- 9 Job Accounting on DARWIN
- 10 Monitoring Allocation Usage
- 11 Queues/Partitions
- 12 Scheduling Jobs
- 13 Managing Jobs on DARWIN
- 14 Software Installation on DARWIN
- 15 System Status
Getting started on DARWIN
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education funded by a $1.4 million grant from the National Science Foundation (NSF). This award will establish the DARWIN computing system as an XSEDE Level 2 Service Provider in Delaware contributing 20% of DARWIN's resources to XSEDE: Extreme Science and Engineering Discovery Environment now transitioned to ACCESS: Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support as of September 1, 2022. DARWIN has 105 compute nodes with a total of 6672 cores, 22 GPUs, 100TB of memory, and 1.2PB of disk storage. See compute nodes and storage for complete details on architecture.
Figure 1. Fish-eye front view of DARWIN in the computing center
Configuration
The DARWIN cluster is being set up to be very similar to the existing Caviness cluster, and will be familiar to those currently using Caviness. However DARWIN is a NSF funded HPC resource available via committee reviewed allocation request process similar to ACCESS allocations.
An HPC system always has one or more public-facing systems known as login nodes. The login nodes are supplemented by many compute nodes which are connected by a private network. One or more head nodes run programs that manage and facilitate the functioning of the cluster. (In some clusters, the head node functionality is present on the login nodes.) Each compute node typically has several multi-core processors that share memory. Finally, all the nodes share one or more filesystems over a high-speed network.
Figure 2. DARWIN Configuration
Login nodes
Login (head) nodes are the gateway into the cluster and are shared by all cluster users. Their computing environment is a full standard variant of Linux configured for scientific applications. This includes command documentation (man pages), scripting tools, compiler suites, debugging/profiling tools, and application software. In addition, the login nodes have several tools to help you move files between the HPC filesystems and your local machine, other clusters, and web-based services.
Login nodes should be used to set up and submit job workflows and to compile programs. You should generally use compute nodes to run or debug application software or your own executables.
If your work requires highly interactive graphics and animations, these are best done on your local workstation rather than on the cluster. Use the cluster to generate files containing the graphics information, and download them from the HPC system to your local system for visualization.
When you use SSH to connect to darwin.hpc.udel.edu
your computer will choose one of the login (head) nodes at random. The default command line prompt clearly indicates to which login node you have connected: for example, [bjones@login00.darwin ~]$
is shown for account bjones
when connected to login node login00.darwin.hpc.udel.edu
.
Only use SSH to connect to a specific login node if you have existing processes present on it. For example, if you used the screen
or tmux
utility to preserve your session after logout.
Compute nodes
There are many compute nodes with different configurations. Each node consists of multi-core processors (CPUs), memory, and local disk space. Nodes can have different OS versions or OS configurations, but this document assumes all the compute nodes have the same OS and almost the same configuration. Some nodes may have more cores, more memory, GPUs, or more disk.
All compute nodes are now available and configured for use. Each compute node has 64 cores, so the compute resources available are the following:
Compute Node | Number of Nodes | Node Names | Total Cores | Memory Per Node | Total Memory | Total GPUs |
---|---|---|---|---|---|---|
Standard | 48 | r1n00 - r1n47 | 3,072 | 512 GiB | 24 TiB |
|
Large Memory | 32 | r2l00 - r2l31 | 2,048 | 1024 GiB | 32 TiB |
|
Extra-Large Memory | 11 | r2x00 - r2x10 | 704 | 2,048 GiB | 22 TiB |
|
nVidia-T4 | 9 | r1t00 - r1t07, r2t08 | 576 | 512 GiB | 4.5 TiB | 9 |
nVidia-V100 | 3 | r2v00 - 02 | 144 | 768 GiB | 2.25 TiB | 12 |
AMD-MI50 | 1 | r2m00 | 64 | 512 GiB | .5 TiB | 1 |
Extended Memory | 1 | r2e00 | 64 | 1024 GiB + 2.73 TiB1) | 3.73 TiB |
|
Total | 105 |
| 6,672 |
| 88.98 TiB | 22 |
The standard Linux on the compute nodes are configured to support just the running of your jobs, particularly parallel jobs. For example, there are no man pages on the compute nodes. All compute nodes will have full development headers and libraries.
Commercial applications, and normally your programs, will use a layer of abstraction called a programming model. Consult the cluster specific documentation for advanced techniques to take advantage of the low level architecture.
Storage
Home filesystem
Each DARWIN user receives a home directory (/home/<uid>
) that will remain the same during and after the early access period. This storage has slower access with a limit of 20 GiB. It should be used for personal software installs and shell configuration files.
Lustre high-performance filesystem
Lustre is designed to use parallel I/O techniques to reduce file-access time. The Lustre filesystems in use at UD are composed of many physical disks using RAID technologies to give resilience, data integrity, and parallelism at multiple levels. There is approximately 1.1 PiB of Lustre storage available on DARWIN. It uses high-bandwidth interconnects such as Mellanox HDR100. Lustre should be used for storing input files, supporting data files, work files, and output files associated with computational tasks run on the cluster.
Each allocation will be assigned a workgroup storage in the Lustre directory (
/lustre/«workgroup»/
).Each workgroup storage will have a users directory (
/lustre/«workgroup»/users/«uid»
) for each user of the workgroup to be used as a personal directory for running jobs and storing larger amounts of data.Each workgroup storage will have a software and VALET directory (
/lustre/«workgroup»/sw/
and/lustre/«workgroup»/sw/valet
) all allow users of the workgroup to install software and create VALET package files that need to be shared by others in the workgroup.There will be a quota limit set based on the amount of storage approved for your allocation for the workgroup storage.
While all filesystems on the DARWIN cluster utilize hardware redundancies to protect data, there is no backup or replication and no recovery available for the home or Lustre filesystems.
Local filesystems
Each node has an internal, locally connected disk. Its capacity is measured in terabytes. Each compute node on DARWIN has a 1.75 TiB SSD local scratch filesystem disk. Part of the local disk is used for system tasks such memory management, which might include cache memory and virtual memory. This remainder of the disk is ideal for applications that need a moderate amount of scratch storage for the duration of a job's run. That portion is referred to as the node scratch filesystem.
Each node scratch filesystem disk is only accessible by the node in which it is physically installed. The job scheduling system creates a temporary directory associated with each running job on this filesystem. When your job terminates, the job scheduler automatically erases that directory and its contents.
Software
There will not be a full set of software during early access and testing, but we will be continually installing and updating software. Installation priority will go to compilers, system libraries, and highly utilized software packages. Please DO let us know if there are packages that you would like to use on DARWIN, as that will help us prioritize user needs, but understand that we may not be able to install software requests in a timely manner.
Users will be able compile and install software packages in their home or workgroup directories. There will be very limited support for helping with user compiled installs or debugging during early access. Please reference basic software building and management to get started with software installations utilizing VALET (versus Modules) as suggested and used by IT RCI staff on our HPC systems.
Please review the following documents if you are planning to compile and install your own software
High Performance Computing (HPC) Tuning Guide for AMD EPYC™ 7002 Series Processors guide for getting started tuning AMD 2nd Gen EPYC™ Processor based systems for HPC workloads. This is not an all-inclusive guide and some items may have similar, but different, names in specific OEM systems (e.g. OEM-specific BIOS settings). Every HPC workload varies in its performance characteristics. While this guide is a good starting point, you are encouraged to perform your own performance testing for additional tuning. This guide also provides suggestions on which items should be the focus of additional, application-specific tuning (November 2020).
HPC Tuning Guide for AMD EPYC™ Processors guide intended for vendors, system integrators, resellers, system managers and developers who are interested in EPYC system configuration details. There is also a discussion on the AMD EPYC software development environment, and we include four appendices on how to install and run the HPL, HPCG, DGEMM, and STREAM benchmarks. The results produced are ‘good' but are not necessarily exhaustively tested across a variety of compilers with their optimization flags (December 2018).
AMD EPYC™ 7xx2-series Processors Compiler Options Quick Reference Guide, however we do not have the AOCC compiler (with Flang - Fortran Front-End) installed on DARWIN.
Scheduler
DARWIN will being using the Slurm scheduler like Caviness, and is the most common scheduler among ACCESS resources. Slurm on DARWIN is configured as fair share with each user being given equal shares to access the current HPC resources available on DARWIN.
Queues (Partitions)
Partitions have been created to align with allocation requests moving forward based on different node types. There will be no default partition, and must only specify one partition at a time. It is not possible to specify multiple partitions using Slurm to span different node types.
Run Jobs
In order to schedule any job (interactively or batch) on the DARWIN cluster, you must set your workgroup to define your cluster group. Each research group has been assigned a unique workgroup. Each research group should have received this information in a welcome email. For example,
# workgroup -g it_css
will enter the workgroup for it_css
. You will know if you are in your workgroup based on the change in your bash prompt. See the following example for user bjones
[bjones@login00.darwin ~]$ workgroup -g it_css
[(it_css:bjones)@login00.darwin ~]$ printenv USER HOME WORKDIR WORKGROUP WORKDIR_USER
bjones
/home/1201
/lustre/it_css
it_css
/lustre/it_css/users/1201
[(it_css:bjones)@login00.darwin ~]$
Now we can use salloc
or sbatch
as long as a partition is specified as well to submit an interactive or batch job respectively. See DARWIN Run Jobs, Schedule Jobs and Managing Jobs wiki pages for more help about Slurm including how to specify resources and check on the status of your jobs.
All resulting executables (created via your own compilation) and other applications (commercial or open-source) should only be run on the compute nodes.
It is a good idea to periodically check in /opt/shared/templates/slurm/
for updated or new templates to use as job scripts to run generic or specific applications designed to provide the best performance on DARWIN.
Help
ACCESS allocations
To report a problem or provide feedback, submit a help desk ticket on the ACCESS Portal and complete the form selecting darwin.udel.xsede.org
as the system and your problem details in the description field to route your question more quickly to the research support team. Provide enough details (including full paths of batch script files, log files, or important input/output files) that our consultants can begin to work on your problem without having to ask you basic initial questions.
Ask or tell the HPC community
hpc-ask is a Google group established to stimulate interactions within UD's broader HPC community and is based on members helping members. This is a great venue to post a question about HPC, start a discussion, or share an upcoming event with the community. Anyone may request membership. Messages are sent as a daily summary to all group members. This list is archived, public, and searchable by anyone.
Publication and Grant Writing Resources
Please refer to the NSF award information for a proposal or requesting allocations on DARWIN. We require all allocation recipients to acknowledge their allocation awards using the following standard text: “This research was supported in part through the use of DARWIN computing system: DARWIN – A Resource for Computational and Data-intensive Research at the University of Delaware and in the Delaware Region, Rudolf Eigenmann, Benjamin E. Bagozzi, Arthi Jayaraman, William Totten, and Cathy H. Wu, University of Delaware, 2021, URL: https://udspace.udel.edu/handle/19716/29071″
ACCESS Allocations
A PI may request allocations on DARWIN via Access. See the Access Allocations page for details on how to do so. If an allocation on DARWIN is granted, a PI may use the ACCESS Allocations portal to add or remove accounts for an active allocation on DARWIN as long as the person you want to add has an ACCESS user portal account. If the person doesn’t have an ACCESS user portal account, then they need to visit the ACCESS User Registration to create one. The person will need to share their ACCESS user portal account with the PI to be added. Please keep in mind it may take up to 10 business days to process an account request on DARWIN for ACCESS users.
Accounts
An ACCESS username will be assigned having the form xsedeuuid. The uid is based on a unique, 4-digit numerical identifier assigned to you. An email with the subject [darwin-users] New DARWIN ACCESS (XSEDE) account information
will be sent to the ACCESS user once their account is ready on DARWIN. Please keep in mind it may take up to 10 business days to process an account request on DARWIN for ACCESS users. Passwords are not set for ACCESS accounts on DARWIN, so you must set a password using the password reset web application at https://idp.hpc.udel.edu/access-password-reset/.
The application starts by directing the client to the CILogon authentication system where the “ACCESS CI (XSEDE)” provider should be selected. If successful (and the client has an account on DARWIN), the application next asks for an email address to which a verification email should be sent; the form is pre-populated with the email address on-record on DARWIN for the client's account. The client has 15 minutes to follow the link in that email message to choose a new password. The form displays information regarding the desired length and qualifications of a valid password. If the new password is acceptable, the client's DARWIN password is set and SSH access via password should become available immediately.
ACCESS users on DARWIN can use the password reset web application to reset a forgotten password, too.
See connecting to DARWIN for more details.
For example,
$ hpc-user-info -a xsedeu1201
full-name = Student Training
last-name = Student Training
home-directory = /home/1201
email-address = bjones@udel.edu
clusters = DARWIN
Command | Function |
| Display info about a user |
| Display complete syntax |
Groups
The allocation groups of which you are a member determine which computing nodes, job queues, and storage resources you may use. Each group has a unique descriptive group name (gname). There are two categories of group names: class and workgroup.
The class category: All users belong to the group named everyone.
The workgroup category: Each workgroup has a unique group name (e.g., xg-tra180011) assigned for each allocation. The PI and users are members of that allocation group (workgroup). To see the usernames of all members of the workgroup, type the hpc-group-info -a allocation_workgroup
command.
Use the command groups
to see all of your groups. The example below is for user xsedeu1201
For example, the command below will display the complete information about the workgroup xg-tra180011
and its members.
The output of this command represents (description=PI
), along with every member in the workgroup and their account information (Username, Full Name, Email Address).
Connecting to DARWIN
Secure Shell program (SSH)
Use a secure shell program/client (SSH) to connect to the cluster and a secure file transfer program to move files to and from the cluster.
There are many suitable secure clients for Windows, Mac OS X, and UNIX/Linux. We recommend MobaXterm or PuTTY and Xming for Windows users. Macintosh and UNIX/Linux users can use their pre-installed SSH and X11 software. (Newer versions of Mac OS X may not have a current version of X11 installed. See the Apple web site for X11 installation instructions.)
IT strongly recommends that you configure your clients as described in the online X-windows (X11) and SSH documents (Windows / Linux/MacOSX). If you need help generating or uploading your SSH keys, please see the Managing SSH Keys page for ACCESS recommendations on how to do so.
Your HPC home directory has a .ssh
directory. Do not manually erase or modify the files that were initially created by the system. They facilitate communication between the login (head) node and the compute nodes. Only use standard ssh commands to add keys to the files in the .ssh
directory.
Please refer to Windows and Mac/Linux related sections for specific details using the command line on your local computer:
Logging on to DARWIN
You need a DARWIN account to access the login node.
To learn about launching GUI applications on DARWIN, refer to Schedule Jobs page.
ACCESS users with an allocation award on DARWIN will not be able to login until their password is set by using the password reset web application at https://idp.hpc.udel.edu/access-password-reset/.
The application starts by directing the client to the CILogon authentication system where the “ACCESS CI (XSEDE)” provider should be selected. If successful (and the client has an account on DARWIN), the application next asks for an email address to which a verification email should be sent; the form is pre-populated with the email address on-record on DARWIN for the client's account. The client has 15 minutes to follow the link in that email message to choose a new password. The form displays information regarding the desired length and qualifications of a valid password. If the new password is acceptable, the client's DARWIN password is set and SSH access via password should become available immediately.
ACCESS users on DARWIN can use the password reset web application to reset a forgotten password, too.
Once a password has been set, you may login to DARWIN by using:
or if you need you to use X-Windows requiring X11 forwarding (e.g., for a Jupyter Notebook or applications that generate graphical output), then use
where XXXX
is your unique uid. The standard methods documented for adding a public key on DARWIN will only work once a password has been set for your ACCESS DARWIN account using the password reset web application. If need help setting up SSH, please see the Generating SSH Keys page and/or Uploading Your SSH Key page.
Once you are logged into DARWIN, your account is configured as a member of an allocation workgroup which determines access to your HPC resources on DARWIN. Setting your allocation workgroup is required in order to submit jobs to the DARWIN cluster. For example, the bjones
account is a member of the it_css
workgroup. To start a shell in the it_css
workgroup, type:
Consult the following pages for detailed instructions for using DARWIN.
File Systems
Home
The 13.5 TiB file system uses 960 GiB enterprise class SSD drives in a triple-parity RAID configuration for high reliability and availability. The file system is accessible to all nodes via IPoIB on the 100 Gbit/s InfiniBand network.
Storage
Each user has 20 GB of disk storage reserved for personal use on the home file system. Users' home directories are in /home (e.g., /home/1005
), and the directory name is put in the environment variable $HOME
at login.
High-Performance Lustre
Lustre is designed to use parallel I/O techniques to reduce file-access time. The Lustre file systems in use at UD are composed of many physical disks using RAID technologies to give resilience, data integrity, and parallelism at multiple levels. There is approximately 1.1 PiB of Lustre storage available on DARWIN. It uses high-bandwidth interconnects such as Mellanox HDR100. Lustre should be used for storing input files, supporting data files, work files, and output files associated with computational tasks run on the cluster.
Consult All About Lustre for more detailed information.
Workgroup Storage
Allocation workgroup storage is available on a high-performance Lustre-based file system having almost 1.1 PB of usable space. Users should have a basic understanding of the concepts of Lustre to take full advantage of this file system. The default stripe count is set to 1 and the default striping is a single stripe distributed across all available OSTs on Lustre. See Lustre Best Practices from Nasa.
Each allocation will have at least 1 TiB of shared (workgroup) storage in the /lustre/
directory identified by the «allocation_workgroup» (e.g., /lustre/it_css
) accessbile by all users in the allocation workgroup, and is referred to as your workgroup directory ($WORKDIR
), if the allocation workgroup has been set.
Each user in the allocation workgroup will have a /lustre/«workgroup»/users/«uid»
directory to be used as a personal workgroup storage directory for running jobs, storing larger amounts of data, input files, supporting data files, work files, output files and source code. It can be referred to as $WORKDIR_USERS
, if the allocation workgroup has been set.
Each allocation will also have a /lustre/«workgroup»/sw
directory to allow users to install software to be shared for the allocation workgroup. It can be referred to as $WORKDIR_SW
, if the allocation workgroup has been set. In addition a /lustre/«workgroup»/sw/valet
) directory is also provided to store VALET package files to shared for the allocation workgroup.
Please see workgroup for complete details on environment variables.
Note: A full file system inhibits use for everyone preventing jobs from running.
Local/Node File System
Temporary Storage
Each compute node has its own 2 TB local hard drive, which is needed for time-critical tasks such as managing virtual memory. The system usage of the local disk is kept as small as possible to allow some local disk for your applications, running on the node.
Quotas and Usage
To help users maintain awareness of quotas and their usage on the /home
file system, the command my_quotas
is available to display a list of all quota-controlled file systems on which the user has storage space.
For example, the following shows the amount of storage available and in-use for user bjones
in workgroup it_css
for their home and workgroup directory.
Home
Each user's home directory has a hard quota limit of 20 GB. To check usage, use
The example below displays the usage for the home directory (/home/1201
) for the account bjones
as 7.2 GB used out of 20 GB which matches the above example provide by my_quotas
command.
Workgroup
All of Lustre is available for allocation workgroup storage. To check Lustre usage for all users, use df -h /lustre
.
The example below shows 25 TB is in use out of 954 TB of usable Lustre storage.
To see your allocation workgroup usage, please use the my_quotas
command. Again the the following example shows the amount of storage available and in-use for user bjones
in allocation workgroup it_css
for their home and allocation workgroup directories.
Node
The node temporary storage is mounted on /tmp
for all nodes. There is no quota, and if you exceed the physical size of the disk you will get disk failure messages. To check the usage of your disk, use the df -h
command on the compute node where your job is running.
We strongly recommend that you refer to the node scratch by using the environment variable, $TMPDIR
, which is defined by Slurm when using salloc
or srun
or sbatch
.
For example, the command
shows size, used and available space in M, G or T units.
This node r1n00
has a 2 TB disk, with only 41 MB used, so 1.8 TB is available for your job.
There is a physical disk installed on each node that is used for time critical tasks, such as swapping memory. Most of the compute nodes are configured with a 2 TB disk, however, the /tmp
file system will never have the total disk. Larger memory nodes will need to use more of the disk for swap space.
Recovering Files
While all file systems on the DARWIN cluster utilize hardware redundancies to protect data, there is no backup or replication and no recovery available for the home or Lustre file systems. All backups are the responsibility of the user. DARWIN's systems administrators are not liable for any lost data.
Usage Recommendations
Home directory: Use your home directory to store private files. Application software you use will often store its configuration, history and cache files in your home directory. Generally, keep this directory free and use it for files needed to configure your environment. For example, add symbolic links in your home directory to point to files in any of the other directory.
Workgroup directory: Use the personal allocation workgroup directory for running jobs, storing larger amounts of data, input files, supporting data files, work files, output files and source code in $WORKDIR_USERS
as an extension of your home direcory. It is also appropriate to use the software allocation workgroup directory to build applications for everyone in your allocation group in $WORKDIR_SW
as well as create a VALET package for your fellow researchers to access applications you want to share in $WORKDIR_SW/valet
.
Node scratch directory: Use the node scratch directory for temporary files. The job scheduler software (Slurm) creates a temporary directory in /tmp specifically for each job's temporary files. This is done on each node assigned to the job. When the job is complete, the subdirectory and its contents are deleted. This process automatically frees up the local scratch storage that others may need. Files in node scratch directories are not available to the head node, or other compute nodes.
Transferring Files
Be careful about modifications you make to your startup files (e.g. .bash*
). Commands that produce output such as VALET or workgroup commands may cause your file transfer command or application to fail. Log into the cluster with ssh
to check what happens during login, and modify your startup files accordingly to remove any commands which are producing output and try again. See computing environment startup and logout scripts for help.
Common Clients
You can move data to and from the cluster using the following supported clients: