{"type":"doc","content":[{"type":"extension","attrs":{"layout":"default","extensionType":"com.atlassian.confluence.macro.core","extensionKey":"toc","parameters":{"macroParams":{"minLevel":{"value":"1"},"maxLevel":{"value":"1"}},"macroMetadata":{"macroId":{"value":"8a6b9e5800b8d070e16086984910ea9d"},"schemaVersion":{"value":"1"},"title":"Table of Contents"}},"localId":"f1bc0762-510d-4abe-9b2a-faf06e0a557a"}},{"type":"heading","attrs":{"level":1},"content":[{"text":"Technical Summary","type":"text"}]},{"type":"mediaSingle","attrs":{"layout":"center","width":80.0},"content":[{"type":"media","attrs":{"type":"external","url":"https://xsedetoaccess.ccs.uky.edu/XSEDE/images/documents/10308/2578178/Expanse_inner_1200-sm.jpg/56523be6-c991-4c86-a4b5-2ebdf3ddc53c%3Ft=1602454080000"}}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is a dedicated Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support (","type":"text"},{"text":"ACCESS","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.org/"}}]},{"text":") cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and will offer Composable Systems and Cloud Bursting.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":"'s standard compute nodes are each powered by two 64-core AMD EPYC 7742 processors and contain 256 GB of DDR4 memory, while each GPU node contains four NVIDIA V100s (32 GB SMX2) connected via NVLINK and dual 20-core Intel Xeon 6248 CPUs. ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" also has four 2 TB large memory nodes.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is organized into 13 SDSC Scalable Compute Units (SSCUs), comprising 728 standard nodes, 54 GPU nodes and 4 large-memory nodes. Every ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" node has access to a 12 PB Lustre parallel file system (provided by Aeon Computing) and a 7 PB Ceph Object Store system. ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" uses the Bright Computing HPC Cluster management system and the SLURM workload manager for job scheduling.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse Portal Login","type":"text","marks":[{"type":"strong"},{"type":"link","attrs":{"href":"https://portal.expanse.sdsc.edu"}}]}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" supports the ","type":"text"},{"text":"ACCESS core software stack","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/682491905"}}]},{"text":", which includes remote login, remote computation, data movement, science workflow support, and science gateway support toolkits.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is an NSF-funded system operated by the ","type":"text"},{"text":"San Diego Supercomputer Center","type":"text","marks":[{"type":"link","attrs":{"href":"https://www.sdsc.edu"}}]},{"text":" at ","type":"text"},{"text":"UC San Diego","type":"text","marks":[{"type":"link","attrs":{"href":"https://www.ucsd.edu"}}]},{"text":", and is available through the ACCESS program.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Resource Allocation Policies","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"The maximum allocation for a Principle Investigator on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is 15M core-hours and 75K GPU hours. Limiting the allocation size means that ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" can support more projects, since the average size of each is smaller.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Science Gateways requesting in the Maximize tier can request up to 30M core-hours.","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Job Scheduling Policies","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"The maximum allowable job size on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is 4,096 cores – a limit that helps shorten wait times since there are fewer nodes in idle state waiting for large number of nodes to become free.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" supports long-running jobs - run times can be extended to one week. Users requests will be evaluated based on number of jobs and job size. ","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" supports shared-node jobs (more than one job on a single node). Many applications are serial or can only scale to a few cores. Allowing shared nodes improves job throughput, provides higher overall system utilization, and allows more users to run on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":".","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Technical Details","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"b4589ce6-7c0e-4d92-8869-67290cbfe17a"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"SYSTEM COMPONENT","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"CONFIGURATION","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"Compute Nodes","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"CPU Type","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"AMD EPYC 7742","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Nodes","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"728","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Sockets","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Cores/socket","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"64","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Clock speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2.25 GHz","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Flop speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"4608 GFlop/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Memory capacity","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"256 GB DDR4 DRAM","type":"text"}]}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Local Storage","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"1TB Intel P4510 NVMe PCIe SSD","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Max CPU Memory bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"409.5 GB/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"GPU Nodes","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"GPU Type","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"NVIDIA V100 SMX2","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Nodes","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"52","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"GPUs/node","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"CPU Type","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Xeon Gold 6248","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Cores/socket","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"20","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Sockets","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Clock speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2.5 GHz","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Flop speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"34.4 TFlop/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Memory capacity","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"*384 GB DDR4 DRAM","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Local Storage","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"1.6TB Samsung PM1745b NVMe PCIe SSD","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Max CPU Memory bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"281.6 GB/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"Large-Memory","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"CPU Type","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"AMD EPYC 7742","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Nodes","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Sockets","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Cores/socket","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"64","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Clock speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2.25 GHz","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Flop speed","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"4608 GFlop/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Memory capacity","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"2 TB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Local Storage","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"3.2 TB (2 X 1.6 TB Samsung PM1745b NVMe PCIe SSD)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"STREAM Triad bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"~310 GB/sec","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"Full System","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total compute nodes","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"728","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total compute cores","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"93,184","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total GPU nodes","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"52","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total V100 GPUs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"208","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Peak performance","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"5.16 PFlop/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total memory","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"247 TB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total memory bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"215 TB/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Total flash memory","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"824 TB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"HDR InfiniBand Interconnect","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Topology","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Hybrid Fat-Tree","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Link bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"56 Gb/s (bidirectional)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Peak bisection bandwidth","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"8.5 TB/s","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"MPI latency","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"1.17-1.69 µs","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":2,"rowspan":1,"colwidth":[340.0,340.0]},"content":[{"type":"paragraph","content":[{"text":"DISK I/O Subsystem","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"File Systems","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"NFS, Ceph","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Lustre Storage(performance)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"12 PB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Ceph Storage","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"7 PB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"I/O bandwidth (performance disk)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"140 GB/s, 200K IOPs","type":"text"}]}]}]}]},{"type":"paragraph","content":[{"text":" ","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Systems Software Environment","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"20f51a70-9d91-4877-8e3a-aeb9e72b922a"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"SOFTWARE FUNCTION","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"DESCRIPTION","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Cluster Management","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Bright Cluster Manager","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Operating System","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Rocky Linux","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"File Systems","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Lustre, Ceph","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Scheduler and Resource Manager","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"SLURM","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"ACCESS Software","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"CTSS","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"User Environment","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Lmod","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Compilers","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"AOCC, GCC, Intel, PGI","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Message Passing","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Intel MPI, MVAPICH, Open MPI","type":"text"}]}]}]}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"System Access","type":"text"}]},{"type":"paragraph","content":[{"text":"As an ACCESS computing resource, ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" is accessible to ACCESS users who are given time on the system. To obtain an account, users may submit a proposal through the ","type":"text"},{"text":"ACCESS Allocation Request System","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/682491905"}}]},{"text":" or request a ","type":"text"},{"text":"Trial Account","type":"text","marks":[{"type":"link","attrs":{"href":"mailto:consult@sdsc.edu"}}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"Interested parties may contact the ","type":"text"},{"text":"ACCESS Help Desk","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/467109484"}}]},{"text":" ","type":"text","marks":[{"type":"strong"}]},{"text":"for ","type":"text"},{"text":"help with an allocations proposal","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/467109798"}}]},{"text":" on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":".","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Logging in to Expanse","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" supports ACCESS user connections via SSH, and web-based access via the ","type":"text"},{"text":"Expanse User Portal","type":"text","marks":[{"type":"link","attrs":{"href":"https://portal.expanse.sdsc.edu"}}]},{"text":". While CPU and GPU resources are allocated separately, the login nodes are the same. To log in to ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" from the command line, use the hostname: ","type":"text"},{"text":"login.expanse.sdsc.edu","type":"text","marks":[{"type":"code"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"The following are examples of Secure Shell (ssh) commands that may be used to log in to ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":":","type":"text"}]},{"type":"codeBlock","content":[{"text":"ssh @login.expanse.sdsc.edu\nssh -l login.expanse.sdsc.edu\n","type":"text"}]},{"type":"paragraph","content":[{"text":"If you need help setting up SSH, please see the ACCESS ","type":"text"},{"text":"Generating SSH Keys","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/467110032"}}]},{"text":" page and/or ","type":"text"},{"text":"Uploading Your SSH Key","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/939360257"}}]},{"text":" page.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Notes and hints","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"When you log in to ","type":"text"},{"type":"inlineCard","attrs":{"url":"http://expanse.sdsc.edu"}},{"text":" , you will be assigned one of the two login nodes login0[1-2]-expanse.sdsc.edu. These nodes are identical in both architecture and software environment. Users should normally log in through ","type":"text"},{"text":"login.expanse.sdsc.edu","type":"text","marks":[{"type":"link","attrs":{"href":"http://login.expanse.sdsc.edu"}}]},{"text":", but may specify one of the two nodes directly if they see poor performance.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Please feel free to append your public key to your ~/.ssh/authorized_keys file to enable access from authorized hosts without having to enter your password. We accept RSA, ECDSA and ed25519 keys. Make sure you have a strong passphrase on the private key on your local machine.","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"You can use ssh-agent or keychain to avoid repeatedly typing the private key password.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Hosts which connect to SSH more frequently than ten times per minute may get blocked for a short period of time","type":"text"}]}]}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Do not use the login nodes for computationally intensive processes, as hosts for running workflow management tools, as primary data transfer nodes for large or numerous data transfers or as servers providing other services accessible to the Internet. The login nodes are meant for file editing, simple data analysis, and other tasks that use minimal compute resources. All computationally demanding jobs should be submitted and run through the batch queuing system.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Login nodes are not the same as the batch nodes, Users should request an interactive sessions to compile programs.","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"2FA with Google Authenticator (Optional)","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" allows a user to use two-factor authentication (2FA) when using a password to log in. 2FA adds a layer of security to your authentication process. ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" uses Google Authenticator, which is a standards-based implementation.Install Authenticator App","type":"text"}]},{"type":"paragraph","content":[{"text":"Users will first need to Install an authenticator app on their smartphone or other device. Users can use any app that supports importing TOTP 2FA codes with a QR code. (Google Authenticator, DUO Mobile App, LastPass Authenticator App, etc) We suggest using the Google Authenticator app if you do not an athenticator application already installed on your mobile device. ","type":"text"},{"type":"hardBreak"}]},{"type":"paragraph","content":[{"text":"Google Authenticator for Apple IOS","type":"text","marks":[{"type":"link","attrs":{"href":"https://apps.apple.com/us/app/google-authenticator/id388497605"}}]}]},{"type":"paragraph","content":[{"text":"Google Authenticator for Android","type":"text","marks":[{"type":"link","attrs":{"href":"https://authenticator.en.uptodown.com/android"}}]},{"text":" Once the authenticator app has been installed, users will need to enroll and pair the 2FA device with their ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" Account.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"To enroll:","type":"text"}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Log in to ","type":"text"},{"text":"login.expanse.sdsc.edu","type":"text","marks":[{"type":"link","attrs":{"href":"http://login.expanse.sdsc.edu"}}]}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"On the command line load the sdsc module:","type":"text"}]}]}]},{"type":"codeBlock","content":[{"text":">module load sdsc","type":"text"}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Resize your terminal window and/or font size so it can display at least 82 columns by 40 lines","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"On the command line run to command:","type":"text"}]}]}]},{"type":"codeBlock","content":[{"text":">otp-enroll","type":"text"}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Using your smart phone, scan the QR code with your OTP/2FA application","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Confirm the scan by entering the 6-digit code from the OTP/2FA application","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Save your emergency scratch codes, in case you need to log in and don't have access to your mobile. (You can always log in with SSH keys instead of using an emergency code)","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Answer 'y' to the prompt asking if you want to update your .google_authenticator file.","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"At this time 2FA is optional, users may un-enroll at any time.","type":"text"}]},{"type":"paragraph","content":[{"text":"To un-enroll:","type":"text","marks":[{"type":"strong"}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Log in to ","type":"text"},{"text":"login.expanse.sdsc.edu","type":"text","marks":[{"type":"link","attrs":{"href":"http://login.expanse.sdsc.edu"}}]},{"text":".","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Remove the file ~/.google_authenticator","type":"text"}]}]}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Once you have removed the .google_authenticator file from the server side, you can remove the entry on your smart phone or other device","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Expanse User Portal","type":"text"}]},{"type":"paragraph","content":[{"text":"The ","type":"text"},{"text":"Expanse User Portal","type":"text","marks":[{"type":"strong"},{"type":"link","attrs":{"href":"https://portal.expanse.sdsc.edu"}}]},{"text":" provides a quick and easy way for ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" users to log in, transfer and edit files, and submit and monitor jobs. The Portal provides a gateway for launching interactive applications such as MATLAB, RStudio, and an integrated web-based environment for file management and job submission. All ACCESS users with a valid ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" allocation have access via their ACCESS-based credentials.","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Modules","type":"text"}]},{"type":"paragraph","content":[{"text":"Environment Modules provide for dynamic modification of your shell environment. Module commands set, change, or delete environment variables, typically in support of a particular application. They also let the user choose between different versions of the same software or different combinations of related codes.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" uses Lmod, a Lua-based module system. Users will now need to setup their own environment by loading available modules into the shell environment, including compilers and libraries and the batch scheduler.","type":"text"}]},{"type":"paragraph","content":[{"text":"Users will not see all the available modules when they run the ","type":"text"},{"text":"module available","type":"text","marks":[{"type":"code"}]},{"text":" command without loading a compiler. Users should use the command ","type":"text"},{"text":"module spider","type":"text","marks":[{"type":"code"}]},{"text":" to see if a particular package exists and can be loaded on the system. For additional details, and to identify dependents modules, use the command:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module spider ","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"The module paths are different for the CPU and GPU nodes. Users can enable the paths by loading the following modules:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load cpu (for cpu nodes)","type":"text"}]},{"type":"paragraph"},{"type":"codeBlock","content":[{"text":"module load gpu (for gpu nodes)","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"Users are requested to ensure that both sets are not loaded at the same time in their build/run environment (use the ","type":"text"},{"text":"module list","type":"text","marks":[{"type":"code"}]},{"text":" command to check in an interactive session).","type":"text"}]},{"type":"paragraph","content":[{"text":"On the GPU nodes, the gnu compiler used for building packages is the default version 8.3.1 from the OS. Hence, no additional ","type":"text"},{"text":"module load","type":"text","marks":[{"type":"code"}]},{"text":" command is required to use them. For example, if one needs OpenMPI built with gnu compilers, the following is sufficient:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load openmpi","type":"text"}]},{"type":"paragraph"},{"type":"heading","attrs":{"level":2},"content":[{"text":"Useful Modules Commands","type":"text"}]},{"type":"paragraph","content":[{"text":"Here are some common module commands and their descriptions:","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"85be7a66-5f1a-4f8c-b919-c2c65f1e47f4"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"COMMAND","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"DESCRIPTION","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module list","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"List the modules that are currently loaded","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module avail","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"List the modules that are available in environment","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module spider","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"List of the modules and extensions currently available","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module display ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Show the environment variables used by and how they are affected","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module unload ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Remove from the environment","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module load ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Load into the environment","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"module swap ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Replace with in the environment","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Loading and unloading modules","type":"text"}]},{"type":"paragraph","content":[{"text":"Some modules depend on others, so they may be loaded or unloaded as a consequence of another module command. If a model has dependencies, the command ","type":"text"},{"text":"module spider ","type":"text","marks":[{"type":"code"}]},{"text":" will provide additional details.","type":"text"}]},{"type":"paragraph","content":[{"text":"Module: command not found","type":"text","marks":[{"type":"strong"}]}]},{"type":"paragraph","content":[{"text":"The error message ","type":"text"},{"text":"module: command not found","type":"text","marks":[{"type":"em"}]},{"text":" is sometimes encountered when switching from one shell to another or attempting to run the module command from within a shell script or batch job. The reason the ","type":"text"},{"text":"module","type":"text","marks":[{"type":"code"}]},{"text":" command may not be inherited as expected is that it is defined as a function for your login shell. If you encounter this error, execute the following from the command line (interactive shells) or add to your shell script (including SLURM batch scripts):","type":"text"}]},{"type":"codeBlock","content":[{"text":"source /etc/profile.d/modules.sh","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Managing Your User Account","type":"text"}]},{"type":"paragraph","content":[{"text":"The ","type":"text"},{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script provides additional details regarding project availability and usage. The script is located at:","type":"text"}]},{"type":"codeBlock","content":[{"text":"/cm/shared/apps/sdsc/current/bin/expanse-client","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"To use the script you will need to load the 'sdsc' module.","type":"text"}]},{"type":"codeBlock","content":[{"text":"[user@login01 ~]$ module load sdsc","type":"text"}]},{"type":"paragraph","content":[{"text":"To review your available projects on Expanse resource use the 'user' parameter and '-r' to designate a resource. If no resource is designated expanse data will be shown by default.","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"user@login01 ~]$ expanse-client user -r expanse","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":" Resource expanse","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"╭───┬─────────────┬─────────┬────────────┬──────┬───────────┬─────────────────╮","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"│ │ NAME │ PROJECT │ TG PROJECT │ USED │ AVAILABLE │ USED BY PROJECT │","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"├───┼─────────────┼─────────┼────────────┼──────┼───────────┼─────────────────┤","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"│ 1 │ user │ ddp386 │ │ 0 │ 110000 │ 8318 │","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"╰───┴─────────────┴─────────┴────────────┴──────┴───────────┴─────────────────╯","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"To see full list of available resources, use the 'resource' command:","type":"text"}]},{"type":"codeBlock","content":[{"text":"[user@login02 ~]$ expanse-client resource\nAvailable resources:\nexpanse\nexpanse_gpu\nexpanse_industry\nexpanse_industry_gpu\n","type":"text"}]},{"type":"paragraph","content":[{"text":"To review project details, use the 'project' parameter followed by an eligible project. (use -p option to report with out formatting):","type":"text"},{"type":"hardBreak"}]},{"type":"paragraph","content":[{"text":"user@login01 ~]$ expanse-client project ddp386 -p","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":" Resource expanse","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" Project ddp386","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" TG Project","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" Total allocation 110000","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" Total spent 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" Expiration November 16, 2022","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":" NAME USED AVAILABLE USED BY PROJECT","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"-------------------------------------------------","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user 0 110000 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user1 0 110000 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user2 18 110000 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user3 7825 110000 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user4 0 110000 8318","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user5 152 110000 8318","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"For additional help using the expanse-client tool:","type":"text"}]},{"type":"paragraph","content":[{"text":"[user@login01 ~]$ expanse-client -h","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Usage:","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" expanse-client [command][flags]","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Available Commands:","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" help Help about any command","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" project Get 'project' information","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" resource Get resources","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" user Get 'user' information","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Flags:","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -a, --auth authenticate the request ","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -h, --help help for user","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -p, --plain plain no graphics output","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -v, --verbose verbose output","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -r, --resource string Resource to query (default: \"expanse\")","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Global Flags:","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -a, --auth authenticate the request","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -p, --plain plain no graphics output","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" -v, --verbose verbose output","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Many users will have access to multiple projects (e.g. an allocation for a research project and a separate allocation for classroom or educational use). Users should verify that the correct project is designated for all batch jobs. Awards are granted for a specific purposes and should not be used for other projects. Designate a project by replacing ","type":"text"},{"text":"<< project >>","type":"text","marks":[{"type":"code"}]},{"text":" with a project listed in the SBATCH directive in your job script:","type":"text"}]},{"type":"codeBlock","content":[{"text":" #SBATCH -A << project >>","type":"text"}]},{"type":"paragraph"},{"type":"heading","attrs":{"level":2},"content":[{"text":"Adding Users to a Project","type":"text"}]},{"type":"paragraph","content":[{"text":"Project PIs and co-PIs can add/remove users(accounts) to/from a project. To do this, log in to your ","type":"text"},{"text":"ACCESS portal account","type":"text","marks":[{"type":"link","attrs":{"href":"https://allocations.access-ci.org/login"}}]},{"text":" and go to the","type":"text"},{"text":" Manage Allocations ","type":"text","marks":[{"type":"strong"}]},{"text":"page.","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Job Charging","type":"text"}]},{"type":"paragraph","content":[{"text":"The charge unit for all SDSC machines, including ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":", is the Service Unit (SU). This corresponds to the equivalent use of one compute core utilizing less than or equal to 2G of data for one hour, or 1 GPU using less than 92G of data for 1 hour. Keep in mind that your charges are based on the resources that are tied up by your job and don't necessarily reflect how the resources are used. Charges on jobs submitted to the 'shared' partitions (shared,gpu-shared,debug,gpu-debug,large-shared) are based on either the number of cores or the fraction of the memory requested, whichever is larger. Jobs submitted to the node-exclusive partitions (compute, gpu) will be charged for the all 128 cores, whether the resources are used or not. The minimum charge for any job is 1 SU.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Job Charge Considerations","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"A node-exclusive job that runs on a compute node for one hour will be charged 128 SUs (128 cores x 1 hour)","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"A node-exclusive job that runs on a GPU node for one hour will be charge 4GPU hours (4 GPU x 1 hour)","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"A node-exclusive job that runs on a Large memory node for one hour will be charged 1024 SUs (2TB memory X 1 hour)","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"A serial job in the shared queue that uses less than 2 GB memory and runs for one hour will be charged 1 SU (1 core x 1 hour)","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Each standard compute node has ~256 GB of memory and 128 cores","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Each standard node core will be allocated 1 GB of memory, users should explicitly include the ","type":"text"},{"text":"--mem","type":"text","marks":[{"type":"code"}]},{"text":" directive to request additional memory","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Max. available memory per compute node ","type":"text"},{"text":"--mem = 249325M","type":"text","marks":[{"type":"code"}]}]}]}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Each GPU node has 4 GPUs, ~384GB of memory and 40 cores","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Default resource allocation for 1 GPU = 1 GPU, 1 CPU, and 1G of memory, users will need to explicitly ask for additional resources in their job script.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"For max memory on a GPU node, users should request ","type":"text"},{"text":"--mem = 377393M","type":"text","marks":[{"type":"code"}]}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"A GPU SU is equivalent to 1GPU, <10CPUs, and <92G of memory.","type":"text"}]}]}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Multicore jobs will scale according to resource utilization","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Each large memory node has ~2 TB of memory and 128 cores","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"By default the system will only allocate 1 GB of memory per core, explicitly use the ","type":"text"},{"text":"--mem","type":"text","marks":[{"type":"code"}]},{"text":" directive to request additional memory","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Max. memory per large memory node ","type":"text"},{"text":"--mem = 2055652M","type":"text","marks":[{"type":"code"}]}]}]}]}]}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Compiling Codes","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" CPU nodes have GNU, Intel, and AOCC (AMD) compilers available along with multiple MPI implementations (OpenMPI, MVAPICH2, and IntelMPI). The majority of the applications on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" have been built using gcc/10.2.0 which features AMD Rome specific optimization flags (-march=znver2). Users should evaluate their application for best compiler and library selection. GNU, Intel, and AOCC compilers all have flags to support Advanced Vector Extensions 2 (AVX2). Using AVX2, up to eight floating point operations can be executed per cycle per core, potentially doubling the performance relative to non-AVX2 processors running at the same clock speed. Note that AVX2 support is not enabled by default and compiler flags must be set as described below.","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" GPU nodes have GNU, Intel, and PGI compilers available along with multiple MPI implementations (OpenMPI, IntelMPI, and MVAPICH2). The gcc/10.2.0, Intel, and PGI compilers have specific flags for the Cascade Lake architecture. Users should evaluate their application for best compiler and library selections.","type":"text"}]},{"type":"paragraph","content":[{"text":"Note that the login nodes are not the same as the GPU nodes, therefore all GPU codes must be compiled by requesting an interactive session on the GPU nodes.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Using AMD Compilers","type":"text"}]},{"type":"paragraph","content":[{"text":"The AMD Optimizing C/C++ Compiler (AOCC) is only available on CPU nodes. AMD compilers can be loaded by executing the following commands at the Linux prompt:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load aocc","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"For more information on the AMD compilers: [flang | clang ] -help","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"a04641f0-ecc8-452d-8d1e-3365c97fbcd2"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":2,"background":"#eeeeee","rowspan":1,"colwidth":[136.0,136.0]},"content":[{"type":"paragraph","content":[{"text":"SERIAL","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"OPENMP","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI+OPENMP","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Fortran","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"flang","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"ifort -mp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90 -fopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"clang","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpiclang","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icc -lomp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc -fopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"clang++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpiclang","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icpc -lomp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx -fopenmp","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Using the Intel Compilers","type":"text"}]},{"type":"paragraph","content":[{"text":"The Intel compilers and the MVAPICH2 MPI compiler wrappers can be loaded by executing the following commands at the Linux prompt:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load intel mvapich2","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"For AVX2 support, compile with the ","type":"text"},{"text":"-march=core-avx2","type":"text","marks":[{"type":"code"}]},{"text":" option. Note that this flag alone does not enable aggressive optimization, so compilation with ","type":"text"},{"text":"-O3","type":"text","marks":[{"type":"code"}]},{"text":" is also suggested.","type":"text"}]},{"type":"paragraph","content":[{"text":"Intel MKL libraries are available as part of the \"intel\" modules on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":". Once this module is loaded, the environment variable INTEL_MKLHOME points to the location of the mkl libraries. The ","type":"text"},{"text":"MKL link advisor","type":"text","marks":[{"type":"link","attrs":{"href":"https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor"}}]},{"text":" can be used to ascertain the link line (change the INTEL_MKLHOME aspect appropriately).","type":"text"}]},{"type":"paragraph","content":[{"text":"For example to compile a C program statically linking 64 bit scalapack libraries on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":":","type":"text"}]},{"type":"codeBlock","content":[{"text":"mpicc -o pdpttr.exe pdpttr.c \\\n -I$INTEL_MKLHOME/mkl/include \\\n ${INTEL_MKLHOME}/mkl/lib/intel64/libmkl_scalapack_lp64.a \\\n -Wl,--start-group ${INTEL_MKLHOME}/mkl/lib/intel64/libmkl_intel_lp64.a \\\n ${INTEL_MKLHOME}/mkl/lib/intel64/libmkl_core.a \\\n ${INTEL_MKLHOME}/mkl/lib/intel64/libmkl_sequential.a \\\n -Wl,--end-group ${INTEL_MKLHOME}/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a \\\n -lpthread -lm","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"For more information on the Intel compilers: [ifort | icc | icpc] -help","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"f72b0763-5743-4a5e-94a9-28d658f90270"},"content":[{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph"}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Serial","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"OpenMP","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI+OpenMP","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Fortran","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"ifort","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"ifort -qopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90 -qopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icc -qopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc -qopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icpc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"icpc -qopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx -qopenmp","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Using the PGI Compilers","type":"text"}]},{"type":"paragraph","content":[{"text":"The PGI compilers are only available on the GPU nodes, and can be loaded by executing the following commands at the Linux prompt","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load pgi","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"Note that the openmpi build is integrated into the PGI install so the above module load provides both PGI and openmpi.","type":"text"}]},{"type":"paragraph","content":[{"text":"For AVX support, compile with ","type":"text"},{"text":"-fast","type":"text","marks":[{"type":"code"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"For more information on the PGI compilers: man [pgf90 | pgcc | pgCC]","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"6ce1cb6f-cf85-4d8c-a571-8060f9ffd706"},"content":[{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph"}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Serial","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"OpenMP","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI+OpenMP","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Fortran","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgf90","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgf90 -mp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90 -mp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgcc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgcc -mp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc -mp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgCC","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"pgCC -mp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx -mp","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Using the GNU Compilers","type":"text"}]},{"type":"paragraph","content":[{"text":"The GNU compilers can be loaded by executing the following commands at the Linux prompt:","type":"text"}]},{"type":"codeBlock","content":[{"text":"module load gcc openmpi","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"For AVX support, compile with -march=core-avx2. Note that AVX support is only available in version 4.7 or later, so it is necessary to explicitly load the gnu/4.9.2 module until such time that it becomes the default.","type":"text"}]},{"type":"paragraph","content":[{"text":"For more information on the GNU compilers: ","type":"text"},{"text":"man [gfortran | gcc | g++]","type":"text","marks":[{"type":"code"}]}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"953a0fcc-e1c5-429b-8560-f6a12fc045e7"},"content":[{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph"}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Serial","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"OpenMP","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"MPI+OpenMP","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"Fortran","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"gfortran","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"gfortran -fopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpif90 -fopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"gcc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"gcc -fopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicc -fopenmp","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"C++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"g++","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"g++ -fopenmp","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[136.0]},"content":[{"type":"paragraph","content":[{"text":"mpicxx -fopenmp","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Notes and Hints","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"The mpif90, mpicc, and mpicxx commands are actually wrappers that call the appropriate serial compilers and load the correct MPI libraries. While the same names are used for the Intel, PGI and GNU compilers, keep in mind that these are completely independent scripts.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"If you use the PGI or GNU compilers or switch between compilers for different applications, make sure that you load the appropriate modules before running your executables.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"When building OpenMP applications and moving between different compilers, one of the most common errors is to use the wrong flag to enable handling of OpenMP directives. Note that Intel, PGI, and GNU compilers use the -qopenmp, -mp, and -fopenmp flags, respectively.","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Explicitly set the optimization level in your makefiles or compilation scripts. Most well written codes can safely use the highest optimization level (-O3), but many compilers set lower default levels (e.g. GNU compilers use the default -O0, which turns off all optimizations).","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Turn off debugging, profiling, and bounds checking when building executables intended for production runs as these can seriously impact performance. These options are all disabled by default. The flag used for bounds checking is compiler dependent, but the debugging (-g) and profiling (-pg) flags tend to be the same for all major compilers.","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Running Jobs on Expanse","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" uses the ","type":"text"},{"text":"Simple Linux Utility for Resource Management (SLURM)","type":"text","marks":[{"type":"strong"}]},{"text":" batch environment. When you run in the batch mode, you submit jobs to be run on the compute nodes using the ","type":"text"},{"text":"sbatch","type":"text","marks":[{"type":"code"}]},{"text":" command as described below. ","type":"text"},{"text":"Remember that computationally intensive jobs should be run only on the compute nodes and not the login nodes.","type":"text","marks":[{"type":"em"}]}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" places limits on the number of jobs queued and running on a per group (allocation) and partition basis. Please note that submitting a large number of jobs (especially very short ones) can impact the overall scheduler response for all users. ","type":"text"},{"text":"If you are anticipating submitting a lot of jobs, please contact the SDSC consulting staff before you submit them. We can work to check if there are bundling options that make your workflow more efficient and reduce the impact on the scheduler.","type":"text","marks":[{"type":"strong"},{"type":"em"}]}]},{"type":"paragraph","content":[{"text":"The limits for each partition are noted in the table below. ","type":"text"},{"text":"Partition limits are subject to change based on Early User Period evaluation.","type":"text","marks":[{"type":"em"}]}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"376be8c0-4117-499c-8e44-0fc86a514797"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"PARTITION NAME","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"MAX","type":"text"},{"type":"hardBreak"},{"text":"WALLTIME","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"MAX","type":"text"},{"type":"hardBreak"},{"text":"NODES/JOB","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"MAX","type":"text"},{"type":"hardBreak"},{"text":"RUNNING","type":"text"},{"type":"hardBreak"},{"text":"JOBS","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"MAX RUNNING","type":"text"},{"type":"hardBreak"},{"text":"+ QUEUED JOBS","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"CHARGE","type":"text"},{"type":"hardBreak"},{"text":"FACTOR","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"NOTES","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"compute","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"64","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Exclusive access to regular compute nodes; ","type":"text"},{"text":"limit applies per group","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"ind-compute","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"64","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Exclusive access to Industry compute nodes; ","type":"text"},{"text":"limit applies per group","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4096","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4096","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Single-node jobs using fewer than 128 cores","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"ind-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"64","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Single-node Industry jobs using fewer than 128 cores","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"gpu","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"8 (32 Tres GPU)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Used for exclusive access to the GPU nodes","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"ind-gpu","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"8 (32 Tres GPU)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Exclusive access to the Industry GPU nodes","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"gpu-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"24","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"24 (24 Tres GPU)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Single-node job using fewer than 4 GPUs","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"ind-gpu-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"24","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"24 (24 Tres GPU)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Single-node job using fewer than 4 Industry GPUs","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"large-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"48 hrs","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"4","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Single-node jobs using large memory up to 2 TB (minimum memory required 256G)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"debug","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"30 min","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Priority access to shared nodes set aside for testing of jobs with short walltime and limited resources","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"gpu-debug","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"30 min","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"2","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Priority access to gpu-shared nodes set aside for testing of jobs with short walltime and limited resources; ","type":"text"},{"text":"max two gpus per job","type":"text","marks":[{"type":"em"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"preempt","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"7 days","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"32","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph"}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"128","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":".8","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Non-refundable discounted jobs to run on free nodes that can be pre-empted by jobs submitted to any other queue","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"gpu-preempt","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"7 days","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"1","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph"}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"24 (24 Tres GPU)","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":".8","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[97.14]},"content":[{"type":"paragraph","content":[{"text":"Non-refundable discounted jobs to run on unallocated nodes that can be pre-empted by higher priority queues","type":"text"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Requesting interactive resources using srun","type":"text"}]},{"type":"paragraph","content":[{"text":"You can request an interactive session using the ","type":"text"},{"text":"srun","type":"text","marks":[{"type":"code"}]},{"text":" command. The following example will request one regular compute node, 4 cores, in the debug partition for 30 minutes.","type":"text"}]},{"type":"codeBlock","content":[{"text":"srun --partition=debug --pty --account=<> --nodes=1 --ntasks-per-node=4 \\\n --mem=8G -t 00:30:00 --wait=0 --export=ALL /bin/bash","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"The following example will request a GPU node, 10 cores, 1 GPU and 96G in the debug partition for 30 minutes. To ensure the GPU environment is properly loaded, please be sure run both the ","type":"text"},{"text":"module purge","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"module restore","type":"text","marks":[{"type":"code"}]},{"text":" commands.","type":"text"}]},{"type":"codeBlock","content":[{"text":"login01$ srun --partition=gpu-debug --pty --account=<> --ntasks-per-node=10 \\\n --nodes=1 --mem=96G --gpus=1 -t 00:30:00 --wait=0 --export=ALL /bin/bash","type":"text"}]},{"type":"codeBlock","content":[{"text":"srun: job 1336890 queued and waiting for resources\nsrun: job 1336890 has been allocated resources\nexp-7-59$ module purge\nexp-7-59$ module restore","type":"text"}]},{"type":"codeBlock","content":[{"text":"Resetting modules to system default. Resetting $MODULEPATH back to system default.\n All extra directories will be removed from $MODULEPATH.","type":"text"}]},{"type":"paragraph"},{"type":"heading","attrs":{"level":2},"content":[{"text":"Submitting Jobs Using sbatch","type":"text"}]},{"type":"paragraph","content":[{"text":"Jobs can be submitted to the sbatch partitions using the ","type":"text"},{"text":"sbatch","type":"text","marks":[{"type":"code"}]},{"text":" command as follows:","type":"text"}]},{"type":"codeBlock","content":[{"text":" sbatch jobscriptfile","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"where ","type":"text"},{"text":"jobscriptfile","type":"text","marks":[{"type":"code"}]},{"text":" is the name of a UNIX format file containing special statements (corresponding to sbatch options), resource specifications and shell commands. Several example SLURM scripts are given below:","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Basic MPI Job","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH --job-name=\"hellompi\"\n#SBATCH --output=\"hellompi.%j.%N.out\"\n#SBATCH --partition=compute\n#SBATCH --nodes=2\n#SBATCH --ntasks-per-node=128\n#SBATCH --mem=249325M \n#SBATCH --account=<>\n#SBATCH --export=ALL\n#SBATCH -t 01:30:00\n\n#This job runs with 2 nodes, 128 cores per node for a total of 256 tasks.\n\nmodule purge\nmodule load cpu\n#Load module file(s) into the shell environment\nmodule load gcc\nmodule load mvapich2\nmodule load slurm\n\nsrun --mpi=pmi2 -n 256 ../hello_mpi","type":"text"}]},{"type":"paragraph"},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse requires users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Basic OpenMP Job","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH --job-name=\"hello_openmp\"\n#SBATCH --output=\"hello_openmp.%j.%N.out\"\n#SBATCH --partition=compute\n#SBATCH --nodes=1\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task=24\n#SBATCH --mem=249325M\n#SBATCH --account=<>\n#SBATCH --export=ALL\n#SBATCH -t 01:30:00\n\nmodule purge \nmodule load cpu\nmodule load slurm\nmodule load gcc\nmodule load openmpi\n\n#SET the number of openmp threads\nexport OMP_NUM_THREADS=24\n\n#Run the job\n./hello_openmp","type":"text"}]},{"type":"paragraph"},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse requires users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Hybrid MPI-OpenMP Job","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH --job-name=\"hellohybrid\"\n#SBATCH --output=\"hellohybrid.%j.%N.out\"\n#SBATCH --partition=compute\n#SBATCH --nodes=1\n#SBATCH --ntasks-per-node=8\n#SBATCH --cpus-per-task=16\n#SBATCH --mem=249325M\n#SBATCH --account=<>\n#SBATCH --export=ALL\n#SBATCH -t 01:30:00\n\n#This job runs with 2 nodes, 24 cores per node for a total of 48 cores.\n# We use 8 MPI tasks and 6 OpenMP threads per MPI task\n\nmodule purge \nmodule load cpu\nmodule load slurm\nmodule load intel\nmodule load intel-mpi\n\nexport OMP_NUM_THREADS=16\nmpirun -genv I_MPI_PIN_DOMAIN=omp:compact ./hello_hybrid","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse require users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Using the Shared Partition","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH -p shared\n#SBATCH --nodes=1\n#SBATCH --ntasks-per-node=8\n#SBATCH --mem=40G\n#SBATCH -t 01:00:00\n#SBATCH -J job.8\n#SBATCH -A <>\n#SBATCH -o job.8.%j.%N.out\n#SBATCH -e job.8.%j.%N.err\n#SBATCH --export=ALL\n\nexport SLURM_EXPORT_ENV=ALL\n\nmodule purge\nmodule load cpu\nmodule load gcc\nmodule load mvapich2\nmodule load slurm\n\nsrun -n 8 ../hello_mpi","type":"text"}]},{"type":"paragraph"},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse requires users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"paragraph","content":[{"text":"The above script will run using 8 cores and 40 GB of memory. Please note that the performance in the shared partition may vary depending on how sensitive your application is to memory locality and the cores you are assigned by the scheduler. It is possible the 8 cores will span two sockets for example.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Using Large Memory Nodes","type":"text"}]},{"type":"paragraph","content":[{"text":"The large memory nodes can be accessed via the \"large-shared\" partition. Charges are based on either the number of cores or the fraction of the memory requested, whichever is larger. By default the system will only allocate 1 GB of memory per core. If additional memory is required, users should explicitly use the ","type":"text"},{"text":"--mem","type":"text","marks":[{"type":"code"}]},{"text":" directive. ","type":"text"}]},{"type":"paragraph","content":[{"text":"For example, on the \"large-shared\" partition, the following job requesting 128 cores and 2000 GB of memory (about 100% of 2TB of one node's available memory) for 1 hour will be charged 1024 SUs:","type":"text"}]},{"type":"paragraph","content":[{"text":"200/1455(memory) * 64(cores) * 1(duration) ~= 1024","type":"text"}]},{"type":"codeBlock","content":[{"text":"#SBATCH --partition=large-shared\n#SBATCH --nodes=1\n#SBATCH --ntasks-per-node=128\n#SBATCH --cpus-per-task=1\n#SBATCH --mem=2055652M\n\nexport OMP_PROC_BIND='true'","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"While there is not a separate 'large' partition, a job can still explicitly request all of the resources on a large memory node. Please note that there is no premium for using ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":"'s large memory nodes. Users are advised to request the large nodes only if they need the extra memory.","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Using GPU Nodes","type":"text"}]},{"type":"paragraph","content":[{"text":"GPU nodes are allocated as a separate resource. The GPU nodes can be accessed via either the \"gpu\" or the \"gpu-shared\" partitions.","type":"text"}]},{"type":"codeBlock","content":[{"text":"#SBATCH -p gpu","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"or","type":"text"}]},{"type":"codeBlock","content":[{"text":"#SBATCH -p gpu-shared","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"When users request 1 GPU, in gpu-shared partition, by default they will also receive, 1 CPU, and 1G memory. Here is an example AMBER script using the gpu-shared queue.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"GPU job","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH --job-name=\"ambergpu\"\n#SBATCH --output=\"ambergpu.%j.%N.out\"\n#SBATCH --partition=gpu\n#SBATCH --nodes=1\n#SBATCH --gpus=4\n#SBATCH --mem=377393M\n#SBATCH --account=<>\n#SBATCH --no-requeue\n#SBATCH -t 01:00:00\n\nmodule purge\nmodule load gpu\nmodule load slurm\nmodule load openmpi\t\t\t\nmodule load amber\npmemd.cuda -O -i mdin.GPU -o mdout.GPU.$SLURM_JOBID -x mdcrd.$SLURM_JOBID \\\n -nf mdinfo.$SLURM_JOBID -1 mdlog.$SLURM_JOBID -p prmtop -c inpcrd","type":"text"}]},{"type":"paragraph"},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse requires users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"GPU-shared job","type":"text"}]},{"type":"codeBlock","content":[{"text":"#!/bin/bash\n#SBATCH --job-name=\"ambergpushared\"\n#SBATCH --output=\"ambergpu.%j.%N.out\"\n#SBATCH --partition=gpu\n#SBATCH --nodes=1\n#SBATCH --gpus=2\n#SBATCH --cpus-per-task=1\n#SBATCH --mem=93G\n#SBATCH --account=<>\n#SBATCH --no-requeue\n#SBATCH -t 01:00:00\n\nmodule purge\nmodule load gpu\nmodule load slurm\nmodule load openmpi\t\t\t\nmodule load amber\npmemd.cuda -O -i mdin.GPU -o mdout-OneGPU.$SLURM_JOBID -p prmtop -c inpcrd","type":"text"}]},{"type":"paragraph","content":[{"text":" ","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Expanse requires users to enter a valid project name; users can list valid project by running the ","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"expanse-client","type":"text","marks":[{"type":"code"}]},{"text":" script.","type":"text"}]},{"type":"paragraph","content":[{"text":" ","type":"text"}]},{"type":"paragraph","content":[{"text":"Users can find application specific example job script on the system in directory ","type":"text"},{"text":"/cm/shared/examples/sdsc/","type":"text","marks":[{"type":"code"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"GPU modes can be controlled for jobs in the \"gpu\" partition. By default, the GPUs are in non-exclusive mode and the persistence mode is 'on'. If a particular \"gpu\" partition job needs exclusive access the following options should be set in your batch script:","type":"text"}]},{"type":"codeBlock","content":[{"text":"#SBATCH --constraint=exclusive","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"To turn persistence off add the following line to your batch script:","type":"text"}]},{"type":"codeBlock","content":[{"text":"#SBATCH --constraint=persistenceoff","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"The charging equation will be:","type":"text"}]},{"type":"paragraph","content":[{"text":"GPU SUs = (Number of GPUs) x (wallclock time)","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"SLURM No-Requeue Option","type":"text"}]},{"type":"paragraph","content":[{"text":"SLURM will requeue jobs if there is a node failure. However, in some cases this might be detrimental if files get overwritten. If users wish to avoid automatic requeue, the following line should be added to their script:","type":"text"}]},{"type":"paragraph","content":[{"text":"#SBATCH --no-requeue","type":"text"}]},{"type":"paragraph","content":[{"text":"The 'requeue' count limit is currently set to 5. The job will be requeued 5 times after which the job will be placed in the REQUEUE_HOLD state and the job must be canceled and resubmitted.","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Example Scripts for Applications","type":"text"}]},{"type":"paragraph","content":[{"text":"SDSC User Services staff have developed sample run scripts for common applications. They are available in the ","type":"text"},{"text":"/cm/shared/examples","type":"text","marks":[{"type":"code"}]},{"text":" directory on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":".","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Job Dependencies","type":"text"}]},{"type":"paragraph","content":[{"text":"There are several scenarios (e.g. splitting long running jobs, workflows) where users may require jobs with dependencies on successful completions of other jobs. In such cases, SLURM's ","type":"text"},{"text":"--dependency","type":"text","marks":[{"type":"code"}]},{"text":" option can be used. The syntax is as follows:","type":"text"}]},{"type":"codeBlock","content":[{"text":"[user@login01-expanse ~]$ sbatch --dependency=afterok:jobid jobscriptfile","type":"text"}]},{"type":"paragraph"},{"type":"heading","attrs":{"level":2},"content":[{"text":"Job Monitoring and Management","type":"text"}]},{"type":"paragraph","content":[{"text":"Users can monitor jobs using the ","type":"text"},{"text":"squeue","type":"text","marks":[{"type":"code"}]},{"text":" command.","type":"text"}]},{"type":"codeBlock","content":[{"text":"[user@expanse ~]$ squeue -u user1\n\n JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)\n 256556 compute raxml_na user1 R 2:03:57 4 expanse-14-[11-14]\n 256555 compute raxml_na user1 R 2:14:44 4 expanse-02-[06-09]","type":"text"}]},{"type":"paragraph"},{"type":"paragraph","content":[{"text":"In this example, the output lists two jobs that are running in the \"compute\" partition. The jobID, partition name, job names, user names, status, time, number of nodes, and the node list are provided for each job. Some common ","type":"text"},{"text":"squeue","type":"text","marks":[{"type":"code"}]},{"text":" options include:","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"dbf6fa45-391f-4370-8b19-d3b1bd17c2f4"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"OPTION","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"RESULT","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"-i ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Repeatedly report at intervals (in seconds)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"-ij","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Displays information for specified job(s)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"-p ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Displays information for specified partitions (queues)","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"-t ","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"Shows jobs in the specified state(s)","type":"text"}]}]}]}]},{"type":"paragraph","content":[{"text":"Users can cancel their own jobs using the ","type":"text"},{"text":"scancel","type":"text","marks":[{"type":"code"}]},{"text":" command as follows:","type":"text"}]},{"type":"codeBlock","content":[{"text":"[user@expanse ~]$ scancel ","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Globus: SDSC Collections, Data Movers and Mount Points","type":"text"}]},{"type":"paragraph","content":[{"text":"All of ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":"'s Lustre filesystems are accessible via the SDSC Expanse specific collections(SDSC HPC - Expanse Lustre ; *SDSC HPC - Projects) . The following table shows the mount points on the data mover nodes (that are the backend for ).","type":"text"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"3a996d8f-ba33-4b44-8a81-eadf62ed89e6"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"MACHINE","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"LOCATION ON MACHINE","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"LOCATION ON GLOBUS/DATA MOVERS","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"Expanse","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/expanse/lustre/projects","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/projects/","type":"text","marks":[{"type":"code"}]},{"text":"…","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"Expanse","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/expanse/lustre/scratch","type":"text","marks":[{"type":"code"}]}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/scratch/...","type":"text","marks":[{"type":"code"}]}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"*Expanse","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/expanse/projects","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[226.67]},"content":[{"type":"paragraph","content":[{"text":"/","type":"text"}]}]}]}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Storage Overview","type":"text"}]},{"type":"paragraph","content":[{"text":"Users are responsible for backing up all important data in case of unexpected data loss at SDSC","type":"text"}]},{"type":"paragraph","content":[{"text":"The SDSC Expanse Lustre file system (including ","type":"text"},{"text":"/expanse/lustre/scratch","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"/expanse/lustre/project","type":"text","marks":[{"type":"code"}]},{"text":") ","type":"text"},{"text":"IS NOT","type":"text","marks":[{"type":"strong"}]},{"text":" an archival file system. The SDSC Expanse Lustre file system ","type":"text"},{"text":"IS NOT","type":"text","marks":[{"type":"strong"}]},{"text":" backed up. SDSC will enforce a strict purge policy on the Expanse Lustre filesystem. ","type":"text"},{"text":"Project space will be purged 90 days after allocation expires. Scratch files will be purged 90 days from creation date.","type":"text","marks":[{"type":"strong"}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Local Scratch Disk","type":"text"}]},{"type":"paragraph","content":[{"text":"The compute nodes on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" have access to fast flash storage. There is 1TB of SSD space available for use on each compute node. The latency to the SSDs is several orders of magnitude lower than that for spinning disk (<100 microseconds vs. milliseconds) making them ideal for user-level check pointing and applications that need fast random I/O to large scratch files. Users can access the SSDs only during job execution under the following directories local to each compute node:","type":"text"}]},{"type":"paragraph","content":[{"text":"/scratch/$USER/job_$SLURM_JOB_ID","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"}]},{"type":"table","attrs":{"layout":"default","width":760.0,"localId":"e5de49e3-a04e-487e-8e53-15c6ac14301f"},"content":[{"type":"tableRow","content":[{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"PARTITION","type":"text"}]}]},{"type":"tableHeader","attrs":{"colspan":1,"background":"#eeeeee","rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"SPACE AVAILABLE","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"compute,shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"1 TB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"gpu, gpu-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"1.6TB","type":"text"}]}]}]},{"type":"tableRow","content":[{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"large-shared","type":"text"}]}]},{"type":"tableCell","attrs":{"colspan":1,"rowspan":1,"colwidth":[340.0]},"content":[{"type":"paragraph","content":[{"text":"3.2 TB","type":"text"}]}]}]}]},{"type":"paragraph"},{"type":"heading","attrs":{"level":2},"content":[{"text":"Parallel Lustre Filesystems","type":"text"}]},{"type":"paragraph","content":[{"text":"In addition to the local scratch storage, users will have access to global parallel filesystems on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":". Every ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" node has access to a 12 PB Lustre parallel file system (provided by Aeon Computing) and a 7 PB Ceph Object Store system, 140 GB/second performance storage. SDSC limits the number of files that can be stored in the ","type":"text"},{"text":"/lustre/scratch","type":"text","marks":[{"type":"code"}]},{"text":" filesystem to 2 million files per user. Users should contact support for assistance at the ","type":"text"},{"text":"ACCESS Help Desk","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/467109484"}}]},{"text":" if their workflow requires extensive small I/O, to avoid causing system issues associated with load on the metadata server.","type":"text"}]},{"type":"paragraph","content":[{"text":"The two Lustre filesystems available on ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" are:","type":"text"}]},{"type":"bulletList","content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Lustre ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" scratch filesystem: ","type":"text"},{"text":"/expanse/lustre/scratch/$USER/temp_project","type":"text","marks":[{"type":"code"}]}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Lustre NSF projects filesystem: ","type":"text"},{"text":"/expanse/lustre/projects/","type":"text","marks":[{"type":"code"}]}]}]}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"Submitting Jobs Using Lustre","type":"text"}]},{"type":"paragraph","content":[{"text":"Jobs that need to use the Lustre filesystem should explicitly reqeust the feature by including the following line to their script:","type":"text"}]},{"type":"paragraph","content":[{"text":"#SBATCH --constraint=\"lustre\"","type":"text"}]},{"type":"paragraph","content":[{"text":"This constraint can be used in combination with any other constraints you are already using. For example:","type":"text"}]},{"type":"paragraph","content":[{"text":"#SBATCH --constraint=\"lustre&persistenceoff&exclusive\"","type":"text"}]},{"type":"paragraph","content":[{"text":" Jobs submitted without --constraint=\"lustre\" that need the Lustre filesystem will be scheduled on nodes without Lustre and will FAIL.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Home File System","type":"text"}]},{"type":"paragraph","content":[{"text":"After logging in, users are placed in their home directory, /home, also referenced by the environment variable ","type":"text"},{"text":"$HOME","type":"text","marks":[{"type":"code"}]},{"text":". The home directory is limited in space and should be used only for source code storage. User will have access to 100GB in ","type":"text"},{"text":"/home","type":"text","marks":[{"type":"code"}]},{"text":". Jobs should ","type":"text"},{"text":"never","type":"text","marks":[{"type":"strong"}]},{"text":" be run from the home file system, as it is not set up for high performance throughput. Users should keep usage on ","type":"text"},{"text":"$HOME","type":"text","marks":[{"type":"code"}]},{"text":" under 100GB. Backups are currently being stored on a rolling 8-week period. In case of file corruption/data loss, please contact us at ","type":"text"},{"text":"ACCESS Help Desk","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/467109484"}}]},{"text":" to retrieve the requested files.","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Composable Systems","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" also supports Composable Systems, allowing researchers to create a virtual 'tool set' of resources, such as Kubernetes resources, for a specific project and then re-compose it as needed. ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" will also feature direct scheduler integration with the major cloud providers, leveraging high-speed networks to ease data movement to and from the cloud.","type":"text"}]},{"type":"paragraph","content":[{"text":"All Composable System requests must include a brief justification, specifically describing why a Composable System is required for the project.","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Software","type":"text"}]},{"type":"paragraph","content":[{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":" supports a broad application base with installs and modules for commonly used packages in bioinformatics, molecular dynamics, machine learning, quantum chemistry, structural mechanics, and visualization, and will continue to support Singularity-based containerization in ","type":"text"},{"text":"Expanse","type":"text","marks":[{"type":"em"}]},{"text":". Users can search for available software on ACCESS resources with the ","type":"text"},{"text":"ACCESS software search tool","type":"text","marks":[{"type":"link","attrs":{"href":"https://access-ci.atlassian.net/wiki/spaces/ACCESSdocumentation/pages/682491905"}}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"Back to top","type":"text","marks":[{"type":"link","attrs":{"href":"#"}}]}]},{"type":"heading","attrs":{"level":1},"content":[{"text":"Publications","type":"text"}]}],"version":1}