ACCESS Pegasus

1 Logging In / Jupyter
2 Allocation Optional for Tutorial Workflows
3 Creating Workflows
4 Job Routing / Resource Provisioning
- 4.1 TestPool
- 4.2 Cloud
- 4.3 HTCondor Annex
- 4.4 OSPool
- 4.5 Manual Glideins
5 Frequently Asked Questions
- 5.1 Can I use Pegasus for my HPC workloads?

Logging In / Jupyter

To get started, use a web browser and log in to https://pegasus.access-ci.org . Use your ACCESS credentials to log in.

Allocation Optional for Tutorial Workflows

Typically, using ACCESS Pegasus to run workflows necessitates users to link their own allocations. However, the initial notebooks in this guide are pre-configured to operate on a modest resource bundled with ACCESS Pegasus. As you progress to more complex sample workflows, you'll be required to utilize your own allocation.

Creating Workflows

Looking at examples of solutions that have already been implemented can be very helpful. With that in mind, we have created a collection of sample workflows that can be conveniently explored using our web-based Jupyter notebooks.

The examples can be found your $HOME directory under the ACCESS-Pegasus-Examples/ directory.

In Jupyter, navigate to the example you are interested in, and step through the notebook. Once the workflow is submitted, you have to add compute resources with HTCondor Annex.

The first few notebooks are set up as self-guided introduction to Pegasus. The final example is a complete workflow focused on automating the variant calling process, and it was adapted from the Data Carpentry Lesson on Data Wrangling and Processing for Genomics. This particular workflow involves downloading and aligning SRA data to the E. coli REL606 reference genome, and identifying any differences between the reads and the genome. Additionally, it performs variant calling in order to track changes in the population over time.

For full description on how to create workflows, please see the Pegasus user guide

Job Routing / Resource Provisioning

ACCESS Pegasus enables jobs to flow to a set of different resources, some which are always available, and some which have to be explicitly provisioned by the users when they need it. Note that by default a job will try to go anywhere it can - you might have to exclude resources if there are places you do not want the jobs to go.

The following figure shows an overview of the resources, and below that is a more detailed discussion of the resources.

TestPool

The TestPool consists of a small number of cores, available for anyone to use at any time, even without an allocation. These are meant to be used for jobs with quick turnaround time, such as tutorials, development, and debugging.

You can see the state of the TestPool by running:

condor_status -const 'TestPool =?= True'

If you do not want your jobs to run on the TestPool, please add TestPool =!= True to your job requirements.

Cloud

Adding cloud resources, using your own allocation, is done by starting a provided VM image, and injecting a provided token for authentication. The VMs join the pool and start running jobs. When there are no more jobs, the VMs shut themselves down.

More details on how to provide cloud resources

HTCondor Annex

ACCESS HPC Resources can be brought in with the HTCondor Annex tool, by sending pilot jobs (also called glideins) to the clusters. The pilots will run under your ACCESS allocation, and have the following properties:

A pilot can run multiple user jobs - it stays active until no more user jobs are available or until end of life has been reached, whichever comes first.
A pilot is partitionable - job slots will dynamically be created based on the resource requirements in the user jobs. This means you can fit multiple user jobs on a compute node at the same time.
A pilot will only run jobs for the user who started it.

Annexes can be named, and jobs can be configured to only go to certain named Annexes. By default, the annexes are named with your username.

More details on how to use the HTCondor Annex

OSPool

The OSPool is always connected to ACCESS Pegasus, but requires jobs to have an OSG project name specified. If you have an ACCESS allocation on OSG, you can use the “TG-NNNNNN” allocation id as project name. Or, if you have an OSG assigned project name, you may use that. You can specify the project name in your workflows like:

props.add_site_profile("condorpool", "condor", "+ProjectName", "\"TG-NNNNNN\"")

Also note that the OSPool uses a different approach to containers. Instead of using Pegasus’ built in container execution, create non-container jobs, with a property specify the container to use:

props.add_site_profile("condorpool", "condor", "+SingularityImage", "\"/cvmfs/singularity.opensciencegrid.org/htc/rocky:8\"")

More information about containers on the OSPool can be found in the OSG documentation.

More details on how to the OSPool

Manual Glideins

This is a great solution for campus cluster, clusters behind firewalls and MFA. The glideins are submitted as regular jobs. Details can be found in the GitHub repository.

Frequently Asked Questions

Can I use Pegasus for my HPC workloads?

HPC jobs require a Pegasus install at the resource provider. This will be explored later in the ACCESS Pegasus pilot, but please let us know that you have an interest in this and what your requirements are.