Globus Compute on ACCESS

Globus Compute: Fire-and-forget remote computing on ACCESS resources

Globus Compute (previously known as funcX) is a federated Function-as-a-Service (FaaS) platform that drastically simplifies remote execution, enabling users effortlessly to execute Python functions on remote computers (including most ACCESS resource providers as well as  laptops, edge devices, and HPC clusters). Globus Compute’s managed FaaS platform allows users to outsource management of remote execution. Users submit functions for execution; the cloud service manages dispatch of the function to a remote endpoint, and stores results until retrieved by users. Globus Compute provides connectors for ACCESS resource schedulers (e.g., Slurm) and can be configured to provision resources dynamically when needed.

Benefits of Globus Compute to ACCESS researchers

  • Remote Execution: ACCESS researchers can run computations on remote resources, such as HPC clusters and cloud platforms, from their local computing environment (e.g., a Jupyter notebook on their laptop), without needing SSH connections.

  • Fire-and-Forget execution: Outsource the complexity of running compute tasks to a reliable cloud-hosted service. Globus Compute stores functions for execution, securely dispatches them to remote endpoints, and reliably stores results until retrieved by the user, who can detach from the service after submitting tasks. 

  • Python: Orchestrate workloads from within your local Python environment and manage execution in Python environments on remote resources, without needing to learn a new specification language.  

  • Portability: Easily move workloads between remote computing resources by simply changing the endpoint ID, without changing the code.

  • Scalability: Globus Compute support for dynamic allocation of computing resources based on workload means that researchers can scale their computations as needed, ensuring that their analyses can handle large datasets and complex models, without learning scheduler APIs. 

  • Workflows: Globus Compute functions can be combined into workflows using Globus’ cloud-hosted Flows platform. Create sophisticated, multi-step workflows that span computing resources and integrate with an ecosystem of connected services, without learning to install and manage complex workflow software.

Getting started with Globus Compute

Globus Compute implements a hybrid cloud architecture with two user-facing components: the endpoint, a user-managed software agent that must be deployed on a compute resource to make it accessible for function execution; and the Globus Compute SDK, which provides a Python API for registration, execution, and management of functions across endpoints.

One-time Endpoint Deployment

To use Globus Compute on a resource provider each researcher must first deploy their own Globus Compute endpoint.  To do so, you will need to login to the resource provider (e.g., via SSH or OpenOnDemand) and install/configure the Globus Compute endpoint.  The endpoint can be deployed in any Python environment (local, Conda, or virtual environments). Note: functions will execute in the Python environment in which the endpoint is deployed. 

Installation is entirely Python-based using Pip. You can install and configure the endpoint as follows:

$ python3 -m pip install globus-compute-endpoint

$ globus-compute-endpoint configure <ENDPOINT_NAME>

$ globus-compute-endpoint start <ENDPOINT_NAME>

The first time you start the endpoint, you will be asked to authenticate. This is an important step that ties your endpoint to your identity and ensures that only you can access your endpoint.  Authentication uses Globus Auth, enabling you to authenticate using any external identity (e.g. your campus or ACCESS account). Note: you do not need to use your ACCESS account. Please ensure you use the same identity when installing your endpoint as you will use with the SDK (i.e., when you register, execute, and manage your functions). 

Configuring your endpoint for local batch schedulers

Globus Compute endpoints act as gateways to diverse computational resources, including clusters, clouds, supercomputers, and even your laptop. By default, deployed endpoints will run on the node on which they are deployed (i.e., the login node). To make use of ACCESS compute resources (that use a batch scheduler), the endpoint should be configured to match the capabilities of the resource on which it is deployed.

After configuring an endpoint, a  config.yaml file is created in  $HOME/.globus_compute/<ENDPOINT_NAME>/. You can edit this configuration file to match your specific HPC environment (e.g., define allocations, queues, number of nodes). After updating the configuration file you must restart the endpoint.  Common configurations for ACCESS resource providers are available here:

See the Globus Compute Endpoints documentation for more details.

If you have any trouble configuring an endpoint, please contact support@globus.org for assistance. 

Running functions

The Globus Compute SDK provides client classes for interacting with Globus Compute. The client abstracts authentication and provides a Python interface to run functions.  We suggest using the Globus Compute Executor (a subclass of Python’s concurrent.Futures.Executor) as it provides a simple interface for running many functions. 

When you instantiate a Globus Compute client, a one-time authentication process will enable you to authenticate via Globus Auth using any accessible identity. Please ensure to use the same identity you used when deploying your endpoint. 

The following example shows how to run a Python function on an endpoint. You can use the tutorial endpoint specified here (running in the CLoud) or update the endpoint ID to the endpoint ID you deployed above. 

from globus_compute_sdk import Executor

def double(x):

    return x * 2

tutorial_endpoint_id = '4b116d3c-1703-4f8f-9f6f-39921e5864df'  

with Executor(endpoint_id=tutorial_endpoint_id) as gce:

    fut = gce.submit(double, 7)

    print(fut.result())

Please see the Globus documentation for more examples of using the Executor and SDK.

What’s next