Introduction

Toil runs in various environments, including locally and in the cloud (Amazon Web Services and Google Compute Engine). Toil also supports workflows written in two DSLs: CWL and WDL, as well as workflows written in Python (see Developing a Python Workflow).

Toil is built in a modular way so that it can be used on lots of different systems, and with different configurations. The three configurable pieces are the

  • Job Store: A filepath or url that can host and centralize all files for a workflow (e.g. a local folder, or an AWS s3 bucket url).

  • Batch System: Specifies either a local single-machine or a currently supported HPC environment (lsf, mesos, slurm, torque, htcondor, kubernetes, or grid_engine).

  • Provisioner: For running in the cloud only. This specifies which cloud provider provides instances to do the “work” of your workflow.

Job Store

The job store is a storage abstraction which contains all of the information used in a Toil run. This centralizes all of the files used by jobs in the workflow and also the details of the progress of the run. If a workflow crashes or fails, the job store contains all of the information necessary to resume with minimal repetition of work.

Several different job stores are supported, including the file job store and cloud job stores. For information on developing job stores, see Job Store API.

File Job Store

The file job store is for use locally, and keeps the workflow information in a directory on the machine where the workflow is launched. This is the simplest and most convenient job store for testing or for small runs.

For an example that uses the file job store, see Running a basic CWL workflow.

Cloud Job Stores

Toil currently supports the following cloud storage systems as job stores:

  • AWS Job Store: An AWS S3 bucket formatted as “aws:<zone>:<bucketname>” where only numbers, letters, and dashes are allowed in the bucket name. Example: aws:us-west-2:my-aws-jobstore-name.

  • Google Job Store: A Google Cloud Storage bucket formatted as “gce:<zone>:<bucketname>” where only numbers, letters, and dashes are allowed in the bucket name. Example: gce:us-west2-a:my-google-jobstore-name.

These use cloud buckets to house all of the files. This is useful if there are several different worker machines all running jobs that need to access the job store.

Batch System

A Toil batch system is either a local single-machine (one computer) or a currently supported cluster of computers (lsf, mesos, slurm, torque, htcondor, or grid_engine) These environments manage individual worker nodes under a leader node to process the work required in a workflow. The leader and its workers all coordinate their tasks and files through a centralized job store location.

See Batch System API for a more detailed description of different batch systems, or information on developing batch systems.

Provisioner

The Toil provisioner provides a tool set for running a Toil workflow on a particular cloud platform.

The Toil Cluster Utilities are command line tools used to provision nodes in your desired cloud platform. They allows you to launch nodes, ssh to the leader, and rsync files back and forth.

For detailed instructions for using the provisioner see Running in AWS or Running in Google Compute Engine (GCE).