Toil Cluster Utilities¶
In addition to the generic Toil Utilities, there are several utilities used for starting and managing a Toil cluster using the AWS or GCE provisioners. They are installed
via the [aws]
or [google]
extra. For installation details see Toil Provisioner.
The toil
cluster subcommands are:
destroy-cluster
— For autoscaling. Terminates the specified cluster and associated resources.
launch-cluster
— For autoscaling. This is used to launch a toil leader instance with the specified provisioner.
rsync-cluster
— For autoscaling. Used to transfer files to a cluster launched withtoil launch-cluster
.
ssh-cluster
— SSHs into the toil appliance container running on the leader of the cluster.
For information on a specific utility, run it with the --help
option:
toil launch-cluster --help
The cluster utilities can be used for Running in Google Compute Engine (GCE) and Running in AWS.
Tip
By default, all of the cluster utilities expect to be running on AWS. To run with Google
you will need to specify the --provisioner gce
option for each utility.
Note
Boto must be configured with AWS credentials before using cluster utilities.
Running in Google Compute Engine (GCE) contains instructions for
Launch-Cluster Command¶
Running toil launch-cluster
starts up a leader for a cluster. Workers can be
added to the initial cluster by specifying the -w
option. An example would be
$ toil launch-cluster my-cluster \
--leaderNodeType t2.small -z us-west-2a \
--keyPairName your-AWS-key-pair-name \
--nodeTypes m3.large,t2.micro -w 1,4
Options are listed below. These can also be displayed by running
$ toil launch-cluster --help
launch-cluster’s main positional argument is the clusterName. This is simply the name of your cluster. If it does not exist yet, Toil will create it for you.
Launch-Cluster Options
- --help
-h also accepted. Displays this help menu.
- --tempDirRoot TEMPDIRROOT
Path to the temporary directory where all temp files are created, by default uses the current working directory as the base.
- --version
Display version.
- --provisioner CLOUDPROVIDER
-p CLOUDPROVIDER also accepted. The provisioner for cluster auto-scaling. Both AWS and GCE are currently supported.
- --zone ZONE
-z ZONE also accepted. The availability zone of the leader. This parameter can also be set via the TOIL_AWS_ZONE or TOIL_GCE_ZONE environment variables, or by the ec2_region_name parameter in your .boto file if using AWS, or derived from the instance metadata if using this utility on an existing EC2 instance.
- --leaderNodeType LEADERNODETYPE
Non-preemptable node type to use for the cluster leader.
- --keyPairName KEYPAIRNAME
The name of the AWS or ssh key pair to include on the instance.
- --owner OWNER
The owner tag for all instances. If not given, the value in TOIL_OWNER_TAG will be used, or else the value of
--keyPairName
.- --boto BOTOPATH
The path to the boto credentials directory. This is transferred to all nodes in order to access the AWS jobStore from non-AWS instances.
- --tag KEYVALUE
KEYVALUE is specified as KEY=VALUE. -t KEY=VALUE also accepted. Tags are added to the AWS cluster for this node and all of its children. Tags are of the form:
-t key1=value1
--tag key2=value2
. Multiple tags are allowed and each tag needs its own flag. By default the cluster is tagged with: { “Name”: clusterName, “Owner”: IAM username }.- --vpcSubnet VPCSUBNET
VPC subnet ID to launch cluster leader in. Uses default subnet if not specified. This subnet needs to have auto assign IPs turned on.
- --nodeTypes NODETYPES
Comma-separated list of node types to create while launching the leader. The syntax for each node type depends on the provisioner used. For the AWS provisioner this is the name of an EC2 instance type followed by a colon and the price in dollars to bid for a spot instance, for example ‘c3.8xlarge:0.42’. Must also provide the
--workers
argument to specify how many workers of each node type to create.- --workers WORKERS
-w WORKERS also accepted. Comma-separated list of the number of workers of each node type to launch alongside the leader when the cluster is created. This can be useful if running toil without auto-scaling but with need of more hardware support.
- --leaderStorage LEADERSTORAGE
Specify the size (in gigabytes) of the root volume for the leader instance. This is an EBS volume.
- --nodeStorage NODESTORAGE
Specify the size (in gigabytes) of the root volume for any worker instances created when using the -w flag. This is an EBS volume.
- --nodeStorageOverrides NODESTORAGEOVERRIDES
Comma-separated list of nodeType:nodeStorage that are used to override the default value from
--nodeStorage
for the specified nodeType(s). This is useful for heterogeneous jobs where some tasks require much more disk than others.- --allowFuse BOOL
Whether to allow FUSE mounts for faster runtimes with Singularity. Note: This will result in the Toil container running as privileged. For Kubernetes, pods will be asked to run as privileged. If this is not allowed, Singularity containers will use sandbox directories instead.
Logging Options
- --logOff
Same as
--logCritical
.- --logCritical
Turn on logging at level CRITICAL and above. (default is INFO)
- --logError
Turn on logging at level ERROR and above. (default is INFO)
- --logWarning
Turn on logging at level WARNING and above. (default is INFO)
- --logInfo
Turn on logging at level INFO and above. (default is INFO)
- --logDebug
Turn on logging at level DEBUG and above. (default is INFO)
- --logDebug
Turn on logging at level TRACE and above. (default is INFO)
- --logLevel LOGLEVEL
Log at given level (may be either OFF (or CRITICAL), ERROR, WARN (or WARNING), INFO, DEBUG, or TRACE). (default is INFO)
- --logFile LOGFILE
File to log in.
- --rotatingLogging
Turn on rotating logging, which prevents log files getting too big.
Ssh-Cluster Command¶
Toil provides the ability to ssh into the leader of the cluster. This can be done as follows:
$ toil ssh-cluster CLUSTER-NAME-HERE
This will open a shell on the Toil leader and is used to start an
Running a Workflow with Autoscaling run. Issues with docker prevent using screen
and tmux
when sshing the cluster (The shell doesn’t know that it is a TTY which prevents
it from allocating a new screen session). This can be worked around via
$ script
$ screen
Simply running screen
within script
will get things working properly again.
Finally, you can execute remote commands with the following syntax:
$ toil ssh-cluster CLUSTER-NAME-HERE remoteCommand
It is not advised that you run your Toil workflow using remote execution like this unless a tool like nohup is used to ensure the process does not die if the SSH connection is interrupted.
For an example usage, see Running a Workflow with Autoscaling.
Rsync-Cluster Command¶
The most frequent use case for the rsync-cluster
utility is deploying your
workflow code to the Toil leader. Note that the syntax is the same as traditional
rsync with the exception of the hostname before
the colon. This is not needed in toil rsync-cluster
since the hostname is automatically
determined by Toil.
Here is an example of its usage:
$ toil rsync-cluster CLUSTER-NAME-HERE \
~/localFile :/remoteDestination
Destroy-Cluster Command¶
The destroy-cluster
command is the advised way to get rid of any Toil cluster
launched using the Launch-Cluster Command command. It ensures that all attached nodes, volumes,
security groups, etc. are deleted. If a node or cluster is shut down using Amazon’s online portal
residual resources may still be in use in the background. To delete a cluster run
$ toil destroy-cluster CLUSTER-NAME-HERE