toil.provisioners.abstractProvisioner

Attributes

a_short_time

logger

Exceptions

ClusterTypeNotSupportedException

Indicates that a provisioner does not support a given cluster type.

ManagedNodesNotSupportedException

Raised when attempting to add managed nodes (which autoscale up and down by

Classes

Node

Shape

Represents a job or a node's "shape", in terms of the dimensions of memory, cores, disk and

AbstractProvisioner

Interface for provisioning worker nodes to use in a Toil cluster.

Functions

applianceSelf([forceDockerAppliance])

Return the fully qualified name of the Docker image to start Toil appliance containers from.

customDockerInitCmd()

Return the custom command set by the TOIL_CUSTOM_DOCKER_INIT_COMMAND environment variable.

customInitCmd()

Return the custom command set by the TOIL_CUSTOM_INIT_COMMAND environment variable.

Module Contents

toil.provisioners.abstractProvisioner.applianceSelf(forceDockerAppliance=False)[source]

Return the fully qualified name of the Docker image to start Toil appliance containers from.

The result is determined by the current version of Toil and three environment variables: TOIL_DOCKER_REGISTRY, TOIL_DOCKER_NAME and TOIL_APPLIANCE_SELF.

TOIL_DOCKER_REGISTRY specifies an account on a publicly hosted docker registry like Quay or Docker Hub. The default is UCSC’s CGL account on Quay.io where the Toil team publishes the official appliance images. TOIL_DOCKER_NAME specifies the base name of the image. The default of toil will be adequate in most cases. TOIL_APPLIANCE_SELF fully qualifies the appliance image, complete with registry, image name and version tag, overriding both TOIL_DOCKER_NAME and TOIL_DOCKER_REGISTRY` as well as the version tag of the image. Setting TOIL_APPLIANCE_SELF will not be necessary in most cases.

Parameters:

forceDockerAppliance (bool)

Return type:

str

toil.provisioners.abstractProvisioner.customDockerInitCmd()[source]

Return the custom command set by the TOIL_CUSTOM_DOCKER_INIT_COMMAND environment variable.

The custom docker command is run prior to running the workers and/or the primary node’s services.

This can be useful for doing any custom initialization on instances (e.g. authenticating to private docker registries). Any single quotes are escaped and the command cannot contain a set of blacklisted chars (newline or tab).

Returns:

The custom command, or an empty string is returned if the environment variable is not set.

Return type:

str

toil.provisioners.abstractProvisioner.customInitCmd()[source]

Return the custom command set by the TOIL_CUSTOM_INIT_COMMAND environment variable.

The custom init command is run prior to running Toil appliance itself in workers and/or the primary node (i.e. this is run one stage before TOIL_CUSTOM_DOCKER_INIT_COMMAND).

This can be useful for doing any custom initialization on instances (e.g. authenticating to private docker registries). Any single quotes are escaped and the command cannot contain a set of blacklisted chars (newline or tab).

returns: the custom command or n empty string is returned if the environment variable is not set.

Return type:

str

exception toil.provisioners.abstractProvisioner.ClusterTypeNotSupportedException(provisioner_class, cluster_type)[source]

Bases: Exception

Indicates that a provisioner does not support a given cluster type.

class toil.provisioners.abstractProvisioner.Node(publicIP, privateIP, name, launchTime, nodeType, preemptible, tags=None, use_private_ip=None)[source]
Parameters:
maxWaitTime
__str__()[source]

Return str(self).

__repr__()[source]

Return repr(self).

__hash__()[source]

Return hash(self).

remainingBillingInterval()[source]

If the node has a launch time, this function returns a floating point value between 0 and 1.0 representing how far we are into the current billing cycle for the given instance. If the return value is .25, we are one quarter into the billing cycle, with three quarters remaining before we will be charged again for that instance.

Assumes a billing cycle of one hour.

Returns:

Float from 0 -> 1.0 representing percentage of pre-paid time left in cycle.

Return type:

float

waitForNode(role, keyName='core')[source]
Parameters:
Return type:

None

copySshKeys(keyName)[source]

Copy authorized_keys file to the core user from the keyName user.

injectFile(fromFile, toFile, role)[source]

rysnc a file to the container with the given role

extractFile(fromFile, toFile, role)[source]

rysnc a file from the container with the given role

sshAppliance(*args, **kwargs)[source]
Parameters:
  • args – arguments to execute in the appliance

  • kwargs – tty=bool tells docker whether or not to create a TTY shell for interactive SSHing. The default value is False. Input=string is passed as input to the Popen call.

sshInstance(*args, **kwargs)[source]

Run a command on the instance. Returns the binary output of the command.

coreSSH(*args, **kwargs)[source]

If strict=False, strict host key checking will be temporarily disabled. This is provided as a convenience for internal/automated functions and ought to be set to True whenever feasible, or whenever the user is directly interacting with a resource (e.g. rsync-cluster or ssh-cluster). Assumed to be False by default.

kwargs: input, tty, appliance, collectStdout, sshOptions, strict

Parameters:

input (bytes) – UTF-8 encoded input bytes to send to the command

coreRsync(args, applianceName='toil_leader', **kwargs)[source]
Parameters:
  • args (List[str])

  • applianceName (str)

  • kwargs (Any)

Return type:

int

toil.provisioners.abstractProvisioner.a_short_time = 5
toil.provisioners.abstractProvisioner.logger
exception toil.provisioners.abstractProvisioner.ManagedNodesNotSupportedException[source]

Bases: RuntimeError

Raised when attempting to add managed nodes (which autoscale up and down by themselves, without the provisioner doing the work) to a provisioner that does not support them.

Polling with this and try/except is the Right Way to check if managed nodes are available from a provisioner.

class toil.provisioners.abstractProvisioner.Shape(wallTime, memory, cores, disk, preemptible)[source]

Represents a job or a node’s “shape”, in terms of the dimensions of memory, cores, disk and wall-time allocation.

The wallTime attribute stores the number of seconds of a node allocation, e.g. 3600 for AWS. FIXME: and for jobs?

The memory and disk attributes store the number of bytes required by a job (or provided by a node) in RAM or on disk (SSD or HDD), respectively.

Parameters:
__eq__(other)[source]

Return self==value.

Parameters:

other (Any)

Return type:

bool

greater_than(other)[source]
Parameters:

other (Any)

Return type:

bool

__gt__(other)[source]

Return self>value.

Parameters:

other (Any)

Return type:

bool

__repr__()[source]

Return repr(self).

Return type:

str

__str__()[source]

Return str(self).

Return type:

str

__hash__()[source]

Return hash(self).

Return type:

int

class toil.provisioners.abstractProvisioner.AbstractProvisioner(clusterName=None, clusterType='mesos', zone=None, nodeStorage=50, nodeStorageOverrides=None, enable_fuse=False)[source]

Bases: abc.ABC

Interface for provisioning worker nodes to use in a Toil cluster.

Parameters:
  • clusterName (Optional[str])

  • clusterType (Optional[str])

  • zone (Optional[str])

  • nodeStorage (int)

  • nodeStorageOverrides (Optional[List[str]])

  • enable_fuse (bool)

LEADER_HOME_DIR = '/root/'
cloud: str = None
abstract supportedClusterTypes()[source]

Get all the cluster types that this provisioner implementation supports.

Return type:

Set[str]

abstract createClusterSettings()[source]

Initialize class for a new cluster, to be deployed, when running outside the cloud.

abstract readClusterSettings()[source]

Initialize class from an existing cluster. This method assumes that the instance we are running on is the leader.

Implementations must call _setLeaderWorkerAuthentication().

setAutoscaledNodeTypes(nodeTypes)[source]

Set node types, shapes and spot bids for Toil-managed autoscaling. :param nodeTypes: A list of node types, as parsed with parse_node_types.

Parameters:

nodeTypes (List[Tuple[Set[str], Optional[float]]])

hasAutoscaledNodeTypes()[source]

Check if node types have been configured on the provisioner (via setAutoscaledNodeTypes).

Returns:

True if node types are configured for autoscaling, and false otherwise.

Return type:

bool

getAutoscaledInstanceShapes()[source]

Get all the node shapes and their named instance types that the Toil autoscaler should manage.

Return type:

Dict[Shape, str]

static retryPredicate(e)[source]

Return true if the exception e should be retried by the cluster scaler. For example, should return true if the exception was due to exceeding an API rate limit. The error will be retried with exponential backoff.

Parameters:

e – exception raised during execution of setNodeCount

Returns:

boolean indicating whether the exception e should be retried

abstract launchCluster(*args, **kwargs)[source]

Initialize a cluster and create a leader node.

Implementations must call _setLeaderWorkerAuthentication() with the leader so that workers can be launched.

Parameters:
  • leaderNodeType – The leader instance.

  • leaderStorage – The amount of disk to allocate to the leader in gigabytes.

  • owner – Tag identifying the owner of the instances.

abstract addNodes(nodeTypes, numNodes, preemptible, spotBid=None)[source]

Used to add worker nodes to the cluster

Parameters:
  • numNodes (int) – The number of nodes to add

  • preemptible (bool) – whether or not the nodes will be preemptible

  • spotBid (Optional[float]) – The bid for preemptible nodes if applicable (this can be set in config, also).

  • nodeTypes (Set[str])

Returns:

number of nodes successfully added

Return type:

int

addManagedNodes(nodeTypes, minNodes, maxNodes, preemptible, spotBid=None)[source]

Add a group of managed nodes of the given type, up to the given maximum. The nodes will automatically be launched and terminated depending on cluster load.

Raises ManagedNodesNotSupportedException if the provisioner implementation or cluster configuration can’t have managed nodes.

Parameters:
  • minNodes – The minimum number of nodes to scale to

  • maxNodes – The maximum number of nodes to scale to

  • preemptible – whether or not the nodes will be preemptible

  • spotBid – The bid for preemptible nodes if applicable (this can be set in config, also).

  • nodeTypes (Set[str])

Return type:

None

abstract terminateNodes(nodes)[source]

Terminate the nodes represented by given Node objects

Parameters:

nodes (List[toil.provisioners.node.Node]) – list of Node objects

Return type:

None

abstract getLeader()[source]
Returns:

The leader node.

abstract getProvisionedWorkers(instance_type=None, preemptible=None)[source]

Gets all nodes, optionally of the given instance type or preemptability, from the provisioner. Includes both static and autoscaled nodes.

Parameters:
  • preemptible (Optional[bool]) – Boolean value to restrict to preemptible nodes or non-preemptible nodes

  • instance_type (Optional[str])

Returns:

list of Node objects

Return type:

List[toil.provisioners.node.Node]

abstract getNodeShape(instance_type, preemptible=False)[source]

The shape of a preemptible or non-preemptible node managed by this provisioner. The node shape defines key properties of a machine, such as its number of cores or the time between billing intervals.

Parameters:

instance_type (str) – Instance type name to return the shape of.

Return type:

Shape

abstract destroyCluster()[source]

Terminates all nodes in the specified cluster and cleans up all resources associated with the cluster. :param clusterName: identifier of the cluster to terminate.

Return type:

None

class InstanceConfiguration[source]

Allows defining the initial setup for an instance and then turning it into an Ignition configuration for instance user data.

addFile(path, filesystem='root', mode='0755', contents='', append=False)[source]

Make a file on the instance with the given filesystem, mode, and contents.

See the storage.files section: https://github.com/kinvolk/ignition/blob/flatcar-master/doc/configuration-v2_2.md

Parameters:
addUnit(name, enabled=True, contents='')[source]

Make a systemd unit on the instance with the given name (including .service), and content. Units will be enabled by default.

Unit logs can be investigated with:

systemctl status whatever.service

or:

journalctl -xe

Parameters:
addSSHRSAKey(keyData)[source]

Authorize the given bare, encoded RSA key (without “ssh-rsa”).

Parameters:

keyData (str)

toIgnitionConfig()[source]

Return an Ignition configuration describing the desired config.

Return type:

str

getBaseInstanceConfiguration()[source]

Get the base configuration for both leader and worker instances for all cluster types.

Return type:

InstanceConfiguration

addVolumesService(config)[source]

Add a service to prepare and mount local scratch volumes.

Parameters:

config (InstanceConfiguration)

addNodeExporterService(config)[source]

Add the node exporter service for Prometheus to an instance configuration.

Parameters:

config (InstanceConfiguration)

toil_service_env_options()[source]
Return type:

str

add_toil_service(config, role, keyPath=None, preemptible=False)[source]

Add the Toil leader or worker service to an instance configuration.

Will run Mesos master or agent as appropriate in Mesos clusters. For Kubernetes clusters, will just sleep to provide a place to shell into on the leader, and shouldn’t run on the worker.

Parameters:
  • role (str) – Should be ‘leader’ or ‘worker’. Will not work for ‘worker’ until leader credentials have been collected.

  • keyPath (str) – path on the node to a server-side encryption key that will be added to the node after it starts. The service will wait until the key is present before starting.

  • preemptible (bool) – Whether a worker should identify itself as preemptible or not to the scheduler.

  • config (InstanceConfiguration)

getKubernetesValues(architecture='amd64')[source]

Returns a dict of Kubernetes component versions and paths for formatting into Kubernetes-related templates.

Parameters:

architecture (str)

addKubernetesServices(config, architecture='amd64')[source]

Add installing Kubernetes and Kubeadm and setting up the Kubelet to run when configured to an instance configuration. The same process applies to leaders and workers.

Parameters:
abstract getKubernetesAutoscalerSetupCommands(values)[source]

Return Bash commands that set up the Kubernetes cluster autoscaler for provisioning from the environment supported by this provisioner.

Should only be implemented if Kubernetes clusters are supported.

Parameters:

values (Dict[str, str]) – Contains definitions of cluster variables, like AUTOSCALER_VERSION and CLUSTER_NAME.

Returns:

Bash snippet

Return type:

str

getKubernetesCloudProvider()[source]

Return the Kubernetes cloud provider (for example, ‘aws’), to pass to the kubelets in a Kubernetes cluster provisioned using this provisioner.

Defaults to None if not overridden, in which case no cloud provider integration will be used.

Returns:

Cloud provider name, or None

Return type:

Optional[str]

addKubernetesLeader(config)[source]

Add services to configure as a Kubernetes leader, if Kubernetes is already set to be installed.

Parameters:

config (InstanceConfiguration)

addKubernetesWorker(config, authVars, preemptible=False)[source]

Add services to configure as a Kubernetes worker, if Kubernetes is already set to be installed.

Authenticate back to the leader using the JOIN_TOKEN, JOIN_CERT_HASH, and JOIN_ENDPOINT set in the given authentication data dict.

Parameters:
  • config (InstanceConfiguration) – The configuration to add services to

  • authVars (Dict[str, str]) – Dict with authentication info

  • preemptible (bool) – Whether the worker should be labeled as preemptible or not