toil.batchSystems.kubernetes

Batch system for running Toil workflows on Kubernetes.

Ony useful with network-based job stores, like AWSJobStore.

Within non-privileged Kubernetes containers, additional Docker containers cannot yet be launched. That functionality will need to wait for user-mode Docker

Module Contents

Classes

KubernetesBatchSystem

Adds cleanup support when the last running job leaves a node, for batch

Functions

is_retryable_kubernetes_error(e)

A function that determines whether or not Toil should retry or stop given

Attributes

logger

retryable_kubernetes_errors

KeyValuesList

toil.batchSystems.kubernetes.logger
toil.batchSystems.kubernetes.retryable_kubernetes_errors: List[Type[Exception] | toil.lib.retry.ErrorCondition]
toil.batchSystems.kubernetes.is_retryable_kubernetes_error(e)

A function that determines whether or not Toil should retry or stop given exceptions thrown by Kubernetes.

Parameters:

e (Exception)

Return type:

bool

toil.batchSystems.kubernetes.KeyValuesList
class toil.batchSystems.kubernetes.KubernetesBatchSystem(config, maxCores, maxMemory, maxDisk)

Bases: toil.batchSystems.cleanup_support.BatchSystemCleanupSupport

Adds cleanup support when the last running job leaves a node, for batch systems that can’t provide it using the backing scheduler.

Parameters:
class DecoratorWrapper(to_wrap, decorator)

Class to wrap an object so all its methods are decorated.

Parameters:
  • to_wrap (Any)

  • decorator (Callable[[Callable[P, Any]], Callable[P, Any]])

P
__getattr__(name)

Get a member as if we are actually the wrapped object. If it looks callable, we will decorate it.

Parameters:

name (str)

Return type:

Any

class Placement

Internal format for pod placement constraints and preferences.

required_labels: KeyValuesList = []

Labels which are required to be present (with these values).

desired_labels: KeyValuesList = []

Labels which are optional, but preferred to be present (with these values).

prohibited_labels: KeyValuesList = []

Labels which are not allowed to be present (with these values).

tolerated_taints: KeyValuesList = []

Taints which are allowed to be present (with these values).

set_preemptible(preemptible)

Add constraints for a job being preemptible or not.

Preemptible jobs will be able to run on preemptible or non-preemptible nodes, and will prefer preemptible nodes if available.

Non-preemptible jobs will not be allowed to run on nodes that are marked as preemptible.

Understands the labeling scheme used by EKS, and the taint scheme used by GCE. The Toil-managed Kubernetes setup will mimic at least one of these.

Parameters:

preemptible (bool)

Return type:

None

apply(pod_spec)

Set affinity and/or tolerations fields on pod_spec, so that it runs on the right kind of nodes for the constraints we represent.

Parameters:

pod_spec (kubernetes.client.V1PodSpec)

Return type:

None

class KubernetesConfig

Bases: Protocol

Type-enforcing protocol for Toil configs that have the extra Kubernetes batch system fields.

TODO: Until MyPY lets protocols inherit form non-protocols, we will have to let the fact that this also has to be a Config just be manually enforced.

kubernetes_host_path: str | None
kubernetes_owner: str
kubernetes_service_account: str | None
kubernetes_pod_timeout: float
ItemT
CovItemT
P
R
OptionType
classmethod supportsAutoDeployment()

Whether this batch system supports auto-deployment of the user script itself.

If it does, the setUserScript() can be invoked to set the resource object representing the user script.

Note to implementors: If your implementation returns True here, it should also override

Return type:

bool

setUserScript(userScript)

Set the user script for this workflow.

This method must be called before the first job is issued to this batch system, and only if supportsAutoDeployment() returns True, otherwise it will raise an exception.

Parameters:

userScript (toil.resource.Resource) – the resource object representing the user script or module and the modules it depends on.

Return type:

None

issueBatchJob(job_desc, job_environment=None)

Issues a job with the specified command to the batch system and returns a unique jobID.

Parameters:
  • jobDesc – a toil.job.JobDescription

  • job_environment (Optional[Dict[str, str]]) – a collection of job-specific environment variables to be set on the worker.

  • job_desc (toil.job.JobDescription)

Returns:

a unique jobID that can be used to reference the newly issued job

Return type:

int

getUpdatedBatchJob(maxWait)

Returns information about job that has updated its status (i.e. ceased running, either successfully or with an error). Each such job will be returned exactly once.

Does not return info for jobs killed by killBatchJobs, although they may cause None to be returned earlier than maxWait.

Parameters:

maxWait (float) – the number of seconds to block, waiting for a result

Returns:

If a result is available, returns UpdatedBatchJobInfo. Otherwise it returns None. wallTime is the number of seconds (a strictly positive float) in wall-clock time the job ran for, or None if this batch system does not support tracking wall time.

Return type:

Optional[toil.batchSystems.abstractBatchSystem.UpdatedBatchJobInfo]

shutdown()

Called at the completion of a toil invocation. Should cleanly terminate all worker threads.

Return type:

None

getIssuedBatchJobIDs()

Gets all currently issued jobs

Returns:

A list of jobs (as jobIDs) currently issued (may be running, or may be waiting to be run). Despite the result being a list, the ordering should not be depended upon.

Return type:

List[int]

getRunningBatchJobIDs()

Gets a map of jobs as jobIDs that are currently running (not just waiting) and how long they have been running, in seconds.

Returns:

dictionary with currently running jobID keys and how many seconds they have been running as the value

Return type:

Dict[int, float]

killBatchJobs(jobIDs)

Kills the given job IDs. After returning, the killed jobs will not appear in the results of getRunningBatchJobIDs. The killed job will not be returned from getUpdatedBatchJob.

Parameters:

jobIDs (List[int]) – list of IDs of jobs to kill

Return type:

None

classmethod get_default_kubernetes_owner()

Get the default Kubernetes-acceptable username string to tack onto jobs.

Return type:

str

classmethod add_options(parser)

If this batch system provides any command line options, add them to the given parser.

Parameters:

parser (Union[argparse.ArgumentParser, argparse._ArgumentGroup])

Return type:

None

classmethod setOptions(setOption)

Process command line or configuration options relevant to this batch system.

Parameters:

setOption (toil.batchSystems.options.OptionSetter) – A function with signature setOption(option_name, parsing_function=None, check_function=None, default=None, env=None) returning nothing, used to update run configuration as a side effect.

Return type:

None