Batch System API¶
The batch system interface is used by Toil to abstract over different ways of running
batches of jobs, for example Slurm, GridEngine, Mesos, Parasol and a single node. The
toil.batchSystems.abstractBatchSystem.AbstractBatchSystem
API is implemented to
run jobs using a given job management system, e.g. Mesos.
Batch System Enivronmental Variables¶
Environmental variables allow passing of scheduler specific parameters.
For SLURM:
export TOIL_SLURM_ARGS="-t 1:00:00 -q fatq"
For TORQUE there are two environment variables - one for everything but the resource requirements, and another - for resources requirements (without the -l prefix):
export TOIL_TORQUE_ARGS="-q fatq"
export TOIL_TORQUE_REQS="walltime=1:00:00"
For GridEngine (SGE, UGE), there is an additional environmental variable to define the parallel environment for running multicore jobs:
export TOIL_GRIDENGINE_PE='smp'
export TOIL_GRIDENGINE_ARGS='-q batch.q'
Batch System API¶
-
class
toil.batchSystems.abstractBatchSystem.
AbstractBatchSystem
[source]¶ An abstract (as far as Python currently allows) base class to represent the interface the batch system must provide to Toil.
-
classmethod
supportsHotDeployment
()[source]¶ Whether this batch system supports hot deployment of the user script itself. If it does, the
setUserScript()
can be invoked to set the resource object representing the user script.Note to implementors: If your implementation returns True here, it should also override
Return type: bool
-
classmethod
supportsWorkerCleanup
()[source]¶ Indicates whether this batch system invokes
workerCleanup()
after the last job for a particular workflow invocation finishes. Note that the term worker refers to an entire node, not just a worker process. A worker process may run more than one job sequentially, and more than one concurrent worker process may exist on a worker node, for the same workflow. The batch system is said to shut down after the last worker process terminates.Return type: bool
-
setUserScript
(userScript)[source]¶ Set the user script for this workflow. This method must be called before the first job is issued to this batch system, and only if
supportsHotDeployment()
returns True, otherwise it will raise an exception.Parameters: userScript (toil.resource.Resource) – the resource object representing the user script or module and the modules it depends on.
-
issueBatchJob
(jobNode)[source]¶ Issues a job with the specified command to the batch system and returns a unique jobID.
:param jobNode a toil.job.JobNode
Returns: a unique jobID that can be used to reference the newly issued job Return type: int
-
killBatchJobs
(jobIDs)[source]¶ Kills the given job IDs.
Parameters: jobIDs (list[int]) – list of IDs of jobs to kill
-
getIssuedBatchJobIDs
()[source]¶ Gets all currently issued jobs
Returns: A list of jobs (as jobIDs) currently issued (may be running, or may be waiting to be run). Despite the result being a list, the ordering should not be depended upon. Return type: list[str]
-
getRunningBatchJobIDs
()[source]¶ Gets a map of jobs as jobIDs that are currently running (not just waiting) and how long they have been running, in seconds.
Returns: dictionary with currently running jobID keys and how many seconds they have been running as the value Return type: dict[str,float]
-
getUpdatedBatchJob
(maxWait)[source]¶ Returns a job that has updated its status.
Parameters: maxWait (float) – the number of seconds to block, waiting for a result Return type: tuple(str, int) or None Returns: If a result is available, returns a tuple (jobID, exitValue, wallTime). Otherwise it returns None. wallTime is the number of seconds (a float) in wall-clock time the job ran for or None if this batch system does not support tracking wall time. Returns None for jobs that were killed.
-
shutdown
()[source]¶ Called at the completion of a toil invocation. Should cleanly terminate all worker threads.
-
setEnv
(name, value=None)[source]¶ Set an environment variable for the worker process before it is launched. The worker process will typically inherit the environment of the machine it is running on but this method makes it possible to override specific variables in that inherited environment before the worker is launched. Note that this mechanism is different to the one used by the worker internally to set up the environment of a job. A call to this method affects all jobs issued after this method returns. Note to implementors: This means that you would typically need to copy the variables before enqueuing a job.
If no value is provided it will be looked up from the current environment.
-
classmethod