toil.batchSystems.mesos.executor

Attributes

log

Classes

BatchSystemSupport

Partial implementation of AbstractBatchSystem, support methods.

Expando

Pass initial attributes to the constructor:

Resource

Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.

MesosExecutor

Part of Toil's Mesos framework, runs on a Mesos agent. A Toil job is passed to it via the

Functions

cpu_count()

Get the rounded-up integer number of whole CPUs available.

configure_root_logger()

Set up the root logger with handlers and formatting.

set_log_level(level[, set_logger])

Sets the root logger level to a given string level (like "INFO").

main()

Module Contents

class toil.batchSystems.mesos.executor.BatchSystemSupport(config, maxCores, maxMemory, maxDisk)

Bases: AbstractBatchSystem

Partial implementation of AbstractBatchSystem, support methods.

Parameters:
check_resource_request(requirer)

Check resource request is not greater than that available or allowed.

Parameters:
  • requirer (toil.job.Requirer) – Object whose requirements are being checked

  • job_name (str) – Name of the job being checked, for generating a useful error report.

  • detail (str) – Batch-system-specific message to include in the error.

Raises:

InsufficientSystemResources – raised when a resource is requested in an amount greater than allowed

Return type:

None

setEnv(name, value=None)

Set an environment variable for the worker process before it is launched. The worker process will typically inherit the environment of the machine it is running on but this method makes it possible to override specific variables in that inherited environment before the worker is launched. Note that this mechanism is different to the one used by the worker internally to set up the environment of a job. A call to this method affects all jobs issued after this method returns. Note to implementors: This means that you would typically need to copy the variables before enqueuing a job.

If no value is provided it will be looked up from the current environment.

Parameters:
  • name (str) – the environment variable to be set on the worker.

  • value (Optional[str]) – if given, the environment variable given by name will be set to this value. If None, the variable’s current value will be used as the value on the worker

Raises:

RuntimeError – if value is None and the name cannot be found in the environment

Return type:

None

set_message_bus(message_bus)

Give the batch system an opportunity to connect directly to the message bus, so that it can send informational messages about the jobs it is running to other Toil components.

Parameters:

message_bus (toil.bus.MessageBus)

Return type:

None

get_batch_logs_dir()

Get the directory where the backing batch system should save its logs.

Only really makes sense if the backing batch system actually saves logs to a filesystem; Kubernetes for example does not. Ought to be a directory shared between the leader and the workers, if the backing batch system writes logs onto the worker’s view of the filesystem, like many HPC schedulers do.

Return type:

str

format_std_out_err_path(toil_job_id, cluster_job_id, std)

Format path for batch system standard output/error and other files generated by the batch system itself.

Files will be written to the batch logs directory (–batchLogsDir, defaulting to the Toil work directory) with names containing both the Toil and batch system job IDs, for ease of debugging job failures.

Param:

int toil_job_id : The unique id that Toil gives a job.

Param:

cluster_job_id : What the cluster, for example, GridEngine, uses as its internal job id.

Param:

string std : The provenance of the stream (for example: ‘err’ for ‘stderr’ or ‘out’ for ‘stdout’)

Return type:

string : Formatted filename; however if self.config.noStdOutErr is true, returns ‘/dev/null’ or equivalent.

Parameters:
  • toil_job_id (int)

  • cluster_job_id (str)

  • std (str)

format_std_out_err_glob(toil_job_id)

Get a glob string that will match all file paths generated by format_std_out_err_path for a job.

Parameters:

toil_job_id (int)

Return type:

str

static workerCleanup(info)

Cleans up the worker node on batch system shutdown.

Also see supportsWorkerCleanup().

Parameters:

info (WorkerCleanupInfo) – A named tuple consisting of all the relevant information for cleaning up the worker.

Return type:

None

class toil.batchSystems.mesos.executor.Expando(*args, **kwargs)

Bases: dict

Pass initial attributes to the constructor:

>>> o = Expando(foo=42)
>>> o.foo
42

Dynamically create new attributes:

>>> o.bar = 'hi'
>>> o.bar
'hi'

Expando is a dictionary:

>>> isinstance(o,dict)
True
>>> o['foo']
42

Works great with JSON:

>>> import json
>>> s='{"foo":42}'
>>> o = json.loads(s,object_hook=Expando)
>>> o.foo
42
>>> o.bar = 'hi'
>>> o.bar
'hi'

And since Expando is a dict, it serializes back to JSON just fine:

>>> json.dumps(o, sort_keys=True)
'{"bar": "hi", "foo": 42}'

Attributes can be deleted, too:

>>> o = Expando(foo=42)
>>> o.foo
42
>>> del o.foo
>>> o.foo
Traceback (most recent call last):
...
AttributeError: 'Expando' object has no attribute 'foo'
>>> o['foo']
Traceback (most recent call last):
...
KeyError: 'foo'
>>> del o.foo
Traceback (most recent call last):
...
AttributeError: foo

And copied:

>>> o = Expando(foo=42)
>>> p = o.copy()
>>> isinstance(p,Expando)
True
>>> o == p
True
>>> o is p
False

Same with MagicExpando …

>>> o = MagicExpando()
>>> o.foo.bar = 42
>>> p = o.copy()
>>> isinstance(p,MagicExpando)
True
>>> o == p
True
>>> o is p
False

… but the copy is shallow:

>>> o.foo is p.foo
True
copy()

D.copy() -> a shallow copy of D

toil.batchSystems.mesos.executor.cpu_count()

Get the rounded-up integer number of whole CPUs available.

Counts hyperthreads as CPUs.

Uses the system’s actual CPU count, or the current v1 cgroup’s quota per period, if the quota is set.

Ignores the cgroup’s cpu shares value, because it’s extremely difficult to interpret. See https://github.com/kubernetes/kubernetes/issues/81021.

Caches result for efficiency.

Returns:

Integer count of available CPUs, minimum 1.

Return type:

int

class toil.batchSystems.mesos.executor.Resource

Bases: namedtuple('Resource', ('name', 'pathHash', 'url', 'contentHash'))

Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.

Each instance is a namedtuple with the following elements:

The pathHash element contains the MD5 (in hexdigest form) of the path to the resource on the leader node. The path, and therefore its hash is unique within a job store.

The url element is a “file:” or “http:” URL at which the resource can be obtained.

The contentHash element is an MD5 checksum of the resource, allowing for validation and caching of resources.

If the resource is a regular file, the type attribute will be ‘file’.

If the resource is a directory, the type attribute will be ‘dir’ and the URL will point at a ZIP archive of that directory.

resourceEnvNamePrefix = 'JTRES_'
rootDirPathEnvName
classmethod create(jobStore, leaderPath)

Saves the content of the file or directory at the given path to the given job store and returns a resource object representing that content for the purpose of obtaining it again at a generic, public URL. This method should be invoked on the leader node.

Parameters:
Return type:

Resource

refresh(jobStore)
Parameters:

jobStore (toil.jobStores.abstractJobStore.AbstractJobStore)

Return type:

Resource

classmethod prepareSystem()

Prepares this system for the downloading and lookup of resources. This method should only be invoked on a worker node. It is idempotent but not thread-safe.

Return type:

None

classmethod cleanSystem()

Remove all downloaded, localized resources.

Return type:

None

register()

Register this resource for later retrieval via lookup(), possibly in a child process.

Return type:

None

classmethod lookup(leaderPath)

Return a resource object representing a resource created from a file or directory at the given path on the leader.

This method should be invoked on the worker. The given path does not need to refer to an existing file or directory on the worker, it only identifies the resource within an instance of toil. This method returns None if no resource for the given path exists.

Parameters:

leaderPath (str)

Return type:

Optional[Resource]

download(callback=None)

Download this resource from its URL to a file on the local system.

This method should only be invoked on a worker node after the node was setup for accessing resources via prepareSystem().

Parameters:

callback (Optional[Callable[[str], None]])

Return type:

None

property localPath: str
Abstractmethod:

Return type:

str

Get the path to resource on the worker.

The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.

property localDirPath: str

The path to the directory containing the resource on the worker.

Return type:

str

pickle()
Return type:

str

classmethod unpickle(s)
Parameters:

s (str)

Return type:

Resource

toil.batchSystems.mesos.executor.configure_root_logger()

Set up the root logger with handlers and formatting.

Should be called before any entry point tries to log anything, to ensure consistent formatting.

Return type:

None

toil.batchSystems.mesos.executor.set_log_level(level, set_logger=None)

Sets the root logger level to a given string level (like “INFO”).

Parameters:
Return type:

None

toil.batchSystems.mesos.executor.log
class toil.batchSystems.mesos.executor.MesosExecutor

Bases: pymesos.Executor

Part of Toil’s Mesos framework, runs on a Mesos agent. A Toil job is passed to it via the task.data field, and launched via call(toil.command).

registered(driver, executorInfo, frameworkInfo, agentInfo)

Invoked once the executor driver has been able to successfully connect with Mesos.

reregistered(driver, agentInfo)

Invoked when the executor re-registers with a restarted agent.

disconnected(driver)

Invoked when the executor becomes “disconnected” from the agent (e.g., the agent is being restarted due to an upgrade).

killTask(driver, taskId)

Kill parent task process and all its spawned children

shutdown(driver)
error(driver, message)

Invoked when a fatal error has occurred with the executor and/or executor driver.

launchTask(driver, task)

Invoked by SchedulerDriver when a Mesos task should be launched by this executor

frameworkMessage(driver, message)

Invoked when a framework message has arrived for this executor.

toil.batchSystems.mesos.executor.main()