toil.batchSystems.mesos.executor¶
Attributes¶
Classes¶
Partial implementation of AbstractBatchSystem, support methods. |
|
Pass initial attributes to the constructor: |
|
Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked. |
|
Part of Toil's Mesos framework, runs on a Mesos agent. A Toil job is passed to it via the |
Functions¶
Get the rounded-up integer number of whole CPUs available. |
|
Set up the root logger with handlers and formatting. |
|
|
Sets the root logger level to a given string level (like "INFO"). |
|
Module Contents¶
- class toil.batchSystems.mesos.executor.BatchSystemSupport(config, maxCores, maxMemory, maxDisk)¶
Bases:
AbstractBatchSystemPartial implementation of AbstractBatchSystem, support methods.
- Parameters:
config (toil.common.Config)
maxCores (float)
maxMemory (int)
maxDisk (int)
- check_resource_request(requirer)¶
Check resource request is not greater than that available or allowed.
- Parameters:
requirer (toil.job.Requirer) – Object whose requirements are being checked
job_name (str) – Name of the job being checked, for generating a useful error report.
detail (str) – Batch-system-specific message to include in the error.
- Raises:
InsufficientSystemResources – raised when a resource is requested in an amount greater than allowed
- Return type:
None
- setEnv(name, value=None)¶
Set an environment variable for the worker process before it is launched. The worker process will typically inherit the environment of the machine it is running on but this method makes it possible to override specific variables in that inherited environment before the worker is launched. Note that this mechanism is different to the one used by the worker internally to set up the environment of a job. A call to this method affects all jobs issued after this method returns. Note to implementors: This means that you would typically need to copy the variables before enqueuing a job.
If no value is provided it will be looked up from the current environment.
- Parameters:
- Raises:
RuntimeError – if value is None and the name cannot be found in the environment
- Return type:
None
- set_message_bus(message_bus)¶
Give the batch system an opportunity to connect directly to the message bus, so that it can send informational messages about the jobs it is running to other Toil components.
- Parameters:
message_bus (toil.bus.MessageBus)
- Return type:
None
- get_batch_logs_dir()¶
Get the directory where the backing batch system should save its logs.
Only really makes sense if the backing batch system actually saves logs to a filesystem; Kubernetes for example does not. Ought to be a directory shared between the leader and the workers, if the backing batch system writes logs onto the worker’s view of the filesystem, like many HPC schedulers do.
- Return type:
- format_std_out_err_path(toil_job_id, cluster_job_id, std)¶
Format path for batch system standard output/error and other files generated by the batch system itself.
Files will be written to the batch logs directory (–batchLogsDir, defaulting to the Toil work directory) with names containing both the Toil and batch system job IDs, for ease of debugging job failures.
- Param:
int toil_job_id : The unique id that Toil gives a job.
- Param:
cluster_job_id : What the cluster, for example, GridEngine, uses as its internal job id.
- Param:
string std : The provenance of the stream (for example: ‘err’ for ‘stderr’ or ‘out’ for ‘stdout’)
- Return type:
string : Formatted filename; however if self.config.noStdOutErr is true, returns ‘/dev/null’ or equivalent.
- Parameters:
- format_std_out_err_glob(toil_job_id)¶
Get a glob string that will match all file paths generated by format_std_out_err_path for a job.
- static workerCleanup(info)¶
Cleans up the worker node on batch system shutdown.
Also see
supportsWorkerCleanup().- Parameters:
info (WorkerCleanupInfo) – A named tuple consisting of all the relevant information for cleaning up the worker.
- Return type:
None
- class toil.batchSystems.mesos.executor.Expando(*args, **kwargs)¶
Bases:
dictPass initial attributes to the constructor:
>>> o = Expando(foo=42) >>> o.foo 42
Dynamically create new attributes:
>>> o.bar = 'hi' >>> o.bar 'hi'
Expando is a dictionary:
>>> isinstance(o,dict) True >>> o['foo'] 42
Works great with JSON:
>>> import json >>> s='{"foo":42}' >>> o = json.loads(s,object_hook=Expando) >>> o.foo 42 >>> o.bar = 'hi' >>> o.bar 'hi'
And since Expando is a dict, it serializes back to JSON just fine:
>>> json.dumps(o, sort_keys=True) '{"bar": "hi", "foo": 42}'
Attributes can be deleted, too:
>>> o = Expando(foo=42) >>> o.foo 42 >>> del o.foo >>> o.foo Traceback (most recent call last): ... AttributeError: 'Expando' object has no attribute 'foo' >>> o['foo'] Traceback (most recent call last): ... KeyError: 'foo'
>>> del o.foo Traceback (most recent call last): ... AttributeError: foo
And copied:
>>> o = Expando(foo=42) >>> p = o.copy() >>> isinstance(p,Expando) True >>> o == p True >>> o is p False
Same with MagicExpando …
>>> o = MagicExpando() >>> o.foo.bar = 42 >>> p = o.copy() >>> isinstance(p,MagicExpando) True >>> o == p True >>> o is p False
… but the copy is shallow:
>>> o.foo is p.foo True
- copy()¶
D.copy() -> a shallow copy of D
- toil.batchSystems.mesos.executor.cpu_count()¶
Get the rounded-up integer number of whole CPUs available.
Counts hyperthreads as CPUs.
Uses the system’s actual CPU count, or the current v1 cgroup’s quota per period, if the quota is set.
Ignores the cgroup’s cpu shares value, because it’s extremely difficult to interpret. See https://github.com/kubernetes/kubernetes/issues/81021.
Caches result for efficiency.
- Returns:
Integer count of available CPUs, minimum 1.
- Return type:
- class toil.batchSystems.mesos.executor.Resource¶
Bases:
namedtuple('Resource', ('name','pathHash','url','contentHash'))Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.
Each instance is a namedtuple with the following elements:
The pathHash element contains the MD5 (in hexdigest form) of the path to the resource on the leader node. The path, and therefore its hash is unique within a job store.
The url element is a “file:” or “http:” URL at which the resource can be obtained.
The contentHash element is an MD5 checksum of the resource, allowing for validation and caching of resources.
If the resource is a regular file, the type attribute will be ‘file’.
If the resource is a directory, the type attribute will be ‘dir’ and the URL will point at a ZIP archive of that directory.
- resourceEnvNamePrefix = 'JTRES_'¶
- rootDirPathEnvName¶
- classmethod create(jobStore, leaderPath)¶
Saves the content of the file or directory at the given path to the given job store and returns a resource object representing that content for the purpose of obtaining it again at a generic, public URL. This method should be invoked on the leader node.
- Parameters:
leaderPath (str)
- Return type:
- refresh(jobStore)¶
- Parameters:
- Return type:
- classmethod prepareSystem()¶
Prepares this system for the downloading and lookup of resources. This method should only be invoked on a worker node. It is idempotent but not thread-safe.
- Return type:
None
- classmethod cleanSystem()¶
Remove all downloaded, localized resources.
- Return type:
None
- register()¶
Register this resource for later retrieval via lookup(), possibly in a child process.
- Return type:
None
- classmethod lookup(leaderPath)¶
Return a resource object representing a resource created from a file or directory at the given path on the leader.
This method should be invoked on the worker. The given path does not need to refer to an existing file or directory on the worker, it only identifies the resource within an instance of toil. This method returns None if no resource for the given path exists.
- download(callback=None)¶
Download this resource from its URL to a file on the local system.
This method should only be invoked on a worker node after the node was setup for accessing resources via prepareSystem().
- Parameters:
callback (Optional[Callable[[str], None]])
- Return type:
None
- property localPath: str¶
- Abstractmethod:
- Return type:
Get the path to resource on the worker.
The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.
- toil.batchSystems.mesos.executor.configure_root_logger()¶
Set up the root logger with handlers and formatting.
Should be called before any entry point tries to log anything, to ensure consistent formatting.
- Return type:
None
- toil.batchSystems.mesos.executor.set_log_level(level, set_logger=None)¶
Sets the root logger level to a given string level (like “INFO”).
- Parameters:
level (str)
set_logger (Optional[logging.Logger])
- Return type:
None
- toil.batchSystems.mesos.executor.log¶
- class toil.batchSystems.mesos.executor.MesosExecutor¶
Bases:
pymesos.ExecutorPart of Toil’s Mesos framework, runs on a Mesos agent. A Toil job is passed to it via the task.data field, and launched via call(toil.command).
- registered(driver, executorInfo, frameworkInfo, agentInfo)¶
Invoked once the executor driver has been able to successfully connect with Mesos.
- reregistered(driver, agentInfo)¶
Invoked when the executor re-registers with a restarted agent.
- disconnected(driver)¶
Invoked when the executor becomes “disconnected” from the agent (e.g., the agent is being restarted due to an upgrade).
- killTask(driver, taskId)¶
Kill parent task process and all its spawned children
- shutdown(driver)¶
- error(driver, message)¶
Invoked when a fatal error has occurred with the executor and/or executor driver.
- launchTask(driver, task)¶
Invoked by SchedulerDriver when a Mesos task should be launched by this executor
- frameworkMessage(driver, message)¶
Invoked when a framework message has arrived for this executor.
- toil.batchSystems.mesos.executor.main()¶