toil.wdl.wdltoil¶
Attributes¶
Exceptions¶
Common base class for all non-exit exceptions. |
Classes¶
Protocol that is more specific than what file_digest takes as an argument. |
|
Protocol for the features we need from hashlib.file_digest. |
|
WDL size() implementation that avoids downloading files. |
|
Standard library implementation for WDL as run on Toil. |
|
Standard library implementation for workflow scope. |
|
Standard library implementation to use inside a WDL task command evaluation. |
|
Standard library implementation for WDL as run on Toil, with additional |
|
Base job class for all WDL-related jobs. |
|
Job that determines the resources needed to run a WDL job. |
|
Job that runs a WDL task. |
|
Job that evaluates a WDL workflow node. |
|
Job that evaluates a list of WDL workflow nodes, which are in the same |
|
Job that collects the results from WDL workflow nodes and combines their |
|
Represents a graph of WDL WorkflowNodes. |
|
Job that can create more graph for a section of the workflow. |
|
Job that evaluates a scatter in a WDL workflow. Runs the body for each |
|
Job that takes all new bindings created in an array of input environments, |
|
Job that evaluates a conditional in a WDL workflow. |
|
Job that evaluates an entire WDL workflow. |
|
Job which evaluates an outputs section for a workflow. |
|
Job that evaluates an entire WDL workflow, and returns the workflow outputs |
|
Class represents a unit of work in toil. |
|
Job to organize importing files on workers instead of the leader. Responsible for extracting filenames and metadata, |
Functions¶
|
Run code in a context where WDL errors will be reported with pretty formatting. |
|
Create a decorator to report WDL errors with the given task message. |
|
Remove "common leading whitespace" as defined in the WDL 1.1 spec. |
|
Implementation of a MiniWDL read_source function that can use any |
|
Check if two WDL values are equal when taking into account file virtualization. |
|
Combine variable bindings from multiple predecessor tasks into one set for |
|
Log bindings to the console, even if some are still promises. |
|
Get the supertype that can hold values of all the given types. |
|
Iterate over all WDL workflow nodes in the given node, including inputs, |
|
Get the combined workflow_node_dependencies of root and everything under |
|
Parse a WDL disk spec into a disk mount specification. |
|
Encode a Toil file ID and metadata about who wrote it as a URI. |
|
Unpack a URI made by make_toil_uri to retrieve the FileID and the basename |
|
Copy all Toil metadata from one WDL File to another. |
|
Return a copy of a WDL File with all metadata intact but the value changed. |
|
Return a copy of a WDL File with all metadata intact but the nonexistent flag set to the given value. |
|
Return the nonexistent flag for a file. |
|
Return a copy of a WDL File with all metadata intact but the virtualized_value attribute set to the given value. |
Get the virtualized storage location for a file. |
|
|
If a File has a shared filesystem path, get that path. |
|
Return a copy of the given File associated with the given shared filesystem path. |
|
Given WDL bindings, return a copy where all files have their shared filesystem paths as their values. |
|
Return the cached result of calling this workflow or task, and its key. |
|
Cache the result of calling a workflow or task. |
|
Select a good directory to save files from a task and source directory in. |
|
Evaluate decls with a given bindings environment and standard library. |
|
|
|
Resolve relative-URI files in the given environment convert the file values to a new value made from a given mapping. |
|
Resolve relative-URI files in the given environment and import all files. |
|
Evaluate an expression when we know the name of it. |
|
Evaluate the expression of a declaration node, or raise an error. |
|
Evaluate a bunch of expressions with names, and make them into a fresh set of bindings. inputs_dict is a mapping of |
|
If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression. |
|
Make sure all the File values embedded in the given bindings point to files |
|
Make sure all the File values embedded in the given bindings point to files |
|
Based off of WDL.runtime.task_container.add_paths from miniwdl |
|
Return None if a file doesn't exist, or its path if it does. |
|
Make sure all the File values embedded in the given bindings point to files |
|
Get the paths of all files in the bindings. Doesn't guarantee that |
|
Run all File values embedded in the given bindings through the given |
|
Run all File values' types and values embedded in the given binding's value through the given |
|
Run all File values embedded in the given value through the given |
|
Run through all nested values embedded in the given value and check that the null values are valid. |
|
|
|
A Toil workflow to interpret WDL input files. |
Module Contents¶
- toil.wdl.wdltoil.logger¶
- class toil.wdl.wdltoil.ReadableFileObj¶
Bases:
Protocol
Protocol that is more specific than what file_digest takes as an argument. Also guarantees a read() method.
Would extend the protocol from Typeshed for hashlib but those are only declared for 3.11+.
- class toil.wdl.wdltoil.FileDigester¶
Bases:
Protocol
Protocol for the features we need from hashlib.file_digest.
- __call__(__f, __alg_name)¶
- Parameters:
__f (ReadableFileObj)
__alg_name (str)
- Return type:
hashlib._Hash
- toil.wdl.wdltoil.file_digest: FileDigester¶
- toil.wdl.wdltoil.WDLContext¶
- exception toil.wdl.wdltoil.InsufficientMountDiskSpace(mount_targets, desired_bytes, available_bytes)¶
Bases:
Exception
Common base class for all non-exit exceptions.
- toil.wdl.wdltoil.wdl_error_reporter(task, exit=False, log=logger.critical)¶
Run code in a context where WDL errors will be reported with pretty formatting.
- toil.wdl.wdltoil.F¶
- toil.wdl.wdltoil.report_wdl_errors(task, exit=False, log=logger.critical)¶
Create a decorator to report WDL errors with the given task message.
Decorator can then be applied to a function, and if a WDL error happens it will say that it could not {task}.
- toil.wdl.wdltoil.remove_common_leading_whitespace(expression, tolerate_blanks=True, tolerate_dedents=False, tolerate_all_whitespace=True, debug=False)¶
Remove “common leading whitespace” as defined in the WDL 1.1 spec.
See <https://github.com/openwdl/wdl/blob/main/versions/1.1/SPEC.md#stripping-leading-whitespace>.
Operates on a WDL.Expr.String expression that has already been parsed.
- Parameters:
tolerate_blanks (bool) – If True, don’t allow totally blank lines to zero the common whitespace.
tolerate_dedents (bool) – If True, remove as much of the whitespace on the first indented line as is found on subesquent lines, regardless of whether later lines are out-dented relative to it.
tolerate_all_whitespace (bool) – If True, don’t allow all-whitespace lines to reduce the common whitespace prefix.
debug (bool) – If True, the function will show its work by logging at debug level.
expression (WDL.Expr.String)
- Return type:
WDL.Expr.String
- async toil.wdl.wdltoil.toil_read_source(uri, path, importer)¶
Implementation of a MiniWDL read_source function that can use any filename or URL supported by Toil.
Needs to be async because MiniWDL will await its result.
- toil.wdl.wdltoil.virtualized_equal(value1, value2)¶
Check if two WDL values are equal when taking into account file virtualization.
Treats virtualized and non-virtualized Files referring to the same underlying file as equal.
- Parameters:
value1 (WDL.Value.Base) – WDL value
value2 (WDL.Value.Base) – WDL value
- Returns:
Whether the two values are equal with file virtualization accounted for
- Return type:
- toil.wdl.wdltoil.WDLBindings¶
- toil.wdl.wdltoil.combine_bindings(all_bindings)¶
Combine variable bindings from multiple predecessor tasks into one set for the current task.
- Parameters:
all_bindings (Sequence[WDLBindings])
- Return type:
WDLBindings
- toil.wdl.wdltoil.log_bindings(log_function, message, all_bindings)¶
Log bindings to the console, even if some are still promises.
- Parameters:
log_function (Callable[Ellipsis, None]) – Function (like logger.info) to call to log data
message (str) – Message to log before the bindings
all_bindings (Sequence[toil.job.Promised[WDLBindings]]) – A list of bindings or promises for bindings, to log
- Return type:
None
- toil.wdl.wdltoil.get_supertype(types)¶
Get the supertype that can hold values of all the given types.
- Parameters:
types (Sequence[WDL.Type.Base])
- Return type:
WDL.Type.Base
- toil.wdl.wdltoil.for_each_node(root)¶
Iterate over all WDL workflow nodes in the given node, including inputs, internal nodes of conditionals and scatters, and gather nodes.
- Parameters:
root (WDL.Tree.WorkflowNode)
- Return type:
Iterator[WDL.Tree.WorkflowNode]
- toil.wdl.wdltoil.recursive_dependencies(root)¶
Get the combined workflow_node_dependencies of root and everything under it, which are not on anything in that subtree.
Useful because section nodes can have internal nodes with dependencies not reflected in those of the section node itself.
- toil.wdl.wdltoil.parse_disks(spec, disks_spec)¶
Parse a WDL disk spec into a disk mount specification. :param spec: Disks spec to parse :param disks_spec: All disks spec as specified in the WDL file. Only used for better error messages. :return: Specified mount point (None if omitted or local-disk), number of units, size of unit (ex GB)
- toil.wdl.wdltoil.pack_toil_uri(file_id, task_path, dir_id, file_basename)¶
Encode a Toil file ID and metadata about who wrote it as a URI.
The URI will start with the scheme in TOIL_URI_SCHEME.
- Parameters:
file_id (toil.fileStores.FileID)
task_path (str)
dir_id (uuid.UUID)
file_basename (str)
- Return type:
- toil.wdl.wdltoil.unpack_toil_uri(toil_uri)¶
Unpack a URI made by make_toil_uri to retrieve the FileID and the basename (no path prefix) that the file is supposed to have.
- toil.wdl.wdltoil.SHARED_PATH_ATTR = '_shared_fs_path'¶
- toil.wdl.wdltoil.clone_metadata(old_file, new_file)¶
Copy all Toil metadata from one WDL File to another.
- Parameters:
old_file (WDL.Value.File)
new_file (WDL.Value.File)
- Return type:
None
- toil.wdl.wdltoil.set_file_value(file, new_value)¶
Return a copy of a WDL File with all metadata intact but the value changed.
- Parameters:
file (WDL.Value.File)
new_value (str)
- Return type:
WDL.Value.File
- toil.wdl.wdltoil.set_file_nonexistent(file, nonexistent)¶
Return a copy of a WDL File with all metadata intact but the nonexistent flag set to the given value.
- Parameters:
file (WDL.Value.File)
nonexistent (bool)
- Return type:
WDL.Value.File
- toil.wdl.wdltoil.get_file_nonexistent(file)¶
Return the nonexistent flag for a file.
- Parameters:
file (WDL.Value.File)
- Return type:
- toil.wdl.wdltoil.set_file_virtualized_value(file, virtualized_value)¶
Return a copy of a WDL File with all metadata intact but the virtualized_value attribute set to the given value.
- Parameters:
file (WDL.Value.File)
virtualized_value (str)
- Return type:
WDL.Value.File
- toil.wdl.wdltoil.get_file_virtualized_value(file)¶
Get the virtualized storage location for a file.
- Parameters:
file (WDL.Value.File)
- Return type:
Optional[str]
If a File has a shared filesystem path, get that path.
This will be the path the File was initially imported from, or the path that it has in the call cache.
- Parameters:
file (WDL.Value.File)
- Return type:
Optional[str]
Return a copy of the given File associated with the given shared filesystem path.
This should be the path it was initially imported from, or the path that it has in the call cache.
- Parameters:
file (WDL.Value.File)
path (str)
- Return type:
WDL.Value.File
Given WDL bindings, return a copy where all files have their shared filesystem paths as their values.
- Parameters:
bindings (WDL.Env.Bindings[WDL.Value.Base])
- Return type:
WDL.Env.Bindings[WDL.Value.Base]
- toil.wdl.wdltoil.poll_execution_cache(node, bindings)¶
Return the cached result of calling this workflow or task, and its key.
Returns None and the key if the cache has no result for us.
Deals in un-namespaced bindings.
- toil.wdl.wdltoil.fill_execution_cache(cache_key, output_bindings, file_store, wdl_options, miniwdl_logger=None, miniwdl_config=None)¶
Cache the result of calling a workflow or task.
Deals in un-namespaced bindings.
- Returns:
possibly modified bindings to continue on with, that may reference the cache.
- Parameters:
cache_key (str)
output_bindings (WDLBindings)
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
wdl_options (WDLContext)
miniwdl_logger (Optional[logging.Logger])
miniwdl_config (Optional[WDL.runtime.config.Loader])
- Return type:
WDLBindings
- toil.wdl.wdltoil.DirectoryNamingStateDict¶
- toil.wdl.wdltoil.choose_human_readable_directory(root_dir, source_task_path, parent_id, state)¶
Select a good directory to save files from a task and source directory in.
The directories involved may not exist.
- Parameters:
root_dir (str) – Directory that the path will be under
source_task_path (str) – The dotted WDL name of whatever generated the file. We assume this is an acceptable filename component.
parent_id (str) – UUID of the directory that the file came from. All files with the same parent ID will be placed as siblings files in a shared parent directory.
state (DirectoryNamingStateDict) – A state dict that must be passed to repeated calls.
- Return type:
- toil.wdl.wdltoil.evaluate_decls_to_bindings(decls, all_bindings, standard_library, include_previous=False, drop_missing_files=False)¶
Evaluate decls with a given bindings environment and standard library. Creates a new bindings object that only contains the bindings from the given decls. Guarantees that each decl in decls can access the variables defined by the previous ones. :param all_bindings: Environment to use when evaluating decls :param decls: Decls to evaluate :param standard_library: Standard library :param include_previous: Whether to include the existing environment in the new returned environment. This will be false for outputs where only defined decls should be included :param drop_missing_files: Whether to coerce nonexistent files to null. The coerced elements will be checked that the transformation is valid. Currently should only be enabled in output sections, see https://github.com/openwdl/wdl/issues/673#issuecomment-2248828116 :return: New bindings object
- Parameters:
decls (list[WDL.Tree.Decl])
all_bindings (WDL.Env.Bindings[WDL.Value.Base])
standard_library (ToilWDLStdLibBase)
include_previous (bool)
drop_missing_files (bool)
- Return type:
WDL.Env.Bindings[WDL.Value.Base]
- class toil.wdl.wdltoil.NonDownloadingSize¶
Bases:
WDL.StdLib._Size
WDL size() implementation that avoids downloading files.
MiniWDL’s default size() implementation downloads the whole file to get its size. We want to be able to get file sizes from code running on the leader, where there may not be space to download the whole file. So we override the fancy class that implements it so that we can handle sizes for FileIDs using the FileID’s stored size info.
- toil.wdl.wdltoil.extract_workflow_inputs(environment)¶
- toil.wdl.wdltoil.convert_files(environment, file_to_id, file_to_data, task_path)¶
Resolve relative-URI files in the given environment convert the file values to a new value made from a given mapping.
Will return bindings with file values set to their corresponding relative-URI.
- Parameters:
environment (WDLBindings) – Bindings to evaluate on
file_to_id (Dict[str, toil.fileStores.FileID])
file_to_data (Dict[str, toil.job.FileMetadata])
task_path (str)
- Returns:
new bindings object
- Return type:
WDLBindings
- toil.wdl.wdltoil.convert_remote_files(environment, file_source, task_path, search_paths=None, import_remote_files=True, execution_dir=None)¶
Resolve relative-URI files in the given environment and import all files.
Returns an environment where each File’s value is set to the URI it was found at, its virtualized value is set to what it was loaded into the filestore as (if applicable), and its shared filesystem path is set if it came from the local filesystem.
- Parameters:
environment (WDLBindings) – Bindings to evaluate on
file_source (toil.jobStores.abstractJobStore.AbstractJobStore) – Context to search for files with
task_path (str) – Dotted WDL name of the user-level code doing the importing (probably the workflow name).
search_paths (Optional[list[str]]) – If set, try resolving input location relative to the URLs or directories in this list.
import_remote_files (bool) – If set, import files from remote locations. Else leave them as URI references.
execution_dir (Optional[str])
- Return type:
WDLBindings
- class toil.wdl.wdltoil.ToilWDLStdLibBase(file_store, wdl_options, share_files_with=None)¶
Bases:
WDL.StdLib.Base
Standard library implementation for WDL as run on Toil.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
wdl_options (WDLContext)
share_files_with (ToilWDLStdLibBase | None)
- size¶
- get_local_paths()¶
Get all the local paths of files devirtualized (or virtualized) through the stdlib.
- static devirtualize_to(filename, dest_dir, file_source, state, wdl_options, devirtualized_to_virtualized=None, virtualized_to_devirtualized=None, export=None)¶
Download or export a WDL virtualized filename/URL to the given directory.
The destination directory must already exist. No other devirtualize_to call may be writing to it, including the case of another workflow writing the same task to the same place in the call cache at the same time.
Makes sure sibling files stay siblings and files with the same name don’t clobber each other. Called from within this class for tasks, and statically at the end of the workflow for outputs.
Returns the local path to the file. If the file is already a local path, or if it already has an entry in virtualized_to_devirtualized, that path will be re-used instead of creating a new copy in dest_dir.
The input filename could already be devirtualized. In this case, the filename should not be added to the cache.
- Parameters:
state (DirectoryNamingStateDict) – State dict which must be shared among successive calls into a dest_dir.
wdl_options (WDLContext) – WDL options to carry through.
export (bool | None) – Always create exported copies of files rather than views that a FileStore might clean up.
filename (str)
dest_dir (str)
file_source (toil.fileStores.abstractFileStore.AbstractFileStore | toil.common.Toil)
- Return type:
- class toil.wdl.wdltoil.ToilWDLStdLibWorkflow(*args, **kwargs)¶
Bases:
ToilWDLStdLibBase
Standard library implementation for workflow scope.
Handles deduplicating files generated by write_* calls at workflow scope with copies already in the call cache, so that tasks that depend on them can also be fulfilled from the cache.
- Parameters:
args (Any)
kwargs (Any)
- class toil.wdl.wdltoil.ToilWDLStdLibTaskCommand(file_store, container, wdl_options)¶
Bases:
ToilWDLStdLibBase
Standard library implementation to use inside a WDL task command evaluation.
Expects all the filenames in variable bindings to be container-side paths; these are the “virtualized” filenames, while the “devirtualized” filenames are host-side paths.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
container (WDL.runtime.task_container.TaskContainer)
wdl_options (WDLContext)
- container¶
- class toil.wdl.wdltoil.ToilWDLStdLibTaskOutputs(file_store, stdout_path, stderr_path, file_to_mountpoint, wdl_options, share_files_with=None)¶
Bases:
ToilWDLStdLibBase
,WDL.StdLib.TaskOutputs
Standard library implementation for WDL as run on Toil, with additional functions only allowed in task output sections.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
stdout_path (str)
stderr_path (str)
wdl_options (WDLContext)
share_files_with (ToilWDLStdLibBase | None)
- toil.wdl.wdltoil.evaluate_named_expression(context, name, expected_type, expression, environment, stdlib)¶
Evaluate an expression when we know the name of it.
- Parameters:
context (WDL.Error.SourceNode | WDL.Error.SourcePosition)
name (str)
expected_type (WDL.Type.Base | None)
expression (WDL.Expr.Base | None)
environment (WDLBindings)
stdlib (WDL.StdLib.Base)
- Return type:
WDL.Value.Base
- toil.wdl.wdltoil.evaluate_decl(node, environment, stdlib)¶
Evaluate the expression of a declaration node, or raise an error.
- Parameters:
node (WDL.Tree.Decl)
environment (WDLBindings)
stdlib (WDL.StdLib.Base)
- Return type:
WDL.Value.Base
- toil.wdl.wdltoil.evaluate_call_inputs(context, expressions, environment, stdlib, inputs_dict=None)¶
Evaluate a bunch of expressions with names, and make them into a fresh set of bindings. inputs_dict is a mapping of variable names to their expected type for the input decls in a task.
- toil.wdl.wdltoil.evaluate_defaultable_decl(node, environment, stdlib)¶
If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression.
- Parameters:
node (WDL.Tree.Decl)
environment (WDLBindings)
stdlib (WDL.StdLib.Base)
- Return type:
WDL.Value.Base
- toil.wdl.wdltoil.devirtualize_files(environment, stdlib)¶
Make sure all the File values embedded in the given bindings point to files that are actually available to command line commands. The same virtual file always maps to the same devirtualized filename even with duplicates
- Parameters:
environment (WDLBindings)
stdlib (ToilWDLStdLibBase)
- Return type:
WDLBindings
- toil.wdl.wdltoil.virtualize_files(environment, stdlib, enforce_existence=True)¶
Make sure all the File values embedded in the given bindings point to files that are usable from other machines.
- Parameters:
environment (WDLBindings)
stdlib (ToilWDLStdLibBase)
enforce_existence (bool)
- Return type:
WDLBindings
- toil.wdl.wdltoil.add_paths(task_container, host_paths)¶
Based off of WDL.runtime.task_container.add_paths from miniwdl Maps the host path to the container paths
- Parameters:
task_container (WDL.runtime.task_container.TaskContainer)
host_paths (Iterable[str])
- Return type:
None
- toil.wdl.wdltoil.drop_if_missing(file, standard_library)¶
Return None if a file doesn’t exist, or its path if it does.
filename represents a URI or file name belonging to a WDL value of type value_type. work_dir represents the current working directory of the job and is where all relative paths will be interpreted from
- Parameters:
file (WDL.Value.File)
standard_library (ToilWDLStdLibBase)
- Return type:
WDL.Value.File | None
- toil.wdl.wdltoil.drop_missing_files(environment, standard_library)¶
Make sure all the File values embedded in the given bindings point to files that exist, or are null.
Files must not be virtualized.
- Parameters:
environment (WDLBindings)
standard_library (ToilWDLStdLibBase)
- Return type:
WDLBindings
- toil.wdl.wdltoil.get_file_paths_in_bindings(environment)¶
Get the paths of all files in the bindings. Doesn’t guarantee that duplicates are removed.
TODO: Duplicative with WDL.runtime.task._fspaths, except that is internal and supports Directory objects.
- toil.wdl.wdltoil.map_over_files_in_bindings(environment, transform)¶
Run all File values embedded in the given bindings through the given transformation function.
The transformation function must not mutate the original File.
TODO: Replace with WDL.Value.rewrite_env_paths or WDL.Value.rewrite_files
- Parameters:
environment (WDLBindings)
transform (Callable[[WDL.Value.File], WDL.Value.File | None])
- Return type:
WDLBindings
- toil.wdl.wdltoil.map_over_files_in_binding(binding, transform)¶
Run all File values’ types and values embedded in the given binding’s value through the given transformation function.
The transformation function must not mutate the original File.
- Parameters:
binding (WDL.Env.Binding[WDL.Value.Base])
transform (Callable[[WDL.Value.File], WDL.Value.File | None])
- Return type:
WDL.Env.Binding[WDL.Value.Base]
- toil.wdl.wdltoil.map_over_typed_files_in_value(value, transform)¶
Run all File values embedded in the given value through the given transformation function.
The transformation function must not mutate the original File.
If the transform returns None, the file value is changed to Null.
The transform has access to the type information for the value, so it knows if it may return None, depending on if the value is optional or not.
The transform is allowed to return None only if the mapping result won’t actually be used, to allow for scans. So error checking needs to be part of the transform itself.
- Parameters:
value (WDL.Value.Base)
transform (Callable[[WDL.Value.File], WDL.Value.File | None])
- Return type:
WDL.Value.Base
- toil.wdl.wdltoil.ensure_null_files_are_nullable(value, original_value, expected_type)¶
Run through all nested values embedded in the given value and check that the null values are valid.
If a null value is found that does not have a valid corresponding expected_type, raise an error
(This is currently only used to check that null values arising from File coercion are in locations with a nullable File? type. If this is to be used elsewhere, the error message should be changed to describe the appropriate types and not just talk about files.)
For example: If one of the nested values is null but the equivalent nested expected_type is not optional, a FileNotFoundError will be raised :param value: WDL base value to check. This is the WDL value that has been transformed and has the null elements :param original_value: The original WDL base value prior to the transformation. Only used for error messages :param expected_type: The WDL type of the value
- Parameters:
value (WDL.Value.Base)
original_value (WDL.Value.Base)
expected_type (WDL.Type.Base)
- Return type:
None
- class toil.wdl.wdltoil.WDLBaseJob(wdl_options, **kwargs)¶
Bases:
toil.job.Job
Base job class for all WDL-related jobs.
Responsible for post-processing returned bindings, to do things like add in null values for things not defined in a section. Post-processing operations can be added onto any job before it is saved, and will be applied as long as the job’s run method calls postprocess().
Also responsible for remembering the Toil WDL configuration keys and values.
- Parameters:
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Run a WDL-related job.
Remember to decorate non-trivial overrides with
report_wdl_errors()
.- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
Any
- then_underlay(underlay)¶
Apply an underlay of backup bindings to the result.
- Parameters:
underlay (toil.job.Promised[WDLBindings])
- Return type:
None
- then_remove(remove)¶
Remove the given bindings from the result.
- Parameters:
remove (toil.job.Promised[WDLBindings])
- Return type:
None
- then_namespace(namespace)¶
Put the result bindings into a namespace.
- Parameters:
namespace (str)
- Return type:
None
- then_overlay(overlay)¶
Overlay the given bindings on top of the (possibly namespaced) result.
- Parameters:
overlay (toil.job.Promised[WDLBindings])
- Return type:
None
- postprocess(bindings)¶
Apply queued changes to bindings.
Should be applied by subclasses’ run() implementations to their return values.
- Parameters:
bindings (WDLBindings)
- Return type:
WDLBindings
- defer_postprocessing(other)¶
Give our postprocessing steps to a different job.
Use this when you are returning a promise for bindings, on the job that issues the promise.
- Parameters:
other (WDLBaseJob)
- Return type:
None
- class toil.wdl.wdltoil.WDLTaskWrapperJob(task, prev_node_results, task_id, wdl_options, **kwargs)¶
Bases:
WDLBaseJob
Job that determines the resources needed to run a WDL job.
Responsible for evaluating the input declarations for unspecified inputs, evaluating the runtime section, and scheduling or chaining to the real WDL job.
All bindings are in terms of task-internal names.
- Parameters:
- run(file_store)¶
Evaluate inputs and runtime and schedule the task.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLTaskJob(task, task_internal_bindings, runtime_bindings, task_id, mount_spec, wdl_options, cache_key=None, **kwargs)¶
Bases:
WDLBaseJob
Job that runs a WDL task.
Responsible for re-evaluating input declarations for unspecified inputs, evaluating the runtime section, re-scheduling if resources are not available, running any command, and evaluating the outputs.
All bindings are in terms of task-internal names.
- Parameters:
- INJECTED_MESSAGE_DIR = '.toil_wdl_runtime'¶
- add_injections(command_string, task_container)¶
Inject extra Bash code from the Toil WDL runtime into the command for the container.
Currently doesn’t implement the MiniWDL plugin system, but does add resource usage monitoring to Docker containers.
- handle_injection_messages(outputs_library)¶
Handle any data received from injected runtime code in the container.
- Parameters:
outputs_library (ToilWDLStdLibTaskOutputs)
- Return type:
None
- handle_message_file(file_path)¶
Handle a message file received from in-container injected code.
Takes the host-side path of the file.
- Parameters:
file_path (str)
- Return type:
None
- can_mount_proc()¶
Determine if –containall will work for Singularity. On Kubernetes, this will result in operation not permitted See: https://github.com/apptainer/singularity/issues/5857
So if Kubernetes is detected, return False :return: bool
- Return type:
- ensure_mount_point(file_store, mount_spec)¶
Ensure the mount point sources are available.
Will check if the mount point source has the requested amount of space available.
Note: We are depending on Toil’s job scheduling backend to error when the sum of multiple mount points disk requests is greater than the total available For example, if a task has two mount points request 100 GB each but there is only 100 GB available, the df check may pass but Toil should fail to schedule the jobs internally
- Parameters:
mount_spec (dict[str | None, int]) – Mount specification from the disks attribute in the WDL task. Is a dict where key is the mount point target and value is the size
file_store (toil.fileStores.abstractFileStore.AbstractFileStore) – File store to create a tmp directory for the mount point source
- Returns:
Dict mapping mount point target to mount point source
- Return type:
- run(file_store)¶
Actually run the task.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLWorkflowNodeJob(node, prev_node_results, wdl_options, **kwargs)¶
Bases:
WDLBaseJob
Job that evaluates a WDL workflow node.
- Parameters:
node (WDL.Tree.WorkflowNode)
prev_node_results (Sequence[toil.job.Promised[WDLBindings]])
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Actually execute the workflow node.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLWorkflowNodeListJob(nodes, prev_node_results, wdl_options, **kwargs)¶
Bases:
WDLBaseJob
Job that evaluates a list of WDL workflow nodes, which are in the same scope and in a topological dependency order, and which do not call out to any other workflows or tasks or sections.
- Parameters:
nodes (list[WDL.Tree.WorkflowNode])
prev_node_results (Sequence[toil.job.Promised[WDLBindings]])
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Actually execute the workflow nodes.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLCombineBindingsJob(prev_node_results, **kwargs)¶
Bases:
WDLBaseJob
Job that collects the results from WDL workflow nodes and combines their environment changes.
- Parameters:
prev_node_results (Sequence[toil.job.Promised[WDLBindings]])
kwargs (Any)
- run(file_store)¶
Aggregate incoming results.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
WDLBindings
- class toil.wdl.wdltoil.WDLWorkflowGraph(nodes)¶
Represents a graph of WDL WorkflowNodes.
Operates at a certain level of instantiation (i.e. sub-sections are represented by single nodes).
Assumes all relevant nodes are provided; dependencies outside the provided nodes are assumed to be satisfied already.
- Parameters:
nodes (Sequence[WDL.Tree.WorkflowNode])
- real_id(node_id)¶
Map multiple IDs for what we consider the same node to one ID.
This elides/resolves gathers.
- is_decl(node_id)¶
Return True if a node represents a WDL declaration, and false otherwise.
- get_dependencies(node_id)¶
Get all the nodes that a node depends on, recursively (into the node if it has a body) but not transitively.
Produces dependencies after resolving gathers and internal-to-section dependencies, on nodes that are also in this graph.
- get_transitive_dependencies(node_id)¶
Get all the nodes that a node depends on, transitively.
- topological_order()¶
Get a topological order of the nodes, based on their dependencies.
- class toil.wdl.wdltoil.WDLSectionJob(wdl_options, **kwargs)¶
Bases:
WDLBaseJob
Job that can create more graph for a section of the workflow.
- Parameters:
wdl_options (WDLContext)
kwargs (Any)
- static coalesce_nodes(order, section_graph)¶
Given a topological order of WDL workflow node IDs, produce a list of lists of IDs, still in topological order, where each list of IDs can be run under a single Toil job.
- create_subgraph(nodes, gather_nodes, environment, local_environment=None, subscript=None)¶
Make a Toil job to evaluate a subgraph inside a workflow or workflow section.
- Returns:
a child Job that will return the aggregated environment after running all the things in the section.
- Parameters:
gather_nodes (Sequence[WDL.Tree.Gather]) – Names exposed by these will always be defined with something, even if the code that defines them does not actually run.
environment (WDLBindings) – Bindings in this environment will be used to evaluate the subgraph and will be passed through.
local_environment (WDLBindings | None) – Bindings in this environment will be used to evaluate the subgraph but will go out of scope at the end of the section.
subscript (int | None) – If the subgraph is being evaluated multiple times, this should be a disambiguating integer for logging.
nodes (Sequence[WDL.Tree.WorkflowNode])
- Return type:
- make_gather_bindings(gathers, undefined)¶
Given a collection of Gathers, create bindings from every identifier gathered, to the given “undefined” placeholder (which would be Null for a single execution of the body, or an empty array for a completely unexecuted scatter).
These bindings can be overlaid with bindings from the actual execution, so that references to names defined in unexecuted code get a proper default undefined value, and not a KeyError at runtime.
The information to do this comes from MiniWDL’s “gathers” system: <https://miniwdl.readthedocs.io/en/latest/WDL.html#WDL.Tree.WorkflowSection.gathers>
TODO: This approach will scale O(n^2) when run on n nested conditionals, because generating these bindings for the outer conditional will visit all the bindings from the inner ones.
- Parameters:
gathers (Sequence[WDL.Tree.Gather])
undefined (WDL.Value.Base)
- Return type:
WDLBindings
- class toil.wdl.wdltoil.WDLScatterJob(scatter, prev_node_results, wdl_options, **kwargs)¶
Bases:
WDLSectionJob
Job that evaluates a scatter in a WDL workflow. Runs the body for each value in an array, and makes arrays of the new bindings created in each instance of the body. If an instance of the body doesn’t create a binding, it gets a null value in the corresponding array.
- Parameters:
scatter (WDL.Tree.Scatter)
prev_node_results (Sequence[toil.job.Promised[WDLBindings]])
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Run the scatter.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLArrayBindingsJob(input_bindings, base_bindings, **kwargs)¶
Bases:
WDLBaseJob
Job that takes all new bindings created in an array of input environments, relative to a base environment, and produces bindings where each new binding name is bound to an array of the values in all the input environments.
Useful for producing the results of a scatter.
- Parameters:
input_bindings (Sequence[toil.job.Promised[WDLBindings]])
base_bindings (WDLBindings)
kwargs (Any)
- run(file_store)¶
Actually produce the array-ified bindings now that promised values are available.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
WDLBindings
- class toil.wdl.wdltoil.WDLConditionalJob(conditional, prev_node_results, wdl_options, **kwargs)¶
Bases:
WDLSectionJob
Job that evaluates a conditional in a WDL workflow.
- Parameters:
conditional (WDL.Tree.Conditional)
prev_node_results (Sequence[toil.job.Promised[WDLBindings]])
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Run the conditional.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLWorkflowJob(workflow, prev_node_results, workflow_id, wdl_options, **kwargs)¶
Bases:
WDLSectionJob
Job that evaluates an entire WDL workflow.
- Parameters:
- run(file_store)¶
Run the workflow. Return the result of the workflow.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLOutputsJob(workflow, bindings, wdl_options, cache_key=None, **kwargs)¶
Bases:
WDLBaseJob
Job which evaluates an outputs section for a workflow.
Returns an environment with just the outputs bound, in no namespace.
- Parameters:
workflow (WDL.Tree.Workflow)
bindings (toil.job.Promised[WDLBindings])
wdl_options (WDLContext)
cache_key (str | None)
kwargs (Any)
- run(file_store)¶
Make bindings for the outputs.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
WDLBindings
- class toil.wdl.wdltoil.WDLStartJob(target, inputs, wdl_options, **kwargs)¶
Bases:
WDLSectionJob
Job that evaluates an entire WDL workflow, and returns the workflow outputs namespaced with the workflow name. Inputs may or may not be namespaced with the workflow name; both forms are accepted.
- Parameters:
target (WDL.Tree.Workflow | WDL.Tree.Task)
inputs (toil.job.Promised[WDLBindings])
wdl_options (WDLContext)
kwargs (Any)
- run(file_store)¶
Actually build the subgraph.
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLInstallImportsJob(task_path, inputs, import_data, **kwargs)¶
Bases:
toil.job.Job
Class represents a unit of work in toil.
- Parameters:
task_path (str)
inputs (WDLBindings)
import_data (toil.job.Promised[Tuple[Dict[str, toil.fileStores.FileID], Dict[str, toil.job.FileMetadata]]])
kwargs (Any)
- run(file_store)¶
Convert the filenames in the workflow inputs ito the URIs :return: Promise of transformed workflow inputs
- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- class toil.wdl.wdltoil.WDLImportWrapper(target, inputs, wdl_options, inputs_search_path, import_remote_files, import_workers_threshold, import_workers_disk, **kwargs)¶
Bases:
WDLSectionJob
Job to organize importing files on workers instead of the leader. Responsible for extracting filenames and metadata, calling ImportsJob, applying imports to input bindings, and scheduling the start workflow job
This class is only used when runImportsOnWorkers is enabled.
- Parameters:
- run(file_store)¶
Run a WDL-related job.
Remember to decorate non-trivial overrides with
report_wdl_errors()
.- Parameters:
file_store (toil.fileStores.abstractFileStore.AbstractFileStore)
- Return type:
toil.job.Promised[WDLBindings]
- toil.wdl.wdltoil.make_root_job(target, inputs, inputs_search_path, toil, wdl_options, options)¶
- Parameters:
target (WDL.Tree.Workflow | WDL.Tree.Task)
inputs (WDLBindings)
toil (toil.common.Toil)
wdl_options (WDLContext)
options (configargparse.Namespace)
- Return type:
- toil.wdl.wdltoil.main()¶
A Toil workflow to interpret WDL input files.
- Return type:
None