toil.wdl.wdltoil

Attributes

logger

F

WDLBindings

TOIL_URI_SCHEME

Classes

NonDownloadingSize

WDL size() implementation that avoids downloading files.

ToilWDLStdLibBase

Standard library implementation for WDL as run on Toil.

ToilWDLStdLibTaskCommand

Standard library implementation to use inside a WDL task command evaluation.

ToilWDLStdLibTaskOutputs

Standard library implementation for WDL as run on Toil, with additional

WDLBaseJob

Base job class for all WDL-related jobs.

WDLTaskWrapperJob

Job that determines the resources needed to run a WDL job.

WDLTaskJob

Job that runs a WDL task.

WDLWorkflowNodeJob

Job that evaluates a WDL workflow node.

WDLWorkflowNodeListJob

Job that evaluates a list of WDL workflow nodes, which are in the same

WDLCombineBindingsJob

Job that collects the results from WDL workflow nodes and combines their

WDLWorkflowGraph

Represents a graph of WDL WorkflowNodes.

WDLSectionJob

Job that can create more graph for a section of the wrokflow.

WDLScatterJob

Job that evaluates a scatter in a WDL workflow. Runs the body for each

WDLArrayBindingsJob

Job that takes all new bindings created in an array of input environments,

WDLConditionalJob

Job that evaluates a conditional in a WDL workflow.

WDLWorkflowJob

Job that evaluates an entire WDL workflow.

WDLOutputsJob

Job which evaluates an outputs section (such as for a workflow).

WDLRootJob

Job that evaluates an entire WDL workflow, and returns the workflow outputs

Functions

wdl_error_reporter(task[, exit, log])

Run code in a context where WDL errors will be reported with pretty formatting.

report_wdl_errors(task[, exit, log])

Create a decorator to report WDL errors with the given task message.

remove_common_leading_whitespace(expression[, ...])

Remove "common leading whitespace" as defined in the WDL 1.1 spec.

potential_absolute_uris(uri, path[, importer])

Get potential absolute URIs to check for an imported file.

toil_read_source(uri, path, importer)

Implementation of a MiniWDL read_source function that can use any

combine_bindings(all_bindings)

Combine variable bindings from multiple predecessor tasks into one set for

log_bindings(log_function, message, all_bindings)

Log bindings to the console, even if some are still promises.

get_supertype(types)

Get the supertype that can hold values of all the given types.

for_each_node(root)

Iterate over all WDL workflow nodes in the given node, including inputs,

recursive_dependencies(root)

Get the combined workflow_node_dependencies of root and everything under

pack_toil_uri(file_id, dir_id, file_basename)

Encode a Toil file ID and its source path in a URI that starts with the scheme in TOIL_URI_SCHEME.

unpack_toil_uri(toil_uri)

Unpack a URI made by make_toil_uri to retrieve the FileID and the basename

evaluate_output_decls(output_decls, all_bindings, ...)

Evaluate output decls with a given bindings environment and standard library.

is_url(filename[, schemes])

Decide if a filename is a known kind of URL

evaluate_named_expression(context, name, ...)

Evaluate an expression when we know the name of it.

evaluate_decl(node, environment, stdlib)

Evaluate the expression of a declaration node, or raise an error.

evaluate_call_inputs(context, expressions, ...[, ...])

Evaluate a bunch of expressions with names, and make them into a fresh set of bindings. inputs_dict is a mapping of

evaluate_defaultable_decl(node, environment, stdlib)

If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression.

devirtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files

virtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files

add_paths(task_container, host_paths)

Based off of WDL.runtime.task_container.add_paths from miniwdl

import_files(environment, toil[, path, skip_remote])

Make sure all File values embedded in the given bindings are imported,

drop_missing_files(environment[, ...])

Make sure all the File values embedded in the given bindings point to files

get_file_paths_in_bindings(environment)

Get the paths of all files in the bindings. Doesn't guarantee that

map_over_typed_files_in_bindings(environment, transform)

Run all File values embedded in the given bindings through the given

map_over_files_in_bindings(bindings, transform)

Run all File values' types and values embedded in the given bindings

map_over_typed_files_in_binding(binding, transform)

Run all File values' types and values embedded in the given binding's value through the given

map_over_typed_files_in_value(value, transform)

Run all File values embedded in the given value through the given

monkeypatch_coerce(standard_library)

Monkeypatch miniwdl's WDL.Value.Base.coerce() function to virtualize files when they are represented as Strings.

main()

A Toil workflow to interpret WDL input files.

Module Contents

toil.wdl.wdltoil.logger
toil.wdl.wdltoil.wdl_error_reporter(task, exit=False, log=logger.critical)

Run code in a context where WDL errors will be reported with pretty formatting.

Parameters:
  • task (str)

  • exit (bool)

  • log (Callable[[str], None])

Return type:

Generator[None, None, None]

toil.wdl.wdltoil.F
toil.wdl.wdltoil.report_wdl_errors(task, exit=False, log=logger.critical)

Create a decorator to report WDL errors with the given task message.

Decorator can then be applied to a function, and if a WDL error happens it will say that it could not {task}.

Parameters:
  • task (str)

  • exit (bool)

  • log (Callable[[str], None])

Return type:

Callable[[F], F]

toil.wdl.wdltoil.remove_common_leading_whitespace(expression, tolerate_blanks=True, tolerate_dedents=False, tolerate_all_whitespace=True, debug=False)

Remove “common leading whitespace” as defined in the WDL 1.1 spec.

See <https://github.com/openwdl/wdl/blob/main/versions/1.1/SPEC.md#stripping-leading-whitespace>.

Operates on a WDL.Expr.String expression that has already been parsed.

Parameters:
  • tolerate_blanks (bool) – If True, don’t allow totally blank lines to zero the common whitespace.

  • tolerate_dedents (bool) – If True, remove as much of the whitespace on the first indented line as is found on subesquent lines, regardless of whether later lines are out-dented relative to it.

  • tolerate_all_whitespace (bool) – If True, don’t allow all-whitespace lines to reduce the common whitespace prefix.

  • debug (bool) – If True, the function will show its work by logging at debug level.

  • expression (WDL.Expr.String)

Return type:

WDL.Expr.String

toil.wdl.wdltoil.potential_absolute_uris(uri, path, importer=None)

Get potential absolute URIs to check for an imported file.

Given a URI or bare path, yield in turn all the URIs, with schemes, where we should actually try to find it, given that we want to search under/against the given paths or URIs, the current directory, and the given importing WDL document if any.

Parameters:
  • uri (str)

  • path (List[str])

  • importer (Optional[WDL.Tree.Document])

Return type:

Iterator[str]

async toil.wdl.wdltoil.toil_read_source(uri, path, importer)

Implementation of a MiniWDL read_source function that can use any filename or URL supported by Toil.

Needs to be async because MiniWDL will await its result.

Parameters:
  • uri (str)

  • path (List[str])

  • importer (Optional[WDL.Tree.Document])

Return type:

WDL.ReadSourceResult

toil.wdl.wdltoil.WDLBindings
toil.wdl.wdltoil.combine_bindings(all_bindings)

Combine variable bindings from multiple predecessor tasks into one set for the current task.

Parameters:

all_bindings (Sequence[WDLBindings])

Return type:

WDLBindings

toil.wdl.wdltoil.log_bindings(log_function, message, all_bindings)

Log bindings to the console, even if some are still promises.

Parameters:
  • log_function (Callable[Ellipsis, None]) – Function (like logger.info) to call to log data

  • message (str) – Message to log before the bindings

  • all_bindings (Sequence[toil.job.Promised[WDLBindings]]) – A list of bindings or promises for bindings, to log

Return type:

None

toil.wdl.wdltoil.get_supertype(types)

Get the supertype that can hold values of all the given types.

Parameters:

types (Sequence[Optional[WDL.Type.Base]])

Return type:

WDL.Type.Base

toil.wdl.wdltoil.for_each_node(root)

Iterate over all WDL workflow nodes in the given node, including inputs, internal nodes of conditionals and scatters, and gather nodes.

Parameters:

root (WDL.Tree.WorkflowNode)

Return type:

Iterator[WDL.Tree.WorkflowNode]

toil.wdl.wdltoil.recursive_dependencies(root)

Get the combined workflow_node_dependencies of root and everything under it, which are not on anything in that subtree.

Useful because section nodes can have internal nodes with dependencies not reflected in those of the section node itself.

Parameters:

root (WDL.Tree.WorkflowNode)

Return type:

Set[str]

toil.wdl.wdltoil.TOIL_URI_SCHEME = 'toilfile:'
toil.wdl.wdltoil.pack_toil_uri(file_id, dir_id, file_basename)

Encode a Toil file ID and its source path in a URI that starts with the scheme in TOIL_URI_SCHEME.

Parameters:
Return type:

str

toil.wdl.wdltoil.unpack_toil_uri(toil_uri)

Unpack a URI made by make_toil_uri to retrieve the FileID and the basename (no path prefix) that the file is supposed to have.

Parameters:

toil_uri (str)

Return type:

Tuple[toil.fileStores.FileID, str, str]

toil.wdl.wdltoil.evaluate_output_decls(output_decls, all_bindings, standard_library)

Evaluate output decls with a given bindings environment and standard library. Creates a new bindings object that only contains the bindings from the given decls. Guarantees that each decl in output_decls can access the variables defined by the previous ones. :param all_bindings: Environment to use when evaluating decls :param output_decls: Decls to evaluate :param standard_library: Standard library :return: New bindings object with only the output_decls

Parameters:
  • output_decls (List[WDL.Tree.Decl])

  • all_bindings (WDL.Env.Bindings[WDL.Value.Base])

  • standard_library (WDL.StdLib.Base)

Return type:

WDL.Env.Bindings[WDL.Value.Base]

class toil.wdl.wdltoil.NonDownloadingSize

Bases: WDL.StdLib._Size

WDL size() implementation that avoids downloading files.

MiniWDL’s default size() implementation downloads the whole file to get its size. We want to be able to get file sizes from code running on the leader, where there may not be space to download the whole file. So we override the fancy class that implements it so that we can handle sizes for FileIDs using the FileID’s stored size info.

toil.wdl.wdltoil.is_url(filename, schemes=['http:', 'https:', 's3:', 'gs:', TOIL_URI_SCHEME])

Decide if a filename is a known kind of URL

Parameters:
  • filename (str)

  • schemes (List[str])

Return type:

bool

class toil.wdl.wdltoil.ToilWDLStdLibBase(file_store, execution_dir=None)

Bases: WDL.StdLib.Base

Standard library implementation for WDL as run on Toil.

Parameters:
share_files(other)

Share caches for devirtualizing and virtualizing files with another instance.

Files devirtualized by one instance can be re-virtualized back to their original virtualized filenames by the other.

Parameters:

other (ToilWDLStdLibBase)

Return type:

None

static devirtualize_to(filename, dest_dir, file_source, execution_dir, devirtualized_to_virtualized=None, virtualized_to_devirtualized=None)

Download or export a WDL virtualized filename/URL to the given directory.

The destination directory must already exist.

Makes sure sibling files stay siblings and files with the same name don’t clobber each other. Called from within this class for tasks, and statically at the end of the workflow for outputs.

Returns the local path to the file. If it already had a local path elsewhere, it might not actually be put in dest_dir.

The input filename could already be devirtualized. In this case, the filename should not be added to the cache

Parameters:
Return type:

str

class toil.wdl.wdltoil.ToilWDLStdLibTaskCommand(file_store, container)

Bases: ToilWDLStdLibBase

Standard library implementation to use inside a WDL task command evaluation.

Expects all the filenames in variable bindings to be container-side paths; these are the “virtualized” filenames, while the “devirtualized” filenames are host-side paths.

Parameters:
class toil.wdl.wdltoil.ToilWDLStdLibTaskOutputs(file_store, stdout_path, stderr_path, file_to_mountpoint, current_directory_override=None)

Bases: ToilWDLStdLibBase, WDL.StdLib.TaskOutputs

Standard library implementation for WDL as run on Toil, with additional functions only allowed in task output sections.

Parameters:
stdout_used()

Return True if the standard output was read by the WDL.

Return type:

bool

stderr_used()

Return True if the standard error was read by the WDL.

Return type:

bool

toil.wdl.wdltoil.evaluate_named_expression(context, name, expected_type, expression, environment, stdlib)

Evaluate an expression when we know the name of it.

Parameters:
  • context (Union[WDL.Error.SourceNode, WDL.Error.SourcePosition])

  • name (str)

  • expected_type (Optional[WDL.Type.Base])

  • expression (Optional[WDL.Expr.Base])

  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

Return type:

WDL.Value.Base

toil.wdl.wdltoil.evaluate_decl(node, environment, stdlib)

Evaluate the expression of a declaration node, or raise an error.

Parameters:
  • node (WDL.Tree.Decl)

  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

Return type:

WDL.Value.Base

toil.wdl.wdltoil.evaluate_call_inputs(context, expressions, environment, stdlib, inputs_dict=None)

Evaluate a bunch of expressions with names, and make them into a fresh set of bindings. inputs_dict is a mapping of variable names to their expected type for the input decls in a task.

Parameters:
  • context (Union[WDL.Error.SourceNode, WDL.Error.SourcePosition])

  • expressions (Dict[str, WDL.Expr.Base])

  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

  • inputs_dict (Optional[Dict[str, WDL.Type.Base]])

Return type:

WDLBindings

toil.wdl.wdltoil.evaluate_defaultable_decl(node, environment, stdlib)

If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression.

Parameters:
  • node (WDL.Tree.Decl)

  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

Return type:

WDL.Value.Base

toil.wdl.wdltoil.devirtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files that are actually available to command line commands. The same virtual file always maps to the same devirtualized filename even with duplicates

Parameters:
  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

Return type:

WDLBindings

toil.wdl.wdltoil.virtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files that are usable from other machines.

Parameters:
  • environment (WDLBindings)

  • stdlib (WDL.StdLib.Base)

Return type:

WDLBindings

toil.wdl.wdltoil.add_paths(task_container, host_paths)

Based off of WDL.runtime.task_container.add_paths from miniwdl Maps the host path to the container paths

Parameters:
  • task_container (WDL.runtime.task_container.TaskContainer)

  • host_paths (Iterable[str])

Return type:

None

toil.wdl.wdltoil.import_files(environment, toil, path=None, skip_remote=False)

Make sure all File values embedded in the given bindings are imported, using the given Toil object.

Parameters:
  • path (Optional[List[str]]) – If set, try resolving input location relative to the URLs or directories in this list.

  • skip_remote (bool) – If set, don’t try to import files from remote locations. Leave them as URIs.

  • environment (WDLBindings)

  • toil (toil.common.Toil)

Return type:

WDLBindings

toil.wdl.wdltoil.drop_missing_files(environment, current_directory_override=None)

Make sure all the File values embedded in the given bindings point to files that exist, or are null.

Files must not be virtualized.

Parameters:
  • environment (WDLBindings)

  • current_directory_override (Optional[str])

Return type:

WDLBindings

toil.wdl.wdltoil.get_file_paths_in_bindings(environment)

Get the paths of all files in the bindings. Doesn’t guarantee that duplicates are removed.

TODO: Duplicative with WDL.runtime.task._fspaths, except that is internal and supports Directory objects.

Parameters:

environment (WDLBindings)

Return type:

List[str]

toil.wdl.wdltoil.map_over_typed_files_in_bindings(environment, transform)

Run all File values embedded in the given bindings through the given transformation function.

TODO: Replace with WDL.Value.rewrite_env_paths or WDL.Value.rewrite_files

Parameters:
  • environment (WDLBindings)

  • transform (Callable[[WDL.Type.Base, str], Optional[str]])

Return type:

WDLBindings

toil.wdl.wdltoil.map_over_files_in_bindings(bindings, transform)

Run all File values’ types and values embedded in the given bindings through the given transformation function.

TODO: Replace with WDL.Value.rewrite_env_paths or WDL.Value.rewrite_files

Parameters:
  • bindings (WDLBindings)

  • transform (Callable[[str], Optional[str]])

Return type:

WDLBindings

toil.wdl.wdltoil.map_over_typed_files_in_binding(binding, transform)

Run all File values’ types and values embedded in the given binding’s value through the given transformation function.

Parameters:
  • binding (WDL.Env.Binding[WDL.Value.Base])

  • transform (Callable[[WDL.Type.Base, str], Optional[str]])

Return type:

WDL.Env.Binding[WDL.Value.Base]

toil.wdl.wdltoil.map_over_typed_files_in_value(value, transform)

Run all File values embedded in the given value through the given transformation function.

If the transform returns None, the file value is changed to Null.

The transform has access to the type information for the value, so it knows if it may return None, depending on if the value is optional or not.

The transform is allowed to return None only if the mapping result won’t actually be used, to allow for scans. So error checking needs to be part of the transform itself.

Parameters:
  • value (WDL.Value.Base)

  • transform (Callable[[WDL.Type.Base, str], Optional[str]])

Return type:

WDL.Value.Base

class toil.wdl.wdltoil.WDLBaseJob(wdl_options=None, **kwargs)

Bases: toil.job.Job

Base job class for all WDL-related jobs.

Responsible for post-processing returned bindings, to do things like add in null values for things not defined in a section. Post-processing operations can be added onto any job before it is saved, and will be applied as long as the job’s run method calls postprocess().

Also responsible for remembering the Toil WDL configuration keys and values.

Parameters:
  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Run a WDL-related job.

Remember to decorate non-trivial overrides with report_wdl_errors().

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

Any

then_underlay(underlay)

Apply an underlay of backup bindings to the result.

Parameters:

underlay (toil.job.Promised[WDLBindings])

Return type:

None

then_remove(remove)

Remove the given bindings from the result.

Parameters:

remove (toil.job.Promised[WDLBindings])

Return type:

None

then_namespace(namespace)

Put the result bindings into a namespace.

Parameters:

namespace (str)

Return type:

None

then_overlay(overlay)

Overlay the given bindings on top of the (possibly namespaced) result.

Parameters:

overlay (toil.job.Promised[WDLBindings])

Return type:

None

postprocess(bindings)

Apply queued changes to bindings.

Should be applied by subclasses’ run() implementations to their return values.

Parameters:

bindings (WDLBindings)

Return type:

WDLBindings

defer_postprocessing(other)

Give our postprocessing steps to a different job.

Use this when you are returning a promise for bindings, on the job that issues the promise.

Parameters:

other (WDLBaseJob)

Return type:

None

class toil.wdl.wdltoil.WDLTaskWrapperJob(task, prev_node_results, task_id, namespace, task_path, **kwargs)

Bases: WDLBaseJob

Job that determines the resources needed to run a WDL job.

Responsible for evaluating the input declarations for unspecified inputs, evaluating the runtime section, and scheduling or chaining to the real WDL job.

All bindings are in terms of task-internal names.

Parameters:
  • task (WDL.Tree.Task)

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • task_id (List[str])

  • namespace (str)

  • task_path (str)

  • kwargs (Any)

run(file_store)

Evaluate inputs and runtime and schedule the task.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLTaskJob(task, task_internal_bindings, runtime_bindings, task_id, namespace, task_path, **kwargs)

Bases: WDLBaseJob

Job that runs a WDL task.

Responsible for re-evaluating input declarations for unspecified inputs, evaluating the runtime section, re-scheduling if resources are not available, running any command, and evaluating the outputs.

All bindings are in terms of task-internal names.

Parameters:
  • task (WDL.Tree.Task)

  • task_internal_bindings (toil.job.Promised[WDLBindings])

  • runtime_bindings (toil.job.Promised[WDLBindings])

  • task_id (List[str])

  • namespace (str)

  • task_path (str)

  • kwargs (Any)

INJECTED_MESSAGE_DIR = '.toil_wdl_runtime'
add_injections(command_string, task_container)

Inject extra Bash code from the Toil WDL runtime into the command for the container.

Currently doesn’t implement the MiniWDL plugin system, but does add resource usage monitoring to Docker containers.

Parameters:
  • command_string (str)

  • task_container (WDL.runtime.task_container.TaskContainer)

Return type:

str

handle_injection_messages(outputs_library)

Handle any data received from injected runtime code in the container.

Parameters:

outputs_library (ToilWDLStdLibTaskOutputs)

Return type:

None

handle_message_file(file_path)

Handle a message file received from in-container injected code.

Takes the host-side path of the file.

Parameters:

file_path (str)

Return type:

None

can_fake_root()

Determine if –fakeroot is likely to work for Singularity.

Return type:

bool

can_mount_proc()

Determine if –containall will work for Singularity. On Kubernetes, this will result in operation not permitted See: https://github.com/apptainer/singularity/issues/5857

So if Kubernetes is detected, return False :return: bool

Return type:

bool

run(file_store)

Actually run the task.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLWorkflowNodeJob(node, prev_node_results, namespace, task_path, wdl_options=None, **kwargs)

Bases: WDLBaseJob

Job that evaluates a WDL workflow node.

Parameters:
  • node (WDL.Tree.WorkflowNode)

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • namespace (str)

  • task_path (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Actually execute the workflow node.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLWorkflowNodeListJob(nodes, prev_node_results, namespace, wdl_options=None, **kwargs)

Bases: WDLBaseJob

Job that evaluates a list of WDL workflow nodes, which are in the same scope and in a topological dependency order, and which do not call out to any other workflows or tasks or sections.

Parameters:
  • nodes (List[WDL.Tree.WorkflowNode])

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • namespace (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Actually execute the workflow nodes.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLCombineBindingsJob(prev_node_results, **kwargs)

Bases: WDLBaseJob

Job that collects the results from WDL workflow nodes and combines their environment changes.

Parameters:
  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • kwargs (Any)

run(file_store)

Aggregate incoming results.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

WDLBindings

class toil.wdl.wdltoil.WDLWorkflowGraph(nodes)

Represents a graph of WDL WorkflowNodes.

Operates at a certain level of instantiation (i.e. sub-sections are represented by single nodes).

Assumes all relevant nodes are provided; dependencies outside the provided nodes are assumed to be satisfied already.

Parameters:

nodes (Sequence[WDL.Tree.WorkflowNode])

real_id(node_id)

Map multiple IDs for what we consider the same node to one ID.

This elides/resolves gathers.

Parameters:

node_id (str)

Return type:

str

is_decl(node_id)

Return True if a node represents a WDL declaration, and false otherwise.

Parameters:

node_id (str)

Return type:

bool

get(node_id)

Get a node by ID.

Parameters:

node_id (str)

Return type:

WDL.Tree.WorkflowNode

get_dependencies(node_id)

Get all the nodes that a node depends on, recursively (into the node if it has a body) but not transitively.

Produces dependencies after resolving gathers and internal-to-section dependencies, on nodes that are also in this graph.

Parameters:

node_id (str)

Return type:

Set[str]

get_transitive_dependencies(node_id)

Get all the nodes that a node depends on, transitively.

Parameters:

node_id (str)

Return type:

Set[str]

topological_order()

Get a topological order of the nodes, based on their dependencies.

Return type:

List[str]

leaves()

Get all the workflow node IDs that have no dependents in the graph.

Return type:

List[str]

class toil.wdl.wdltoil.WDLSectionJob(namespace, task_path, wdl_options=None, **kwargs)

Bases: WDLBaseJob

Job that can create more graph for a section of the wrokflow.

Parameters:
  • namespace (str)

  • task_path (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

static coalesce_nodes(order, section_graph)

Given a topological order of WDL workflow node IDs, produce a list of lists of IDs, still in topological order, where each list of IDs can be run under a single Toil job.

Parameters:
Return type:

List[List[str]]

create_subgraph(nodes, gather_nodes, environment, local_environment=None, subscript=None)

Make a Toil job to evaluate a subgraph inside a workflow or workflow section.

Returns:

a child Job that will return the aggregated environment after running all the things in the section.

Parameters:
  • gather_nodes (Sequence[WDL.Tree.Gather]) – Names exposed by these will always be defined with something, even if the code that defines them does not actually run.

  • environment (WDLBindings) – Bindings in this environment will be used to evaluate the subgraph and will be passed through.

  • local_environment (Optional[WDLBindings]) – Bindings in this environment will be used to evaluate the subgraph but will go out of scope at the end of the section.

  • subscript (Optional[int]) – If the subgraph is being evaluated multiple times, this should be a disambiguating integer for logging.

  • nodes (Sequence[WDL.Tree.WorkflowNode])

Return type:

WDLBaseJob

make_gather_bindings(gathers, undefined)

Given a collection of Gathers, create bindings from every identifier gathered, to the given “undefined” placeholder (which would be Null for a single execution of the body, or an empty array for a completely unexecuted scatter).

These bindings can be overlaid with bindings from the actual execution, so that references to names defined in unexecuted code get a proper default undefined value, and not a KeyError at runtime.

The information to do this comes from MiniWDL’s “gathers” system: <https://miniwdl.readthedocs.io/en/latest/WDL.html#WDL.Tree.WorkflowSection.gathers>

TODO: This approach will scale O(n^2) when run on n nested conditionals, because generating these bindings for the outer conditional will visit all the bindings from the inner ones.

Parameters:
  • gathers (Sequence[WDL.Tree.Gather])

  • undefined (WDL.Value.Base)

Return type:

WDLBindings

class toil.wdl.wdltoil.WDLScatterJob(scatter, prev_node_results, namespace, task_path, wdl_options=None, **kwargs)

Bases: WDLSectionJob

Job that evaluates a scatter in a WDL workflow. Runs the body for each value in an array, and makes arrays of the new bindings created in each instance of the body. If an instance of the body doesn’t create a binding, it gets a null value in the corresponding array.

Parameters:
  • scatter (WDL.Tree.Scatter)

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • namespace (str)

  • task_path (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Run the scatter.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLArrayBindingsJob(input_bindings, base_bindings, **kwargs)

Bases: WDLBaseJob

Job that takes all new bindings created in an array of input environments, relative to a base environment, and produces bindings where each new binding name is bound to an array of the values in all the input environments.

Useful for producing the results of a scatter.

Parameters:
  • input_bindings (Sequence[toil.job.Promised[WDLBindings]])

  • base_bindings (WDLBindings)

  • kwargs (Any)

run(file_store)

Actually produce the array-ified bindings now that promised values are available.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

WDLBindings

class toil.wdl.wdltoil.WDLConditionalJob(conditional, prev_node_results, namespace, task_path, wdl_options=None, **kwargs)

Bases: WDLSectionJob

Job that evaluates a conditional in a WDL workflow.

Parameters:
  • conditional (WDL.Tree.Conditional)

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • namespace (str)

  • task_path (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Run the conditional.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLWorkflowJob(workflow, prev_node_results, workflow_id, namespace, task_path, wdl_options=None, **kwargs)

Bases: WDLSectionJob

Job that evaluates an entire WDL workflow.

Parameters:
  • workflow (WDL.Tree.Workflow)

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]])

  • workflow_id (List[str])

  • namespace (str)

  • task_path (str)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Run the workflow. Return the result of the workflow.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLOutputsJob(workflow, bindings, wdl_options=None, **kwargs)

Bases: WDLBaseJob

Job which evaluates an outputs section (such as for a workflow).

Returns an environment with just the outputs bound, in no namespace.

Parameters:
  • workflow (WDL.Tree.Workflow)

  • bindings (toil.job.Promised[WDLBindings])

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Make bindings for the outputs.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

WDLBindings

class toil.wdl.wdltoil.WDLRootJob(target, inputs, wdl_options=None, **kwargs)

Bases: WDLSectionJob

Job that evaluates an entire WDL workflow, and returns the workflow outputs namespaced with the workflow name. Inputs may or may not be namespaced with the workflow name; both forms are accepted.

Parameters:
  • target (Union[WDL.Tree.Workflow, WDL.Tree.Task])

  • inputs (WDLBindings)

  • wdl_options (Optional[Dict[str, str]])

  • kwargs (Any)

run(file_store)

Actually build the subgraph.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

toil.job.Promised[WDLBindings]

toil.wdl.wdltoil.monkeypatch_coerce(standard_library)

Monkeypatch miniwdl’s WDL.Value.Base.coerce() function to virtualize files when they are represented as Strings. Calls _virtualize_filename from a given standard library object. :param standard_library: a standard library object :return

Parameters:

standard_library (ToilWDLStdLibBase)

Return type:

Generator[None, None, None]

toil.wdl.wdltoil.main()

A Toil workflow to interpret WDL input files.

Return type:

None