toil.wdl.wdltoil

Module Contents

Classes

NonDownloadingSize

WDL size() implementation that avoids downloading files.

ToilWDLStdLibBase

Standard library implementation for WDL as run on Toil.

ToilWDLStdLibTaskOutputs

Standard library implementation for WDL as run on Toil, with additional

WDLBaseJob

Base job class for all WDL-related jobs.

WDLTaskJob

Job that runs a WDL task.

WDLWorkflowNodeJob

Job that evaluates a WDL workflow node.

WDLCombineBindingsJob

Job that collects the results from WDL workflow nodes and combines their

WDLNamespaceBindingsJob

Job that puts a set of bindings into a namespace.

WDLSectionJob

Job that can create more graph for a section of the wrokflow.

WDLScatterJob

Job that evaluates a scatter in a WDL workflow. Runs the body for each

WDLArrayBindingsJob

Job that takes all new bindings created in an array of input environments,

WDLConditionalJob

Job that evaluates a conditional in a WDL workflow.

WDLWorkflowJob

Job that evaluates an entire WDL workflow.

WDLOutputsJob

Job which evaluates an outputs section (such as for a workflow).

WDLRootJob

Job that evaluates an entire WDL workflow, and returns the workflow outputs

Functions

potential_absolute_uris(uri, path[, importer])

Get potential absolute URIs to check for an imported file.

toil_read_source(uri, path, importer)

Implementation of a MiniWDL read_source function that can use any

combine_bindings(all_bindings)

Combine variable bindings from multiple predecessor tasks into one set for

log_bindings(log_function, message, all_bindings)

Log bindings to the console, even if some are still promises.

get_supertype(types)

Get the supertype that can hold values of all the given types.

for_each_node(root)

Iterate over all WDL workflow nodes in the given node, including inputs,

recursive_dependencies(root)

Get the combined workflow_node_dependencies of root and everything under

pack_toil_uri(file_id, file_basename)

Encode a Toil file ID and its source path in a URI that starts with the scheme in TOIL_URI_SCHEME.

unpack_toil_uri(toil_uri)

Unpack a URI made by make_toil_uri to retrieve the FileID and the basename

evaluate_named_expression(context, name, ...)

Evaluate an expression when we know the name of it.

evaluate_decl(node, environment, stdlib)

Evaluate the expression of a declaration node, or raise an error.

evaluate_call_inputs(context, expressions, ...)

Evaluate a bunch of expressions with names, and make them into a fresh set of bindings.

evaluate_defaultable_decl(node, environment, stdlib)

If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression.

devirtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files

virtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files

import_files(environment, toil[, path])

Make sure all File values embedded in the given bindings are imported,

drop_missing_files(environment[, ...])

Make sure all the File values embedded in the given bindings point to files

get_file_paths_in_bindings(environment)

Get the paths of all files in the bindings. Doesn't guarantee that

map_over_typed_files_in_bindings(environment, transform)

Run all File values embedded in the given bindings through the given

map_over_files_in_bindings(bindings, transform)

Run all File values' types and values embedded in the given bindings

map_over_typed_files_in_binding(binding, transform)

Run all File values' types and values embedded in the given binding's value through the given

map_over_typed_files_in_value(value, transform)

Run all File values embedded in the given value through the given

main()

A Toil workflow to interpret WDL input files.

Attributes

logger

WDLBindings

TOIL_URI_SCHEME

toil.wdl.wdltoil.logger
toil.wdl.wdltoil.potential_absolute_uris(uri, path, importer=None)

Get potential absolute URIs to check for an imported file.

Given a URI or bare path, yield in turn all the URIs, with schemes, where we should actually try to find it, given that we want to search under/against the given paths or URIs, the current directory, and the given importing WDL document if any.

Parameters
  • uri (str) –

  • path (List[str]) –

  • importer (Optional[WDL.Tree.Document]) –

Return type

Iterator[str]

async toil.wdl.wdltoil.toil_read_source(uri, path, importer)

Implementation of a MiniWDL read_source function that can use any filename or URL supported by Toil.

Needs to be async because MiniWDL will await its result.

Parameters
  • uri (str) –

  • path (List[str]) –

  • importer (Optional[WDL.Tree.Document]) –

Return type

WDL.ReadSourceResult

toil.wdl.wdltoil.WDLBindings
toil.wdl.wdltoil.combine_bindings(all_bindings)

Combine variable bindings from multiple predecessor tasks into one set for the current task.

Parameters

all_bindings (Sequence[WDLBindings]) –

Return type

WDLBindings

toil.wdl.wdltoil.log_bindings(log_function, message, all_bindings)

Log bindings to the console, even if some are still promises.

Parameters
  • log_function (Callable[Ellipsis, None]) – Function (like logger.info) to call to log data

  • message (str) – Message to log before the bindings

  • all_bindings (Sequence[toil.job.Promised[WDLBindings]]) – A list of bindings or promises for bindings, to log

Return type

None

toil.wdl.wdltoil.get_supertype(types)

Get the supertype that can hold values of all the given types.

Parameters

types (Sequence[Optional[WDL.Type.Base]]) –

Return type

WDL.Type.Base

toil.wdl.wdltoil.for_each_node(root)

Iterate over all WDL workflow nodes in the given node, including inputs, internal nodes of conditionals and scatters, and gather nodes.

Parameters

root (WDL.Tree.WorkflowNode) –

Return type

Iterator[WDL.Tree.WorkflowNode]

toil.wdl.wdltoil.recursive_dependencies(root)

Get the combined workflow_node_dependencies of root and everything under it, which are not on anything in that subtree.

Useful because section nodes can have internal nodes with dependencies not reflected in those of the section node itself.

Parameters

root (WDL.Tree.WorkflowNode) –

Return type

Set[str]

toil.wdl.wdltoil.TOIL_URI_SCHEME = 'toilfile:'
toil.wdl.wdltoil.pack_toil_uri(file_id, file_basename)

Encode a Toil file ID and its source path in a URI that starts with the scheme in TOIL_URI_SCHEME.

Parameters
Return type

str

toil.wdl.wdltoil.unpack_toil_uri(toil_uri)

Unpack a URI made by make_toil_uri to retrieve the FileID and the basename (no path prefix) that the file is supposed to have.

Parameters

toil_uri (str) –

Return type

Tuple[toil.fileStores.FileID, str]

class toil.wdl.wdltoil.NonDownloadingSize

Bases: WDL.StdLib._Size

Inheritance diagram of toil.wdl.wdltoil.NonDownloadingSize

WDL size() implementation that avoids downloading files.

MiniWDL’s default size() implementation downloads the whole file to get its size. We want to be able to get file sizes from code running on the leader, where there may not be space to download the whole file. So we override the fancy class that implements it so that we can handle sizes for FileIDs using the FileID’s stored size info.

class toil.wdl.wdltoil.ToilWDLStdLibBase(file_store)

Bases: WDL.StdLib.Base

Inheritance diagram of toil.wdl.wdltoil.ToilWDLStdLibBase

Standard library implementation for WDL as run on Toil.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

class toil.wdl.wdltoil.ToilWDLStdLibTaskOutputs(file_store, stdout_path, stderr_path, current_directory_override=None)

Bases: ToilWDLStdLibBase, WDL.StdLib.TaskOutputs

Inheritance diagram of toil.wdl.wdltoil.ToilWDLStdLibTaskOutputs

Standard library implementation for WDL as run on Toil, with additional functions only allowed in task output sections.

Parameters
toil.wdl.wdltoil.evaluate_named_expression(context, name, expected_type, expression, environment, stdlib)

Evaluate an expression when we know the name of it.

Parameters
  • context (Union[WDL.Error.SourceNode, WDL.Error.SourcePosition]) –

  • name (str) –

  • expected_type (Optional[WDL.Type.Base]) –

  • expression (Optional[WDL.Expr.Base]) –

  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDL.Value.Base

toil.wdl.wdltoil.evaluate_decl(node, environment, stdlib)

Evaluate the expression of a declaration node, or raise an error.

Parameters
  • node (WDL.Tree.Decl) –

  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDL.Value.Base

toil.wdl.wdltoil.evaluate_call_inputs(context, expressions, environment, stdlib)

Evaluate a bunch of expressions with names, and make them into a fresh set of bindings.

Parameters
  • context (Union[WDL.Error.SourceNode, WDL.Error.SourcePosition]) –

  • expressions (Dict[str, WDL.Expr.Base]) –

  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDLBindings

toil.wdl.wdltoil.evaluate_defaultable_decl(node, environment, stdlib)

If the name of the declaration is already defined in the environment, return its value. Otherwise, return the evaluated expression.

Parameters
  • node (WDL.Tree.Decl) –

  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDL.Value.Base

toil.wdl.wdltoil.devirtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files that are actually available to command line commands.

Parameters
  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDLBindings

toil.wdl.wdltoil.virtualize_files(environment, stdlib)

Make sure all the File values embedded in the given bindings point to files that are usable from other machines.

Parameters
  • environment (WDLBindings) –

  • stdlib (WDL.StdLib.Base) –

Return type

WDLBindings

toil.wdl.wdltoil.import_files(environment, toil, path=None)

Make sure all File values embedded in the given bindings are imported, using the given Toil object.

Parameters
  • path (Optional[List[str]]) – If set, try resolving input location relative to the URLs or directories in this list.

  • environment (WDLBindings) –

  • toil (toil.common.Toil) –

Return type

WDLBindings

toil.wdl.wdltoil.drop_missing_files(environment, current_directory_override=None)

Make sure all the File values embedded in the given bindings point to files that exist, or are null.

Files must not be virtualized.

Parameters
  • environment (WDLBindings) –

  • current_directory_override (Optional[str]) –

Return type

WDLBindings

toil.wdl.wdltoil.get_file_paths_in_bindings(environment)

Get the paths of all files in the bindings. Doesn’t guarantee that duplicates are removed.

TODO: Duplicative with WDL.runtime.task._fspaths, except that is internal and supports Direcotry objects.

Parameters

environment (WDLBindings) –

Return type

List[str]

toil.wdl.wdltoil.map_over_typed_files_in_bindings(environment, transform)

Run all File values embedded in the given bindings through the given transformation function.

TODO: Replace with WDL.Value.rewrite_env_paths or WDL.Value.rewrite_files

Parameters
  • environment (WDLBindings) –

  • transform (Callable[[WDL.Type.Base, str], Optional[str]]) –

Return type

WDLBindings

toil.wdl.wdltoil.map_over_files_in_bindings(bindings, transform)

Run all File values’ types and values embedded in the given bindings through the given transformation function.

TODO: Replace with WDL.Value.rewrite_env_paths or WDL.Value.rewrite_files

Parameters
  • bindings (WDLBindings) –

  • transform (Callable[[str], Optional[str]]) –

Return type

WDLBindings

toil.wdl.wdltoil.map_over_typed_files_in_binding(binding, transform)

Run all File values’ types and values embedded in the given binding’s value through the given transformation function.

Parameters
  • binding (WDL.Env.Binding[WDL.Value.Base]) –

  • transform (Callable[[WDL.Type.Base, str], Optional[str]]) –

Return type

WDL.Env.Binding[WDL.Value.Base]

toil.wdl.wdltoil.map_over_typed_files_in_value(value, transform)

Run all File values embedded in the given value through the given transformation function.

If the transform returns None, the file value is changed to Null.

The transform has access to the type information for the value, so it knows if it may return None, depending on if the value is optional or not.

The transform is allowed to return None only if the mapping result won’t actually be used, to allow for scans. So error checking needs to be part of the transform itself.

Parameters
  • value (WDL.Value.Base) –

  • transform (Callable[[WDL.Type.Base, str], Optional[str]]) –

Return type

WDL.Value.Base

class toil.wdl.wdltoil.WDLBaseJob(**kwargs)

Bases: toil.job.Job

Inheritance diagram of toil.wdl.wdltoil.WDLBaseJob

Base job class for all WDL-related jobs.

Parameters

kwargs (Any) –

run(file_store)

Run a WDL-related job.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

Any

class toil.wdl.wdltoil.WDLTaskJob(task, prev_node_results, task_id, namespace, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLTaskJob

Job that runs a WDL task.

Responsible for evaluating the input declarations for unspecified inputs, evaluating the runtime section, re-scheduling if resources are not available, running any command, and evaluating the outputs.

All bindings are in terms of task-internal names.

Parameters
  • task (WDL.Tree.Task) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • task_id (List[str]) –

  • namespace (str) –

  • kwargs (Any) –

can_fake_root()

Determie if –fakeroot is likely to work for Singularity.

Return type

bool

run(file_store)

Actually run the task.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLWorkflowNodeJob(node, prev_node_results, namespace, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLWorkflowNodeJob

Job that evaluates a WDL workflow node.

Parameters
  • node (WDL.Tree.WorkflowNode) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • namespace (str) –

  • kwargs (Any) –

run(file_store)

Actually execute the workflow node.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLCombineBindingsJob(prev_node_results, underlay=None, remove=None, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLCombineBindingsJob

Job that collects the results from WDL workflow nodes and combines their environment changes.

Parameters
  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • underlay (Optional[toil.job.Promised[WDLBindings]]) –

  • remove (Optional[toil.job.Promised[WDLBindings]]) –

  • kwargs (Any) –

run(file_store)

Aggregate incoming results.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

WDLBindings

class toil.wdl.wdltoil.WDLNamespaceBindingsJob(namespace, prev_node_results, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLNamespaceBindingsJob

Job that puts a set of bindings into a namespace.

Parameters
  • namespace (str) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • kwargs (Any) –

run(file_store)

Apply the namespace

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

WDLBindings

class toil.wdl.wdltoil.WDLSectionJob(namespace, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLSectionJob

Job that can create more graph for a section of the wrokflow.

Parameters
  • namespace (str) –

  • kwargs (Any) –

create_subgraph(nodes, gather_nodes, environment, local_environment=None)

Make a Toil job to evaluate a subgraph inside a workflow or workflow section.

Returns

a child Job that will return the aggregated environment after running all the things in the section.

Parameters
  • gather_nodes (Sequence[WDL.Tree.Gather]) – Names exposed by these will always be defined with something, even if the code that defines them does not actually run.

  • environment (WDLBindings) – Bindings in this environment will be used to evaluate the subgraph and will be passed through.

  • local_environment (Optional[WDLBindings]) – Bindings in this environment will be used to evaluate the subgraph but will go out of scope at the end of the section.

  • nodes (Sequence[WDL.Tree.WorkflowNode]) –

Return type

toil.job.Job

make_gather_bindings(gathers, undefined)

Given a collection of Gathers, create bindings from every identifier gathered, to the given “undefined” placeholder (which would be Null for a single execution of the body, or an empty array for a completely unexecuted scatter).

These bindings can be overlaid with bindings from the actual execution, so that references to names defined in unexecuted code get a proper default undefined value, and not a KeyError at runtime.

The information to do this comes from MiniWDL’s “gathers” system: <https://miniwdl.readthedocs.io/en/latest/WDL.html#WDL.Tree.WorkflowSection.gathers>

TODO: This approach will scale O(n^2) when run on n nested conditionals, because generating these bindings for the outer conditional will visit all the bindings from the inner ones.

Parameters
  • gathers (Sequence[WDL.Tree.Gather]) –

  • undefined (WDL.Value.Base) –

Return type

WDLBindings

class toil.wdl.wdltoil.WDLScatterJob(scatter, prev_node_results, namespace, **kwargs)

Bases: WDLSectionJob

Inheritance diagram of toil.wdl.wdltoil.WDLScatterJob

Job that evaluates a scatter in a WDL workflow. Runs the body for each value in an array, and makes arrays of the new bindings created in each instance of the body. If an instance of the body doesn’t create a binding, it gets a null value in the corresponding array.

Parameters
  • scatter (WDL.Tree.Scatter) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • namespace (str) –

  • kwargs (Any) –

run(file_store)

Run the scatter.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLArrayBindingsJob(input_bindings, base_bindings, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLArrayBindingsJob

Job that takes all new bindings created in an array of input environments, relative to a base environment, and produces bindings where each new binding name is bound to an array of the values in all the input environments.

Useful for producing the results of a scatter.

Parameters
  • input_bindings (Sequence[toil.job.Promised[WDLBindings]]) –

  • base_bindings (WDLBindings) –

  • kwargs (Any) –

run(file_store)

Actually produce the array-ified bindings now that promised values are available.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

WDLBindings

class toil.wdl.wdltoil.WDLConditionalJob(conditional, prev_node_results, namespace, **kwargs)

Bases: WDLSectionJob

Inheritance diagram of toil.wdl.wdltoil.WDLConditionalJob

Job that evaluates a conditional in a WDL workflow.

Parameters
  • conditional (WDL.Tree.Conditional) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • namespace (str) –

  • kwargs (Any) –

run(file_store)

Run the conditional.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLWorkflowJob(workflow, prev_node_results, workflow_id, namespace, **kwargs)

Bases: WDLSectionJob

Inheritance diagram of toil.wdl.wdltoil.WDLWorkflowJob

Job that evaluates an entire WDL workflow.

Parameters
  • workflow (WDL.Tree.Workflow) –

  • prev_node_results (Sequence[toil.job.Promised[WDLBindings]]) –

  • workflow_id (List[str]) –

  • namespace (str) –

  • kwargs (Any) –

run(file_store)

Run the workflow. Return the result of the workflow.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

class toil.wdl.wdltoil.WDLOutputsJob(outputs, bindings, **kwargs)

Bases: WDLBaseJob

Inheritance diagram of toil.wdl.wdltoil.WDLOutputsJob

Job which evaluates an outputs section (such as for a workflow).

Returns an environment with just the outputs bound, in no namespace.

Parameters
  • outputs (List[WDL.Tree.Decl]) –

  • bindings (toil.job.Promised[WDLBindings]) –

  • kwargs (Any) –

run(file_store)

Make bindings for the outputs.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

WDLBindings

class toil.wdl.wdltoil.WDLRootJob(workflow, inputs, **kwargs)

Bases: WDLSectionJob

Inheritance diagram of toil.wdl.wdltoil.WDLRootJob

Job that evaluates an entire WDL workflow, and returns the workflow outputs namespaced with the workflow name. Inputs may or may not be namespaced with the workflow name; both forms are accepted.

Parameters
  • workflow (WDL.Tree.Workflow) –

  • inputs (WDLBindings) –

  • kwargs (Any) –

run(file_store)

Actually build the subgraph.

Parameters

file_store (toil.fileStores.abstractFileStore.AbstractFileStore) –

Return type

toil.job.Promised[WDLBindings]

toil.wdl.wdltoil.main()

A Toil workflow to interpret WDL input files.

Return type

None