toil.cwl.cwltoil

Implemented support for Common Workflow Language (CWL) for Toil.

Module Contents

Classes

UnresolvedDict

Tag to indicate a dict contains promises that must be resolved.

SkipNull

Internal sentinel object.

Conditional

Object holding conditional expression until we are ready to evaluate it.

ResolveSource

Apply linkMerge and pickValue operators to values coming into a port.

StepValueFrom

A workflow step input which has a valueFrom expression attached to it.

DefaultWithSource

A workflow step input that has both a source and a default value.

JustAValue

A simple value masquerading as a 'resolve'-able object.

ToilPathMapper

Keeps track of files in a Toil way.

ToilSingleJobExecutor

A SingleJobExecutor that does not assume it is at the top level of the workflow.

ToilTool

Mixin to hook Toil into a cwltool tool type.

ToilCommandLineTool

Subclass the cwltool command line tool to provide the custom ToilPathMapper.

ToilExpressionTool

Subclass the cwltool expression tool to provide the custom ToilPathMapper.

ToilFsAccess

Custom filesystem access class which handles toil filestore references.

CWLNamedJob

Base class for all CWL jobs that do user work, to give them useful names.

ResolveIndirect

Helper Job.

CWLJobWrapper

Wrap a CWL job that uses dynamic resources requirement.

CWLJob

Execute a CWL tool using cwltool.executors.SingleJobExecutor.

CWLScatter

Implement workflow scatter step.

CWLGather

Follows on to a scatter Job.

SelfJob

Fake job object to facilitate implementation of CWLWorkflow.run().

CWLWorkflow

Toil Job to convert a CWL workflow graph into a Toil job graph.

Functions

cwltoil_was_removed()

Complain about deprecated entrypoint.

filter_skip_null(name, value)

Recursively filter out SkipNull objects from 'value'.

ensure_no_collisions(directory[, dir_description])

Make sure no items in the given CWL Directory have the same name.

resolve_dict_w_promises(dict_w_promises[, file_store])

Resolve a dictionary of promises evaluate expressions to produce the actual values.

simplify_list(maybe_list)

Turn a length one list loaded by cwltool into a scalar.

toil_make_tool(toolpath_object, loadingContext)

Emit custom ToilCommandLineTools.

check_directory_dict_invariants(contents)

Make sure a directory structure dict makes sense. Throws an error

decode_directory(dir_path)

Decode a directory from a "toildir:" path to a directory (or a file in it).

encode_directory(contents)

Encode a directory from a "toildir:" path to a directory (or a file in it).

toil_get_file(file_store, index, existing, uri[, ...])

Set up the given file or directory from the Toil jobstore at a file URI

write_file(writeFunc, index, existing, file_uri)

Write a file into the Toil jobstore.

path_to_loc(obj)

Make a path into a location.

import_files(import_function, fs_access, fileindex, ...)

Prepare all files and directories.

upload_directory(directory_metadata, directory_contents)

Upload a Directory object.

upload_file(uploadfunc, fileindex, existing, file_metadata)

Update a file object so that the file will be accessible from another machine.

writeGlobalFileWrapper(file_store, fileuri)

Wrap writeGlobalFile to accept file:// URIs.

remove_empty_listings(rec)

toilStageFiles(toil, cwljob, outdir[, destBucket, ...])

Copy input files out of the global file store and update location and path.

get_container_engine(runtime_context)

makeJob(tool, jobobj, runtime_context, parent_name, ...)

Create the correct Toil Job object for the CWL tool.

remove_pickle_problems(obj)

Doc_loader does not pickle correctly, causing Toil errors, remove from objects.

visitSteps(cmdline_tool, op)

Iterate over a CWL Process object, running the op on each tool description

rm_unprocessed_secondary_files(job_params)

filtered_secondary_files(unfiltered_secondary_files)

Remove unprocessed secondary files.

scan_for_unsupported_requirements(tool[, ...])

Scan the given CWL tool for any unsupported optional features.

determine_load_listing(tool)

Determine the directory.listing feature in CWL.

generate_default_job_store(batch_system_name, ...)

Choose a default job store appropriate to the requested batch system and

get_options(args)

Parse given args and properly add non-Toil arguments into the cwljob of the Namespace.

main([args, stdout])

Run the main loop for toil-cwl-runner.

find_default_container(args, builder)

Find the default constructor by consulting a Toil.options object.

Attributes

logger

DEFAULT_TMPDIR

DEFAULT_TMPDIR_PREFIX

MISSING_FILE

DirectoryContents

ProcessType

usage_message

toil.cwl.cwltoil.logger
toil.cwl.cwltoil.DEFAULT_TMPDIR
toil.cwl.cwltoil.DEFAULT_TMPDIR_PREFIX
toil.cwl.cwltoil.cwltoil_was_removed()

Complain about deprecated entrypoint.

Return type:

None

class toil.cwl.cwltoil.UnresolvedDict

Bases: Dict[Any, Any]

Tag to indicate a dict contains promises that must be resolved.

class toil.cwl.cwltoil.SkipNull

Internal sentinel object.

Indicates a null value produced by each port of a skipped conditional step. The CWL 1.2 specification calls for treating this the exactly the same as a null value.

toil.cwl.cwltoil.filter_skip_null(name, value)

Recursively filter out SkipNull objects from ‘value’.

Parameters:
  • name (str) – Name of port producing this value. Only used when we find an unhandled null from a conditional step and we print out a warning. The name allows the user to better localize which step/port was responsible for the unhandled null.

  • value (Any) – port output value object

Return type:

Any

toil.cwl.cwltoil.ensure_no_collisions(directory, dir_description=None)

Make sure no items in the given CWL Directory have the same name.

If any do, raise a WorkflowException about a “File staging conflict”.

Does not recurse into subdirectories.

Parameters:
  • directory (cwltool.utils.DirectoryType)

  • dir_description (Optional[str])

Return type:

None

class toil.cwl.cwltoil.Conditional(expression=None, outputs=None, requirements=None, container_engine='docker')

Object holding conditional expression until we are ready to evaluate it.

Evaluation occurs at the moment the encloses step is ready to run.

Parameters:
  • expression (Optional[str])

  • outputs (Union[Dict[str, cwltool.utils.CWLOutputType], None])

  • requirements (Optional[List[cwltool.utils.CWLObjectType]])

  • container_engine (str)

is_false(job)

Determine if expression evaluates to False given completed step inputs.

Parameters:

job (cwltool.utils.CWLObjectType) – job output object

Returns:

bool

Return type:

bool

skipped_outputs()

Generate a dict of SkipNull objects corresponding to the output structure.

Return type:

Dict[str, SkipNull]

class toil.cwl.cwltoil.ResolveSource(name, input, source_key, promises)

Apply linkMerge and pickValue operators to values coming into a port.

Parameters:
promise_tuples: List[Tuple[str, toil.job.Promise]] | Tuple[str, toil.job.Promise]
__repr__()

Allow for debug printing.

Return type:

str

resolve()

First apply linkMerge then pickValue if either present.

Return type:

Any

Apply linkMerge operator to values object.

Parameters:

values (cwltool.utils.CWLObjectType) – result of step

Return type:

Union[List[cwltool.utils.CWLOutputType], cwltool.utils.CWLOutputType]

pick_value(values)

Apply pickValue operator to values object.

Parameters:

values (Union[List[Union[str, SkipNull]], Any]) – Intended to be a list, but other types will be returned without modification.

Returns:

Return type:

Any

class toil.cwl.cwltoil.StepValueFrom(expr, source, req, container_engine)

A workflow step input which has a valueFrom expression attached to it.

The valueFrom expression will be evaluated to produce the actual input object for the step.

Parameters:
  • expr (str)

  • source (Any)

  • req (List[cwltool.utils.CWLObjectType])

  • container_engine (str)

__repr__()

Allow for debug printing.

Return type:

str

eval_prep(step_inputs, file_store)

Resolve the contents of any file in a set of inputs.

The inputs must be associated with the StepValueFrom object’s self.source.

Called when loadContents is specified.

Parameters:
Return type:

None

resolve()

Resolve the promise in the valueFrom expression’s context.

Returns:

object that will serve as expression context

Return type:

Any

do_eval(inputs)

Evaluate the valueFrom expression with the given input object.

Parameters:

inputs (cwltool.utils.CWLObjectType)

Returns:

object

Return type:

Any

class toil.cwl.cwltoil.DefaultWithSource(default, source)

A workflow step input that has both a source and a default value.

Parameters:
  • default (Any)

  • source (Any)

__repr__()

Allow for debug printing.

Return type:

str

resolve()

Determine the final input value when the time is right.

(when the source can be resolved)

Returns:

dict

Return type:

Any

class toil.cwl.cwltoil.JustAValue(val)

A simple value masquerading as a ‘resolve’-able object.

Parameters:

val (Any)

__repr__()

Allow for debug printing.

Return type:

str

resolve()

Return the value.

Return type:

Any

toil.cwl.cwltoil.resolve_dict_w_promises(dict_w_promises, file_store=None)

Resolve a dictionary of promises evaluate expressions to produce the actual values.

Parameters:
Returns:

dictionary of actual values

Return type:

cwltool.utils.CWLObjectType

toil.cwl.cwltoil.simplify_list(maybe_list)

Turn a length one list loaded by cwltool into a scalar.

Anything else is passed as-is, by reference.

Parameters:

maybe_list (Any)

Return type:

Any

class toil.cwl.cwltoil.ToilPathMapper(referenced_files, basedir, stagedir, separateDirs=True, get_file=None, stage_listing=False, streaming_allowed=True)

Bases: cwltool.pathmapper.PathMapper

Keeps track of files in a Toil way.

Maps between the symbolic identifier of a file (the Toil FileID), its local path on the host (the value returned by readGlobalFile) and the location of the file inside the software container.

Parameters:
  • referenced_files (List[cwltool.utils.CWLObjectType])

  • basedir (str)

  • stagedir (str)

  • separateDirs (bool)

  • get_file (Union[Any, None])

  • stage_listing (bool)

  • streaming_allowed (bool)

visit(obj, stagedir, basedir, copy=False, staged=False)

Iterate over a CWL object, resolving File and Directory path references.

This is called on each File or Directory CWL object. The Files and Directories all have “location” fields. For the Files, these are from upload_file(), and for the Directories, these are from upload_directory() or cwltool internally. With upload_directory(), they and their children will be assigned locations based on listing the Directories using ToilFsAccess. With cwltool, locations will be set as absolute paths.

Parameters:
  • obj (cwltool.utils.CWLObjectType) – The CWL File or Directory to process

  • stagedir (str) – The base path for target paths to be generated under, except when a File or Directory has an overriding parent directory in dirname

  • basedir (str) – The directory from which relative paths should be resolved; used as the base directory for the StdFsAccess that generated the listing being processed.

  • copy (bool) – If set, use writable types for Files and Directories.

  • staged (bool) – Starts as True at the top of the recursion. Set to False when entering a directory that we can actually download, so we don’t stage files and subdirectories separately from the directory as a whole. Controls the staged flag on generated mappings, and therefore whether files and directories are actually placed at their mapped-to target locations. If stage_listing is True, we will leave this True throughout and stage everything.

Return type:

None

Produces one MapperEnt for every unique location for a File or Directory. These MapperEnt objects are instructions to cwltool’s stage_files function: https://github.com/common-workflow-language/cwltool/blob/a3e3a5720f7b0131fa4f9c0b3f73b62a347278a6/cwltool/process.py#L254

The MapperEnt has fields:

resolved: An absolute local path anywhere on the filesystem where the file/directory can be found, or the contents of a file to populate it with if type is CreateWritableFile or CreateFile. Or, a URI understood by the StdFsAccess in use (for example, toilfile:).

target: An absolute path under stagedir that the file or directory will then be placed at by cwltool. Except if a File or Directory has a dirname field, giving its parent path, that is used instead.

type: One of:

File: cwltool will copy or link the file from resolved to target, if possible.

CreateFile: cwltool will create the file at target, treating resolved as the contents.

WritableFile: cwltool will copy the file from resolved to target, making it writable.

CreateWritableFile: cwltool will create the file at target, treating resolved as the contents, and make it writable.

Directory: cwltool will copy or link the directory from resolved to target, if possible. Otherwise, cwltool will make the directory at target if resolved starts with “_:”. Otherwise it will do nothing.

WritableDirectory: cwltool will copy the directory from resolved to target, if possible. Otherwise, cwltool will make the directory at target if resolved starts with “_:”. Otherwise it will do nothing.

staged: if set to False, cwltool will not make or copy anything for this entry

class toil.cwl.cwltoil.ToilSingleJobExecutor

Bases: cwltool.executors.SingleJobExecutor

A SingleJobExecutor that does not assume it is at the top level of the workflow.

We need this because otherwise every job thinks it is top level and tries to discover secondary files, which may exist when they haven’t actually been passed at the top level and thus aren’t supposed to be visible.

run_jobs(process, job_order_object, logger, runtime_context)

run_jobs from SingleJobExecutor, but not in a top level runtime context.

Parameters:
Return type:

None

class toil.cwl.cwltoil.ToilTool(*args, **kwargs)

Mixin to hook Toil into a cwltool tool type.

Parameters:
  • args (Any)

  • kwargs (Any)

connect_toil_job(job)

Attach the Toil tool to the Toil job that is executing it. This allows it to use the Toil job to stop at certain points if debugging flags are set.

Parameters:

job (toil.job.Job)

Return type:

None

make_path_mapper(reffiles, stagedir, runtimeContext, separateDirs)

Create the appropriate PathMapper for the situation.

Parameters:
Return type:

cwltool.pathmapper.PathMapper

__str__()

Return string representation of this tool type.

Return type:

str

class toil.cwl.cwltoil.ToilCommandLineTool(*args, **kwargs)

Bases: ToilTool, cwltool.command_line_tool.CommandLineTool

Subclass the cwltool command line tool to provide the custom ToilPathMapper.

Parameters:
  • args (Any)

  • kwargs (Any)

class toil.cwl.cwltoil.ToilExpressionTool(*args, **kwargs)

Bases: ToilTool, cwltool.command_line_tool.ExpressionTool

Subclass the cwltool expression tool to provide the custom ToilPathMapper.

Parameters:
  • args (Any)

  • kwargs (Any)

toil.cwl.cwltoil.toil_make_tool(toolpath_object, loadingContext)

Emit custom ToilCommandLineTools.

This factory function is meant to be passed to cwltool.load_tool().

Parameters:
Return type:

cwltool.process.Process

toil.cwl.cwltoil.MISSING_FILE = 'missing://'
toil.cwl.cwltoil.DirectoryContents
toil.cwl.cwltoil.check_directory_dict_invariants(contents)

Make sure a directory structure dict makes sense. Throws an error otherwise.

Currently just checks to make sure no empty-string keys exist.

Parameters:

contents (DirectoryContents)

Return type:

None

toil.cwl.cwltoil.decode_directory(dir_path)

Decode a directory from a “toildir:” path to a directory (or a file in it).

Returns the decoded directory dict, the remaining part of the path (which may be None), and the deduplication key string that uniquely identifies the directory.

Parameters:

dir_path (str)

Return type:

Tuple[DirectoryContents, Optional[str], str]

toil.cwl.cwltoil.encode_directory(contents)

Encode a directory from a “toildir:” path to a directory (or a file in it).

Takes the directory dict, which is a dict from name to URI for a file or dict for a subdirectory.

Parameters:

contents (DirectoryContents)

Return type:

str

class toil.cwl.cwltoil.ToilFsAccess(basedir, file_store=None)

Bases: cwltool.stdfsaccess.StdFsAccess

Custom filesystem access class which handles toil filestore references.

Normal file paths will be resolved relative to basedir, but ‘toilfile:’ and ‘toildir:’ URIs will be fulfilled from the Toil file store.

Also supports URLs supported by Toil job store implementations.

Parameters:
glob(pattern)
Parameters:

pattern (str)

Return type:

List[str]

open(fn, mode)
Parameters:
Return type:

IO[Any]

exists(path)

Test for file existence.

Parameters:

path (str)

Return type:

bool

size(path)
Parameters:

path (str)

Return type:

int

isfile(fn)
Parameters:

fn (str)

Return type:

bool

isdir(fn)
Parameters:

fn (str)

Return type:

bool

listdir(fn)
Parameters:

fn (str)

Return type:

List[str]

join(path, *paths)
Parameters:
Return type:

str

realpath(fn)
Parameters:

fn (str)

Return type:

str

toil.cwl.cwltoil.toil_get_file(file_store, index, existing, uri, streamable=False, streaming_allowed=True, pipe_threads=None)

Set up the given file or directory from the Toil jobstore at a file URI where it can be accessed locally.

Run as part of the tool setup, inside jobs on the workers. Also used as part of reorganizing files to get them uploaded at the end of a tool.

Parameters:
  • file_store (toil.fileStores.abstractFileStore.AbstractFileStore) – The Toil file store to download from.

  • index (Dict[str, str]) – Maps from downloaded file path back to input Toil URI.

  • existing (Dict[str, str]) – Maps from URI to downloaded file path.

  • uri (str) – The URI for the file to download.

  • streamable (bool) – If the file is has ‘streamable’ flag set

  • streaming_allowed (bool) – If streaming is allowed

  • pipe_threads (Optional[List[Tuple[threading.Thread, int]]]) – List of threads responsible for streaming the data and open file descriptors corresponding to those files. Caller is responsible to close the file descriptors (to break the pipes) and join the threads

Return type:

str

toil.cwl.cwltoil.write_file(writeFunc, index, existing, file_uri)

Write a file into the Toil jobstore.

‘existing’ is a set of files retrieved as inputs from toil_get_file. This ensures they are mapped back as the same name if passed through.

Returns a toil uri path to the object.

Parameters:
Return type:

str

toil.cwl.cwltoil.path_to_loc(obj)

Make a path into a location.

(If a CWL object has a “path” and not a “location”)

Parameters:

obj (cwltool.utils.CWLObjectType)

Return type:

None

toil.cwl.cwltoil.import_files(import_function, fs_access, fileindex, existing, cwl_object, mark_broken=False, skip_remote=False, bypass_file_store=False, log_level=logging.DEBUG)

Prepare all files and directories.

Will be executed from the leader or worker in the context of the given CWL tool, order, or output object to be used on the workers. Make sure their sizes are set and import all the files.

Recurses inside directories using the fs_access to find files to upload and subdirectory structure to encode, even if their listings are not set or not recursive.

Preserves any listing fields.

If a file cannot be found (like if it is an optional secondary file that doesn’t exist), fails, unless mark_broken is set, in which case it applies a sentinel location.

Also does some miscelaneous normalization.

Parameters:
  • import_function (Callable[[str], toil.fileStores.FileID]) – The function used to upload a URI and get a Toil FileID for it.

  • fs_access (cwltool.stdfsaccess.StdFsAccess) – the CWL FS access object we use to access the filesystem to find files to import. Needs to support the URI schemes used.

  • fileindex (Dict[str, str]) – Forward map to fill in from file URI to Toil storage location, used by write_file to deduplicate writes.

  • existing (Dict[str, str]) – Reverse map to fill in from Toil storage location to file URI. Not read from.

  • cwl_object (Optional[cwltool.utils.CWLObjectType]) – CWL tool (or workflow order) we are importing files for

  • mark_broken (bool) – If True, when files can’t be imported because they e.g. don’t exist, set their locations to MISSING_FILE rather than failing with an error.

  • skp_remote – If True, leave remote URIs in place instead of importing files.

  • bypass_file_store (bool) – If True, leave file:// URIs in place instead of importing files and directories.

  • log_level (int) – Log imported files at the given level.

  • skip_remote (bool)

Return type:

None

toil.cwl.cwltoil.upload_directory(directory_metadata, directory_contents, mark_broken=False)

Upload a Directory object.

Ignores the listing (which may not be recursive and isn’t safe or efficient to touch), and instead uses directory_contents, which is a recursive dict structure from filename to file URI or subdirectory contents dict.

Makes sure the directory actually exists, and rewrites its location to be something we can use on another machine.

If mark_broken is set, ignores missing directories and replaces them with directories containing the given (possibly empty) contents.

We can’t rely on the directory’s listing as visible to the next tool as a complete recursive description of the files we will need to present to the tool, since some tools require it to be cleared or single-level but still expect to see its contents in the filesystem.

Parameters:
  • directory_metadata (cwltool.utils.CWLObjectType)

  • directory_contents (DirectoryContents)

  • mark_broken (bool)

Return type:

None

toil.cwl.cwltoil.upload_file(uploadfunc, fileindex, existing, file_metadata, mark_broken=False, skip_remote=False)

Update a file object so that the file will be accessible from another machine.

Uploads local files to the Toil file store, and sets their location to a reference to the toil file store.

If a file doesn’t exist, fails with an error, unless mark_broken is set, in which case the missing file is given a special sentinel location.

Unless skip_remote is set, downloads remote files into the file store and sets their locations to references into the file store as well.

Parameters:
Return type:

None

toil.cwl.cwltoil.writeGlobalFileWrapper(file_store, fileuri)

Wrap writeGlobalFile to accept file:// URIs.

Parameters:
Return type:

toil.fileStores.FileID

toil.cwl.cwltoil.remove_empty_listings(rec)
Parameters:

rec (cwltool.utils.CWLObjectType)

Return type:

None

class toil.cwl.cwltoil.CWLNamedJob(cores=1, memory='1GiB', disk='1MiB', accelerators=None, preemptible=None, tool_id=None, parent_name=None, subjob_name=None, local=None)

Bases: toil.job.Job

Base class for all CWL jobs that do user work, to give them useful names.

Parameters:
class toil.cwl.cwltoil.ResolveIndirect(cwljob, parent_name=None)

Bases: CWLNamedJob

Helper Job.

Accepts an unresolved dict (containing promises) and produces a dictionary of actual values.

Parameters:
  • cwljob (toil.job.Promised[cwltool.utils.CWLObjectType])

  • parent_name (Optional[str])

run(file_store)

Evaluate the promises and return their values.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

cwltool.utils.CWLObjectType

toil.cwl.cwltoil.toilStageFiles(toil, cwljob, outdir, destBucket=None, log_level=logging.DEBUG)

Copy input files out of the global file store and update location and path.

Parameters:
  • destBucket (Union[str, None]) – If set, export to this base URL instead of to the local filesystem.

  • log_level (int) – Log each file transfered at the given level.

  • toil (toil.common.Toil)

  • cwljob (Union[cwltool.utils.CWLObjectType, List[cwltool.utils.CWLObjectType]])

  • outdir (str)

Return type:

None

class toil.cwl.cwltoil.CWLJobWrapper(tool, cwljob, runtime_context, parent_name, conditional=None)

Bases: CWLNamedJob

Wrap a CWL job that uses dynamic resources requirement.

When executed, this creates a new child job which has the correct resource requirement set.

Parameters:
run(file_store)

Create a child job with the correct resource requirements set.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

Any

class toil.cwl.cwltoil.CWLJob(tool, cwljob, runtime_context, parent_name=None, conditional=None)

Bases: CWLNamedJob

Execute a CWL tool using cwltool.executors.SingleJobExecutor.

Parameters:
required_env_vars(cwljob)

Yield environment variables from EnvVarRequirement.

Parameters:

cwljob (Any)

Return type:

Iterator[Tuple[str, str]]

populate_env_vars(cwljob)

Prepare environment variables necessary at runtime for the job.

Env vars specified in the CWL “requirements” section should already be loaded in self.cwltool.requirements, however those specified with “EnvVarRequirement” take precedence and are only populated here. Therefore, this not only returns a dictionary with all evaluated “EnvVarRequirement” env vars, but checks self.cwltool.requirements for any env vars with the same name and replaces their value with that found in the “EnvVarRequirement” env var if it exists.

Parameters:

cwljob (cwltool.utils.CWLObjectType)

Return type:

Dict[str, str]

run(file_store)

Execute the CWL document.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

Any

toil.cwl.cwltoil.get_container_engine(runtime_context)
Parameters:

runtime_context (cwltool.context.RuntimeContext)

Return type:

str

toil.cwl.cwltoil.makeJob(tool, jobobj, runtime_context, parent_name, conditional)

Create the correct Toil Job object for the CWL tool.

Types: workflow, job, or job wrapper for dynamic resource requirements.

Returns:

“wfjob, followOn” if the input tool is a workflow, and “job, job” otherwise

Parameters:
Return type:

Union[Tuple[CWLWorkflow, ResolveIndirect], Tuple[CWLJob, CWLJob], Tuple[CWLJobWrapper, CWLJobWrapper]]

class toil.cwl.cwltoil.CWLScatter(step, cwljob, runtime_context, parent_name, conditional)

Bases: toil.job.Job

Implement workflow scatter step.

When run, this creates a child job for each parameterization of the scatter.

Parameters:
flat_crossproduct_scatter(joborder, scatter_keys, outputs, postScatterEval)

Cartesian product of the inputs, then flattened.

Parameters:
  • joborder (cwltool.utils.CWLObjectType)

  • scatter_keys (List[str])

  • outputs (List[toil.job.Promised[cwltool.utils.CWLObjectType]])

  • postScatterEval (Callable[[cwltool.utils.CWLObjectType], cwltool.utils.CWLObjectType])

Return type:

None

nested_crossproduct_scatter(joborder, scatter_keys, postScatterEval)

Cartesian product of the inputs.

Parameters:
  • joborder (cwltool.utils.CWLObjectType)

  • scatter_keys (List[str])

  • postScatterEval (Callable[[cwltool.utils.CWLObjectType], cwltool.utils.CWLObjectType])

Return type:

List[toil.job.Promised[cwltool.utils.CWLObjectType]]

run(file_store)

Generate the follow on scatter jobs.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

List[toil.job.Promised[cwltool.utils.CWLObjectType]]

class toil.cwl.cwltoil.CWLGather(step, outputs)

Bases: toil.job.Job

Follows on to a scatter Job.

This gathers the outputs of each job in the scatter into an array for each output parameter.

Parameters:
static extract(obj, k)

Extract the given key from the obj.

If the object is a list, extract it from all members of the list.

Parameters:
  • obj (Union[cwltool.utils.CWLObjectType, List[cwltool.utils.CWLObjectType]])

  • k (str)

Return type:

Union[cwltool.utils.CWLOutputType, List[cwltool.utils.CWLObjectType]]

run(file_store)

Gather all the outputs of the scatter.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

Dict[str, Any]

class toil.cwl.cwltoil.SelfJob(j, v)

Bases: toil.job.Job

Fake job object to facilitate implementation of CWLWorkflow.run().

Parameters:
rv(*path)

Return our properties dictionary.

Parameters:

path (Any)

Return type:

Any

addChild(c)

Add a child to our workflow.

Parameters:

c (toil.job.Job)

Return type:

Any

hasChild(c)

Check if the given child is in our workflow.

Parameters:

c (toil.job.Job)

Return type:

Any

toil.cwl.cwltoil.ProcessType
toil.cwl.cwltoil.remove_pickle_problems(obj)

Doc_loader does not pickle correctly, causing Toil errors, remove from objects.

Parameters:

obj (ProcessType)

Return type:

ProcessType

class toil.cwl.cwltoil.CWLWorkflow(cwlwf, cwljob, runtime_context, parent_name=None, conditional=None)

Bases: CWLNamedJob

Toil Job to convert a CWL workflow graph into a Toil job graph.

The Toil job graph will include the appropriate dependencies.

Parameters:
run(file_store)

Convert a CWL Workflow graph into a Toil job graph.

Always runs on the leader, because the batch system knows to schedule it as a local job.

Parameters:

file_store (toil.fileStores.abstractFileStore.AbstractFileStore)

Return type:

Union[UnresolvedDict, Dict[str, SkipNull]]

toil.cwl.cwltoil.visitSteps(cmdline_tool, op)

Iterate over a CWL Process object, running the op on each tool description CWL object.

Parameters:
Return type:

None

toil.cwl.cwltoil.rm_unprocessed_secondary_files(job_params)
Parameters:

job_params (Any)

Return type:

None

toil.cwl.cwltoil.filtered_secondary_files(unfiltered_secondary_files)

Remove unprocessed secondary files.

Interpolated strings and optional inputs in secondary files were added to CWL in version 1.1.

The CWL libraries we call do successfully resolve the interpolated strings, but add the resolved fields to the list of unresolved fields so we remove them here after the fact.

We keep secondary files with anything other than MISSING_FILE as their location. The ‘required’ logic seems to be handled deeper in cwltool.builder.Builder(), and correctly determines which files should be imported. Therefore we remove the files here and if this file is SUPPOSED to exist, it will still give the appropriate file does not exist error, but just a bit further down the track.

Parameters:

unfiltered_secondary_files (cwltool.utils.CWLObjectType)

Return type:

List[cwltool.utils.CWLObjectType]

toil.cwl.cwltoil.scan_for_unsupported_requirements(tool, bypass_file_store=False)

Scan the given CWL tool for any unsupported optional features.

If it has them, raise an informative UnsupportedRequirement.

Parameters:
  • tool (cwltool.process.Process) – The CWL tool to check for unsupported requirements.

  • bypass_file_store (bool) – True if the Toil file store is not being used to transport files between nodes, and raw origin node file:// URIs are exposed to tools instead.

Return type:

None

toil.cwl.cwltoil.determine_load_listing(tool)

Determine the directory.listing feature in CWL.

In CWL, any input directory can have a DIRECTORY_NAME.listing (where DIRECTORY_NAME is any variable name) set to one of the following three options:

  1. no_listing: DIRECTORY_NAME.listing will be undefined.

    e.g.

    inputs.DIRECTORY_NAME.listing == unspecified

  2. shallow_listing: DIRECTORY_NAME.listing will return a list one level

    deep of DIRECTORY_NAME’s contents. e.g.

    inputs.DIRECTORY_NAME.listing == [items in directory]

    inputs.DIRECTORY_NAME.listing[0].listing == undefined inputs.DIRECTORY_NAME.listing.length == # of items in directory

  3. deep_listing: DIRECTORY_NAME.listing will return a list of the entire

    contents of DIRECTORY_NAME. e.g.

    inputs.DIRECTORY_NAME.listing == [items in directory] inputs.DIRECTORY_NAME.listing[0].listing == [items in subdirectory if it exists and is the first item listed] inputs.DIRECTORY_NAME.listing.length == # of items in directory

See https://www.commonwl.org/v1.1/CommandLineTool.html#LoadListingRequirement and https://www.commonwl.org/v1.1/CommandLineTool.html#LoadListingEnum

DIRECTORY_NAME.listing should be determined first from loadListing. If that’s not specified, from LoadListingRequirement. Else, default to “no_listing” if unspecified.

Parameters:

tool (cwltool.process.Process) – ToilCommandLineTool

Return str:

One of ‘no_listing’, ‘shallow_listing’, or ‘deep_listing’.

Return type:

typing_extensions.Literal[no_listing, shallow_listing, deep_listing]

exception toil.cwl.cwltoil.NoAvailableJobStoreException

Bases: Exception

Indicates that no job store name is available.

toil.cwl.cwltoil.generate_default_job_store(batch_system_name, provisioner_name, local_directory)

Choose a default job store appropriate to the requested batch system and provisioner, and installed modules. Raises an error if no good default is available and the user must choose manually.

Parameters:
  • batch_system_name (Optional[str]) – Registry name of the batch system the user has requested, if any. If no name has been requested, should be None.

  • provisioner_name (Optional[str]) – Name of the provisioner the user has requested, if any. Recognized provisioners include ‘aws’ and ‘gce’. None indicates that no provisioner is in use.

  • local_directory (str) – Path to a nonexistent local directory suitable for use as a file job store.

Return str:

Job store specifier for a usable job store.

Return type:

str

toil.cwl.cwltoil.usage_message
toil.cwl.cwltoil.get_options(args)

Parse given args and properly add non-Toil arguments into the cwljob of the Namespace. :param args: List of args from command line :return: options namespace

Parameters:

args (List[str])

Return type:

configargparse.Namespace

toil.cwl.cwltoil.main(args=None, stdout=sys.stdout)

Run the main loop for toil-cwl-runner.

Parameters:
  • args (Optional[List[str]])

  • stdout (TextIO)

Return type:

int

toil.cwl.cwltoil.find_default_container(args, builder)

Find the default constructor by consulting a Toil.options object.

Parameters:
Return type:

Optional[str]