toil.wdl.wdl_functions

Module Contents

Classes

WDLJSONEncoder

Extended JSONEncoder to support WDL-specific JSON encoding.

Functions

generate_docker_bashscript_file(temp_dir, docker_dir, ...)

Creates a bashscript to inject into a docker container for the job.

process_single_infile(wdl_file, fileStore)

process_infile(f, fileStore)

Takes any input and imports the WDLFile into the fileStore.

sub(input_str, pattern, replace)

Given 3 String parameters input, pattern, replace, this function will

defined(i)

process_single_outfile(wdl_file, fileStore, workDir, ...)

process_outfile(f, fileStore, workDir, outDir)

abspath_single_file(f, cwd)

abspath_file(f, cwd)

read_single_file(f, tempDir, fileStore[, docker])

read_file(f, tempDir, fileStore[, docker])

process_and_read_file(f, tempDir, fileStore[, docker])

generate_stdout_file(output, tempDir, fileStore[, stderr])

Create a stdout (or stderr) file from a string or bytes object.

parse_memory(memory)

Parses a string representing memory and returns

parse_cores(cores)

parse_disk(disk)

is_number(s)

size([f, unit, fileStore])

Given a File and a String (optional), returns the size of the file in Bytes

select_first(values)

combine_dicts(dict1, dict2)

basename(path[, suffix])

https://software.broadinstitute.org/wdl/documentation/article?id=10554

heredoc_wdl(template[, dictionary, indent])

floor(i)

Converts a Float value into an Int by rounding down to the next lower integer.

ceil(i)

Converts a Float value into an Int by rounding up to the next higher integer.

read_lines(path)

Given a file-like object (String, File) as a parameter, this will read each

read_tsv(path[, delimiter])

Take a tsv filepath and return an array; e.g. [[],[],[]].

read_csv(path)

Take a csv filepath and return an array; e.g. [[],[],[]].

read_json(path)

The read_json() function takes one parameter, which is a file-like object

read_map(path)

Given a file-like object (String, File) as a parameter, this will read each

read_int(path)

The read_int() function takes a file path which is expected to contain 1

read_string(path)

The read_string() function takes a file path which is expected to contain 1

read_float(path)

The read_float() function takes a file path which is expected to contain 1

read_boolean(path)

The read_boolean() function takes a file path which is expected to contain 1

write_lines(in_lines[, temp_dir, file_store])

Given something that's compatible with Array[String], this writes each element

write_tsv(in_tsv[, delimiter, temp_dir, file_store])

Given something that's compatible with Array[Array[String]], this writes a TSV

write_json(in_json[, indent, separators, temp_dir, ...])

Given something with any type, this writes the JSON equivalent to a file. See

write_map(in_map[, temp_dir, file_store])

Given something that's compatible with Map[String, String], this writes a TSV

wdl_range(num)

Given an integer argument, the range function creates an array of integers of

transpose(in_array)

Given a two dimensional array argument, the transpose function transposes the

length(in_array)

Given an Array, the length function returns the number of elements in the Array

wdl_zip(left, right)

Return the dot product of the two arrays. If the arrays have different lengths

cross(left, right)

Return the cross product of the two arrays. Array[Y][1] appears before

as_pairs(in_map)

Given a Map, the as_pairs function returns an Array containing each element

as_map(in_array)

Given an Array consisting of Pairs, the as_map function returns a Map in

keys(in_map)

Given a Map, the keys function returns an Array consisting of the keys in

collect_by_key(in_array)

Given an Array consisting of Pairs, the collect_by_key function returns a Map

flatten(in_array)

Given an array of arrays, the flatten function concatenates all the member

Attributes

logger

toil.wdl.wdl_functions.logger
exception toil.wdl.wdl_functions.WDLRuntimeError(message)[source]

Bases: Exception

Inheritance diagram of toil.wdl.wdl_functions.WDLRuntimeError

WDL-related run-time error.

class toil.wdl.wdl_functions.WDLJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: json.JSONEncoder

Inheritance diagram of toil.wdl.wdl_functions.WDLJSONEncoder

Extended JSONEncoder to support WDL-specific JSON encoding.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
toil.wdl.wdl_functions.generate_docker_bashscript_file(temp_dir, docker_dir, globs, cmd, job_name)[source]

Creates a bashscript to inject into a docker container for the job.

This script wraps the job command(s) given in a bash script, hard links the outputs and returns an “rc” file containing the exit code. All of this is done in an effort to parallel the Broad’s cromwell engine, which is the native WDL runner. As they’ve chosen to write and then run a bashscript for every command, so shall we.

Parameters
  • temp_dir – The current directory outside of docker to deposit the bashscript into, which will be the bind mount that docker loads files from into its own containerized filesystem. This is usually the tempDir created by this individual job using ‘tempDir = job.fileStore.getLocalTempDir()’.

  • docker_dir – The working directory inside of the docker container which is bind mounted to ‘temp_dir’. By default this is ‘data’.

  • globs – A list of expected output files to retrieve as glob patterns that will be returned as hard links to the current working directory.

  • cmd – A bash command to be written into the bash script and run.

  • job_name – The job’s name, only used to write in a file name identifying the script as written for that job. Will be used to call the script later.

Returns

Nothing, but it writes and deposits a bash script in temp_dir intended to be run inside of a docker container for this job.

toil.wdl.wdl_functions.process_single_infile(wdl_file, fileStore)[source]
Parameters
Return type

toil.wdl.wdl_types.WDLFile

toil.wdl.wdl_functions.process_infile(f, fileStore)[source]

Takes any input and imports the WDLFile into the fileStore.

This returns the input importing all WDLFile instances to the fileStore. Toil does not preserve a file’s original name upon import and so the WDLFile also keeps track of this.

Parameters
toil.wdl.wdl_functions.sub(input_str, pattern, replace)[source]

Given 3 String parameters input, pattern, replace, this function will replace any occurrence matching pattern in input by replace. pattern is expected to be a regular expression. Details of regex evaluation will depend on the execution engine running the WDL.

WDL syntax: String sub(String, String, String)

Parameters
  • input_str (str) –

  • pattern (str) –

  • replace (str) –

Return type

str

toil.wdl.wdl_functions.defined(i)[source]
toil.wdl.wdl_functions.process_single_outfile(wdl_file, fileStore, workDir, outDir)[source]
Parameters

wdl_file (toil.wdl.wdl_types.WDLFile) –

Return type

toil.wdl.wdl_types.WDLFile

toil.wdl.wdl_functions.process_outfile(f, fileStore, workDir, outDir)[source]
toil.wdl.wdl_functions.abspath_single_file(f, cwd)[source]
Parameters
Return type

toil.wdl.wdl_types.WDLFile

toil.wdl.wdl_functions.abspath_file(f, cwd)[source]
Parameters
  • f (Any) –

  • cwd (str) –

toil.wdl.wdl_functions.read_single_file(f, tempDir, fileStore, docker=False)[source]
Parameters

f (toil.wdl.wdl_types.WDLFile) –

Return type

str

toil.wdl.wdl_functions.read_file(f, tempDir, fileStore, docker=False)[source]
Parameters
toil.wdl.wdl_functions.process_and_read_file(f, tempDir, fileStore, docker=False)[source]
toil.wdl.wdl_functions.generate_stdout_file(output, tempDir, fileStore, stderr=False)[source]

Create a stdout (or stderr) file from a string or bytes object.

Parameters
  • output (str|bytes) – A str or bytes object that holds the stdout/stderr text.

  • tempDir (str) – The directory to write the stdout file.

  • fileStore – A fileStore object.

  • stderr (bool) – If True, a stderr instead of a stdout file is generated.

Returns

The file path to the generated file.

toil.wdl.wdl_functions.parse_memory(memory)[source]

Parses a string representing memory and returns an integer # of bytes.

Parameters

memory

Returns

toil.wdl.wdl_functions.parse_cores(cores)[source]
toil.wdl.wdl_functions.parse_disk(disk)[source]
toil.wdl.wdl_functions.is_number(s)[source]
toil.wdl.wdl_functions.size(f=None, unit='B', fileStore=None)[source]

Given a File and a String (optional), returns the size of the file in Bytes or in the unit specified by the second argument.

Supported units are KiloByte (“K”, “KB”), MegaByte (“M”, “MB”), GigaByte (“G”, “GB”), TeraByte (“T”, “TB”) (powers of 1000) as well as their binary version (https://en.wikipedia.org/wiki/Binary_prefix) “Ki” (“KiB”), “Mi” (“MiB”), “Gi” (“GiB”), “Ti” (“TiB”) (powers of 1024). Default unit is Bytes (“B”).

WDL syntax: Float size(File, [String]) Varieties: Float size(File?, [String])

Float size(Array[File], [String]) Float size(Array[File?], [String])

Parameters
Return type

float

toil.wdl.wdl_functions.select_first(values)[source]
toil.wdl.wdl_functions.combine_dicts(dict1, dict2)[source]
toil.wdl.wdl_functions.basename(path, suffix=None)[source]

https://software.broadinstitute.org/wdl/documentation/article?id=10554

toil.wdl.wdl_functions.heredoc_wdl(template, dictionary={}, indent='')[source]
toil.wdl.wdl_functions.floor(i)[source]

Converts a Float value into an Int by rounding down to the next lower integer.

Parameters

i (Union[int, float]) –

Return type

int

toil.wdl.wdl_functions.ceil(i)[source]

Converts a Float value into an Int by rounding up to the next higher integer.

Parameters

i (Union[int, float]) –

Return type

int

toil.wdl.wdl_functions.read_lines(path)[source]

Given a file-like object (String, File) as a parameter, this will read each line as a string and return an Array[String] representation of the lines in the file.

WDL syntax: Array[String] read_lines(String|File)

Parameters

path (str) –

Return type

List[str]

toil.wdl.wdl_functions.read_tsv(path, delimiter='\t')[source]

Take a tsv filepath and return an array; e.g. [[],[],[]].

For example, a file containing:

1 2 3 4 5 6 7 8 9

would return the array: [[‘1’,’2’,’3’], [‘4’,’5’,’6’], [‘7’,’8’,’9’]]

WDL syntax: Array[Array[String]] read_tsv(String|File)

Parameters
  • path (str) –

  • delimiter (str) –

Return type

List[List[str]]

toil.wdl.wdl_functions.read_csv(path)[source]

Take a csv filepath and return an array; e.g. [[],[],[]].

For example, a file containing:

1,2,3 4,5,6 7,8,9

would return the array: [[‘1’,’2’,’3’], [‘4’,’5’,’6’], [‘7’,’8’,’9’]]

Parameters

path (str) –

Return type

List[List[str]]

toil.wdl.wdl_functions.read_json(path)[source]
The read_json() function takes one parameter, which is a file-like object

(String, File) and returns a data type which matches the data structure in the JSON file. See

https://github.com/openwdl/wdl/blob/main/versions/development/SPEC.md#mixed-read_jsonstringfile

WDL syntax: mixed read_json(String|File)

Parameters

path (str) –

Return type

Any

toil.wdl.wdl_functions.read_map(path)[source]

Given a file-like object (String, File) as a parameter, this will read each line from a file and expect the line to have the format col1 col2. In other words, the file-like object must be a two-column TSV file.

WDL syntax: Map[String, String] read_map(String|File)

Parameters

path (str) –

Return type

Dict[str, str]

toil.wdl.wdl_functions.read_int(path)[source]

The read_int() function takes a file path which is expected to contain 1 line with 1 integer on it. This function returns that integer.

WDL syntax: Int read_int(String|File)

Parameters

path (Union[str, toil.wdl.wdl_types.WDLFile]) –

Return type

int

toil.wdl.wdl_functions.read_string(path)[source]

The read_string() function takes a file path which is expected to contain 1 line with 1 string on it. This function returns that string.

WDL syntax: String read_string(String|File)

Parameters

path (Union[str, toil.wdl.wdl_types.WDLFile]) –

Return type

str

toil.wdl.wdl_functions.read_float(path)[source]

The read_float() function takes a file path which is expected to contain 1 line with 1 floating point number on it. This function returns that float.

WDL syntax: Float read_float(String|File)

Parameters

path (Union[str, toil.wdl.wdl_types.WDLFile]) –

Return type

float

toil.wdl.wdl_functions.read_boolean(path)[source]

The read_boolean() function takes a file path which is expected to contain 1 line with 1 Boolean value (either “true” or “false” on it). This function returns that Boolean value.

WDL syntax: Boolean read_boolean(String|File)

Parameters

path (Union[str, toil.wdl.wdl_types.WDLFile]) –

Return type

bool

toil.wdl.wdl_functions.write_lines(in_lines, temp_dir=None, file_store=None)[source]

Given something that’s compatible with Array[String], this writes each element to it’s own line on a file. with newline `

` characters as line separators.

WDL syntax: File write_lines(Array[String])

Parameters
Return type

str

toil.wdl.wdl_functions.write_tsv(in_tsv, delimiter='\t', temp_dir=None, file_store=None)[source]

Given something that’s compatible with Array[Array[String]], this writes a TSV file of the data structure.

WDL syntax: File write_tsv(Array[Array[String]])

Parameters
Return type

str

toil.wdl.wdl_functions.write_json(in_json, indent=None, separators=(',', ':'), temp_dir=None, file_store=None)[source]

Given something with any type, this writes the JSON equivalent to a file. See the table in the definition of https://github.com/openwdl/wdl/blob/main/versions/development/SPEC.md#mixed-read_jsonstringfile

WDL syntax: File write_json(mixed)

Parameters
Return type

str

toil.wdl.wdl_functions.write_map(in_map, temp_dir=None, file_store=None)[source]
Given something that’s compatible with Map[String, String], this writes a TSV

file of the data structure.

WDL syntax: File write_map(Map[String, String])

Parameters
Return type

str

toil.wdl.wdl_functions.wdl_range(num)[source]

Given an integer argument, the range function creates an array of integers of length equal to the given argument.

WDL syntax: Array[Int] range(Int)

Parameters

num (int) –

Return type

List[int]

toil.wdl.wdl_functions.transpose(in_array)[source]

Given a two dimensional array argument, the transpose function transposes the two dimensional array according to the standard matrix transpose rules.

WDL syntax: Array[Array[X]] transpose(Array[Array[X]])

Parameters

in_array (List[List[Any]]) –

Return type

List[List[Any]]

toil.wdl.wdl_functions.length(in_array)[source]

Given an Array, the length function returns the number of elements in the Array as an Int.

Parameters

in_array (List[Any]) –

Return type

int

toil.wdl.wdl_functions.wdl_zip(left, right)[source]

Return the dot product of the two arrays. If the arrays have different lengths it is an error.

WDL syntax: Array[Pair[X,Y]] zip(Array[X], Array[Y])

Parameters
  • left (List[Any]) –

  • right (List[Any]) –

Return type

List[toil.wdl.wdl_types.WDLPair]

toil.wdl.wdl_functions.cross(left, right)[source]

Return the cross product of the two arrays. Array[Y][1] appears before Array[X][1] in the output.

WDL syntax: Array[Pair[X,Y]] cross(Array[X], Array[Y])

Parameters
  • left (List[Any]) –

  • right (List[Any]) –

Return type

List[toil.wdl.wdl_types.WDLPair]

toil.wdl.wdl_functions.as_pairs(in_map)[source]

Given a Map, the as_pairs function returns an Array containing each element in the form of a Pair. The key will be the left element of the Pair and the value the right element. The order of the the Pairs in the resulting Array is the same as the order of the key/value pairs in the Map.

WDL syntax: Array[Pair[X,Y]] as_pairs(Map[X,Y])

Parameters

in_map (dict) –

Return type

List[toil.wdl.wdl_types.WDLPair]

toil.wdl.wdl_functions.as_map(in_array)[source]

Given an Array consisting of Pairs, the as_map function returns a Map in which the left elements of the Pairs are the keys and the right elements the values.

WDL syntax: Map[X,Y] as_map(Array[Pair[X,Y]])

Parameters

in_array (List[toil.wdl.wdl_types.WDLPair]) –

Return type

dict

toil.wdl.wdl_functions.keys(in_map)[source]

Given a Map, the keys function returns an Array consisting of the keys in the Map. The order of the keys in the resulting Array is the same as the order of the Pairs in the Map.

WDL syntax: Array[X] keys(Map[X,Y])

Parameters

in_map (dict) –

Return type

list

toil.wdl.wdl_functions.collect_by_key(in_array)[source]

Given an Array consisting of Pairs, the collect_by_key function returns a Map in which the left elements of the Pairs are the keys and the right elements the values.

WDL syntax: Map[X,Array[Y]] collect_by_key(Array[Pair[X,Y]])

Parameters

in_array (List[toil.wdl.wdl_types.WDLPair]) –

Return type

dict

toil.wdl.wdl_functions.flatten(in_array)[source]

Given an array of arrays, the flatten function concatenates all the member arrays in the order to appearance to give the result. It does not deduplicate the elements.

WDL syntax: Array[X] flatten(Array[Array[X]])

Parameters

in_array (List[list]) –

Return type

list