toil.wdl.wdl_synthesis
¶
Module Contents¶
Classes¶
SynthesizeWDL takes the "workflows_dictionary" and "tasks_dictionary" produced by |
Attributes¶
- toil.wdl.wdl_synthesis.logger¶
- class toil.wdl.wdl_synthesis.SynthesizeWDL(version, tasks_dictionary, workflows_dictionary, output_directory, json_dict, docker_user, jobstore=None, destBucket=None)[source]¶
SynthesizeWDL takes the “workflows_dictionary” and “tasks_dictionary” produced by wdl_analysis.py and uses them to write a native python script for use with Toil.
A WDL “workflow” section roughly corresponds to the python “main()” function, where functions are wrapped as Toil “jobs”, output dependencies specified, and called.
A WDL “task” section corresponds to a unique python function, which will be wrapped as a Toil “job” and defined outside of the “main()” function that calls it.
Generally this handles breaking sections into their corresponding Toil counterparts.
For example: write the imports, then write all functions defining jobs (which have subsections like: write header, define variables, read “File” types into the jobstore, docker call, etc.), then write the main and all of its subsections.
- Parameters
- write_main()[source]¶
Writes out a huge string representing the main section of the python compiled toil script.
Currently looks at and writes 5 sections: 1. JSON Variables (includes importing and preparing files as tuples) 2. TSV Variables (includes importing and preparing files as tuples) 3. CSV Variables (includes importing and preparing files as tuples) 4. Wrapping each WDL “task” function as a toil job 5. List out children and encapsulated jobs by priority, then start job0.
This should create variable declarations necessary for function calls. Map file paths appropriately and store them in the toil fileStore so that they are persistent from job to job. Create job wrappers for toil. And finally write out, and run the jobs in order of priority using the addChild and encapsulate commands provided by toil.
- Returns
giant string containing the main def for the toil script.
- write_main_jobwrappers()[source]¶
Writes out ‘jobs’ as wrapped toil objects in preparation for calling.
- Returns
A string representing this.
- write_main_destbucket()[source]¶
Writes out a loop for exporting outputs to a cloud bucket.
- Returns
A string representing this.
- write_functions()[source]¶
Writes out a python function for each WDL “task” object.
- Returns
a giant string containing the meat of the job defs.
- write_scatterfunction(job, scattername)[source]¶
Writes out a python function for each WDL “scatter” object.
- write_function(job)[source]¶
Writes out a python function for each WDL “task” object.
Each python function is a unit of work written out as a string in preparation to being written out to a file. In WDL, each “job” is called a “task”. Each WDL task is written out in multiple steps:
1: Header and inputs (e.g. ‘def mapping(self, input1, input2)’) 2: Log job name (e.g. ‘job.fileStore.logToMaster(‘initialize_jobs’)’) 3: Create temp dir (e.g. ‘tempDir = fileStore.getLocalTempDir()’) 4: import filenames and use readGlobalFile() to get files from the
jobStore
5: Reformat commandline variables (like converting to ‘ ‘.join(files)). 6: Commandline call using subprocess.Popen(). 7: Write the section returning the outputs. Also logs stats.
- Returns
a giant string containing the meat of the job defs for the toil script.
- write_function_header(job)[source]¶
Writes the header that starts each function, for example, this function can write and return:
‘def write_function_header(self, job, job_declaration_array):’
- Parameters
job – A list such that: (job priority #, job ID #, Job Skeleton Name, Job Alias)
job_declaration_array – A list of all inputs that job requires.
- Returns
A string representing this.
- needs_file_import(var_type)[source]¶
Check if the given type contains a File type. A return value of True means that the value with this type has files to import.
- Parameters
var_type (toil.wdl.wdl_types.WDLType) –
- Return type
- write_declaration_type(var_type)[source]¶
Return a string that preserves the construction of the given WDL type so it can be passed into the compiled script.
- Parameters
var_type (toil.wdl.wdl_types.WDLType) –
- write_function_bashscriptline(job)[source]¶
Writes a function to create a bashscript for injection into the docker container.
- Parameters
job_task_reference – The job referenced in WDL’s Task section.
job_alias – The actual job name to be written.
- Returns
A string writing all of this.
- write_function_dockercall(job)[source]¶
Writes a string containing the apiDockerCall() that will run the job.
- Parameters
job_task_reference – The name of the job calling docker.
docker_image – The corresponding name of the docker image. e.g. “ubuntu:latest”
- Returns
A string containing the apiDockerCall() that will run the job.
- write_function_cmdline(job)[source]¶
Write a series of commandline variables to be concatenated together eventually and either called with subprocess.Popen() or with apiDockerCall() if a docker image is called for.
- Parameters
job – A list such that: (job priority #, job ID #, Job Skeleton Name, Job Alias)
- Returns
A string representing this.
- write_function_subprocesspopen()[source]¶
Write a subprocess.Popen() call for this function and write it out as a string.
- Parameters
job – A list such that: (job priority #, job ID #, Job Skeleton Name, Job Alias)
- Returns
A string representing this.
- write_function_outputreturn(job, docker=False)[source]¶
Find the output values that this function needs and write them out as a string.
- Parameters
job – A list such that: (job priority #, job ID #, Job Skeleton Name, Job Alias)
job_task_reference – The name of the job to look up values for.
- Returns
A string representing this.
- write_python_file(module_section, fn_section, main_section, output_file)[source]¶
Just takes three strings and writes them to output_file.
- Parameters
module_section – A string of ‘import modules’.
fn_section – A string of python ‘def functions()’.
main_section – A string declaring toil options and main’s header.
job_section – A string import files into toil and declaring jobs.
output_file – The file to write the compiled toil script to.