toil.resource

Module Contents

Classes

Resource

Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.

FileResource

A resource read from a file on the leader.

DirectoryResource

A resource read from a directory on the leader.

VirtualEnvResource

A resource read from a virtualenv on the leader.

ModuleDescriptor

A path to a Python module decomposed into a namedtuple of three elements

Attributes

logger

toil.resource.logger
class toil.resource.Resource[source]

Bases: namedtuple('Resource', ('name', 'pathHash', 'url', 'contentHash'))

Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.

Each instance is a namedtuple with the following elements:

The pathHash element contains the MD5 (in hexdigest form) of the path to the resource on the leader node. The path, and therefore its hash is unique within a job store.

The url element is a “file:” or “http:” URL at which the resource can be obtained.

The contentHash element is an MD5 checksum of the resource, allowing for validation and caching of resources.

If the resource is a regular file, the type attribute will be ‘file’.

If the resource is a directory, the type attribute will be ‘dir’ and the URL will point at a ZIP archive of that directory.

abstract property localPath: str

Get the path to resource on the worker.

The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.

Return type:

str

property localDirPath: str

The path to the directory containing the resource on the worker.

Return type:

str

resourceEnvNamePrefix = 'JTRES_'
rootDirPathEnvName
classmethod create(jobStore, leaderPath)[source]

Saves the content of the file or directory at the given path to the given job store and returns a resource object representing that content for the purpose of obtaining it again at a generic, public URL. This method should be invoked on the leader node.

Parameters:
Return type:

Resource

refresh(jobStore)[source]
Parameters:

jobStore (toil.jobStores.abstractJobStore.AbstractJobStore)

Return type:

Resource

classmethod prepareSystem()[source]

Prepares this system for the downloading and lookup of resources. This method should only be invoked on a worker node. It is idempotent but not thread-safe.

Return type:

None

classmethod cleanSystem()[source]

Remove all downloaded, localized resources.

Return type:

None

register()[source]

Register this resource for later retrieval via lookup(), possibly in a child process.

Return type:

None

classmethod lookup(leaderPath)[source]

Return a resource object representing a resource created from a file or directory at the given path on the leader.

This method should be invoked on the worker. The given path does not need to refer to an existing file or directory on the worker, it only identifies the resource within an instance of toil. This method returns None if no resource for the given path exists.

Parameters:

leaderPath (str)

Return type:

Optional[Resource]

download(callback=None)[source]

Download this resource from its URL to a file on the local system.

This method should only be invoked on a worker node after the node was setup for accessing resources via prepareSystem().

Parameters:

callback (Optional[Callable[[str], None]])

Return type:

None

pickle()[source]
Return type:

str

classmethod unpickle(s)[source]
Parameters:

s (str)

Return type:

Resource

class toil.resource.FileResource[source]

Bases: Resource

A resource read from a file on the leader.

property localPath: str

Get the path to resource on the worker.

The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.

Return type:

str

class toil.resource.DirectoryResource[source]

Bases: Resource

A resource read from a directory on the leader.

The URL will point to a ZIP archive of the directory. All files in that directory (and any subdirectories) will be included. The directory may be a package but it does not need to be.

property localPath: str

Get the path to resource on the worker.

The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.

Return type:

str

class toil.resource.VirtualEnvResource[source]

Bases: DirectoryResource

A resource read from a virtualenv on the leader.

All modules and packages found in the virtualenv’s site-packages directory will be included.

class toil.resource.ModuleDescriptor[source]

Bases: namedtuple('ModuleDescriptor', ('dirPath', 'name', 'fromVirtualEnv'))

A path to a Python module decomposed into a namedtuple of three elements

  • dirPath, the path to the directory that should be added to sys.path before importing the module,

  • moduleName, the fully qualified name of the module with leading package names separated by dot and

>>> import toil.resource
>>> ModuleDescriptor.forModule('toil.resource') 
ModuleDescriptor(dirPath='/.../src', name='toil.resource', fromVirtualEnv=False)
>>> import subprocess, tempfile, os
>>> dirPath = tempfile.mkdtemp()
>>> path = os.path.join( dirPath, 'foo.py' )
>>> with open(path,'w') as f:
...     _ = f.write('from toil.resource import ModuleDescriptor\n'
...                 'print(ModuleDescriptor.forModule(__name__))')
>>> subprocess.check_output([ sys.executable, path ]) 
b"ModuleDescriptor(dirPath='...', name='foo', fromVirtualEnv=False)\n"
>>> from shutil import rmtree
>>> rmtree( dirPath )

Now test a collision. ‘collections’ is part of the standard library in Python 2 and 3. >>> dirPath = tempfile.mkdtemp() >>> path = os.path.join( dirPath, ‘collections.py’ ) >>> with open(path,’w’) as f: … _ = f.write(‘from toil.resource import ModuleDescriptorn’ … ‘ModuleDescriptor.forModule(__name__)’)

This should fail and return exit status 1 due to the collision with the built-in module: >>> subprocess.call([ sys.executable, path ]) 1

Clean up >>> rmtree( dirPath )

property belongsToToil: bool

True if this module is part of the Toil distribution

Return type:

bool

dirPath: str
name: str
classmethod forModule(name)[source]

Return an instance of this class representing the module of the given name.

If the given module name is “__main__”, it will be translated to the actual file name of the top-level script without the .py or .pyc extension. This method assumes that the module with the specified name has already been loaded.

Parameters:

name (str)

Return type:

ModuleDescriptor

saveAsResourceTo(jobStore)[source]

Store the file containing this module–or even the Python package directory hierarchy containing that file–as a resource to the given job store and return the corresponding resource object. Should only be called on a leader node.

Parameters:

jobStore (toil.jobStores.abstractJobStore.AbstractJobStore)

Return type:

Resource

localize()[source]

Check if this module was saved as a resource.

If it was, return a new module descriptor that points to a local copy of that resource. Should only be called on a worker node. On the leader, this method returns this resource, i.e. self.

Return type:

ModuleDescriptor

globalize()[source]

Reverse the effect of localize().

Return type:

ModuleDescriptor

toCommand()[source]
Return type:

Sequence[str]

classmethod fromCommand(command)[source]
Parameters:

command (Sequence[str])

Return type:

ModuleDescriptor

makeLoadable()[source]
Return type:

ModuleDescriptor

load()[source]
Return type:

Optional[types.ModuleType]

exception toil.resource.ResourceException[source]

Bases: Exception

Common base class for all non-exit exceptions.