toil.resource¶
Attributes¶
Exceptions¶
Common base class for all non-exit exceptions. |
Classes¶
Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked. |
|
A resource read from a file on the leader. |
|
A resource read from a directory on the leader. |
|
A resource read from a virtualenv on the leader. |
|
A path to a Python module decomposed into a namedtuple of three elements |
Module Contents¶
- toil.resource.logger¶
- class toil.resource.Resource[source]¶
Bases:
namedtuple
('Resource'
, ('name'
,'pathHash'
,'url'
,'contentHash'
))Represents a file or directory that will be deployed to each node before any jobs in the user script are invoked.
Each instance is a namedtuple with the following elements:
The pathHash element contains the MD5 (in hexdigest form) of the path to the resource on the leader node. The path, and therefore its hash is unique within a job store.
The url element is a “file:” or “http:” URL at which the resource can be obtained.
The contentHash element is an MD5 checksum of the resource, allowing for validation and caching of resources.
If the resource is a regular file, the type attribute will be ‘file’.
If the resource is a directory, the type attribute will be ‘dir’ and the URL will point at a ZIP archive of that directory.
- resourceEnvNamePrefix = 'JTRES_'¶
- rootDirPathEnvName¶
- classmethod create(jobStore, leaderPath)[source]¶
Saves the content of the file or directory at the given path to the given job store and returns a resource object representing that content for the purpose of obtaining it again at a generic, public URL. This method should be invoked on the leader node.
- Parameters:
leaderPath (str)
- Return type:
- classmethod prepareSystem()[source]¶
Prepares this system for the downloading and lookup of resources. This method should only be invoked on a worker node. It is idempotent but not thread-safe.
- Return type:
None
- register()[source]¶
Register this resource for later retrieval via lookup(), possibly in a child process.
- Return type:
None
- classmethod lookup(leaderPath)[source]¶
Return a resource object representing a resource created from a file or directory at the given path on the leader.
This method should be invoked on the worker. The given path does not need to refer to an existing file or directory on the worker, it only identifies the resource within an instance of toil. This method returns None if no resource for the given path exists.
- download(callback=None)[source]¶
Download this resource from its URL to a file on the local system.
This method should only be invoked on a worker node after the node was setup for accessing resources via prepareSystem().
- Parameters:
callback (Optional[Callable[[str], None]])
- Return type:
None
- property localPath: str¶
- Abstractmethod:
- Return type:
Get the path to resource on the worker.
The file or directory at the returned path may or may not yet exist. Invoking download() will ensure that it does.
- class toil.resource.FileResource[source]¶
Bases:
Resource
A resource read from a file on the leader.
- class toil.resource.DirectoryResource[source]¶
Bases:
Resource
A resource read from a directory on the leader.
The URL will point to a ZIP archive of the directory. All files in that directory (and any subdirectories) will be included. The directory may be a package but it does not need to be.
- class toil.resource.VirtualEnvResource[source]¶
Bases:
DirectoryResource
A resource read from a virtualenv on the leader.
All modules and packages found in the virtualenv’s site-packages directory will be included.
- class toil.resource.ModuleDescriptor[source]¶
Bases:
namedtuple
('ModuleDescriptor'
, ('dirPath'
,'name'
,'fromVirtualEnv'
))A path to a Python module decomposed into a namedtuple of three elements
dirPath, the path to the directory that should be added to sys.path before importing the module,
moduleName, the fully qualified name of the module with leading package names separated by dot and
>>> import toil.resource >>> ModuleDescriptor.forModule('toil.resource') ModuleDescriptor(dirPath='/.../src', name='toil.resource', fromVirtualEnv=False)
>>> import subprocess, tempfile, os >>> dirPath = tempfile.mkdtemp() >>> path = os.path.join( dirPath, 'foo.py' ) >>> with open(path,'w') as f: ... _ = f.write('from toil.resource import ModuleDescriptor\n' ... 'print(ModuleDescriptor.forModule(__name__))') >>> subprocess.check_output([ sys.executable, path ]) b"ModuleDescriptor(dirPath='...', name='foo', fromVirtualEnv=False)\n"
>>> from shutil import rmtree >>> rmtree( dirPath )
Now test a collision. ‘collections’ is part of the standard library in Python 2 and 3. >>> dirPath = tempfile.mkdtemp() >>> path = os.path.join( dirPath, ‘collections.py’ ) >>> with open(path,’w’) as f: … _ = f.write(‘from toil.resource import ModuleDescriptorn’ … ‘ModuleDescriptor.forModule(__name__)’)
This should fail and return exit status 1 due to the collision with the built-in module: >>> subprocess.call([ sys.executable, path ]) 1
Clean up >>> rmtree( dirPath )
- classmethod forModule(name)[source]¶
Return an instance of this class representing the module of the given name.
If the given module name is “__main__”, it will be translated to the actual file name of the top-level script without the .py or .pyc extension. This method assumes that the module with the specified name has already been loaded.
- Parameters:
name (str)
- Return type:
- saveAsResourceTo(jobStore)[source]¶
Store the file containing this module–or even the Python package directory hierarchy containing that file–as a resource to the given job store and return the corresponding resource object. Should only be called on a leader node.
- Parameters:
- Return type:
- localize()[source]¶
Check if this module was saved as a resource.
If it was, return a new module descriptor that points to a local copy of that resource. Should only be called on a worker node. On the leader, this method returns this resource, i.e. self.
- Return type:
- load()[source]¶
- Return type:
Optional[types.ModuleType]