toil.lib.aws.s3¶
Attributes¶
Exceptions¶
Common base class for all non-exit exceptions. |
|
Common base class for all non-exit exceptions. |
|
Common base class for all non-exit exceptions. |
Classes¶
An object-oriented wrapper for os.pipe. Clients should subclass it, implement |
Functions¶
|
Create an AWS S3 bucket, using the given Boto3 S3 session, with the |
|
Delete the bucket with 'bucket_name'. |
|
|
|
Attempt to HEAD an s3 object and return its response. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Upload a readable object to s3, using multipart uploading if applicable. |
|
Context manager that gives out a download stream to download data. |
|
|
|
Return True if the s3 obect exists, and False if not. Will error if encryption args are incorrect. |
|
|
|
|
|
|
|
|
|
Module Contents¶
- toil.lib.aws.s3.logger¶
- toil.lib.aws.s3.AWS_MAX_MULTIPART_COUNT = 10000¶
- toil.lib.aws.s3.AWS_MAX_CHUNK_SIZE = 5000000000000¶
- toil.lib.aws.s3.AWS_MIN_CHUNK_SIZE = 5000000¶
- toil.lib.aws.s3.DEFAULT_AWS_CHUNK_SIZE = 134217728¶
- exception toil.lib.aws.s3.AWSKeyNotFoundError¶
Bases:
ExceptionCommon base class for all non-exit exceptions.
- exception toil.lib.aws.s3.AWSKeyAlreadyExistsError¶
Bases:
ExceptionCommon base class for all non-exit exceptions.
- exception toil.lib.aws.s3.AWSBadEncryptionKeyError¶
Bases:
ExceptionCommon base class for all non-exit exceptions.
- toil.lib.aws.s3.create_s3_bucket(s3_resource, bucket_name, region, tags=None, public=True)¶
Create an AWS S3 bucket, using the given Boto3 S3 session, with the given name, in the given region.
Supports the us-east-1 region, where bucket creation is special.
ALL S3 bucket creation should use this function.
- toil.lib.aws.s3.delete_s3_bucket(s3_resource, bucket_name, quiet=True)¶
Delete the bucket with ‘bucket_name’.
- Note: ‘quiet’ is False when used for a clean up utility script (contrib/admin/cleanup_aws_resources.py)
that prints progress rather than logging. Logging should be used for all other internal Toil usage.
- toil.lib.aws.s3.bucket_exists(s3_resource, bucket)¶
- toil.lib.aws.s3.head_s3_object(bucket, key, header, region=None)¶
Attempt to HEAD an s3 object and return its response.
- Parameters:
bucket (str) – AWS bucket name
key (str) – AWS Key name for the s3 object
header (dict[str, Any]) – Headers to include (mostly for encryption). See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/head_object.html
region (str | None) – Region that we want to look for the bucket in
- Return type:
mypy_boto3_s3.type_defs.HeadObjectOutputTypeDef
- toil.lib.aws.s3.list_multipart_uploads(bucket, region, prefix, max_uploads=1)¶
- toil.lib.aws.s3.copy_s3_to_s3(s3_resource, src_bucket, src_key, dst_bucket, dst_key, extra_args=None)¶
- toil.lib.aws.s3.copy_local_to_s3(s3_resource, local_file_path, dst_bucket, dst_key, extra_args=None)¶
- toil.lib.aws.s3.copy_s3_to_local(s3_resource, local_file_path, src_bucket, src_key, extra_args=None)¶
- class toil.lib.aws.s3.MultiPartPipe(part_size, s3_resource, bucket_name, file_id, encryption_args, encoding=None, errors=None)¶
Bases:
toil.lib.pipes.WritablePipeAn object-oriented wrapper for os.pipe. Clients should subclass it, implement
readFrom()to consume the readable end of the pipe, then instantiate the class as a context manager to get the writable end. See the example below.>>> import sys, shutil, codecs >>> class MyPipe(WritablePipe): ... def readFrom(self, readable): ... shutil.copyfileobj(codecs.getreader('utf-8')(readable), sys.stdout) >>> with MyPipe() as writable: ... _ = writable.write('Hello, world!\n'.encode('utf-8')) Hello, world!
Each instance of this class creates a thread and invokes the readFrom method in that thread. The thread will be join()ed upon normal exit from the context manager, i.e. the body of the with statement. If an exception occurs, the thread will not be joined but a well-behaved
readFrom()implementation will terminate shortly thereafter due to the pipe having been closed.Now, exceptions in the reader thread will be reraised in the main thread:
>>> class MyPipe(WritablePipe): ... def readFrom(self, readable): ... raise RuntimeError('Hello, world!') >>> with MyPipe() as writable: ... pass Traceback (most recent call last): ... RuntimeError: Hello, world!
More complicated, less illustrative tests:
Same as above, but proving that handles are closed:
>>> x = os.dup(0); os.close(x) >>> class MyPipe(WritablePipe): ... def readFrom(self, readable): ... raise RuntimeError('Hello, world!') >>> with MyPipe() as writable: ... pass Traceback (most recent call last): ... RuntimeError: Hello, world! >>> y = os.dup(0); os.close(y); x == y True
Exceptions in the body of the with statement aren’t masked, and handles are closed:
>>> x = os.dup(0); os.close(x) >>> class MyPipe(WritablePipe): ... def readFrom(self, readable): ... pass >>> with MyPipe() as writable: ... raise RuntimeError('Hello, world!') Traceback (most recent call last): ... RuntimeError: Hello, world! >>> y = os.dup(0); os.close(y); x == y True
- Parameters:
- encoding = None¶
- errors = None¶
- part_size¶
- s3_client¶
- bucket_name¶
- file_id¶
- encryption_args¶
- readFrom(readable)¶
Implement this method to read data from the pipe. This method should support both binary and text mode output.
- Parameters:
readable (file) – the file object representing the readable end of the pipe. Do not explicitly invoke the close() method of the object; that will be done automatically.
- Return type:
None
- toil.lib.aws.s3.list_s3_items(s3_resource, bucket, prefix, startafter=None)¶
- Parameters:
- Return type:
collections.abc.Iterator[mypy_boto3_s3.type_defs.ObjectTypeDef]
- toil.lib.aws.s3.upload_to_s3(readable, s3_resource, bucket, key, extra_args=None)¶
Upload a readable object to s3, using multipart uploading if applicable.
- Parameters:
readable (IO[Any]) – a readable stream or a local file path to upload to s3
resource (S3.Resource) – boto3 resource
bucket (str) – name of the bucket to upload to
key (str) – the name of the file to upload to
extra_args (dict) – http headers to use when uploading - generally used for encryption purposes
partSize (int) – max size of each part in the multipart upload, in bytes
s3_resource (mypy_boto3_s3.S3ServiceResource)
- Returns:
version of the newly uploaded file
- Return type:
None
- toil.lib.aws.s3.download_stream(s3_resource, bucket, key, checksum_to_verify=None, extra_args=None, encoding=None, errors=None)¶
Context manager that gives out a download stream to download data.
- toil.lib.aws.s3.download_fileobject(s3_resource, bucket, key, fileobj, extra_args=None)¶
- Parameters:
s3_resource (mypy_boto3_s3.S3ServiceResource)
bucket (mypy_boto3_s3.service_resource.Bucket)
key (str)
fileobj (io.BytesIO)
extra_args (dict[Any, Any] | None)
- Return type:
None
- toil.lib.aws.s3.s3_key_exists(s3_resource, bucket, key, check=False, extra_args=None)¶
Return True if the s3 obect exists, and False if not. Will error if encryption args are incorrect.
- toil.lib.aws.s3.get_s3_object(s3_resource, bucket, key, extra_args=None)¶
- toil.lib.aws.s3.put_s3_object(s3_resource, bucket, key, body, extra_args=None)¶
- toil.lib.aws.s3.generate_presigned_url(s3_resource, bucket, key_name, expiration)¶
- toil.lib.aws.s3.create_public_url(s3_resource, bucket, key)¶