toil.jobStores.aws.utils¶
Attributes¶
Exceptions¶
Raised when AWS refuses to perform a server-side copy between S3 keys, and |
Classes¶
A mixin with methods for storing limited amounts of binary data in an SDB item |
Functions¶
Return True if an error represents a failure to make a network connection. |
|
|
Get the AWS region name associated with the given S3 bucket. |
|
|
Get the error code name from a Boto 2 or 3 error, or compatible types. |
|
Get the error message string from a Boto 2 or 3 error, or compatible types. |
|
Get the HTTP status code from a compatible source. |
|
|
Deprecated. |
|
Retry a function if it fails with any Exception defined in "errors". |
|
|
|
Uploads a file to s3, using multipart uploading if applicable |
|
Upload a readable object to s3, using multipart uploading if applicable. |
|
Copies a key from a source key to a destination key in multiple parts. Note that if the |
|
|
|
Module Contents¶
- toil.jobStores.aws.utils.AWSServerErrors¶
- toil.jobStores.aws.utils.connection_error(e)¶
Return True if an error represents a failure to make a network connection.
- toil.jobStores.aws.utils.get_bucket_region(bucket_name, endpoint_url=None, only_strategies=None)¶
Get the AWS region name associated with the given S3 bucket.
Takes an optional S3 API URL override.
- toil.jobStores.aws.utils.DEFAULT_DELAYS = (0, 1, 1, 4, 16, 64)¶
- toil.jobStores.aws.utils.DEFAULT_TIMEOUT = 300¶
- toil.jobStores.aws.utils.get_error_code(e)¶
Get the error code name from a Boto 2 or 3 error, or compatible types.
Returns empty string for other errors.
- toil.jobStores.aws.utils.get_error_message(e)¶
Get the error message string from a Boto 2 or 3 error, or compatible types.
Note that error message conditions also check more than this; this function does not fall back to the traceback for incompatible types.
- toil.jobStores.aws.utils.get_error_status(e)¶
Get the HTTP status code from a compatible source.
Such as a Boto 2 or 3 error, kubernetes.client.rest.ApiException, http.client.HTTPException, urllib3.exceptions.HTTPError, requests.exceptions.HTTPError, urllib.error.HTTPError, or compatible type
Returns 0 from other errors.
- toil.jobStores.aws.utils.old_retry(delays=DEFAULT_DELAYS, timeout=DEFAULT_TIMEOUT, predicate=lambda e: ...)¶
Deprecated.
Retry an operation while the failure matches a given predicate and until a given timeout expires, waiting a given amount of time in between attempts. This function is a generator that yields contextmanagers. See doctests below for example usage.
- Parameters:
delays (Iterable[float]) – an interable yielding the time in seconds to wait before each retried attempt, the last element of the iterable will be repeated.
timeout (float) – a overall timeout that should not be exceeded for all attempts together. This is a best-effort mechanism only and it won’t abort an ongoing attempt, even if the timeout expires during that attempt.
predicate (Callable[[Exception],bool]) – a unary callable returning True if another attempt should be made to recover from the given exception. The default value for this parameter will prevent any retries!
- Returns:
a generator yielding context managers, one per attempt
- Return type:
Iterator
Retry for a limited amount of time:
>>> true = lambda _:True >>> false = lambda _:False >>> i = 0 >>> for attempt in old_retry( delays=[0], timeout=.1, predicate=true ): ... with attempt: ... i += 1 ... raise RuntimeError('foo') Traceback (most recent call last): ... RuntimeError: foo >>> i > 1 True
If timeout is 0, do exactly one attempt:
>>> i = 0 >>> for attempt in old_retry( timeout=0 ): ... with attempt: ... i += 1 ... raise RuntimeError( 'foo' ) Traceback (most recent call last): ... RuntimeError: foo >>> i 1
Don’t retry on success:
>>> i = 0 >>> for attempt in old_retry( delays=[0], timeout=.1, predicate=true ): ... with attempt: ... i += 1 >>> i 1
Don’t retry on unless predicate returns True:
>>> i = 0 >>> for attempt in old_retry( delays=[0], timeout=.1, predicate=false): ... with attempt: ... i += 1 ... raise RuntimeError( 'foo' ) Traceback (most recent call last): ... RuntimeError: foo >>> i 1
- toil.jobStores.aws.utils.retry(intervals=None, infinite_retries=False, errors=None, log_message=None, prepare=None)¶
Retry a function if it fails with any Exception defined in “errors”.
Does so every x seconds, where x is defined by a list of numbers (ints or floats) in “intervals”. Also accepts ErrorCondition events for more detailed retry attempts.
- Parameters:
intervals (Optional[List]) – A list of times in seconds we keep retrying until returning failure. Defaults to retrying with the following exponential back-off before failing: 1s, 1s, 2s, 4s, 8s, 16s
infinite_retries (bool) – If this is True, reset the intervals when they run out. Defaults to: False.
errors (Optional[Sequence[Union[ErrorCondition, Type[Exception]]]]) –
A list of exceptions OR ErrorCondition objects to catch and retry on. ErrorCondition objects describe more detailed error event conditions than a plain error. An ErrorCondition specifies: - Exception (required) - Error codes that must match to be retried (optional; defaults to not checking) - A string that must be in the error message to be retried (optional; defaults to not checking) - A bool that can be set to False to always error on this condition.
If not specified, this will default to a generic Exception.
log_message (Optional[Tuple[Callable, str]]) – Optional tuple of (“log/print function()”, “message string”) that will precede each attempt.
prepare (Optional[List[Callable]]) – Optional list of functions to call, with the function’s arguments, between retries, to reset state.
- Returns:
The result of the wrapped function or raise.
- Return type:
Callable[[Callable[Ellipsis, RT]], Callable[Ellipsis, RT]]
- toil.jobStores.aws.utils.logger¶
- toil.jobStores.aws.utils.DIAL_SPECIFIC_REGION_CONFIG¶
- class toil.jobStores.aws.utils.SDBHelper¶
A mixin with methods for storing limited amounts of binary data in an SDB item
>>> import os >>> H=SDBHelper >>> H.presenceIndicator() u'numChunks' >>> H.binaryToAttributes(None)['numChunks'] 0 >>> H.attributesToBinary({u'numChunks': 0}) (None, 0) >>> H.binaryToAttributes(b'') {u'000': b'VQ==', u'numChunks': 1} >>> H.attributesToBinary({u'numChunks': 1, u'000': b'VQ=='}) (b'', 1)
Good pseudo-random data is very likely smaller than its bzip2ed form. Subtract 1 for the type character, i.e ‘C’ or ‘U’, with which the string is prefixed. We should get one full chunk:
>>> s = os.urandom(H.maxRawValueSize-1) >>> d = H.binaryToAttributes(s) >>> len(d), len(d['000']) (2, 1024) >>> H.attributesToBinary(d) == (s, 1) True
One byte more and we should overflow four bytes into the second chunk, two bytes for base64-encoding the additional character and two bytes for base64-padding to the next quartet.
>>> s += s[0:1] >>> d = H.binaryToAttributes(s) >>> len(d), len(d['000']), len(d['001']) (3, 1024, 4) >>> H.attributesToBinary(d) == (s, 2) True
- maxAttributesPerItem = 256¶
- maxValueSize = 1024¶
- maxRawValueSize¶
- classmethod maxBinarySize(extraReservedChunks=0)¶
- classmethod binaryToAttributes(binary)¶
Turn a bytestring, or None, into SimpleDB attributes.
- classmethod attributeDictToList(attributes)¶
Convert the attribute dict (ex: from binaryToAttributes) into a list of attribute typed dicts to be compatible with boto3 argument syntax :param attributes: Dict[str, str], attribute in object form :return: List[AttributeTypeDef], list of attributes in typed dict form
- classmethod attributeListToDict(attributes)¶
Convert the attribute boto3 representation of list of attribute typed dicts back to a dictionary with name, value pairs :param attribute: List[AttributeTypeDef, attribute in typed dict form :return: Dict[str, str], attribute in dict form
- classmethod get_attributes_from_item(item, keys)¶
- classmethod presenceIndicator()¶
The key that is guaranteed to be present in the return value of binaryToAttributes(). Assuming that binaryToAttributes() is used with SDB’s PutAttributes, the return value of this method could be used to detect the presence/absence of an item in SDB.
- toil.jobStores.aws.utils.fileSizeAndTime(localFilePath)¶
- toil.jobStores.aws.utils.uploadFromPath(localFilePath, resource, bucketName, fileID, headerArgs=None, partSize=50 << 20)¶
Uploads a file to s3, using multipart uploading if applicable
- Parameters:
localFilePath (str) – Path of the file to upload to s3
resource (S3.Resource) – boto3 resource
bucketName (str) – name of the bucket to upload to
fileID (str) – the name of the file to upload to
headerArgs (dict) – http headers to use when uploading - generally used for encryption purposes
partSize (int) – max size of each part in the multipart upload, in bytes
- Returns:
version of the newly uploaded file
- toil.jobStores.aws.utils.uploadFile(readable, resource, bucketName, fileID, headerArgs=None, partSize=50 << 20)¶
Upload a readable object to s3, using multipart uploading if applicable. :param readable: a readable stream or a file path to upload to s3 :param S3.Resource resource: boto3 resource :param str bucketName: name of the bucket to upload to :param str fileID: the name of the file to upload to :param dict headerArgs: http headers to use when uploading - generally used for encryption purposes :param int partSize: max size of each part in the multipart upload, in bytes :return: version of the newly uploaded file
- exception toil.jobStores.aws.utils.ServerSideCopyProhibitedError¶
Bases:
RuntimeErrorRaised when AWS refuses to perform a server-side copy between S3 keys, and insists that you pay to download and upload the data yourself instead.
- toil.jobStores.aws.utils.copyKeyMultipart(resource, srcBucketName, srcKeyName, srcKeyVersion, dstBucketName, dstKeyName, sseAlgorithm=None, sseKey=None, copySourceSseAlgorithm=None, copySourceSseKey=None)¶
Copies a key from a source key to a destination key in multiple parts. Note that if the destination key exists it will be overwritten implicitly, and if it does not exist a new key will be created. If the destination bucket does not exist an error will be raised.
This function will always do a fast, server-side copy, at least until/unless <https://github.com/boto/boto3/issues/3270> is fixed. In some situations, a fast, server-side copy is not actually possible. For example, when residing in an AWS VPC with an S3 VPC Endpoint configured, copying from a bucket in another region to a bucket in your own region cannot be performed server-side. This is because the VPC Endpoint S3 API servers refuse to perform server-side copies between regions, the source region’s API servers refuse to initiate the copy and refer you to the destination bucket’s region’s API servers, and the VPC routing tables are configured to redirect all access to the current region’s S3 API servers to the S3 Endpoint API servers instead.
If a fast server-side copy is not actually possible, a ServerSideCopyProhibitedError will be raised.
- Parameters:
resource (mypy_boto3_s3.S3ServiceResource) – boto3 resource
srcBucketName (str) – The name of the bucket to be copied from.
srcKeyName (str) – The name of the key to be copied from.
srcKeyVersion (str) – The version of the key to be copied from.
dstBucketName (str) – The name of the destination bucket for the copy.
dstKeyName (str) – The name of the destination key that will be created or overwritten.
sseAlgorithm (str) – Server-side encryption algorithm for the destination.
sseKey (str) – Server-side encryption key for the destination.
copySourceSseAlgorithm (str) – Server-side encryption algorithm for the source.
copySourceSseKey (str) – Server-side encryption key for the source.
- Return type:
- Returns:
The version of the copied file (or None if versioning is not enabled for dstBucket).
- toil.jobStores.aws.utils.monkeyPatchSdbConnection(sdb)¶
- toil.jobStores.aws.utils.no_such_sdb_domain(e)¶
- toil.jobStores.aws.utils.retryable_ssl_error(e)¶
- toil.jobStores.aws.utils.retryable_sdb_errors(e)¶
- toil.jobStores.aws.utils.retry_sdb(delays=DEFAULT_DELAYS, timeout=DEFAULT_TIMEOUT, predicate=retryable_sdb_errors)¶