Skip to content

Datastore: _Rendezvous of RPC that terminated with StatusCode.UNAVAILABLE #2583

@Bogdanp

Description

@Bogdanp

We see this fairly often on commit with google-cloud-datastore version 0.20. I believe these should either be retried with exponential backoff automatically by the library (according to this) or a more specific error should be raised so user code can deal w/ it (preferably one exception for every one of the cases listed on that doc).

_Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, {"created":"@1476898717.596308747","description":"Secure read failed","file":"src/core/lib/security/transport/secure_endpoint.c","file_line":157,"grpc_status":14,"referenced_errors":[{"created":"@1476898717.596257572","description":"EOF","file":"src/core/lib/iomgr/tcp_posix.c","file_line":235}]})>

Edit:

Here's our current (somewhat tested) workaround for this issue in our internal ORM:

# The maximum number of retries that should be done per Datastore
# error code.
_MAX_RETRIES_BY_CODE = {
    grpc.StatusCode.INTERNAL: 1,
    grpc.StatusCode.ABORTED: 5,  # Only retried for non-transactional commits
    grpc.StatusCode.UNAVAILABLE: 5,
    grpc.StatusCode.DEADLINE_EXCEEDED: 5,
}


def _handle_errors(f, transactional=False):
    @functools.wraps(f)
    def handler(*args, **kwargs):
        retries = 0
        while True:
            try:
                return f(*args, **kwargs)
            # TODO: Replace w/ concrete error types once/if they are
            # added to gcloud.  See: google-cloud-python/issues/2583
            except google.cloud.exceptions._Rendezvous as e:
                code = e.code()
                max_retries = _MAX_RETRIES_BY_CODE.get(code)
                if max_retries is None or transactional and code == grpc.StatusCode.ABORTED:
                    raise

                if retries > max_retries:
                    raise RetriesExceeded(e)

                backoff = min(0.0625 * 2 ** retries, 1.0)
                bread.get_logger().debug("Sleeping for %r before retrying failed request...", backoff)

                retries += 1
                time.sleep(backoff)

    return handler


class Client(datastore.Client):
    def __init__(self, *args, **kwargs):
        super(Client, self).__init__(*args, **kwargs)

        self.delete_multi = _handle_errors(self.delete_multi)
        self.get_multi = _handle_errors(self.get_multi)
        self.put_multi = _handle_errors(self.put_multi)

    def transaction(self, *args, **kwargs):
        transaction = super(Client, self).transaction(*args, **kwargs)
        transaction.commit = _handle_errors(transaction.commit, transactional=True)
        return transaction

    def query(self, *args, **kwargs):
        query = super(Client, self).query(*args, **kwargs)
        query.fetch = _handle_errors(query.fetch)
        return query

Metadata

Metadata

Labels

api: datastoreIssues related to the Datastore API.grpcpriority: p2Moderately-important priority. Fix may not be included in next release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions