Skip to content

Conversation

@tseaver
Copy link
Contributor

@tseaver tseaver commented Jul 24, 2017

Uses #3615 as a base. ca441f7 is the only change from that PR.

@tseaver tseaver added api: spanner Issues related to the Spanner API. testing labels Jul 24, 2017
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 24, 2017
@tseaver tseaver added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jul 24, 2017
@tseaver
Copy link
Contributor Author

tseaver commented Jul 24, 2017

This test can deadlock: I got interrupted while debugging and blew my stack. :(

@tseaver
Copy link
Contributor Author

tseaver commented Jul 25, 2017

@dhermes, @bjwatson, @lukesneeringer OK, I have the test passing, but to do so I have to straddle the fact that the GAPIC / gRPC bits sometimes raise a _Rendezvous w/ status code ABORTED, and sometimes a GaxError with such a _Rendezvous as their cause. I think the APIs should raise one or the other, not a mix-and-match.

Specifics:

  • ABORTED returned from the server during a Read API call is raised as a _Rendezvous.
    ABORTEDreturned from the server during aCommitAPI call is raised as aGaxError`.

My hack to get the read-abort bit passing:

@@ -312,7 +334,12 @@ def _delay_until_retry(exc, deadline):
     :type deadline: float
     :param deadline: maximum timestamp to continue retrying the transaction.
     """
-    if exc_to_code(exc.cause) != StatusCode.ABORTED:
+    if isinstance(exc, GrpcRendezvous):
+        cause = exc
+    else:
+        cause = exc.cause
+
+    if exc_to_code(cause) != StatusCode.ABORTED:
         raise
 
     now = time.time()
@@ -320,7 +347,7 @@ def _delay_until_retry(exc, deadline):
     if now >= deadline:
         raise
 
-    delay = _get_retry_delay(exc)
+    delay = _get_retry_delay(cause)
     if delay is not None:
 
         if now + delay > deadline:
@@ -330,7 +357,7 @@ def _delay_until_retry(exc, deadline):
 # pylint: enable=misplaced-bare-raise
 
 
-def _get_retry_delay(exc):
+def _get_retry_delay(cause):
     """Helper for :func:`_delay_until_retry`.
 
     :type exc: :class:`google.gax.errors.GaxError`
@@ -339,7 +366,7 @@ def _get_retry_delay(exc):
     :rtype: float
     :returns: seconds to wait before retrying the transaction.
     """
-    metadata = dict(exc.cause.trailing_metadata())
+    metadata = dict(cause.trailing_metadata())
     retry_info_pb = metadata.get('google.rpc.retryinfo-bin')
     if retry_info_pb is not None:
         retry_info = RetryInfo()

@tseaver
Copy link
Contributor Author

tseaver commented Jul 25, 2017

Turns out my issue is the same as #3562.

@tseaver tseaver force-pushed the spanner-systest-txn_read_abort branch from ca441f7 to f916edc Compare July 28, 2017 19:40
@tseaver tseaver added the status: blocked Resolving the issue is dependent on other work. label Jul 28, 2017
@tseaver
Copy link
Contributor Author

tseaver commented Jul 28, 2017

I've rebased, fixed the deadlock / hang, and added a separate commit (f916edc) with my workaround for #3562 so that I can verify the tests pass on Circle. I propose waiting for a real fix and then removing that commit.

@tseaver
Copy link
Contributor Author

tseaver commented Jul 28, 2017

The test failure is for coverage of branches added in the to-be-backed-out commit (f916edc).

@theacodes
Copy link
Contributor

@tseaver once #3738 is in, you can use google.api.core.exceptions.from_grpc_exception to map the grpc.RpcError to a Google API exception. You shouldn't need to catch _Rendezvous directly, instead, catch grpc.RpcError or grpc.Call (if you need the metadata directly).

@bjwatson bjwatson added release blocking Required feature/issue must be fixed prior to next release. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed status: blocked Resolving the issue is dependent on other work. labels Aug 7, 2017
@bjwatson
Copy link

bjwatson commented Aug 8, 2017

@tseaver I removed the blocked label since it looks like @jonparrott unblocked this.

@tseaver
Copy link
Contributor Author

tseaver commented Aug 8, 2017

@jonparrott Note that for the ABORT error, I'm not propagating the error, but using it to trigger a retry (I do need access to the trailing metadata to pick out the retry interval).

@bjwatson Am I still supposed to be catching GaxError as well as grpc.Call here?

@theacodes
Copy link
Contributor

@tseaver yes, in this case catch GaxError and grpc.Call.

@tseaver
Copy link
Contributor Author

tseaver commented Aug 8, 2017

@jonparrott Why are we not fixing the streaming iterator stuff to return only one kind of error?

@theacodes
Copy link
Contributor

@tseaver we are, but it's O(weeks) away, and changing this PR to catch grpc.Call instead of GrpcRendezvous is a fine compromise for now- in the near term, it'll unblock spanner. Once google.api.core.grpc exists and is used by gapic, it will not raise this error which will mean a small change will need to be made here (simplifying this to just catch a google.api.core.exception subclass).

@tseaver
Copy link
Contributor Author

tseaver commented Aug 9, 2017

@jonparrott Note that one cannot catch grpc.Call (it doesn't derive from BaseException).

Given that the change is intended to be temporary, I will merge with a # pragma: NO COVER on the branch for if isinstance(exc, GrpcRendezvous), and add an issue to remove that once the underling normalization fix has rolled out.

@tseaver tseaver force-pushed the spanner-systest-txn_read_abort branch from f916edc to cf48b5e Compare August 9, 2017 22:16
@tseaver tseaver merged commit 1fcc1a4 into master Aug 10, 2017
@tseaver tseaver deleted the spanner-systest-txn_read_abort branch August 10, 2017 01:19
landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 21, 2017
landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 22, 2017
landrito pushed a commit to landrito/google-cloud-python that referenced this pull request Aug 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API. cla: yes This human has signed the Contributor License Agreement. do not merge Indicates a pull request not ready for merge, due to either quality or timing. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. release blocking Required feature/issue must be fixed prior to next release. testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants