Skip to content

Storage retries don't work as expected #2098

@Dima1224

Description

@Dima1224

I have been getting 503 errors from GCS while downloading content and would expect the storage library to retry, but it does not. After a bit of debugging, it turns out that the issue is that GoogleJsonResponseException doesn't have its details field set and when that is the case the StorageException it is converted to is marked as not retriable.

Here's the code snippet which is the problem from BaseHttpServiceException:

  private static ExceptionData makeExceptionData(IOException exception, boolean idempotent,
      Set<BaseServiceException.Error> retryableErrors) {
    int code = UNKNOWN_CODE;
    String reason = null;
    String location = null;
    String debugInfo = null;
    Boolean retryable = null;
    if (exception instanceof HttpResponseException) {
      if (exception instanceof GoogleJsonResponseException) {
        GoogleJsonError jsonError = ((GoogleJsonResponseException) exception).getDetails();
        if (jsonError != null) {
          BaseServiceException.Error error = new BaseServiceException.Error(jsonError.getCode(),
              reason(jsonError));
          code = error.getCode();
          reason = error.getReason();
          retryable = error.isRetryable(idempotent, retryableErrors);
          if (reason != null) {
            GoogleJsonError.ErrorInfo errorInfo = jsonError.getErrors().get(0);
            location = errorInfo.getLocation();
            debugInfo = (String) errorInfo.get("debugInfo");
          }
        } else {
          code = ((GoogleJsonResponseException) exception).getStatusCode();
        }
      } else {
        // In cases where an exception is an instance of HttpResponseException but not
        // an instance of GoogleJsonResponseException, check the status code to determine whether it's retryable
        code = ((HttpResponseException) exception).getStatusCode();
        retryable = BaseServiceException.isRetryable(code, null, idempotent, retryableErrors);
      }
    }
    return ExceptionData.newBuilder()
        .setMessage(message(exception))
        .setCause(exception)
        .setRetryable(MoreObjects
            .firstNonNull(retryable, BaseServiceException.isRetryable(idempotent, exception)))
        .setCode(code)
        .setReason(reason)
        .setLocation(location)
        .setDebugInfo(debugInfo)
        .build();
  }

In this snippet, the list of retriableErrors is only referenced if getDetails() returns a non-null result or if we're dealing with an HttpResponseException which isn't a GoogleJsonResponseException.
In my case, the error is a GoogleJsonResponseException with a 503 status code but no details field (not sure if this is a bug upstream somewhere). As a result, BaseServiceException.isRetryable(idempotent, exception) is called to determine if the exception is retriable, which returns false.

I'm using v1.0.2 of the library.

Here's the stack trace:

Exception in thread "main" com.google.cloud.storage.StorageException: 503 Service Unavailable
Backend Error
at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:189)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:515)
at com.google.cloud.storage.BlobReadChannel$1.call(BlobReadChannel.java:127)
at com.google.cloud.storage.BlobReadChannel$1.call(BlobReadChannel.java:124)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:93)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:49)
at com.google.cloud.storage.BlobReadChannel.read(BlobReadChannel.java:124)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.ZipInputStream.read(ZipInputStream.java:193)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.Reader.read(Reader.java:100)
at java.util.Scanner.readInput(Scanner.java:854)
at java.util.Scanner.findWithinHorizon(Scanner.java:1733)
at java.util.Scanner.hasNextLine(Scanner.java:1550)
at Random.main(Random.java:118)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 503 Service Unavailable
Backend Error
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeMedia(AbstractGoogleClientRequest.java:380)
at com.google.api.services.storage.Storage$Objects$Get.executeMedia(Storage.java:6107)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.read(HttpStorageRpc.java:494)
... 22 more

Metadata

Metadata

Assignees

Labels

api: storageIssues related to the Cloud Storage API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions