Skip to content

Conversation

@olavloite
Copy link
Collaborator

@olavloite olavloite commented Dec 12, 2025

  • Send a RequestID to Spanner for each request
  • Make sure that the attempt number of the RequestID is increased if the RPC is retried.
  • Include the RequestID in every error that is thrown due to an error that is returned by Spanner.

This change removes most of the manual handling of RequestIDs, and instead lets the SpannerRpc implementation and a couple of interceptors handle RequestIDs.

@olavloite olavloite requested review from a team as code owners December 12, 2025 14:32
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: spanner Issues related to the googleapis/java-spanner API. labels Dec 12, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @olavloite, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the observability and debugging experience for the Spanner client library by integrating a comprehensive RequestID tracking system. This system ensures that every API call to Spanner carries a unique identifier, which is then propagated through retries and included in any resulting error messages. This change allows developers to more easily trace the lifecycle of individual requests and diagnose issues within distributed Spanner operations.

Highlights

  • Request ID Propagation: Introduced a mechanism to send a unique RequestID with every Spanner API request, enhancing traceability and debugging capabilities.
  • Retry Attempt Tracking: The RequestID now includes an attempt number that is automatically incremented when an RPC is retried, providing clearer insight into retry behavior.
  • Error Reporting: All SpannerException instances now include the relevant RequestID in their details, making it easier to correlate errors with specific requests.
  • Refactored Request ID Handling: The internal handling of RequestID generation and propagation has been refactored, moving RequestIdCreator responsibility to the SpannerRpc layer and simplifying exception constructors.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed refactoring to automate the handling of RequestID for Spanner requests. The changes move the responsibility of creating, propagating, and incrementing request IDs from the application logic into a new gRPC interceptor layer. This simplifies the higher-level code, reduces boilerplate, and makes the request ID handling more robust and centralized.

The key changes include:

  • Introducing RequestIdCreatorImpl for generating request IDs.
  • Adding RequestIdInterceptor to inject the request ID into gRPC headers and increment attempt numbers on retries.
  • Modifying SpannerErrorInterceptor to propagate the request ID in error trailers, allowing exceptions to automatically capture it.
  • Removing manual request ID management from numerous classes, including SpannerException subclasses, AbstractReadContext, TransactionRunnerImpl, and SessionImpl.
  • Adding a comprehensive new test class, RequestIdMockServerTest, which thoroughly validates the new mechanism.

The code quality is high, and the changes are consistent and logical. This is an excellent improvement to the client library's architecture.


if (operationName == null) {
GrpcCallContext context = newCallContext(null, instanceName, initialRequest, method);
GrpcCallContext context = newAdminCallContext(instanceName, initialRequest, method);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't directly related, but a small cleanup to prevent each of these methods to invoke newCallContext(null, ...), and instead use a more descriptive method.

@Override
public long executePartitionedUpdate(Statement stmt, UpdateOption... options) {
return createMultiplexedSessionTransaction(/* singleUse= */ true)
return createMultiplexedSessionTransaction(/* singleUse= */ false)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartitionedDML is not a single-use transaction. It invokes the BeginTransaction RPC and the ExecuteStreamingSql RPC. These two calls should use the same channel.

defaultInterceptorList.add(
new LoggingInterceptor(Logger.getLogger(GapicSpannerRpc.class.getName()), Level.FINER));
defaultInterceptorList.add(new HeaderInterceptor(new SpannerRpcMetrics(openTelemetry)));
defaultInterceptorList.add(new RequestIdInterceptor());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a generic interceptor that adds the RequestID header from a CallOption.

return;
}
try {
if (headers.containsKey(XGoogSpannerRequestId.REQUEST_ID_HEADER_KEY)) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the original RequestID in the error if Spanner returns a non-OK status code for an RPC. This ensures that all errors that originate from Spanner contain a RequestID, and those that do not originate from Spanner do not contain a RequestID.

ReadRequest request,
ResultStreamConsumer consumer,
@Nullable Map<Option, ?> options,
XGoogSpannerRequestId requestId,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The streaming RPCs (StreamingRead, ExecuteStreamingSql) that handle retries of UNAVAILABLE errors manually must pass in a RequestID. This ensures that the same RequestID is used for both the original attempt and any retries that happen. This again ensures that the attempt number is increased for retries.

context = withRequestId(context, options);
int requestIdChannel = convertToRequestIdChannelNumber(affinity);
if (requestId == null) {
requestId = requestIdCreator.nextRequestId(requestIdChannel);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generate a RequestID automatically here for all unary RPCs. We can safely do that, as retries are handled internally by Gax. This means that the same RequestID is automatically used during a retry, and the RequestIdInterceptor can just increase the attempt number every time that the RPC is invoked.

XGoogSpannerRequestId.VERSION,
XGoogSpannerRequestId.RAND_PROCESS_ID,
this.nthClientId,
this.nthChannelId < 0 ? "x" : String.valueOf(this.nthChannelId),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChannelId < 0 is used in tests to indicate 'A channel ID is expected, but it is unknown which it will be'

InProcessServerBuilder.forName(uniqueName)
// We need to use a real executor for timeouts to occur.
.scheduledExecutorService(new ScheduledThreadPoolExecutor(1))
NettyServerBuilder.forAddress(address)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test must use a standard (TCP-based) channel provider, otherwise the interceptors are not added to the client.

- Send a RequestID to Spanner for each request
- Make sure that the attempt number of the RequestID is increased
  if the RPC is retried.
- Include the RequestID in every error that is thrown due to an error
  that is returned by Spanner.
@rahul2393 rahul2393 merged commit afd7d6b into main Dec 15, 2025
42 checks passed
@rahul2393 rahul2393 deleted the request-id branch December 15, 2025 06:57
blakeli0 pushed a commit that referenced this pull request Dec 17, 2025
* chore: add release-please config for protobuf-4.x (#4249)

* fix: Refine connecitivity metrics to capture RPCs with no response he… (#4252)

* fix: Refine connecitivity metrics to capture RPCs with no response headers

* test fix

* fix: retry as PDML dit not retry Resource limit exceeded (#4258)

Most transactions that exceed the mutation limit for an atomic transaction will fail with the error "The transaction contains too many mutations.". However, it is also possible that the transaction fails with the more generic error message "Transaction resource limits exceeded". This error did not trigger a retry of the statement using a PDML transaction.

Fixes #4253

* deps: update dependency com.google.cloud:sdk-platform-java-config to v3.54.2 (#4261)

* deps: update googleapis/sdk-platform-java action to v2.64.2 (#4262)

* feat: include RequestID in requests and errors (#4263)

- Send a RequestID to Spanner for each request
- Make sure that the attempt number of the RequestID is increased
  if the RPC is retried.
- Include the RequestID in every error that is thrown due to an error
  that is returned by Spanner.

* feat: make grpc-gcp default enabled (#4239)

This PR enables the gRPC-GCP channel pool extension by default for Cloud Spanner Java client.

**What's Changing for Customers**

**Before this change**

- gRPC-GCP extension was disabled by default
- Default number of channels: 4
- Channel pooling was handled by GAX

**After this change**

- gRPC-GCP extension is enabled by default
- Default number of channels: 8
- Channel pooling is handled by gRPC-GCP extension

**Benefits of gRPC-GCP**

- **Improved resilience:** When a network connection fails on a particular channel, operations can be automatically retried on a different gRPC channel
- **Better channel management:** gRPC-GCP provides more sophisticated channel affinity and load balancing

**How to Disable gRPC-GCP (Switch Back to GAX Channel Pool)**

If you need to disable gRPC-GCP and use the previous GAX channel pooling behavior, use the `disableGrpcGcpExtension()` method:
```
SpannerOptions options = SpannerOptions.newBuilder()
    .setProjectId("my-project")
    .disableGrpcGcpExtension()
    .build();
```

When disabled, the default number of channels reverts to 4 (the previous default).

**When You Might Want to Disable gRPC-GCP**

- **Maintaining previous behavior:** If you want to keep the exact same behavior as before this change (GAX channel pool with 4 default channels).
- **Troubleshooting**: If you experience any unexpected behavior, disabling gRPC-GCP can help isolate whether the issue is related to the channel pooling mechanism.

* chore(main): release 6.104.1-SNAPSHOT (#4248)

:robot: I have created a release *beep* *boop*
---


### Updating meta-information for bleeding-edge SNAPSHOT release.

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).

* feat: add support of dynamic channel pooling (#4265)

* feat: add support of dynamic channel pooling

* set initial pool size to 0 make dynamic channel pool work

* chore: generate libraries at Tue Dec 16 06:00:50 UTC 2025

* bump grpc-gcp-java version

* fix test

* incorporate suggestions

* chore: generate libraries at Tue Dec 16 09:56:45 UTC 2025

* make dynamic channel pool default disabled

* add verifySameChannelId back and fix partitionDML test

* support setting GcpChannelPoolOptions directly

---------

Co-authored-by: cloud-java-bot <[email protected]>

* chore(main): release 6.105.0 (#4267)

* chore(main): release 6.105.0

* chore: generate libraries at Tue Dec 16 17:46:24 UTC 2025

---------

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
Co-authored-by: cloud-java-bot <[email protected]>

* chore(main): release 6.105.1-SNAPSHOT (#4268)

:robot: I have created a release *beep* *boop*
---


### Updating meta-information for bleeding-edge SNAPSHOT release.

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).

* chore: update manifest

---------

Co-authored-by: surbhigarg92 <[email protected]>
Co-authored-by: Knut Olav Løite <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: rahul2393 <[email protected]>
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
Co-authored-by: cloud-java-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the googleapis/java-spanner API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants