Skip to content

Conversation

@jadewang-db
Copy link
Contributor

Description

This PR implements retry-after behavior for the Spark ADBC driver when receiving 503 responses with Retry-After headers. This is particularly useful for Databricks clusters that may return 503 responses when a cluster is starting up or experiencing temporary unavailability.

Changes

  • Added new configuration parameters:
    • adbc.spark.temporarily_unavailable_retry (default: 1 - enabled)
    • adbc.spark.temporarily_unavailable_retry_timeout (default: 900 seconds)
  • Created a RetryHttpHandler class that wraps the existing HttpClientHandler to handle 503 responses
  • Modified SparkHttpConnection to use the new retry handler
  • Added comprehensive unit tests for the retry behavior

Implementation Details

When a 503 response with a Retry-After header is received:

  1. The handler will wait for the number of seconds specified in the header
  2. It will then retry the request
  3. If another 503 response is received, it will continue retrying
  4. If the total retry time exceeds the configured timeout, it will fail with an appropriate error message

Testing

Added unit tests to verify:

  • Retry behavior for 503 responses with Retry-After headers
  • Timeout behavior when retry time exceeds the configured limit
  • Handling of invalid or missing Retry-After headers
  • Disabling retry behavior via configuration
  • Parameter validationv

@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Mar 31, 2025
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! I've left feedback, some more optional than others.

@davidhcoe
Copy link
Contributor

Can we hold off submitting this until a decision is made on #2672?

@jadewang-db jadewang-db force-pushed the handle-retry-after branch 2 times, most recently from 6906845 to 0b4526b Compare April 8, 2025 23:38
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! Please make sure the linter doesn't complain about whitespace issues and that the retry flag is properly handled.

@CurtHagenlocher
Copy link
Contributor

There are some merge conflicts, so this will probably need to be rebased.

Update DatabricksConnection.cs

Update RetryHttpHandlerTest.cs

Add retry after handling in ADBC Spark driver

fix pre commit check failures

fix build error

address PR comments

fix linter

lint

Update DatabricksConnection.cs

address comments
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@CurtHagenlocher CurtHagenlocher merged commit 30626e5 into apache:main Apr 22, 2025
6 checks passed
colin-rogers-dbt pushed a commit to dbt-labs/arrow-adbc that referenced this pull request Jun 10, 2025
…C driver (apache#2664)

## Description

This PR implements retry-after behavior for the Spark ADBC driver when
receiving 503 responses with Retry-After headers. This is particularly
useful for Databricks clusters that may return 503 responses when a
cluster is starting up or experiencing temporary unavailability.

## Changes

- Added new configuration parameters:
  - `adbc.spark.temporarily_unavailable_retry` (default: 1 - enabled)
- `adbc.spark.temporarily_unavailable_retry_timeout` (default: 900
seconds)
- Created a `RetryHttpHandler` class that wraps the existing
`HttpClientHandler` to handle 503 responses
- Modified `SparkHttpConnection` to use the new retry handler
- Added comprehensive unit tests for the retry behavior

## Implementation Details

When a 503 response with a Retry-After header is received:
1. The handler will wait for the number of seconds specified in the
header
2. It will then retry the request
3. If another 503 response is received, it will continue retrying
4. If the total retry time exceeds the configured timeout, it will fail
with an appropriate error message

## Testing

Added unit tests to verify:
- Retry behavior for 503 responses with Retry-After headers
- Timeout behavior when retry time exceeds the configured limit
- Handling of invalid or missing Retry-After headers
- Disabling retry behavior via configuration
- Parameter validationv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants