Skip to content

Comments

feat(services/gcs): add configuration aliases for better Arrow object_store compatibility#6526

Merged
Xuanwo merged 3 commits intoapache:mainfrom
jackye1995:gcs-configs
Aug 25, 2025
Merged

feat(services/gcs): add configuration aliases for better Arrow object_store compatibility#6526
Xuanwo merged 3 commits intoapache:mainfrom
jackye1995:gcs-configs

Conversation

@jackye1995
Copy link
Contributor

@jackye1995 jackye1995 commented Aug 23, 2025

Summary

This PR adds serde aliases to GcsConfig fields to support multiple configuration key naming conventions, enabling seamless migration from Arrow object_store and other file system libraries. Users can now use either OpenDAL's native field names or Google-prefixed alternatives.

Changes

Added #[serde(alias = "...")] attributes to 5 key GCS configuration fields, following the exact same patterns used by Apache Arrow's object_store library.

Example Usage

// Original OpenDAL style (still works)
{
  "bucket": "my-bucket",
  "service_account": "/path/to/sa.json",
  "credential": "service-account-key-json"
}

// Google SDK style (now supported!)
{
  "google_bucket": "my-bucket", 
  "google_service_account": "/path/to/sa.json",
  "google_service_account_key": "service-account-key-json"
}

Added Aliases

OpenDAL Field Added Aliases Arrow Reference
bucket google_bucket, google_bucket_name, bucket_name L156-L160
service_account google_service_account, google_service_account_path, service_account_path L136-L142
credential google_service_account_key, service_account_key L146-L149
credential_path google_application_credentials L164-L165
allow_anonymous google_skip_signature, skip_signature L167-L168

Missing Configurations (Future Work)

These Arrow object_store configurations are not yet available in OpenDAL and could be added in future PRs:

HTTP Client Configuration

Arrow Configuration Aliases Description
AllowHttp google_allow_http, allow_http Allow non-TLS, i.e. non-HTTPS connections
AllowInvalidCertificates google_allow_invalid_certificates, allow_invalid_certificates Skip certificate validation on HTTPS connections
ConnectTimeout google_connect_timeout, connect_timeout Timeout for only the connect phase of a client
DefaultContentType google_default_content_type, default_content_type Default CONTENT_TYPE for uploads
Http1Only google_http1_only, http1_only Only use HTTP/1 connections
Http2KeepAliveInterval google_http2_keep_alive_interval, http2_keep_alive_interval Interval for HTTP2 Ping frames to keep connection alive
Http2KeepAliveTimeout google_http2_keep_alive_timeout, http2_keep_alive_timeout Timeout for receiving keep-alive ping acknowledgement
Http2KeepAliveWhileIdle google_http2_keep_alive_while_idle, http2_keep_alive_while_idle Enable HTTP2 keep alive pings for idle connections
Http2MaxFrameSize google_http2_max_frame_size, http2_max_frame_size Sets the maximum frame size for HTTP2
Http2Only google_http2_only, http2_only Only use HTTP/2 connections
PoolIdleTimeout google_pool_idle_timeout, pool_idle_timeout The pool max idle timeout
PoolMaxIdlePerHost google_pool_max_idle_per_host, pool_max_idle_per_host Maximum number of idle connections per host
ProxyUrl google_proxy_url, proxy_url HTTP proxy to use for requests
ProxyCaCertificate google_proxy_ca_certificate, proxy_ca_certificate PEM-formatted CA certificate for proxy connections
ProxyExcludes google_proxy_excludes, proxy_excludes List of hosts that bypass proxy
RandomizeAddresses google_randomize_addresses, randomize_addresses Randomize DNS resolution order for load balancing
Timeout google_timeout, timeout Request timeout (connection to response completion)
UserAgent google_user_agent, user_agent User-Agent header to be used by this client

Notes on Missing Configurations

  • HTTP Client Options: OpenDAL's GCS service doesn't currently expose low-level HTTP client configurations. These could be valuable for users who need fine-grained control over connection behavior, timeouts, and proxy settings.
  • Proxy Support: While OpenDAL may have global proxy support, GCS-specific proxy configuration aliases are not available.
  • HTTP Version Control: Arrow object_store allows forcing HTTP/1 or HTTP/2, which could be useful for compatibility with different GCS endpoints.

Testing

  • ✅ Added comprehensive unit tests covering all aliases
  • ✅ All existing GCS tests pass
  • ✅ Code compiles without warnings
  • ✅ Clippy passes without issues

Backward Compatibility

This change is fully backward compatible. All existing configurations continue to work exactly as before, with the new aliases providing additional flexibility.

Similar to #6524

…_store compatibility

This PR adds serde aliases to GcsConfig fields to support multiple configuration key naming conventions, enabling seamless migration from Arrow object_store and other file system libraries. Users can now use either OpenDAL's native field names or Google-prefixed alternatives.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@jackye1995 jackye1995 requested a review from Xuanwo as a code owner August 23, 2025 04:08
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. releases-note/feat The PR implements a new feature or has a title that begins with "feat" labels Aug 23, 2025
jackye1995 and others added 2 commits August 22, 2025 21:14
- Split large test_config_aliases into 5 focused test functions
- Each test function covers aliases for a specific config field
- Added missing google_service_account_path alias test
- Improved test organization and readability

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

@Xuanwo Xuanwo merged commit 8347d2f into apache:main Aug 25, 2025
86 checks passed
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer releases-note/feat The PR implements a new feature or has a title that begins with "feat" size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants