Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Sep 15, 2025

What changes were proposed in this pull request?

This PR aims to add a new profile, huaweicloud-provided.

Why are the changes needed?

Since Apache Spark 4.0.0, Apache Spark module is moving from OkHttp to Vert.x implementation via the following.

Like Apache Hadoop community, we are moving away further from okhttp transitive dependencies from hadoop-huaweicloud dependency.

This PR will allow users to exclude and add their huaweicloud and its transitive dependencies. Technically, the scope of following dependencies are changed to provided. As a result, those are removed from Spark distribution.

-esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar
-hadoop-huaweicloud/3.4.2//hadoop-huaweicloud-3.4.2.jar
-java-xmlbuilder/1.2//java-xmlbuilder-1.2.jar
-okhttp/3.12.12//okhttp-3.12.12.jar
-okio/1.17.6//okio-1.17.6.jar

Does this PR introduce any user-facing change?

No, this is a new profile which is disabled by default.

How was this patch tested?

Manually check like the following.

$ mvn dependency:tree -Phadoop-cloud | grep okhttp
[INFO] +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile
[INFO] |  +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile

$ mvn dependency:tree -Phadoop-cloud -Phuaweicloud-provided | grep okhttp
[INFO] +- com.squareup.okhttp3:okhttp:jar:3.12.12:provided

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the BUILD label Sep 15, 2025
@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon . Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-53590 branch September 15, 2025 22:48
dongjoon-hyun added a commit to apache/spark-kubernetes-operator that referenced this pull request Sep 19, 2025
…c8.kubernetes.client.http.Interceptor`

### What changes were proposed in this pull request?

Like Apache Spark main repository, this PR aims to make `Apache Spark K8s Operator` be independent from `OkHttp3` library for long-term maintainability.
- apache/spark#49159
  - [SPARK-50493 Migrate kubernetes-client from `6.x` to `7.x`](https://issues.apache.org/jira/browse/SPARK-50493)
  - [SPARK-37687 Cleanup direct usage of OkHttpClient](https://issues.apache.org/jira/browse/SPARK-37687)
- apache/spark#52346

Technically, this goal is achieved by the following in this PR.
- SPARK-53647 Use `io.fabric8.kubernetes.client.http.Interceptor` instead of `okhttp3.Interceptor`
- SPARK-53648 Use `VertxHttpClientFactory` instead of `OkHttpClientFactory`

### Why are the changes needed?

Currently, `Apache Spark K8s Operator` has a hard compilation dependency on `OkHttp3` library like the following.

https://github.com/apache/spark-kubernetes-operator/blob/a04c2bb9aeee5856681f796129e2f698a38e6ac1/spark-operator/src/main/java/org/apache/spark/k8s/operator/client/KubernetesClientFactory.java#L38-L41

From `Fabric8` v7.0.0, we should avoid `OkHttp3` because `Fabric8` community moved away from it like the following.

- `VertxHttpClientFactory` is the default HTTP client factory now.
    - fabric8io/kubernetes-client#6470

- `io.fabric8.kubernetes.client.http.Interceptor` is the `fabric8`'s interceptor layer which we should use to be independent from the underlying HTTP factories. We should depend on this instead of exposing `okhttp3.Interceptor`.

### Does this PR introduce _any_ user-facing change?

Yes, but Apache Spark K8s versions are still 0.x releases.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #327 from dongjoon-hyun/TODO_METRICS.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
huangxiaopingRD pushed a commit to huangxiaopingRD/spark that referenced this pull request Nov 25, 2025
### What changes were proposed in this pull request?

This PR aims to add a new profile, `huaweicloud-provided`.

### Why are the changes needed?

Since Apache Spark 4.0.0, Apache Spark module is moving from `OkHttp` to `Vert.x` implementation via the following.
- apache#49159
- [SPARK-37687 Cleanup direct usage of OkHttpClient](https://issues.apache.org/jira/browse/SPARK-37687)

Like Apache Hadoop community, we are moving away further from `okhttp` transitive dependencies from `hadoop-huaweicloud` dependency.
- [HADOOP-18503](https://issues.apache.org/jira/browse/HADOOP-18503) Upgrade Huawei OBS client to 3.22.3.1
- [HADOOP-18890](https://issues.apache.org/jira/browse/HADOOP-18890) Remove okhttp usage

This PR will allow users to exclude and add their `huaweicloud` and its transitive dependencies. Technically, the scope of following dependencies are changed to `provided`. As a result, those are removed from Spark distribution.
```
-esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar
-hadoop-huaweicloud/3.4.2//hadoop-huaweicloud-3.4.2.jar
-java-xmlbuilder/1.2//java-xmlbuilder-1.2.jar
-okhttp/3.12.12//okhttp-3.12.12.jar
-okio/1.17.6//okio-1.17.6.jar
```

### Does this PR introduce _any_ user-facing change?

No, this is a new profile which is disabled by default.

### How was this patch tested?

Manually check like the following.

```
$ mvn dependency:tree -Phadoop-cloud | grep okhttp
[INFO] +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile
[INFO] |  +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile

$ mvn dependency:tree -Phadoop-cloud -Phuaweicloud-provided | grep okhttp
[INFO] +- com.squareup.okhttp3:okhttp:jar:3.12.12:provided
```

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#52346 from dongjoon-hyun/SPARK-53590.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants