Skip to content

Commit 4d47ac4

Browse files
yaooqinncloud-fan
authored andcommitted
[SPARK-33705][SQL][TEST] Fix HiveThriftHttpServerSuite flakiness
### What changes were proposed in this pull request? TO FIX flaky tests: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/132345/testReport/ ``` org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.JDBC query execution org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.Checks Hive version org.apache.spark.sql.hive.thriftserver.HiveThriftHttpServerSuite.SPARK-24829 Checks cast as float ``` The root cause here is a jar conflict issue. `NewCookie.isHttpOnly` is not defined in the `jsr311-api.jar` which conflicts The transitive artifact `jsr311-api.jar` of `hadoop-client` is excluded at the maven side. See https://issues.apache.org/jira/browse/SPARK-27179. The Jenkins PR builder and Github Action use `SBT` as the compiler tool. First, the exclusion rule from maven is not followed by sbt, so I was able to see `jsr311-api.jar` from maven cache to be added to the classpath directly. **This seems to be a bug of `sbt-pom-reader` plugin but I'm not that sure.** Then I added an `ExcludeRule` for the `hive-thriftserver` module at the SBT side and did see the `jsr311-api.jar` gone, but the CI jobs still failed with the same error. I added a trace log in ThriftHttpServlet ```s ERROR ThriftHttpServlet: !!!!!!!!! Suspect???????? ---> file:/home/jenkins/workspace/SparkPullRequestBuilder/assembly/target/scala-2.12/jars/jsr311-api-1.1.1.jar ``` And the log pointed out that the assembly phase copied it to `assembly/target/scala-2.12/jars/` which will be added to the classpath too. With the help of SBT `dependencyTree` tool, I saw the `jsr311-api` again as a transitive of `jersery-core` from `yarn` module with a `test` scope. So **This seems to be another bug from the SBT side of the `sbt-assembly` plugin.** It copied a test scope transitive artifact to the assembly output. In this PR, I defined some rules in SparkBuild.scala to bypass the potential bugs from the SBT side. First, exclude the `jsr311` from all over the project and then add it back separately to the YARN module for SBT. Additionally, the HiveThriftServerSuites was reflected for reducing flakiness too, but not related to the bugs I have found so far. ### Why are the changes needed? fix test here ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? passing jenkins and ga Closes #30643 from yaooqinn/HiveThriftHttpServerSuite. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent b135db3 commit 4d47ac4

File tree

12 files changed

+156
-170
lines changed

12 files changed

+156
-170
lines changed

LICENSE-binary

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -521,7 +521,6 @@ Common Development and Distribution License (CDDL) 1.1
521521
------------------------------------------------------
522522

523523
javax.el:javax.el-api https://javaee.github.io/uel-ri/
524-
javax.servlet:javax.servlet-api https://javaee.github.io/servlet-spec/
525524
javax.servlet.jsp:jsp-api
526525
javax.transaction:jta http://www.oracle.com/technetwork/java/index.html
527526
javax.xml.bind:jaxb-api https://github.com/javaee/jaxb-v2
@@ -553,6 +552,7 @@ Eclipse Public License (EPL) 2.0
553552
--------------------------------
554553

555554
jakarta.annotation:jakarta-annotation-api https://projects.eclipse.org/projects/ee4j.ca
555+
jakarta.servlet:jakarta.servlet-api https://projects.eclipse.org/projects/ee4j.servlet
556556
jakarta.ws.rs:jakarta.ws.rs-api https://github.com/eclipse-ee4j/jaxrs-api
557557
org.glassfish.hk2.external:jakarta.inject
558558

core/pom.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -161,9 +161,9 @@
161161
<scope>compile</scope>
162162
</dependency>
163163
<dependency>
164-
<groupId>javax.servlet</groupId>
165-
<artifactId>javax.servlet-api</artifactId>
166-
<version>${javaxservlet.version}</version>
164+
<groupId>jakarta.servlet</groupId>
165+
<artifactId>jakarta.servlet-api</artifactId>
166+
<version>${jakartaservlet.version}</version>
167167
</dependency>
168168
<dependency>
169169
<groupId>org.apache.commons</groupId>

dev/deps/spark-deps-hadoop-2.7-hive-2.3

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,14 @@ jackson-xc/1.9.13//jackson-xc-1.9.13.jar
119119
jakarta.activation-api/1.2.1//jakarta.activation-api-1.2.1.jar
120120
jakarta.annotation-api/1.3.5//jakarta.annotation-api-1.3.5.jar
121121
jakarta.inject/2.6.1//jakarta.inject-2.6.1.jar
122+
jakarta.servlet-api/4.0.3//jakarta.servlet-api-4.0.3.jar
122123
jakarta.validation-api/2.0.2//jakarta.validation-api-2.0.2.jar
123124
jakarta.ws.rs-api/2.1.6//jakarta.ws.rs-api-2.1.6.jar
124125
jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
125126
janino/3.0.16//janino-3.0.16.jar
126127
javassist/3.25.0-GA//javassist-3.25.0-GA.jar
127128
javax.inject/1//javax.inject-1.jar
128129
javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
129-
javax.servlet-api/3.1.0//javax.servlet-api-3.1.0.jar
130130
javolution/5.5.1//javolution-5.5.1.jar
131131
jaxb-api/2.2.2//jaxb-api-2.2.2.jar
132132
jaxb-runtime/2.3.2//jaxb-runtime-2.3.2.jar

dev/deps/spark-deps-hadoop-3.2-hive-2.3

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,7 @@ jackson-module-scala_2.12/2.11.4//jackson-module-scala_2.12-2.11.4.jar
118118
jakarta.activation-api/1.2.1//jakarta.activation-api-1.2.1.jar
119119
jakarta.annotation-api/1.3.5//jakarta.annotation-api-1.3.5.jar
120120
jakarta.inject/2.6.1//jakarta.inject-2.6.1.jar
121+
jakarta.servlet-api/4.0.3//jakarta.servlet-api-4.0.3.jar
121122
jakarta.validation-api/2.0.2//jakarta.validation-api-2.0.2.jar
122123
jakarta.ws.rs-api/2.1.6//jakarta.ws.rs-api-2.1.6.jar
123124
jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar

pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@
138138
<parquet.version>1.10.1</parquet.version>
139139
<orc.version>1.6.6</orc.version>
140140
<jetty.version>9.4.28.v20200408</jetty.version>
141-
<javaxservlet.version>3.1.0</javaxservlet.version>
141+
<jakartaservlet.version>4.0.3</jakartaservlet.version>
142142
<chill.version>0.9.5</chill.version>
143143
<ivy.version>2.4.0</ivy.version>
144144
<oro.version>2.0.8</oro.version>

project/SparkBuild.scala

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -395,6 +395,8 @@ object SparkBuild extends PomBuild {
395395

396396
enable(KubernetesIntegrationTests.settings)(kubernetesIntegrationTests)
397397

398+
enable(YARN.settings)(yarn)
399+
398400
/**
399401
* Adds the ability to run the spark shell directly from SBT without building an assembly
400402
* jar.
@@ -654,7 +656,21 @@ object DependencyOverrides {
654656
*/
655657
object ExcludedDependencies {
656658
lazy val settings = Seq(
657-
libraryDependencies ~= { libs => libs.filterNot(_.name == "groovy-all") }
659+
libraryDependencies ~= { libs => libs.filterNot(_.name == "groovy-all") },
660+
// SPARK-33705: Due to sbt compiler issues, it brings exclusions defined in maven pom back to
661+
// the classpath directly and assemble test scope artifacts to assembly/target/scala-xx/jars,
662+
// which is also will be added to the classpath of some unit tests that will build a subprocess
663+
// to run `spark-submit`, e.g. HiveThriftServer2Test.
664+
//
665+
// These artifacts are for the jersey-1 API but Spark use jersey-2 ones, so it cause test
666+
// flakiness w/ jar conflicts issues.
667+
//
668+
// Also jersey-1 is only used by yarn module(see resource-managers/yarn/pom.xml) for testing
669+
// purpose only. Here we exclude them from the whole project scope and add them w/ yarn only.
670+
excludeDependencies ++= Seq(
671+
ExclusionRule(organization = "com.sun.jersey"),
672+
ExclusionRule("javax.servlet", "javax.servlet-api"),
673+
ExclusionRule("javax.ws.rs", "jsr311-api"))
658674
)
659675
}
660676

@@ -758,6 +774,15 @@ object Hive {
758774
)
759775
}
760776

777+
object YARN {
778+
lazy val settings = Seq(
779+
excludeDependencies --= Seq(
780+
ExclusionRule(organization = "com.sun.jersey"),
781+
ExclusionRule("javax.servlet", "javax.servlet-api"),
782+
ExclusionRule("javax.ws.rs", "jsr311-api"))
783+
)
784+
}
785+
761786
object Assembly {
762787
import sbtassembly.AssemblyUtils._
763788
import sbtassembly.AssemblyPlugin.autoImport._

resource-managers/yarn/pom.xml

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -88,13 +88,6 @@
8888
<artifactId>hadoop-client</artifactId>
8989
</dependency>
9090

91-
<dependency>
92-
<groupId>jakarta.servlet</groupId>
93-
<artifactId>jakarta.servlet-api</artifactId>
94-
<version>4.0.3</version>
95-
<scope>test</scope>
96-
</dependency>
97-
9891
<!-- Explicit listing of transitive deps that are shaded. Otherwise, odd compiler crashes. -->
9992
<dependency>
10093
<groupId>com.google.guava</groupId>

0 commit comments

Comments
 (0)