Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented Mar 3, 2015

So, I think it would be a step too far to tell people they have to run Spark with a JDK instead of a JRE. It's uncommon to put a JDK in production. So this runtime script needs to cope with not having jar available.

I've added explicit checks to the two cases where it's used, but yes it's only the second datanucleus case that matters.

At least there's an error message now, but, how about just adding the jars anyway, if they're present, in this case?

CC @JoshRosen

@SparkQA
Copy link

SparkQA commented Mar 3, 2015

Test build #28232 has started for PR 4873 at commit 471be87.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 3, 2015

Test build #28232 has finished for PR 4873 at commit 471be87.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28232/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... not sure about this one. The thrift server probably won't run, in many cases, if the datanucleus jars are not added to its classpath. So won't apps that use HiveContext with a local metastore, for example. Perhaps this should be an error instead? Of maybe try to be smart and look at $JAVA_HOME/../bin/jar too?

(BTW, shameless plug: SPARK-4924 fixes this. But it's not in 1.3.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this upgrades silence to a warning; upgrading it to an error however means that a JDK is suddenly required to run the thrift server et al, which does not seem like something that should be required (production usually disallows JDKs). However, yes, my question is: why not just always add these JARs if there's doubt, and they're present? They could be added in the else block, or, why bother inspecting what is built into the assembly anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No complaints from me about always adding the jars. We could get rid of that annoying message ("Assembly was built with hive") in the process.

@srowen
Copy link
Member Author

srowen commented Mar 9, 2015

Sorry to bug @pwendell again but I think you may also be familiar with this script. I went to the extreme and removed the check for Hive jars entirely. Datanucleus goes on the classpath if it exists, full stop. This also resolves the JAR issue. But is there a reason that's a bad idea? Like, if I didn't build with Hive, but Datanucleus is lying around, does that cause a problem?

@SparkQA
Copy link

SparkQA commented Mar 9, 2015

Test build #28396 has started for PR 4873 at commit 18b53a0.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 9, 2015

Test build #28396 has finished for PR 4873 at commit 18b53a0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28396/
Test PASSed.

@srowen
Copy link
Member Author

srowen commented Mar 11, 2015

Ah right, this is obsoleted for master by SPARK-4924. I'll reopen for branch-1.3 and possible back-porting.

@srowen srowen closed this Mar 11, 2015
@srowen srowen deleted the SPARK-4044 branch March 11, 2015 13:38
asfgit pushed a commit that referenced this pull request Mar 13, 2015
…o JRE instead of JDK

Don't use JAR_CMD unless present in archive check. Add datanucleus always if present, to avoid needing a check involving JAR_CMD.

Follow up to #4873 for branch 1.3.

Author: Sean Owen <[email protected]>

Closes #4981 from srowen/SPARK-4044.2 and squashes the following commits:

3aafc76 [Sean Owen] Don't use JAR_CMD unless present in archive check. Add datanucleus always if present, to avoid needing a check involving JAR_CMD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants