-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-4044 [CORE] Thriftserver fails to start when JAVA_HOME points to JRE instead of JDK #4873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #28232 has started for PR 4873 at commit
|
|
Test build #28232 has finished for PR 4873 at commit
|
|
Test PASSed. |
bin/compute-classpath.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... not sure about this one. The thrift server probably won't run, in many cases, if the datanucleus jars are not added to its classpath. So won't apps that use HiveContext with a local metastore, for example. Perhaps this should be an error instead? Of maybe try to be smart and look at $JAVA_HOME/../bin/jar too?
(BTW, shameless plug: SPARK-4924 fixes this. But it's not in 1.3.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this upgrades silence to a warning; upgrading it to an error however means that a JDK is suddenly required to run the thrift server et al, which does not seem like something that should be required (production usually disallows JDKs). However, yes, my question is: why not just always add these JARs if there's doubt, and they're present? They could be added in the else block, or, why bother inspecting what is built into the assembly anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No complaints from me about always adding the jars. We could get rid of that annoying message ("Assembly was built with hive") in the process.
…e datanucleus if present
|
Sorry to bug @pwendell again but I think you may also be familiar with this script. I went to the extreme and removed the check for Hive jars entirely. Datanucleus goes on the classpath if it exists, full stop. This also resolves the JAR issue. But is there a reason that's a bad idea? Like, if I didn't build with Hive, but Datanucleus is lying around, does that cause a problem? |
|
Test build #28396 has started for PR 4873 at commit
|
|
Test build #28396 has finished for PR 4873 at commit
|
|
Test PASSed. |
|
Ah right, this is obsoleted for |
…o JRE instead of JDK Don't use JAR_CMD unless present in archive check. Add datanucleus always if present, to avoid needing a check involving JAR_CMD. Follow up to #4873 for branch 1.3. Author: Sean Owen <[email protected]> Closes #4981 from srowen/SPARK-4044.2 and squashes the following commits: 3aafc76 [Sean Owen] Don't use JAR_CMD unless present in archive check. Add datanucleus always if present, to avoid needing a check involving JAR_CMD
So, I think it would be a step too far to tell people they have to run Spark with a JDK instead of a JRE. It's uncommon to put a JDK in production. So this runtime script needs to cope with not having
jaravailable.I've added explicit checks to the two cases where it's used, but yes it's only the second datanucleus case that matters.
At least there's an error message now, but, how about just adding the jars anyway, if they're present, in this case?
CC @JoshRosen