-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Jenkins failing on Scala GPU #13080
Description
There is a failure occurring on the Jenkins in the Scala GPU task. The failure is occurring when running "make scalapkg" to build the core module.
One sample error ending is:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.22.0:test (default-test) on project mxnet-core_2.11: There are test failures.
[ERROR]
[ERROR] Please refer to /work/mxnet/scala-package/core/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date]-jvmRun[N].dump, [date].dumpstream and [date]-jvmRun[N].dumpstream.
[ERROR] The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /work/mxnet/scala-package/core && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -jar /work/mxnet/scala-package/core/target/surefire/surefirebooter1191780993849458203.jar /work/mxnet/scala-package/core/target/surefire 2018-11-01T15-57-55_076-jvmRun1 surefire2207349951215643963tmp surefire_05921002131138985800tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /work/mxnet/scala-package/core && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -jar /work/mxnet/scala-package/core/target/surefire/surefirebooter1191780993849458203.jar /work/mxnet/scala-package/core/target/surefire 2018-11-01T15-57-55_076-jvmRun1 surefire2207349951215643963tmp surefire_05921002131138985800tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:671)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:533)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:278)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:244)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1194)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1022)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:868)
[ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
[ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
[ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
[ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
[ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
[ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR] at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :mxnet-core_2.11
make: *** [scalapkg] Error 1
Makefile:606: recipe for target 'scalapkg' failed
build.py: 2018-11-01 15:57:55,920 Waiting for status of container a9ca51005111 for 600 s.
build.py: 2018-11-01 15:57:56,101 Container exit status: {'Error': None, 'StatusCode': 2}
build.py: 2018-11-01 15:57:56,101 Stopping container: a9ca51005111
build.py: 2018-11-01 15:57:56,103 Removing container: a9ca51005111
build.py: 2018-11-01 15:57:56,230 Execution of ['/work/runtime_functions.sh', 'integrationtest_ubuntu_gpu_scala'] failed with status: 2
script returned exit code 2
We have identified a number of Jenkins runs that produced this:
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13077/1/pipeline
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13071/1/pipeline
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-13052/2/pipeline
The problem has also been observed on the dev Jenkins:
http://jenkins.mxnet-ci-dev.amazon-ml.com/blue/organizations/jenkins/restricted-publish-artifacts/detail/automate-maven/61/pipeline