Skip to content

Conversation

@ash211
Copy link

@ash211 ash211 commented Jul 15, 2017

Two commits had merge conflicts:

  • Python Bindings for launching PySpark Jobs from the JVM
  • Submission client redesign to use a step-based builder pattern

Hong Zhiguo and others added 6 commits July 14, 2017 17:12
The conf property spark.kubernetes.shuffle.namespace is used to
specify the namesapce of shuffle pods.

In normal cases, only one "shuffle daemonset" is deployed and
shared by all spark pods.

The spark driver should be able to list and watch shuffle pods
in the namespace specified by user.

Note: by default, spark driver pod doesn't have authority to
list and watch shuffle pods in another namespace. Some action
is needed to grant it the authority. For example, below ABAC
policy works.

```
{"apiVersion": "abac.authorization.kubernetes.io/v1beta1", "kind":
"Policy", "spec": {"group": "system:serviceaccounts", "namespace":
"SHUFFLE_NAMESPACE",
"resource": "pods", "readonly": true}}
```
(cherry picked from commit a6291c6)
This commit tries to solve issue #359 by allowing the `spark.executor.cores` configuration key to take fractional values, e.g., 0.5 or 1.5. The value is used to specify the cpu request when creating the executor pods, which is allowed to be fractional by Kubernetes. When the value is passed to the executor process through the environment variable `SPARK_EXECUTOR_CORES`, the value is rounded up to the closest integer as required by the `CoarseGrainedExecutorBackend`.

Signed-off-by: Yinan Li <[email protected]>(cherry picked from commit 6f6cfd6)
* Adding PySpark Submit functionality. Launching Python from JVM

* Addressing scala idioms related to PR351

* Removing extends Logging which was necessary for LogInfo

* Refactored code to leverage the ContainerLocalizedFileResolver

* Modified Unit tests so that they would pass

* Modified Unit Test input to pass Unit Tests

* Setup working environent for integration tests for PySpark

* Comment out Python thread logic until Jenkins has python in Python

* Modifying PythonExec to pass on Jenkins

* Modifying python exec

* Added unit tests to ClientV2 and refactored to include pyspark submission resources

* Modified unit test check

* Scalastyle

* PR 348 file conflicts

* Refactored unit tests and styles

* further scala stylzing and logic

* Modified unit tests to be more specific towards Class in question

* Removed space delimiting for methods

* Submission client redesign to use a step-based builder pattern.

This change overhauls the underlying architecture of the submission
client, but it is intended to entirely preserve existing behavior of
Spark applications. Therefore users will find this to be an invisible
change.

The philosophy behind this design is to reconsider the breakdown of the
submission process. It operates off the abstraction of "submission
steps", which are transformation functions that take the previous state
of the driver and return the new state of the driver. The driver's state
includes its Spark configurations and the Kubernetes resources that will
be used to deploy it.

Such a refactor moves away from a features-first API design, which
considers different containers to serve a set of features. The previous
design, for example, had a container files resolver API object that
returned different resolutions of the dependencies added by the user.
However, it was up to the main Client to know how to intelligently
invoke all of those APIs. Therefore the API surface area of the file
resolver became untenably large and it was not intuitive of how it was
to be used or extended.

This design changes the encapsulation layout; every module is now
responsible for changing the driver specification directly. An
orchestrator builds the correct chain of steps and hands it to the
client, which then calls it verbatim. The main client then makes any
final modifications that put the different pieces of the driver
together, particularly to attach the driver container itself to the pod
and to apply the Spark configuration as command-line arguments.

* Don't add the init-container step if all URIs are local.

* Python arguments patch + tests + docs

* Revert "Python arguments patch + tests + docs"

This reverts commit 4533df2.

* Revert "Don't add the init-container step if all URIs are local."

This reverts commit e103225.

* Revert "Submission client redesign to use a step-based builder pattern."

This reverts commit 5499f6d.

* style changes

* space for styling

(cherry picked from commit befcf0a)

 Conflicts:
	README.md
	core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
* Submission client redesign to use a step-based builder pattern.

This change overhauls the underlying architecture of the submission
client, but it is intended to entirely preserve existing behavior of
Spark applications. Therefore users will find this to be an invisible
change.

The philosophy behind this design is to reconsider the breakdown of the
submission process. It operates off the abstraction of "submission
steps", which are transformation functions that take the previous state
of the driver and return the new state of the driver. The driver's state
includes its Spark configurations and the Kubernetes resources that will
be used to deploy it.

Such a refactor moves away from a features-first API design, which
considers different containers to serve a set of features. The previous
design, for example, had a container files resolver API object that
returned different resolutions of the dependencies added by the user.
However, it was up to the main Client to know how to intelligently
invoke all of those APIs. Therefore the API surface area of the file
resolver became untenably large and it was not intuitive of how it was
to be used or extended.

This design changes the encapsulation layout; every module is now
responsible for changing the driver specification directly. An
orchestrator builds the correct chain of steps and hands it to the
client, which then calls it verbatim. The main client then makes any
final modifications that put the different pieces of the driver
together, particularly to attach the driver container itself to the pod
and to apply the Spark configuration as command-line arguments.

* Add a unit test for BaseSubmissionStep.

* Add unit test for kubernetes credentials mounting.

* Add unit test for InitContainerBootstrapStep.

* unit tests for initContainer

* Add a unit test for DependencyResolutionStep.

* further modifications to InitContainer unit tests

* Use of resolver in PythonStep and unit tests for PythonStep

* refactoring of init unit tests and pythonstep resolver logic

* Add unit test for KubernetesSubmissionStepsOrchestrator.

* refactoring and addition of secret trustStore+Cert checks in a SubmissionStepSuite

* added SparkPodInitContainerBootstrapSuite

* Added InitContainerResourceStagingServerSecretPluginSuite

* style in Unit tests

* extremely minor style fix in variable naming

* Address comments.

* Rename class for consistency.

* Attempt to make spacing consistent.

Multi-line methods should have four-space indentation for arguments that
aren't on the same line as the method call itself... but this is
difficult to do consistently given how IDEs handle Scala multi-line indentation
in most cases.

(cherry picked from commit 0f4368f)

 Conflicts:
	resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala
@ash211
Copy link
Author

ash211 commented Jul 15, 2017

@mccheah looks like we don't pass scalastyle:

[INFO] --- scala-maven-plugin:3.2.2:doc-jar (attach-scaladocs) @ spark-kubernetes_2.11 ---
/home/ubuntu/spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/submitsteps/DriverKubernetesCredentialsStep.scala:211: warning: implicit conversion method augmentSparkConf should be enabled
by making the implicit value scala.language.implicitConversions visible.
This can be achieved by adding the import clause 'import scala.language.implicitConversions'
or by setting the compiler option -language:implicitConversions.
See the Scaladoc for value scala.language.implicitConversions for a discussion
why the feature should be explicitly enabled.
  private implicit def augmentSparkConf(sparkConf: SparkConf): OptionSettableSparkConf = {
                       ^
model contains 30 documentable templates
one warning found

@mccheah
Copy link

mccheah commented Jul 17, 2017

Requires apache-spark-on-k8s#374

Otherwise we can get a Scalastyle error when building from SBT.

(cherry picked from commit 7deaaa3)
@ash211
Copy link
Author

ash211 commented Jul 17, 2017

@mccheah now seeing this:

error file=/home/ubuntu/spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala message=import.ordering.wrongOrderInGroup.message line=39 column=0^M
Saving to outputFile=/home/ubuntu/spark/resource-managers/kubernetes/core/target/scalastyle-output.xml^M
Processed 52 file(s)^M

@ash211
Copy link
Author

ash211 commented Jul 17, 2017

More locally (on branch-2.1-kubernetes):

[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/InitContainerResourceStagingServerSecretPluginSuite.scala:20:0: import.ordering.wrongOrderInGroup.message
[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/InitContainerResourceStagingServerSecretPluginSuite.scala:21:0: import.ordering.missingEmptyLine.message
[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/InitContainerResourceStagingServerSecretPluginSuite.scala:23:0: import.ordering.wrongGroup.message
[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/SparkPodInitContainerBootstrapSuite.scala:20:0: import.ordering.wrongOrderInGroup.message
[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/SparkPodInitContainerBootstrapSuite.scala:21:0: import.ordering.missingEmptyLine.message
[error] /Volumes/git/spark/resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/kubernetes/SparkPodInitContainerBootstrapSuite.scala:23:0: import.ordering.wrongGroup.message

ash211 and others added 4 commits July 18, 2017 23:19
Test with ./dev/scalastyle
(cherry picked from commit 8751a9a)
* Retry binding server to random port in the resource staging server test.

* Break if successful start

* Start server in try block.

* FIx scalastyle

* More rigorous cleanup logic. Increment port numbers.

* Move around more exception logic.

* More exception refactoring.

* Remove whitespace

* Fix test

* Rename variable
@ash211 ash211 merged commit a4f9ced into master Jul 19, 2017
@ash211 ash211 deleted the resync-kube branch July 19, 2017 20:12
@ash211
Copy link
Author

ash211 commented Jul 19, 2017

Oops squashed instead of merged -- will just have to remember that next we resync from upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants