-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22648] [K8S] Spark on Kubernetes - Documentation #19946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
2720c88
4ccf59b
14bee00
b18d1ba
9594462
679b5c7
67abb93
a7e0c4c
74ac5c9
d235847
702162b
8726154
374ddc8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -81,6 +81,7 @@ options for deployment: | |
| * [Standalone Deploy Mode](spark-standalone.html): simplest way to deploy Spark on a private cluster | ||
| * [Apache Mesos](running-on-mesos.html) | ||
| * [Hadoop YARN](running-on-yarn.html) | ||
| * [Kubernetes](running-on-kubernetes.html) | ||
|
|
||
| # Where to Go from Here | ||
|
|
||
|
|
@@ -112,7 +113,7 @@ options for deployment: | |
| * [Mesos](running-on-mesos.html): deploy a private cluster using | ||
| [Apache Mesos](http://mesos.apache.org) | ||
| * [YARN](running-on-yarn.html): deploy Spark on top of Hadoop NextGen (YARN) | ||
| * [Kubernetes (experimental)](https://github.com/apache-spark-on-k8s/spark): deploy Spark on top of Kubernetes | ||
| * [Kubernetes (experimental)](running-on-kubernetes.html): deploy Spark on top of Kubernetes | ||
|
||
|
|
||
| **Other Documents:** | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,7 +18,8 @@ Spark application's configuration (driver, executors, and the AM when running in | |
|
|
||
| There are two deploy modes that can be used to launch Spark applications on YARN. In `cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. | ||
|
|
||
| Unlike [Spark standalone](spark-standalone.html) and [Mesos](running-on-mesos.html) modes, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn`. | ||
| Unlike [Spark standalone](spark-standalone.html), [Mesos](running-on-mesos.html) and [Kubernetes](running-on-kubernetes.html) modes, | ||
|
||
| in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn`. | ||
|
||
|
|
||
| To launch a Spark application in `cluster` mode: | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -127,6 +127,16 @@ export HADOOP_CONF_DIR=XXX | |
| http://path/to/examples.jar \ | ||
| 1000 | ||
|
|
||
| # Run on a Kubernetes cluster in cluster deploy mode | ||
| ./bin/spark-submit \ | ||
| --class org.apache.spark.examples.SparkPi \ | ||
| --master k8s://xx.yy.zz.ww:443 \ | ||
| --deploy-mode cluster \ | ||
| --executor-memory 20G \ | ||
| --num-executors 50 \ | ||
| http://path/to/examples.jar \ | ||
| 1000 | ||
|
|
||
| {% endhighlight %} | ||
|
|
||
| # Master URLs | ||
|
|
@@ -155,6 +165,12 @@ The master URL passed to Spark can be in one of the following formats: | |
| <code>client</code> or <code>cluster</code> mode depending on the value of <code>--deploy-mode</code>. | ||
| The cluster location will be found based on the <code>HADOOP_CONF_DIR</code> or <code>YARN_CONF_DIR</code> variable. | ||
| </td></tr> | ||
| <tr><td> <code>k8s://HOST:PORT</code> </td><td> Connect to a <a href="running-on-kubernetes.html"> Kubernetes </a> cluster in | ||
|
||
| <code>cluster</code> mode. Client mode is currently unsupported and will be supported in future releases. | ||
| The <code>HOST</code> and <code>PORT</code> refer to the [Kubernetes API Server](https://kubernetes.io/docs/reference/generated/kube-apiserver/). | ||
| It connects using TLS by default. In order to force it to use an unsecured connection, you can use | ||
| <code>k8s://http://HOST:PORT</code>. | ||
| </td></tr> | ||
| </table> | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use k8s? I kept bringing this up and that's because I can never spell Kubernetes properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's
k8s://in the URL scheme we use and the package names are alsok8s- so, users should never have to type the name.This is one of the last few holdouts. I'd say that it's consistent here, with the use of other cluster manager names in full in their maven projects. I can change it here if you feel strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually second @rxin, I do get the spelling wrong at times :-)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you both also referring to the config options etc, which are still
spark.kubernetes.*, or just the maven build target? If it's everything, it would be a fairly large change, doable certainly - but confirming how far the rename should go.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @mridulm, @rxin - can you please confirm the scope of the renaming you were referring to here? Is it just the maven target? Changing all the config options etc would be a considerably large change at this point. Also, a point that was brought up today was - while k8s is common shorthand, it's not universal as is the full name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've filed https://issues.apache.org/jira/browse/SPARK-22853 to discuss this and unblock this PR. We should be able to reach consensus by release time. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I don't think you need to block this pr with this.