Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Addressed docs comments
  • Loading branch information
liyinan926 committed Dec 9, 2017
commit caf22060f600b3b382e2e98b7ee5f0aacc165f2d
6 changes: 4 additions & 2 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,8 @@ of the most common options to set are:
<td>
The amount of off-heap memory (in megabytes) to be allocated per driver in cluster mode. This is
memory that accounts for things like VM overheads, interned strings, other native overheads, etc.
This tends to grow with the container size (typically 6-10%).
This tends to grow with the container size (typically 6-10%). This option is supported in Yarn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YARN. I'd also simplify:

"This option is currently supported on YARN and Kubernetes."

Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mode and Kubernetes mode.
</td>
</tr>
<tr>
Expand All @@ -179,7 +180,8 @@ of the most common options to set are:
<td>
The amount of off-heap memory (in megabytes) to be allocated per executor. This is memory that
accounts for things like VM overheads, interned strings, other native overheads, etc. This tends
to grow with the executor size (typically 6-10%).
to grow with the executor size (typically 6-10%). This option is supported in Yarn mode and
Kubernetes mode.
</td>
</tr>
<tr>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,15 +131,15 @@ private[spark] object Config extends Logging {
.checkValue(interval => interval > 0, s"Logging interval must be a positive time value.")
.createWithDefaultString("1s")

private[spark] val JARS_DOWNLOAD_LOCATION =
val JARS_DOWNLOAD_LOCATION =
ConfigBuilder("spark.kubernetes.mountDependencies.jarsDownloadDir")
.doc("Location to download jars to in the driver and executors. When using" +
" spark-submit, this directory must be empty and will be mounted as an empty directory" +
" volume on the driver and executor pod.")
.stringConf
.createWithDefault("/var/spark-data/spark-jars")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to have a default?
perhaps a comment or doc string to explain why this value as default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somehow implementation details and we expect normally users wouldn't need to set or even know about it, so having a default makes sense.

Copy link
Contributor

@vanzin vanzin Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc string says "download jars to". Is it guaranteed that this directory will be writable? Generally only root can write to things in "/var" by default, and I assume you're not running things as root even if it's inside containers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @foxish and @mccheah.

Copy link
Contributor

@mridulm mridulm Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to @vanzin's query - do we have a default user we expect to run the container as ? (or not run as ?)
Or does it not matter either way and is up to the user ?

Also, would be great if we documented assumptions like this in the k8s documentation - which paths should be available, should be writable, ports accessible, etc (perhaps this is already in the docs, and I am yet to find it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need this directory to be writable and we can definitely document it. We are running things as root in the reference images within the container, and we can instead create and use a more restricted user - but that would imply an assumption on our end. @ash211, can you comment on your setup?

K8s also offers a way to change on the fly that using PodSecurityPolicy, one can change the runAsUser config to run as a particular user. So, one can create a user "spark" on their nodes and have pod security policy enforce that all containers run as that user, i.e. our default to root doesn't prevent customization in that regard.


private[spark] val FILES_DOWNLOAD_LOCATION =
val FILES_DOWNLOAD_LOCATION =
ConfigBuilder("spark.kubernetes.mountDependencies.filesDownloadDir")
.doc("Location to download files to in the driver and executors. When using" +
" spark-submit, this directory must be empty and will be mounted as an empty directory" +
Expand Down