@@ -64,22 +64,22 @@ This is a useful place to check to make sure that your properties have been set
6464that only values explicitly specified through either ` spark-defaults.conf ` or SparkConf will
6565appear. For all other configuration properties, you can assume the default value is used.
6666
67- ## All Configuration Properties
67+ ## Available Properties
6868
6969Most of the properties that control internal settings have reasonable default values. Some
7070of the most common options to set are:
7171
7272<table class =" table " >
7373<tr ><th >Property Name</th ><th >Default</th ><th >Meaning</th ></tr >
7474<tr >
75- <td ><strong >< code >spark.app.name</code ></ strong ></td >
75+ <td ><code >spark.app.name</code ></td >
7676 <td >(none)</td >
7777 <td >
7878 The name of your application. This will appear in the UI and in log data.
7979 </td >
8080</tr >
8181<tr >
82- <td ><strong >< code >spark.master</code ></ strong ></td >
82+ <td ><code >spark.master</code ></td >
8383 <td >(none)</td >
8484 <td >
8585 The cluster manager to connect to. See the list of
@@ -244,15 +244,6 @@ Apart from these, the following properties are also available, and may be useful
244244 reduce the number of disk seeks and system calls made in creating intermediate shuffle files.
245245 </td >
246246</tr >
247- <tr >
248- <td ><code >spark.storage.memoryMapThreshold</code ></td >
249- <td >8192</td >
250- <td >
251- Size of a block, in bytes, above which Spark memory maps when reading a block from disk.
252- This prevents Spark from memory mapping very small blocks. In general, memory
253- mapping has high overhead for blocks close to or below the page size of the operating system.
254- </td >
255- </tr >
256247<tr >
257248 <td ><code >spark.reducer.maxMbInFlight</code ></td >
258249 <td >48</td >
@@ -292,7 +283,7 @@ Apart from these, the following properties are also available, and may be useful
292283 <td ><code >spark.eventLog.enabled</code ></td >
293284 <td >false</td >
294285 <td >
295- Whether to log spark events, useful for reconstructing the Web UI after the application has
286+ Whether to log Spark events, useful for reconstructing the Web UI after the application has
296287 finished.
297288 </td >
298289</tr >
@@ -307,7 +298,7 @@ Apart from these, the following properties are also available, and may be useful
307298 <td ><code >spark.eventLog.dir</code ></td >
308299 <td >file:///tmp/spark-events</td >
309300 <td >
310- Base directory in which spark events are logged, if <code>spark.eventLog.enabled</code> is true.
301+ Base directory in which Spark events are logged, if <code>spark.eventLog.enabled</code> is true.
311302 Within this base directory, Spark creates a sub-directory for each application, and logs the
312303 events specific to the application in this directory.
313304 </td >
@@ -457,13 +448,33 @@ Apart from these, the following properties are also available, and may be useful
457448 directories on Tachyon file system.
458449 </td >
459450</tr >
451+ <tr >
452+ <td ><code >spark.storage.memoryMapThreshold</code ></td >
453+ <td >8192</td >
454+ <td >
455+ Size of a block, in bytes, above which Spark memory maps when reading a block from disk.
456+ This prevents Spark from memory mapping very small blocks. In general, memory
457+ mapping has high overhead for blocks close to or below the page size of the operating system.
458+ </td >
459+ </tr >
460460<tr >
461461 <td ><code >spark.tachyonStore.url</code ></td >
462462 <td >tachyon://localhost:19998</td >
463463 <td >
464464 The URL of the underlying Tachyon file system in the TachyonStore.
465465 </td >
466466</tr >
467+ <tr >
468+ <td ><code >spark.cleaner.ttl</code ></td >
469+ <td >(infinite)</td >
470+ <td >
471+ Duration (seconds) of how long Spark will remember any metadata (stages generated, tasks
472+ generated, etc.). Periodic cleanups will ensure that metadata older than this duration will be
473+ forgotten. This is useful for running Spark for many hours / days (for example, running 24/7 in
474+ case of Spark Streaming applications). Note that any RDD that persists in memory for more than
475+ this duration will be cleared as well.
476+ </td >
477+ </tr >
467478</table >
468479
469480#### Networking
@@ -539,7 +550,7 @@ Apart from these, the following properties are also available, and may be useful
539550 `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using
540551 failure detector can be, a sensistive failure detector can help evict rogue executors really
541552 quick. However this is usually not the case as gc pauses and network lags are expected in a
542- real spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats
553+ real Spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats
543554 between nodes leading to flooding the network with those.
544555 </td >
545556</tr >
@@ -677,16 +688,16 @@ Apart from these, the following properties are also available, and may be useful
677688 <td ><code >spark.authenticate</code ></td >
678689 <td >false</td >
679690 <td >
680- Whether spark authenticates its internal connections. See
681- <code>spark.authenticate.secret</code> if not running on Yarn .
691+ Whether Spark authenticates its internal connections. See
692+ <code>spark.authenticate.secret</code> if not running on YARN .
682693 </td >
683694</tr >
684695<tr >
685696 <td ><code >spark.authenticate.secret</code ></td >
686697 <td >None</td >
687698 <td >
688699 Set the secret key used for Spark to authenticate between components. This needs to be set if
689- not running on Yarn and authentication is enabled.
700+ not running on YARN and authentication is enabled.
690701 </td >
691702</tr >
692703<tr >
@@ -702,7 +713,8 @@ Apart from these, the following properties are also available, and may be useful
702713 <td >None</td >
703714 <td >
704715 Comma separated list of filter class names to apply to the Spark web ui. The filter should be a
705- standard javax servlet Filter. Parameters to each filter can also be specified by setting a
716+ standard <a href="http://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html">
717+ javax servlet Filter</a>. Parameters to each filter can also be specified by setting a
706718 java system property of spark.<class name of filter>.params='param1=value1,param2=value2'
707719 (e.g. -Dspark.ui.filters=com.test.filter1
708720 -Dspark.com.test.filter1.params='param1=foo,param2=testing')
@@ -712,7 +724,7 @@ Apart from these, the following properties are also available, and may be useful
712724 <td ><code >spark.ui.acls.enable</code ></td >
713725 <td >false</td >
714726 <td >
715- Whether spark web ui acls should are enabled. If enabled, this checks to see if the user has
727+ Whether Spark web ui acls should are enabled. If enabled, this checks to see if the user has
716728 access permissions to view the web ui. See <code>spark.ui.view.acls</code> for more details.
717729 Also note this requires the user to be known, if the user comes across as null no checks
718730 are done. Filters can be used to authenticate and set the user.
@@ -722,7 +734,7 @@ Apart from these, the following properties are also available, and may be useful
722734 <td ><code >spark.ui.view.acls</code ></td >
723735 <td >Empty</td >
724736 <td >
725- Comma separated list of users that have view access to the spark web ui. By default only the
737+ Comma separated list of users that have view access to the Spark web ui. By default only the
726738 user that started the Spark job has view access.
727739 </td >
728740</tr >
@@ -731,17 +743,6 @@ Apart from these, the following properties are also available, and may be useful
731743#### Spark Streaming
732744<table class =" table " >
733745<tr ><th >Property Name</th ><th >Default</th ><th >Meaning</th ></tr >
734- <tr >
735- <td ><code >spark.cleaner.ttl</code ></td >
736- <td >(infinite)</td >
737- <td >
738- Duration (seconds) of how long Spark will remember any metadata (stages generated, tasks
739- generated, etc.). Periodic cleanups will ensure that metadata older than this duration will be
740- forgotten. This is useful for running Spark for many hours / days (for example, running 24/7 in
741- case of Spark Streaming applications). Note that any RDD that persists in memory for more than
742- this duration will be cleared as well.
743- </td >
744- </tr >
745746<tr >
746747 <td ><code >spark.streaming.blockInterval</code ></td >
747748 <td >200</td >
0 commit comments