-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40675][DOCS] Supplement undocumented spark configurations in configuration.md
#38131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 195 to 203 in 2248316
| private[spark] val EVENT_LOG_GC_METRICS_YOUNG_GENERATION_GARBAGE_COLLECTORS = | |
| ConfigBuilder("spark.eventLog.gcMetrics.youngGenerationGarbageCollectors") | |
| .doc("Names of supported young generation garbage collector. A name usually is " + | |
| " the return of GarbageCollectorMXBean.getName. The built-in young generation garbage " + | |
| s"collectors are ${GarbageCollectionMetrics.YOUNG_GENERATION_BUILTIN_GARBAGE_COLLECTORS}") | |
| .version("3.0.0") | |
| .stringConf | |
| .toSequence | |
| .createWithDefault(GarbageCollectionMetrics.YOUNG_GENERATION_BUILTIN_GARBAGE_COLLECTORS) |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 205 to 213 in 2248316
| private[spark] val EVENT_LOG_GC_METRICS_OLD_GENERATION_GARBAGE_COLLECTORS = | |
| ConfigBuilder("spark.eventLog.gcMetrics.oldGenerationGarbageCollectors") | |
| .doc("Names of supported old generation garbage collector. A name usually is " + | |
| "the return of GarbageCollectorMXBean.getName. The built-in old generation garbage " + | |
| s"collectors are ${GarbageCollectionMetrics.OLD_GENERATION_BUILTIN_GARBAGE_COLLECTORS}") | |
| .version("3.0.0") | |
| .stringConf | |
| .toSequence | |
| .createWithDefault(GarbageCollectionMetrics.OLD_GENERATION_BUILTIN_GARBAGE_COLLECTORS) |
5802516 to
3a5bee0
Compare
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2206 to 2211 in 2248316
| private[spark] val EXECUTOR_ALLOW_SPARK_CONTEXT = | |
| ConfigBuilder("spark.executor.allowSparkContext") | |
| .doc("If set to true, SparkContext can be created in executors.") | |
| .version("3.0.1") | |
| .booleanConf | |
| .createWithDefault(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I guess we don't want to expose this, @dcoliversun . There is no good for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Better to mark as internal configuration? If so, I will make new PR to solve it.
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2133 to 2144 in 2248316
| private[spark] val DECOMMISSION_ENABLED = | |
| ConfigBuilder("spark.decommission.enabled") | |
| .doc("When decommission enabled, Spark will try its best to shutdown the executor " + | |
| s"gracefully. Spark will try to migrate all the RDD blocks (controlled by " + | |
| s"${STORAGE_DECOMMISSION_RDD_BLOCKS_ENABLED.key}) and shuffle blocks (controlled by " + | |
| s"${STORAGE_DECOMMISSION_SHUFFLE_BLOCKS_ENABLED.key}) from the decommissioning " + | |
| s"executor to a remote executor when ${STORAGE_DECOMMISSION_ENABLED.key} is enabled. " + | |
| s"With decommission enabled, Spark will also decommission an executor instead of " + | |
| s"killing when ${DYN_ALLOCATION_ENABLED.key} enabled.") | |
| .version("3.1.0") | |
| .booleanConf | |
| .createWithDefault(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't realize that this is still undocumented. Thanks.
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2146 to 2156 in 2248316
| private[spark] val EXECUTOR_DECOMMISSION_KILL_INTERVAL = | |
| ConfigBuilder("spark.executor.decommission.killInterval") | |
| .doc("Duration after which a decommissioned executor will be killed forcefully " + | |
| "*by an outside* (e.g. non-spark) service. " + | |
| "This config is useful for cloud environments where we know in advance when " + | |
| "an executor is going to go down after decommissioning signal i.e. around 2 mins " + | |
| "in aws spot nodes, 1/2 hrs in spot block nodes etc. This config is currently " + | |
| "used to decide what tasks running on decommission executors to speculate.") | |
| .version("3.1.0") | |
| .timeConf(TimeUnit.SECONDS) | |
| .createOptional |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2158 to 2165 in 2248316
| private[spark] val EXECUTOR_DECOMMISSION_FORCE_KILL_TIMEOUT = | |
| ConfigBuilder("spark.executor.decommission.forceKillTimeout") | |
| .doc("Duration after which a Spark will force a decommissioning executor to exit." + | |
| " this should be set to a high value in most situations as low values will prevent " + | |
| " block migrations from having enough time to complete.") | |
| .version("3.2.0") | |
| .timeConf(TimeUnit.SECONDS) | |
| .createOptional |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2167 to 2172 in 2248316
| private[spark] val EXECUTOR_DECOMMISSION_SIGNAL = | |
| ConfigBuilder("spark.executor.decommission.signal") | |
| .doc("The signal that used to trigger the executor to start decommission.") | |
| .version("3.2.0") | |
| .stringConf | |
| .createWithDefaultString("PWR") |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 512 to 517 in 2248316
| private[spark] val STORAGE_DECOMMISSION_FALLBACK_STORAGE_CLEANUP = | |
| ConfigBuilder("spark.storage.decommission.fallbackStorage.cleanUp") | |
| .doc("If true, Spark cleans up its fallback storage data during shutting down.") | |
| .version("3.2.0") | |
| .booleanConf | |
| .createWithDefault(false) |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 519 to 528 in 2248316
| private[spark] val STORAGE_DECOMMISSION_SHUFFLE_MAX_DISK_SIZE = | |
| ConfigBuilder("spark.storage.decommission.shuffleBlocks.maxDiskSize") | |
| .doc("Maximum disk space to use to store shuffle blocks before rejecting remote " + | |
| "shuffle blocks. Rejecting remote shuffle blocks means that an executor will not receive " + | |
| "any shuffle migrations, and if there are no other executors available for migration " + | |
| "then shuffle blocks will be lost unless " + | |
| s"${STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH.key} is configured.") | |
| .version("3.2.0") | |
| .bytesConf(ByteUnit.BYTE) | |
| .createOptional |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2197 to 2204 in 2248316
| private[spark] val STANDALONE_SUBMIT_WAIT_APP_COMPLETION = | |
| ConfigBuilder("spark.standalone.submit.waitAppCompletion") | |
| .doc("In standalone cluster mode, controls whether the client waits to exit until the " + | |
| "application completes. If set to true, the client process will stay alive polling " + | |
| "the driver's status. Otherwise, the client process will exit after submission.") | |
| .version("3.1.0") | |
| .booleanConf | |
| .createWithDefault(false) |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2301 to 2308 in 2248316
| private[spark] val SHUFFLE_NUM_PUSH_THREADS = | |
| ConfigBuilder("spark.shuffle.push.numPushThreads") | |
| .doc("Specify the number of threads in the block pusher pool. These threads assist " + | |
| "in creating connections and pushing blocks to remote external shuffle services. By" + | |
| " default, the threadpool size is equal to the number of spark executor cores.") | |
| .version("3.2.0") | |
| .intConf | |
| .createOptional |
docs/configuration.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/core/src/main/scala/org/apache/spark/internal/config/package.scala
Lines 2330 to 2338 in 2248316
| private[spark] val PUSH_BASED_SHUFFLE_MERGE_FINALIZE_THREADS = | |
| ConfigBuilder("spark.shuffle.push.merge.finalizeThreads") | |
| .doc("Number of threads used by driver to finalize shuffle merge. Since it could" + | |
| " potentially take seconds for a large shuffle to finalize, having multiple threads helps" + | |
| " driver to handle concurrent shuffle merge finalize requests when push-based" + | |
| " shuffle is enabled.") | |
| .version("3.3.0") | |
| .intConf | |
| .createWithDefault(8) |
configuration.md (part 1)
|
cc @HyukjinKwon @dongjoon-hyun |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this, @dcoliversun .
|
Can one of the admins verify this patch? |
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look, LGTM. Thanks for working on this.
c23d54c to
fc98e73
Compare
configuration.md (part 1)configuration.md
|
Thanks for your help @srowen @HyukjinKwon @dongjoon-hyun |
What changes were proposed in this pull request?
This PR aims to supplement missing spark configurations in
org.apache.spark.internal.configinconfiguration.md.Why are the changes needed?
Help users to confirm configuration through documentation instead of code.
Does this PR introduce any user-facing change?
Yes, more configurations in documentation.
How was this patch tested?
Pass the GitHub Actions.