-
Notifications
You must be signed in to change notification settings - Fork 29k
Add a note about jobs running in FIFO order in the default pool #20881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -215,6 +215,9 @@ pool), but inside each pool, jobs run in FIFO order. For example, if you create | |||||||||||||||||||
| means that each user will get an equal share of the cluster, and that each user's queries will run in | ||||||||||||||||||||
| order instead of later queries taking resources from that user's earlier ones. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| If jobs are not explicitely set to use a given pool, they end up in the default pool. This means that even if | ||||||||||||||||||||
| `spark.scheduler.mode` is set to `FAIR` those jobs will be ran in `FIFO` order (within the default pool). | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not actually correct. There is no reason why you can't define a default pool that uses FAIR scheduling.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume you mean that the second sentence is incorrect? I drew that conclusion based from empirical observations + spark/core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala Lines 109 to 117 in 992447f
However, I might very well be missing something?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You seem to be missing a few somethings: 1) You can define your own default pool that does FAIR scheduling within that pool, so blanket statements about "the" default pool are dangerous; 2) So, item 2) effectively means that If you just want one scheduling pool that does fair scheduling among its schedulable entities, then you need to set
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool, thanks @markhamstra I think I grasp what's going on now. Some form of your comment would be a useful addition to the documentation; rationale being that there seems to be a (common?) misunderstanding about how to schedule jobs in a |
||||||||||||||||||||
| ## Configuring Pool Properties | ||||||||||||||||||||
|
|
||||||||||||||||||||
| Specific pools' properties can also be modified through a configuration file. Each pool supports three | ||||||||||||||||||||
|
|
||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Alexis-D , there are a few minor typos here;
'explicitely' -> 'explicitly'.
'ran' -> 'run'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right my bad -- I updated the PR