-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -947,7 +947,13 @@ class DAGScheduler( | |
| // serializable. If tasks are not serializable, a SparkListenerStageCompleted event | ||
| // will be posted, which should always come after a corresponding SparkListenerStageSubmitted | ||
| // event. | ||
| outputCommitCoordinator.stageStart(stage.id) | ||
| stage match { | ||
| case s: ShuffleMapStage => | ||
| outputCommitCoordinator.stageStart(stage = s.id, maxPartitionId = s.numPartitions - 1) | ||
| case s: ResultStage => | ||
| outputCommitCoordinator.stageStart( | ||
| stage = s.id, maxPartitionId = s.rdd.partitions.length - 1) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This result stage case is trickier: for the cases where the OutputCommitCoordinator actually gets invoked, I think it's generally the case that all partitions are being computed, but I guess it's hypothetically possible that a result stage could write results for only one of the RDD's partitions. In this case, I think the partition ids can be larger than |
||
| } | ||
| val taskIdToLocations: Map[Int, Seq[TaskLocation]] = try { | ||
| stage match { | ||
| case s: ShuffleMapStage => | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A shuffle map stage's maximum partition id is determined by the number of partitions in the RDD being computed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I was reviewing this, I was wondering if a
ShuffleMapStagecould have a different maximum partitionId if it was from a skipped stage. I'm now convinced it cannot, but it might be a bit clearer if we change the constructor to not even take anumTasksargument, since it should always berdd.partitions.length? Not necessary for this change, but just a thought while you are touching this.Also -- isn't the output commit coordinator irrelevant for
ShuffleMapStages anyway? If not, than I think there might be another bug there for skipped stages. Since it indexes by stageId, you can have two different stages, that really represent the exact same shuffle, so you could have two different tasks authorized to commit that are handling the same stage. (Which wouldn't be a problem introduced by this change, but I just thought it was worth mentioning.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it should be irrelevant for ShuffleMapStages. I was just being overly-conservative here.