[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274

JoshRosen · 2015-10-26T02:20:06Z

OutputCommitCoordinator uses a map in a place where an array would suffice, increasing its memory consumption for result stages with millions of tasks.

This patch replaces that map with an array. The only tricky part of this is reasoning about the range of possible array indexes in order to make sure that we never index out of bounds.

JoshRosen · 2015-10-26T02:20:51Z

core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

A shuffle map stage's maximum partition id is determined by the number of partitions in the RDD being computed.

as I was reviewing this, I was wondering if a ShuffleMapStage could have a different maximum partitionId if it was from a skipped stage. I'm now convinced it cannot, but it might be a bit clearer if we change the constructor to not even take a numTasks argument, since it should always be rdd.partitions.length? Not necessary for this change, but just a thought while you are touching this.

Also -- isn't the output commit coordinator irrelevant for ShuffleMapStages anyway? If not, than I think there might be another bug there for skipped stages. Since it indexes by stageId, you can have two different stages, that really represent the exact same shuffle, so you could have two different tasks authorized to commit that are handling the same stage. (Which wouldn't be a problem introduced by this change, but I just thought it was worth mentioning.)

Yeah, it should be irrelevant for ShuffleMapStages. I was just being overly-conservative here.

JoshRosen · 2015-10-26T02:23:32Z

/cc @kayousterhout @markhamstra, this seems like a potentially easy win for reducing driver memory consumption when performing a write that outputs millions of partitions. This isn't necessarily a huge amount of memory savings, but it's a substantial reduction in the number of map entry objects created, which could have GC benefits.

SparkQA · 2015-10-26T04:31:10Z

Test build #44327 has finished for PR 9274 at commit 9dc210e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2015-10-27T20:02:12Z

one small question, overall lgtm. but I'm not very familiar w/ the speculative execution code so would appreciate an expert opinion.

davies · 2015-11-04T19:46:57Z

LGTM

SparkQA · 2015-11-05T00:54:37Z

Test build #45056 has finished for PR 9274 at commit 5085aa8.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-11-05T01:20:19Z

Merged into master, thanks!

[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator.

9dc210e

JoshRosen reviewed Oct 26, 2015
View reviewed changes

Merge remote-tracking branch 'origin/master' into SPARK-11307

5085aa8

asfgit closed this in d0b5633 Nov 5, 2015

JoshRosen deleted the SPARK-11307 branch November 5, 2015 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274

[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274

Uh oh!

JoshRosen commented Oct 26, 2015

Uh oh!

JoshRosen Oct 26, 2015

Uh oh!

squito Oct 27, 2015

Uh oh!

JoshRosen Oct 28, 2015

Uh oh!

JoshRosen commented Oct 26, 2015

Uh oh!

SparkQA commented Oct 26, 2015

Uh oh!

squito commented Oct 27, 2015

Uh oh!

davies commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 5, 2015

Uh oh!

davies commented Nov 5, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274

[SPARK-11307] Reduce memory consumption of OutputCommitCoordinator #9274

Uh oh!

Conversation

JoshRosen commented Oct 26, 2015

Uh oh!

JoshRosen Oct 26, 2015

Choose a reason for hiding this comment

Uh oh!

squito Oct 27, 2015

Choose a reason for hiding this comment

Uh oh!

JoshRosen Oct 28, 2015

Choose a reason for hiding this comment

Uh oh!

JoshRosen commented Oct 26, 2015

Uh oh!

SparkQA commented Oct 26, 2015

Uh oh!

squito commented Oct 27, 2015

Uh oh!

davies commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 5, 2015

Uh oh!

davies commented Nov 5, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants