[SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param #3525

ryan-williams · 2014-12-01T04:45:36Z

let it be specified as a fraction of the executor memory
add/generalize some utilities for parsing "memory strings" that are found around the Spark codebase.

fixes SPARK-4665 and SPARK-4666.

AmplabJenkins · 2014-12-01T04:47:11Z

Can one of the admins verify this patch?

ryan-williams · 2014-12-01T04:50:18Z

possibly relevant to @sryza

sryza · 2014-12-01T05:04:11Z

Hey @ryan-williams , mind filing a JIRA for this?

sryza · 2014-12-01T05:11:39Z

In Spark 1.2, the memory overhead defaults to 7% of the executor memory. Have you noticed that you need larger than this fraction? In the change that added that fraction, there was some concern about having two different params (a constant overhead and a fraction) to control the same value.

ryan-williams · 2014-12-01T05:40:38Z

sure, will file JIRA.

To answer what I think is the spirit of your question, whether a spark user wants more or less than 7% seems like something that should be configurable. I have definitely wanted to set this value to closer to 10% of the executor memory, or higher, to make sure that that wasn't my problem, when debugging some of my own Spark jobs. I can't swear that I've seen yarn kill my executors for exceeding the 7%, but I've definitely felt like I wanted to configure it.

I can see that it's unusual to have two params that both affect this value, but it seems like the right tradeoff, since the fraction more closely models what we want, but by an accident of history we have this other value and we don't want to break backwards-compatibility.

Everyone I've heard talk about how to think about the overhead (which is mostly you, admittedly) talks about it in terms of the fraction of the executor memory. If you think the best course is to deprecate the absolute-amount-of-memory one as well, I'm open to that, but I don't think the status quo (1. have a hard-coded value for the logical param that we care about, and 2. have a param that sets the value we want indirectly) makes much sense.

ryan-williams · 2014-12-01T05:41:21Z

there's actually two JIRAs I'll file here:

add param to set fractional overhead memory amount
make the existing param a "memory string", not just a # of megabytes.

ryan-williams · 2014-12-01T05:49:12Z

Filed SPARK-4665 and SPARK-4666, noted the former in the PR title, and put a note about both in the opening PR comment. lmk if there's some other syntax for addressing two JIRAs that is warranted.

sryza · 2014-12-01T05:53:51Z

@ryan-williams your reasoning makes sense to me. @andrewor14 @tgravescs what do you think about deprecating memoryOverhead and adding memoryOverheadFraction?

tgravescs · 2014-12-01T14:25:21Z

Perhaps I've missed it but I haven't heard a lot of cases for either way. Do you have examples or use cases? I'd be open to changing it but want more reasoning behind it. I've found putting in the value rather then a % easier in some cases that weren't small/straight forward jobs.

ryan-williams · 2014-12-01T17:09:28Z

@arahuja recently had YARN killing executors until he bumped memory-overhead to 4-5GB, on executor memory of 20-30GB, so the hard-coded 7% was not enough for him. In general, this fraction should be configurable; maybe some people want <7% too! 7% is not special, afaik.

@sryza has given me the impression that the overhead tends to grow proportionally to the executor memory, meaning that allowing people to configure the fraction makes as much or more sense than having them do some division and tweak their cmd-line flag for every job in order to specify an absolute amount of memory.

ryan-williams · 2014-12-01T17:29:50Z

@arahuja also mentioned that his issue (needing seemingly excessive amounts of memory overhead) was fixed by #3465, but I think that still supports the idea that sometimes people need the ability to tweak this parameter on a case-by-case basis.

tgravescs · 2014-12-01T20:48:48Z

Basically I would like some hard data or user input before changing this since changing/adding configs can just lead to more confusion. I have seen a few cases where a % wasn't ideal. I've seen others where 7% worked just fine. Overall I'd like to understand if % covers the majority of cases or if say it is only useful doing etl but then breaks down doing ML or graphx. We had this discussion briefly before and left it at the value vs % and I haven't really seen much more data to change it but perhaps others have?

ryan-williams · 2014-12-04T00:51:27Z

so it turns out that it was actually Nishkam Ravi that made b4fb7b8, which added this 7%. Excerpts from that commit, that are live in comments and Spark documentation today:

"This tends to grow with the executor size (typically 6-10%)."
"7% was arrived at experimentally. In the interest of minimizing memory waste while covering the common cases. Memory overhead tends to grow with container size."

There is a hard-coded 7% sitting in the codebase, that is "load-bearing" (dominates the other lower-bound of 384MB) above around 5GB of executor memory, with comments that say sometimes a person may want more than 7%.

I understand that in general a proliferation of configuration parameters can be confusing, but IMHO it is more confusing that there are already two logical configuration parameters here, with various reasonable values for them discussed in the documentation, yet one of them not actually configurable.

Also, re: user input, I am reporting, as a user, that I have had to bump spark.yarn.executor.memoryOverhead around to track the 10% recommended in a widely-publicized talk by @sryza (relevant slide) and in the Spark documentation itself, on pain of indecipherable job failures and wasted time, and keep changing it as I experiment with different executor memory sizes, which is all a pain.

srowen · 2015-02-07T23:23:47Z

FWIW I think the logic behind the existing parameters is: 7% should be enough for anyone. And if it isn't, they know it, and have a clear idea of just how much they want the overhead to be. Or: if 7% is consistently not enough, it should be higher, like 10%. I'd leave the existing param unless it's obviously problematic.

But that's separate from the question here, of whether lots of these strings could be specified in mega or gigabytes. Is that still something that can move forward? the refactoring of all that seems helpful in any event.

sryza · 2015-02-08T19:30:48Z

Not a huge deal, but I think the converse is kind of true. In my experience, the right percentage is totally unclear to everyone in all situations, so pretty much always needs to be set experimentally. If I settle on a percentage that works for my job (or all my jobs), it would be nice for me not to need to reset this param every time I tweak spark.executor.memory.

srowen · 2015-02-08T20:16:18Z

OK maybe there's enough momentum for ... two flags? one of which overrides the other? But maybe we should also bump up the default a bit too.

tgravescs · 2015-02-09T15:09:28Z

I'm still not on board with having 2 different configs. You then have to explain to the user what they are, which one takes presendence, etc. I can see cases the % is useful but then again you can argue the other way too. If my job needs 4G overhead, I may want to increase the executor memory without changing the overhead amount. In that case then I have to go mess with % to get it back what I want. Either way I think the user has to experiment to get the best number. I haven't seen any reason to bump up the default at this point but if others have more real world data lets consider it.

How many people are complaining or having real issues with this?

The main reason I'm against adding this is that I consider it an api change and we now have to support 2 configs until we can get rid of the deprecated one. Its just more dev overhead, testing, and potential user confusion. If its enough of a benefit to the user to change it then I would be ok with it but would rather deprecate the existing one in favor of the %. To me the raw number is more flexible then the % but I see the % as being easier in situations.

lianhuiwang · 2015-02-09T16:07:15Z

before i think that both of two configs can be existed. but from @tgravescs i think overhead is more necessary than OverheadFraction , because at some time it has very large memory,we just use overhead to increase executor memory. now memoryOverheadFraction is very large and inappropriate.

ryan-williams · 2015-02-09T21:01:11Z

Thanks for resurrecting this, @srowen. I think there are 3 potentially separable changes here, in rough order of increasing controversial-ness:

Make spark.yarn.executor.memoryOverhead take a "memory string" (e.g. 1g or 100m)
- defaulting to m as the unit for backwards-compatibility.
General refactoring of "memory string" handling in conf parameters in some other parts of the codebase.
Addition of spark.yarn.executor.memoryOverheadFraction

If people prefer, I can make a separate PR with 1∪2 or individual PRs for each of 1 and 2.

Re: 3, I am sympathetic to @tgravescs's arguments, but worry that we are conflating [wishing Spark's configuration surface area was smaller] with [forcing users to build custom Sparks to get at real, existing niches of said configuration surface area (in this case to change a hard-coded .07 that reasonable people can disagree on the ideal value of)]. Additionally, the arguments about possible confusion around interplay between memoryOverhead and memoryOverheadFraction are applicable to the status quo, imho.

However, I don't mind much one way or another at this point, so I'm fine dropping 3 if that's what consensus here prefers.

ryan-williams · 2015-02-09T21:30:37Z

(until I hear otherwise I am pulling/rebasing 1∪2, roughly the SPARK-4666 bits, into their own PR)

srowen · 2015-02-09T21:32:22Z

Without clear consensus at this point, and given 3 is really a question of how you like to override a default in the same way you override others, I'd do 1+2 for the moment.

ryan-williams · 2015-02-09T22:47:16Z

core/src/main/scala/org/apache/spark/scheduler/local/LocalBackend.scala

Saw this while compiling:

[WARNING] /Users/ryan/c/spark/streaming/src/main/scala/org/apache/spark/streaming/util/WriteAheadLogManager.scala:156: postfix operator second should be enabled
by making the implicit value scala.language.postfixOps visible.
This can be achieved by adding the import clause 'import scala.language.postfixOps'
or by setting the compiler option -language:postfixOps.
See the Scala docs for value scala.language.postfixOps for a discussion
why the feature should be explicitly enabled.

Yes that's the right thing, if you think that the use of postfix ops is justified.

ryan-williams · 2015-02-10T01:07:49Z

OK, this is ready to go again. Changes:

removed SPARK-4665 from the title and closed that JIRA
in the process of rebasing this change I found myself wanting "MB" suffixes on various Int variables that represent numbers of megabytes; I've included several commits here that perform such renames, but they're separate and easy to remove from this PR if that's controversial
I made Utils.getMaxResultSize(SparkConf) a method of SparkConf instead.
I cleaned up the semantics around which Utils and SparkConf methods assume Ints to represent numbers of megabytes, vs. ones that are generic across memory-size orders-of-magnitude.

Let me know what you think.

srowen · 2015-02-10T10:00:07Z

core/src/main/scala/org/apache/spark/SparkConf.scala

Hm, why is this property special-cased here? the methods in this class are generally generic.

I weighed two options:

have a method in Utils that takes a SparkConf as a parameter and thinly wraps a method call on said SparkConf

make the aforementioned wrapper a method on SparkConf that delegates to another method on SparkConf

..and felt like the latter was better/cleaner. My feeling was that a kitchen-sink / generically-named Utils class that wraps methods for SparkConf (and possibly other classes?) to maintain an illusion of simplicity in the SparkConf API is not helping code clarity.

Of course, this is subjective and I'm open to putting it back in Utils, lmk.

NB: that was an answer to why this property is special-cased here, as opposed to over in Utils. You may be more interested in the question of why it's special-cased at all, but that seems reasonable to me given the couple magic strings and its greater-than-1 call-sites (namely: 2).

I would have picked Utils I suppose. Or is there somewhere less generic to put this that covers the 2 call sites? Other opinions?

Option 3. could be: Put such methods in a SparkConfUtils object that would be less prone to kitchen-sink-iness.

I'm impartial, I'll let you make the call between these three.

I'd like another set of eyes on the change from @pwendell or @JoshRosen . The reason I hesitate before merging is the large number of small changes and merge conflict potential. Although I do definitely think the variable names are improved by the suffix.

For this particular issue, maybe expose just getMemory from this class, and inline the two simple calls to it that currently use getMaxResultSize? writing getMemory("spark.driver.maxResultSize", "1g", outputScale = 'b') in two places isn't bad versus constructing a new home for it.

hm, I'd vote we put it back in Utils over un-factoring those two calls that are doing the same thing as one another

Previously it assumed a unitless number represented raw bytes, but I want to use it for a config variable that previously defaulted to # of megabytes and not break backwards-compatibility.

ryan-williams · 2015-02-10T21:06:07Z

OK I think it is back to you @srowen

ryan-williams · 2015-02-10T21:06:13Z

(thanks for the review!)

hammer · 2015-02-25T21:04:01Z

@srowen this one ready to go in?

srowen · 2015-02-25T21:09:11Z

@hammer @ryan-williams I'd prefer to hear from @JoshRosen or @pwendell before merging. The two key questions are

Gut feeling about the potential for merge conflict -- OK?
An opinion on https://github.com/apache/spark/pull/3525/files#r24632812 and maybe the for comprehension question though that's really not that important either way

JoshRosen · 2015-06-24T22:19:08Z

I think that this PR's motivation may have been addressed by #5574. @ilganeli @srowen @ryan-williams , can you confirm whether there's still an issue here and resolve this PR / JIRAs as appropriate?

srowen · 2015-06-29T08:59:01Z

I think this would have to be reworked in light of SPARK-5932. I don't think that resolved this particular suggestion. Proceed that way or close this PR?

ilganeli · 2015-06-29T13:38:02Z

@srowen @JoshRosen I think this should be refactored to use the updates from #5574 but I don't think #5574 resolves this on its own because of the need to handle the min/max allocation - my 2c.

andrewor14 · 2015-06-29T18:40:39Z

@ryan-williams would you mind opening a new PR to reflect the changes in #5574? Either that or just update this one, though the conflicts may be non-trivial since this has gotten stale. Up to you.

ryan-williams · 2015-07-21T21:48:47Z

I've lost state on this, closing, thanks all

ryan-williams changed the title ~~Improve YarnAllocator's parsing of "memory overhead" param~~ [SPARK-4665] Improve YarnAllocator's parsing of "memory overhead" param Dec 1, 2014

ryan-williams changed the title ~~[SPARK-4665] Improve YarnAllocator's parsing of "memory overhead" param~~ [SPARK-4665] [SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param Dec 5, 2014

ryan-williams force-pushed the mem-overhead branch from 5b36139 to 766e7f2 Compare February 9, 2015 22:46

ryan-williams reviewed Feb 9, 2015
View reviewed changes

ryan-williams force-pushed the mem-overhead branch from 766e7f2 to 8959717 Compare February 10, 2015 01:01

ryan-williams changed the title ~~[SPARK-4665] [SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param~~ [SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param Feb 10, 2015

srowen reviewed Feb 10, 2015
View reviewed changes

ryan-williams added 18 commits February 10, 2015 20:00

rename: executorMemoryOverhead -> executorMemoryOverheadMB

5a3e4b8

rename: amMemoryOverhead -> amMemoryOverheadMB

63086eb

rename: executorMemoryOverhead -> executorMemoryOverheadMB

f54b0ce

rename: MEMORY_OVERHEAD_MIN -> MEMORY_OVERHEAD_MIN_MB

fa3d69f

fix deprecation warning

f265d15

rename: executorMemory -> executorMemoryMB

c29da1d

rename: executorMemory -> executorMemoryMB

23a77be

rename: memoryOverhead -> memoryOverheadMB

5057bd3

rename: memory -> memoryMB

14bd3d5

rename: mem -> memMB

6e69b08

memoryStringToMb can have default scale specified

48c5115

Previously it assumed a unitless number represented raw bytes, but I want to use it for a config variable that previously defaulted to # of megabytes and not break backwards-compatibility.

move getMaxResultSize from Utils to SparkConf

dc03bf2

privatize amMemoryOverheadConf

40ac6ce

refactor memory-size order-of-magnitude constants

dd9be85

add memory-string-parsing helpers to Utils

bb66b22

add getMemory, getMB helpers to SparkConf

960b525

use SparkConf.getMB helper in Yarn memory parsing

e038867

update docs about YARN memory overhead parameters

2ebb55a

ryan-williams force-pushed the mem-overhead branch from 8959717 to 2ebb55a Compare February 10, 2015 20:08

code review feedback

50f0f52

ryan-williams closed this Jul 21, 2015

[SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param #3525

[SPARK-4666] Improve YarnAllocator's parsing of "memory overhead" param #3525

Uh oh!

Conversation

ryan-williams commented Dec 1, 2014

Uh oh!

AmplabJenkins commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

sryza commented Dec 1, 2014

Uh oh!

sryza commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

sryza commented Dec 1, 2014

Uh oh!

tgravescs commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 1, 2014

Uh oh!

tgravescs commented Dec 1, 2014

Uh oh!

ryan-williams commented Dec 4, 2014

Uh oh!

srowen commented Feb 7, 2015

Uh oh!

sryza commented Feb 8, 2015

Uh oh!

srowen commented Feb 8, 2015

Uh oh!

tgravescs commented Feb 9, 2015

Uh oh!

lianhuiwang commented Feb 9, 2015

Uh oh!

ryan-williams commented Feb 9, 2015

Uh oh!

ryan-williams commented Feb 9, 2015

Uh oh!

srowen commented Feb 9, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan-williams commented Feb 10, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan-williams commented Feb 10, 2015

Uh oh!

ryan-williams commented Feb 10, 2015

Uh oh!

hammer commented Feb 25, 2015

Uh oh!

srowen commented Feb 25, 2015

Uh oh!

JoshRosen commented Jun 24, 2015

Uh oh!

srowen commented Jun 29, 2015

Uh oh!