Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
3e1cc29
rename: executorMemory -> executorMemoryMB
ryan-williams Feb 10, 2015
cba802c
rename: amMemory -> amMemoryMB
ryan-williams Feb 10, 2015
0a717d7
rename: driverMemory -> driverMemoryMB
ryan-williams Feb 10, 2015
6635945
rename: DEFAULT_MEMORY -> DEFAULT_MEMORY_MB
ryan-williams Feb 10, 2015
96dfec0
rename: amMemoryOverhead -> amMemoryOverheadMB
ryan-williams Feb 10, 2015
5a3e4b8
rename: executorMemoryOverhead -> executorMemoryOverheadMB
ryan-williams Feb 10, 2015
63086eb
rename: amMemoryOverhead -> amMemoryOverheadMB
ryan-williams Feb 10, 2015
f54b0ce
rename: executorMemoryOverhead -> executorMemoryOverheadMB
ryan-williams Feb 10, 2015
fa3d69f
rename: MEMORY_OVERHEAD_MIN -> MEMORY_OVERHEAD_MIN_MB
ryan-williams Feb 10, 2015
f265d15
fix deprecation warning
ryan-williams Feb 9, 2015
c29da1d
rename: executorMemory -> executorMemoryMB
ryan-williams Feb 10, 2015
23a77be
rename: executorMemory -> executorMemoryMB
ryan-williams Feb 10, 2015
5057bd3
rename: memoryOverhead -> memoryOverheadMB
ryan-williams Feb 10, 2015
14bd3d5
rename: memory -> memoryMB
ryan-williams Feb 10, 2015
6e69b08
rename: mem -> memMB
ryan-williams Feb 10, 2015
48c5115
memoryStringToMb can have default scale specified
ryan-williams Nov 13, 2014
dc03bf2
move getMaxResultSize from Utils to SparkConf
ryan-williams Feb 10, 2015
40ac6ce
privatize amMemoryOverheadConf
ryan-williams Feb 10, 2015
dd9be85
refactor memory-size order-of-magnitude constants
ryan-williams Feb 10, 2015
bb66b22
add memory-string-parsing helpers to Utils
ryan-williams Feb 10, 2015
960b525
add `getMemory`, `getMB` helpers to SparkConf
ryan-williams Feb 10, 2015
e038867
use SparkConf.getMB helper in Yarn memory parsing
ryan-williams Feb 10, 2015
2ebb55a
update docs about YARN memory overhead parameters
ryan-williams Feb 10, 2015
50f0f52
code review feedback
ryan-williams Feb 10, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add memory-string-parsing helpers to Utils
  • Loading branch information
ryan-williams committed Feb 10, 2015
commit bb66b222745a85477f32ce03bb72f2e452e5a670
79 changes: 56 additions & 23 deletions core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -986,34 +986,67 @@ private[spark] object Utils extends Logging {
private val MB = 1L << 20
private val KB = 1L << 10

private val scaleCharToFactor: Map[Char, Long] = Map(
'b' -> 1L,
'k' -> KB,
'm' -> MB,
'g' -> GB,
't' -> TB
)

/**
* Convert a Java memory parameter passed to -Xmx (such as 300m or 1g) to a number of megabytes.
*/
def memoryStringToMb(str: String): Int = memoryStringToMb(str, 'b')
def memoryStringToMb(str: String, defaultScale: Char): Int = {
* Convert a Java memory parameter passed to -Xmx (such as "300m" or "1g") to a number of
* megabytes (or other byte-scale denominations as specified by @outputScaleChar).
*
* For @defaultInputScaleChar and @outputScaleChar, valid values are: 'b' (bytes), 'k'
* (kilobytes), 'm' (megabytes), 'g' (gigabytes), and 't' (terabytes).
*
* @param str String to parse an amount of memory out of
* @param defaultInputScaleChar if no "scale" is provided on the end of @str (i.e. @str is a
* plain numeric value), assume this scale (default: 'b' for
* 'bytes')
* @param outputScaleChar express the output in this scale, i.e. number of bytes, kilobytes,
* megabytes, or gigabytes.
*/
def parseMemoryString(
str: String,
defaultInputScaleChar: Char = 'b',
outputScaleChar: Char = 'm'): Long = {

val lower = str.toLowerCase
val lastChar = lower(lower.length - 1)
val scale =
if (lastChar.isDigit)
defaultScale
else
lastChar

if (scale == 'k') {
(lower.substring(0, lower.length-1).toLong / 1024).toInt
} else if (scale == 'm') {
lower.substring(0, lower.length-1).toInt
} else if (scale == 'g') {
lower.substring(0, lower.length-1).toInt * 1024
} else if (scale == 't') {
lower.substring(0, lower.length-1).toInt * 1024 * 1024
} else if (scale == 'b') {// no suffix, so it's just a number in bytes
(lower.toLong / 1024 / 1024).toInt
} else {
throw new IllegalArgumentException("Invalid memory string: %s".format(str))
}
val (num, inputScaleChar) =
if (lastChar.isDigit) {
(lower.toLong, defaultInputScaleChar)
} else {
(lower.substring(0, lower.length - 1).toLong, lastChar)
}

(for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a for construction used here? just to handle the invalid in / out scale param? My personal taste would be to just check that the Option[Long] exists directly and do away with it. I can't tell how much that's just me versus how the kids talk in Scala these days. A looping construct surprised me as there is no loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a for construction used here? just to handle the invalid in / out scale param?

for comprehensions are commonly used when maping over 2 or more monads to avoid arguably-ugly .flatMap-chaining; the syntax just removes some boilerplate, e.g.:

    scaleCharToFactor.get(inputScaleChar).flatMap(inputScale => 
      scaleCharToFactor.get(outputScaleChar).map(outputScale =>
        inputScale * num / outputScale
      )
    ).getOrElse(
        // throw
      )

vs.

    (for {
      inputScale <- scaleCharToFactor.get(inputScaleChar)
      outputScale <- scaleCharToFactor.get(outputScaleChar)
    } yield {
      inputScale * num / outputScale
    }).getOrElse(
        // throw
      )

(I collapsed the scale wrapper line in the latter for apples-to-apples brevity, and can do that in the PR as well).

So, it's not a "looping construct" so much as a "mapping" one, commonly used on Options, Lists, and even things like twitter Futures (search for "for {").

However, it does get better as the number of chained maps increases, e.g. especially when there are 3 or more, so I'm not that tied to it here.

Of course, handling such situations using a match is also possible:

    (scaleCharToFactor.get(inputScaleChar), scaleCharToFactor.get(outputScaleChar)) match {
      case (Some(inputScale), Some(outputScale)) =>
        inputScale * num / outputScale
      case _ =>
        // throw
    }

I prefer all of these to, say, the following straw-man that doesn't take advantage of any of the nice things that come from using Options:

    if (scaleCharToFactor.get(inputScaleChar).isDefined && 
        scaleCharToFactor.get(outputScaleChar).isDefined)
      scaleCharToFactor.get(inputScaleChar).get * num / scaleCharToFactor.get(outputScaleChar).get
    else
      // throw

but I'll defer to you on the approach you like best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I get what it does. I was comparing in my mind to...

val inputScale = scaleCharToFactor.get(inputScaleChar)
val outputScale = scaleCharToFactor.get(outputScaleChar)
require(inputScale.isDefined)
require(outputScale.isDefined)
inputScale.get * num / outputScale.get

Here's another instance where I wouldn't mind hearing another opinion as it's a good more general style question.

inputScale <- scaleCharToFactor.get(inputScaleChar)
outputScale <- scaleCharToFactor.get(outputScaleChar)
scale = inputScale * num / outputScale
} yield {
scale
}).getOrElse(
throw new IllegalArgumentException(
"Invalid memory string or scale: %s, %s, %s".format(
str,
defaultInputScaleChar,
outputScaleChar
)
)
)
}

/**
* Wrapper for @parseMemoryString taking default arguments and returning an int, which is safe
* since we are converting to a number of megabytes.
*/
def memoryStringToMb(str: String): Int = memoryStringToMb(str, defaultInputScale = 'b')
def memoryStringToMb(str: String, defaultInputScale: Char = 'b'): Int =
parseMemoryString(str, defaultInputScale, 'm').toInt

/**
* Convert a quantity in bytes to a human-readable string such as "4.0 MB".
*/
Expand Down