Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
59bf9e1
[SPARK-5931] Updated Utils and JavaUtils classes to add helper method…
Mar 27, 2015
404f8c3
Updated usage of spark.core.connection.ack.wait.timeout
Mar 27, 2015
7db6d2a
Updated usage of spark.akka.timeout
Mar 27, 2015
4933fda
Updated usage of spark.storage.blockManagerSlaveTimeout
Mar 27, 2015
c9f5cad
Updated spark.shuffle.io.retryWait
Mar 27, 2015
21ef3dd
updated spark.shuffle.sasl.timeout
Mar 27, 2015
064ebd6
Updated usage of spark.cleaner.ttl
Mar 27, 2015
7320c87
updated spark.akka.heartbeat.interval
Mar 27, 2015
272c215
Updated spark.locality.wait
Mar 27, 2015
3352d34
Updated spark.scheduler.maxRegisteredResourcesWaitingTime
Mar 27, 2015
3f1cfc8
Updated spark.scheduler.revive.interval
Mar 27, 2015
6d1518e
Upated spark.speculation.interval
Mar 27, 2015
2fcc91c
Updated spark.dynamicAllocation.executorIdleTimeout
Mar 27, 2015
5181597
Updated spark.dynamicAllocation.schedulerBacklogTimeout
Mar 27, 2015
c6a0095
Updated spark.core.connection.auth.wait.timeout
Mar 27, 2015
cde9bff
Updated spark.streaming.blockInterval
Mar 27, 2015
42477aa
Updated configuration doc with note on specifying time properties
Mar 27, 2015
9a29d8d
Fixed misuse of time in streaming context test
Mar 27, 2015
34f87c2
Update Utils.scala
ilganeli Mar 28, 2015
8f741e1
Update JavaUtils.java
ilganeli Mar 28, 2015
9e2547c
Reverting doc changes
Mar 30, 2015
499bdf0
Merge branch 'SPARK-5931' of github.com:ilganeli/spark into SPARK-5931
Mar 30, 2015
5232a36
[SPARK-5931] Changed default behavior of time string conversion.
Mar 30, 2015
3a12dd8
Updated host revceiver
Mar 30, 2015
68f4e93
Updated more files to clean up usage of default time strings
Mar 30, 2015
70ac213
Fixed remaining usages to be consistent. Updated Java-side time conve…
Mar 30, 2015
647b5ac
Udpated time conversion to use map iterator instead of if fall through
Mar 31, 2015
1c0c07c
Updated Java code to add day, minutes, and hours
Mar 31, 2015
8613631
Whitespace
Mar 31, 2015
bac9edf
More whitespace
Mar 31, 2015
1858197
Fixed bug where all time was being converted to us instead of the app…
Mar 31, 2015
3b126e1
Fixed conversion to US from seconds
Mar 31, 2015
39164f9
[SPARK-5931] Updated Java conversion to be similar to scala conversio…
Mar 31, 2015
b2fc965
replaced get or default since it's not present in this version of java
Mar 31, 2015
dd0a680
Updated scala code to call into java
Mar 31, 2015
bf779b0
Special handling of overlapping usffixes for java
Apr 1, 2015
76cfa27
[SPARK-5931] Minor nit fixes'
Apr 1, 2015
5193d5f
Resolved merge conflicts
Apr 6, 2015
6387772
Updated suffix handling to handle overlap of units more gracefully
Apr 6, 2015
19c31af
Added cleaner computation of time conversions in tests
Apr 6, 2015
ff40bfe
Updated tests to fix small bugs
Apr 6, 2015
28187bf
Convert straight to seconds
Apr 6, 2015
1465390
Nit
Apr 6, 2015
cbf41db
Got rid of thrown exceptions
Apr 7, 2015
d4efd26
Added time conversion for yarn.scheduler.heartbeat.interval-ms
Apr 8, 2015
4e48679
Fixed priority order and mixed up conversions in a couple spots
Apr 8, 2015
1a1122c
Formatting fixes and added m for use as minute formatter
Apr 8, 2015
cbd2ca6
Formatting error
Apr 8, 2015
6f651a8
Now using regexes to simplify code in parseTimeString. Introduces get…
Apr 8, 2015
7d19cdd
Added fix for possible NPE
Apr 8, 2015
dc7bd08
Fixed error in exception handling
Apr 8, 2015
69fedcc
Added test for zero
Apr 8, 2015
8927e66
Fixed handling of -1
Apr 9, 2015
642a06d
Fixed logic for invalid suffixes and addid matching test
Apr 9, 2015
25d3f52
Minor nit fixes
Apr 11, 2015
bc04e05
Minor fixes and doc updates
Apr 13, 2015
951ca2d
Made the most recent round of changes
Apr 13, 2015
f5fafcd
Doc updates
Apr 13, 2015
de3bff9
Fixing style errors
Apr 13, 2015
4526c81
Update configuration.md
Apr 13, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Now using regexes to simplify code in parseTimeString. Introduces get…
…TimeAsSec and getTimeAsMs methods in SparkConf. Updated documentation
  • Loading branch information
Ilya Ganelin committed Apr 8, 2015
commit 6f651a8ab68c5ceb00a22ce5b68eff5bc51e631b
Original file line number Diff line number Diff line change
Expand Up @@ -80,16 +80,16 @@ private[spark] class ExecutorAllocationManager(
Integer.MAX_VALUE)

// How long there must be backlogged tasks for before an addition is triggered (seconds)
private val schedulerBacklogTimeoutS = Utils.timeStringAsSec(conf.get(
"spark.dynamicAllocation.schedulerBacklogTimeout", "5s"))
private val schedulerBacklogTimeoutS = conf.getTimeAsSec(
"spark.dynamicAllocation.schedulerBacklogTimeout", "5s")

// Same as above, but used only after `schedulerBacklogTimeoutS` is exceeded
private val sustainedSchedulerBacklogTimeoutS = Utils.timeStringAsSec(conf.get(
"spark.dynamicAllocation.sustainedSchedulerBacklogTimeout", s"${schedulerBacklogTimeoutS}s"))
private val sustainedSchedulerBacklogTimeoutS = conf.getTimeAsSec(
"spark.dynamicAllocation.sustainedSchedulerBacklogTimeout", s"${schedulerBacklogTimeoutS}s")

// How long an executor must be idle for before it is removed (seconds)
private val executorIdleTimeoutS = Utils.timeStringAsSec(conf.get(
"spark.dynamicAllocation.executorIdleTimeout", "600s"))
private val executorIdleTimeoutS = conf.getTimeAsSec(
"spark.dynamicAllocation.executorIdleTimeout", "600s")

// During testing, the methods to actually kill and add executors are mocked out
private val testing = conf.getBoolean("spark.dynamicAllocation.testing", false)
Expand Down
16 changes: 8 additions & 8 deletions core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
Original file line number Diff line number Diff line change
Expand Up @@ -62,17 +62,17 @@ private[spark] class HeartbeatReceiver(sc: SparkContext)

// "spark.network.timeout" uses "seconds", while `spark.storage.blockManagerSlaveTimeoutMs` uses
// "milliseconds"
private val slaveTimeoutMs = Utils.timeStringAsMs(
sc.conf.get("spark.storage.blockManagerSlaveTimeoutMs", "120s"))
private val executorTimeoutMs = Utils.timeStringAsSec(sc.conf.get("spark.network.timeout",
s"${slaveTimeoutMs}ms")) * 1000
private val slaveTimeoutMs =
sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs", "120s")
private val executorTimeoutMs =
sc.conf.getTimeAsSec("spark.network.timeout", s"${slaveTimeoutMs}ms") * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just getTimeAsMs here without the * 1000?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sean - it can't because since the unit is typically specified in seconds, if no unit is provided it will be interpreted as seconds in this code. Changing to the Ms method will break backwards compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's why. Ignore all the similar comments below then.


// "spark.network.timeoutInterval" uses "seconds", while
// "spark.storage.blockManagerTimeoutIntervalMs" uses "milliseconds"
private val timeoutIntervalMs = Utils.timeStringAsMs(
sc.conf.get("spark.storage.blockManagerTimeoutIntervalMs", "60s"))
private val checkTimeoutIntervalMs = Utils.timeStringAsSec(
sc.conf.get("spark.network.timeoutInterval", s"${timeoutIntervalMs}ms")) * 1000
private val timeoutIntervalMs =
sc.conf.getTimeAsMs("spark.storage.blockManagerTimeoutIntervalMs", "60s")
private val checkTimeoutIntervalMs =
sc.conf.getTimeAsSec("spark.network.timeoutInterval", s"${timeoutIntervalMs}ms") * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, can go straight to ms?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same - it can't since default unit is s.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sean. It can't because then the default unit is assumed to be ms when it's really seconds.

Sent with Good (www.good.com)

-----Original Message-----
From: Sean Owen [[email protected]:[email protected]]
Sent: Saturday, April 11, 2015 12:51 PM Eastern Standard Time
To: apache/spark
Cc: Ganelin, Ilya
Subject: Re: [spark] [SPARK-5931][CORE] Use consistent naming for time properties (#5236)

In core/src/main/scala/org/apache/spark/HeartbeatReceiver.scalahttps://github.com//pull/5236#discussion_r28196499:

// "spark.network.timeoutInterval" uses "seconds", while
// "spark.storage.blockManagerTimeoutIntervalMs" uses "milliseconds"

  • private val checkTimeoutIntervalMs =
  • sc.conf.getOption("spark.network.timeoutInterval").map(_.toLong * 1000).
  •  getOrElse(sc.conf.getLong("spark.storage.blockManagerTimeoutIntervalMs", 60000))
    
  • private val timeoutIntervalMs =
  • sc.conf.getTimeAsMs("spark.storage.blockManagerTimeoutIntervalMs", "60s")
  • private val checkTimeoutIntervalMs =
  • sc.conf.getTimeAsSec("spark.network.timeoutInterval", s"${timeoutIntervalMs}ms") * 1000

Same, can go straight to ms?


Reply to this email directly or view it on GitHubhttps://github.com//pull/5236/files#r28196499.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.


private var timeoutCheckingTask: ScheduledFuture[_] = null

Expand Down
29 changes: 29 additions & 0 deletions core/src/main/scala/org/apache/spark/SparkConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,35 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
getOption(key).getOrElse(defaultValue)
}

/** Get a time parameter as seconds; throws a NoSuchElementException if it's not set. If no
* suffix is provided then seconds are assumed.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: in Spark we use java docs not scala docs. See other methods in this file for reference

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference in this context? this works as javadoc too. The formatting could be more like:

/**
 * Docs...
 * @throws
 */

but should still render OK either way?

def getTimeAsSec(key: String): Long = {
Utils.timeStringAsSec(getOption(key).getOrElse(throw new NoSuchElementException(key)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just call get(key) and rely on it to perform exactly this logic? (Am I missing some separate discussion we already had on this?)

}

/** Get a time parameter as seconds, falling back to a default if not set. If no
* suffix is provided then seconds are assumed.
*/
def getTimeAsSec(key: String, defaultValue: String): Long = {
Utils.timeStringAsSec(getOption(key).getOrElse(defaultValue))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, can this be get(key, defaultValue) inside?

}

/** Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set. If no
* suffix is provided then milliseconds are assumed.
*/
def getTimeAsMs(key: String): Long = {
Utils.timeStringAsMs(getOption(key).getOrElse(throw new NoSuchElementException(key)))
}

/** Get a time parameter as milliseconds, falling back to a default if not set. If no
* suffix is provided then milliseconds are assumed.
*/
def getTimeAsMs(key: String, defaultValue: String): Long = {
Utils.timeStringAsMs(getOption(key).getOrElse(defaultValue))
}


/** Get a parameter as an Option */
def getOption(key: String): Option[String] = {
Option(settings.get(key))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ private[spark] class Executor(
* This thread stops running when the executor is stopped.
*/
private def startDriverHeartbeater(): Unit = {
val intervalMs = Utils.timeStringAsMs(conf.get("spark.executor.heartbeatInterval", "10s"))
val intervalMs = conf.getTimeAsMs("spark.executor.heartbeatInterval", "10s")
val thread = new Thread() {
override def run() {
// Sleep a random interval so the heartbeats don't end up in sync
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ private[nio] class ConnectionManager(
new HashedWheelTimer(Utils.namedThreadFactory("AckTimeoutMonitor"))

private val ackTimeout =
Utils.timeStringAsSec(conf.get("spark.core.connection.ack.wait.timeout",
conf.get("spark.network.timeout", "120s")))
conf.getTimeAsSec("spark.core.connection.ack.wait.timeout",
conf.get("spark.network.timeout", "120s"))

// Get the thread counts from the Spark Configuration.
//
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,10 @@ private[spark] class TaskSchedulerImpl(
val conf = sc.conf

// How often to check for speculative tasks
val SPECULATION_INTERVAL_MS =
Utils.timeStringAsMs(conf.get("spark.speculation.interval", "100ms"))
val SPECULATION_INTERVAL_MS = conf.getTimeAsMs("spark.speculation.interval", "100ms")

// Threshold above which we warn user initial TaskSet may be starved
val STARVATION_TIMEOUT_MS = Utils.timeStringAsMs(conf.get("spark.starvation.timeout", "15s"))
val STARVATION_TIMEOUT_MS = conf.getTimeAsMs("spark.starvation.timeout", "15s")

// CPUs to request per task
val CPUS_PER_TASK = conf.getInt("spark.task.cpus", 1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -855,7 +855,7 @@ private[spark] class TaskSetManager(
case TaskLocality.RACK_LOCAL => "spark.locality.wait.rack"
case _ => ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be return 0L instead to match old behavior (probably a corner case)

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an idea, maybe we can rewrite this as:

val defaultWait = conf.get("spark.locality.wait", "3s")
val localityWaitKey =
  key match {
    case TaskLocality.PROCESS_LOCAL => "spark.locality.wait.process"
    case TaskLocality.NODE_LOCAL => "spark.locality.wait.node"
    case TaskLocality.RACK_LOCAL => "spark.locality.wait.rack"
  }
Utils.timeStringAsMs(conf.get(localityWaitKey, defaultWait))

Looks nicer IMO, less duplicate code

Utils.timeStringAsMs(conf.get(localityWaitKey, defaultWait))
conf.getTimeAsMs(localityWaitKey, defaultWait)
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp
// Submit tasks after maxRegisteredWaitingTime milliseconds
// if minRegisteredRatio has not yet been reached
val maxRegisteredWaitingTimeMs =
Utils.timeStringAsMs(conf.get("spark.scheduler.maxRegisteredResourcesWaitingTime", "30s"))
conf.getTimeAsMs("spark.scheduler.maxRegisteredResourcesWaitingTime", "30s")
val createTime = System.currentTimeMillis()

private val executorDataMap = new HashMap[String, ExecutorData]
Expand All @@ -77,8 +77,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp

override def onStart() {
// Periodically revive offers to allow delay scheduling to work
val reviveIntervalMs = Utils.timeStringAsMs(
conf.get("spark.scheduler.revive.interval", "1s"))
val reviveIntervalMs = conf.getTimeAsMs("spark.scheduler.revive.interval", "1s")

reviveThread.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
Expand Down
11 changes: 5 additions & 6 deletions core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ package org.apache.spark.util
import scala.collection.JavaConversions.mapAsJavaMap
import scala.concurrent.Await
import scala.concurrent.duration.{Duration, FiniteDuration}
import scala.util.Try

import akka.actor.{ActorRef, ActorSystem, ExtendedActorSystem}
import akka.pattern.ask
Expand Down Expand Up @@ -66,8 +65,8 @@ private[spark] object AkkaUtils extends Logging {

val akkaThreads = conf.getInt("spark.akka.threads", 4)
val akkaBatchSize = conf.getInt("spark.akka.batchSize", 15)
val akkaTimeoutS = Utils.timeStringAsSec(conf.get("spark.akka.timeout",
conf.get("spark.network.timeout", "120s")))
val akkaTimeoutS = conf.getTimeAsSec("spark.akka.timeout",
conf.get("spark.network.timeout", "120s"))
val akkaFrameSize = maxFrameSizeBytes(conf)
val akkaLogLifecycleEvents = conf.getBoolean("spark.akka.logLifecycleEvents", false)
val lifecycleEvents = if (akkaLogLifecycleEvents) "on" else "off"
Expand All @@ -79,10 +78,10 @@ private[spark] object AkkaUtils extends Logging {

val logAkkaConfig = if (conf.getBoolean("spark.akka.logAkkaConfig", false)) "on" else "off"

val akkaHeartBeatPausesS = Utils.timeStringAsSec(conf.get("spark.akka.heartbeat.pauses",
"6000s"))
val akkaHeartBeatPausesS = conf.getTimeAsSec("spark.akka.heartbeat.pauses",
"6000s")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this to the line above? I think it fits comfortably

val akkaHeartBeatIntervalS =
Utils.timeStringAsSec(conf.get("spark.akka.heartbeat.interval", "1000s"))
conf.getTimeAsSec("spark.akka.heartbeat.interval", "1000s")

val secretKey = securityManager.getSecretKey()
val isAuthOn = securityManager.isAuthenticationEnabled()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ private[spark] object MetadataCleanerType extends Enumeration {
// initialization of StreamingContext. It's okay for users trying to configure stuff themselves.
private[spark] object MetadataCleaner {
def getDelaySeconds(conf: SparkConf): Int = {
Utils.timeStringAsSec(conf.get("spark.cleaner.ttl", "-1")).toInt
conf.getTimeAsSec("spark.cleaner.ttl", "-1").toInt
}

def getDelaySeconds(
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/util/Utils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -613,8 +613,8 @@ private[spark] object Utils extends Logging {
}
Utils.setupSecureURLConnection(uc, securityMgr)

val timeoutMs = Utils.timeStringAsSec(
conf.get("spark.files.fetchTimeout", "60s")).toInt * 1000
val timeoutMs =
conf.getTimeAsSec("spark.files.fetchTimeout", "60s").toInt * 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, can go straight to MS?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't to maintain backwards compatibility for no units.

uc.setConnectTimeout(timeoutMs)
uc.setReadTimeout(timeoutMs)
uc.connect()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ class TaskSetManagerSuite extends FunSuite with LocalSparkContext with Logging {

private val conf = new SparkConf

val LOCALITY_WAIT_MS = Utils.timeStringAsMs(conf.get("spark.locality.wait", "3s"))
val LOCALITY_WAIT_MS = conf.getTimeAsMs("spark.locality.wait", "3s")
val MAX_TASK_FAILURES = 4

override def beforeEach() {
Expand Down
25 changes: 25 additions & 0 deletions core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

package org.apache.spark.util

import java.lang.NumberFormatException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial: you don't need to import anything from java.lang

import java.util.concurrent.TimeUnit

import scala.util.Random
Expand Down Expand Up @@ -62,6 +63,30 @@ class UtilsSuite extends FunSuite with ResetSystemProperties {
assert(Utils.timeStringAsUs("1min") === TimeUnit.MINUTES.toMicros(1))
assert(Utils.timeStringAsUs("1h") === TimeUnit.HOURS.toMicros(1))
assert(Utils.timeStringAsUs("1d") === TimeUnit.DAYS.toMicros(1))

// Test invalid strings
try {
Utils.timeStringAsMs("This breaks 600s")
assert(false) // We should never reach this
} catch {
case e: NumberFormatException => assert(true)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's this really cool thing called intercept[NumberFormatException]!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is actually super cool. Thanks for pointing that out.


// Test invalid strings
try {
Utils.timeStringAsMs("600s This breaks")
assert(false) // We should never reach this
} catch {
case e: NumberFormatException => assert(true)
}

// Test invalid strings
try {
Utils.timeStringAsMs("This 123s breaks")
assert(false) // We should never reach this
} catch {
case e: NumberFormatException => assert(true)
}
}

test("bytesToString") {
Expand Down
Loading