Skip to content
Merged
Changes from 1 commit
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
4062cda
Preparing Spark release v1.6.0-rc4
pwendell Dec 22, 2015
5b19e7c
Preparing development version 1.6.0-SNAPSHOT
pwendell Dec 22, 2015
309ef35
[MINOR] Fix typos in JavaStreamingContext
zsxwing Dec 22, 2015
0f905d7
[SPARK-11823][SQL] Fix flaky JDBC cancellation test in HiveThriftBina…
JoshRosen Dec 22, 2015
94fb5e8
[SPARK-12487][STREAMING][DOCUMENT] Add docs for Kafka message handler
zsxwing Dec 22, 2015
942c057
[SPARK-12429][STREAMING][DOC] Add Accumulator and Broadcast example f…
zsxwing Dec 23, 2015
c6c9bf9
[SPARK-12477][SQL] - Tungsten projection fails for null values in arr…
Dec 23, 2015
5987b16
[SPARK-12499][BUILD] don't force MAVEN_OPTS
abridgett Dec 24, 2015
b49856a
[SPARK-12411][CORE] Decrease executor heartbeat timeout to match hear…
nongli Dec 19, 2015
4dd8712
[SPARK-12502][BUILD][PYTHON] Script /dev/run-tests fails when IBM Jav…
kiszk Dec 24, 2015
865dd8b
[SPARK-12010][SQL] Spark JDBC requires support for column-name-free I…
CK50 Dec 24, 2015
b8da77e
[SPARK-12520] [PYSPARK] Correct Descriptions and Add Use Cases in Equ…
gatorsmile Dec 28, 2015
1fbcb6e
[SPARK-12517] add default RDD name for one created via sc.textFile
wyaron Dec 28, 2015
7c7d76f
[SPARK-12424][ML] The implementation of ParamMap#filter is wrong.
sarutak Dec 28, 2015
a9c52d4
[SPARK-12222][CORE] Deserialize RoaringBitmap using Kryo serializer t…
adrian-wang Dec 28, 2015
fd20248
[SPARK-12489][CORE][SQL][MLIB] Fix minor issues found by FindBugs
zsxwing Dec 28, 2015
d545dfe
Merge branch 'branch-1.6' of github.com:apache/spark into csd-1.6
markhamstra Dec 29, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-12411][CORE] Decrease executor heartbeat timeout to match hear…
…tbeat interval

Previously, the rpc timeout was the default network timeout, which is the same value
the driver uses to determine dead executors. This means if there is a network issue,
the executor is determined dead after one heartbeat attempt. There is a separate config
for the heartbeat interval which is a better value to use for the heartbeat RPC. With
this change, the executor will make multiple heartbeat attempts even with RPC issues.

Author: Nong Li <[email protected]>

Closes apache#10365 from nongli/spark-12411.
  • Loading branch information
nongli authored and Andrew Or committed Dec 24, 2015
commit b49856ae5983aca8ed7df2f478fc5f399ec34ce8
4 changes: 3 additions & 1 deletion core/src/main/scala/org/apache/spark/executor/Executor.scala
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import scala.util.control.NonFatal
import org.apache.spark._
import org.apache.spark.deploy.SparkHadoopUtil
import org.apache.spark.memory.TaskMemoryManager
import org.apache.spark.rpc.RpcTimeout
import org.apache.spark.scheduler.{DirectTaskResult, IndirectTaskResult, Task}
import org.apache.spark.shuffle.FetchFailedException
import org.apache.spark.storage.{StorageLevel, TaskResultBlockId}
Expand Down Expand Up @@ -445,7 +446,8 @@ private[spark] class Executor(

val message = Heartbeat(executorId, tasksMetrics.toArray, env.blockManager.blockManagerId)
try {
val response = heartbeatReceiverRef.askWithRetry[HeartbeatResponse](message)
val response = heartbeatReceiverRef.askWithRetry[HeartbeatResponse](
message, RpcTimeout(conf, "spark.executor.heartbeatInterval", "10s"))
if (response.reregisterBlockManager) {
logInfo("Told to re-register on heartbeat")
env.blockManager.reregister()
Expand Down