Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@ package object config {
private[spark] val EVENT_LOG_OVERWRITE =
ConfigBuilder("spark.eventLog.overwrite").booleanConf.createWithDefault(false)

private[spark] val EVENT_LOG_CALLSITE_FORM =
ConfigBuilder("spark.eventLog.callsite").stringConf.createWithDefault("short")
Copy link
Member

@gatorsmile gatorsmile Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short is defined? Where is the test case? Why this is not documented?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether we should introduce a conf here. cc @rxin

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah this should have been documented, that is a good point. I should have looked more carefully at the configs. The other eventLog configs aren't internal either.

I agree this could also be a boolean, or simply handle anything but "short" or "long" as "short"


private[spark] val EXECUTOR_CLASS_PATH =
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_CLASSPATH).stringConf.createOptional

Expand Down
10 changes: 9 additions & 1 deletion core/src/main/scala/org/apache/spark/storage/RDDInfo.scala
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@

package org.apache.spark.storage

import org.apache.spark.SparkEnv
import org.apache.spark.annotation.DeveloperApi
import org.apache.spark.internal.config._
import org.apache.spark.rdd.{RDD, RDDOperationScope}
import org.apache.spark.util.Utils

Expand Down Expand Up @@ -53,10 +55,16 @@ class RDDInfo(
}

private[spark] object RDDInfo {
private val callsiteForm = SparkEnv.get.conf.get(EVENT_LOG_CALLSITE_FORM)

def fromRdd(rdd: RDD[_]): RDDInfo = {
val rddName = Option(rdd.name).getOrElse(Utils.getFormattedClassName(rdd))
val parentIds = rdd.dependencies.map(_.rdd.id)
val callSite = callsiteForm match {
case "short" => rdd.creationSite.shortForm
case "long" => rdd.creationSite.longForm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the users input is neither short nor long, we will get an exception, right?

cc @jiangxb1987 @cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, usually we will define an enum and verify the given config value is valid or not.

For this particular case, I think we can just define a boolean flag: spark.eventLog.callsite.longForm.enabled.

}
new RDDInfo(rdd.id, rddName, rdd.partitions.length,
rdd.getStorageLevel, parentIds, rdd.creationSite.shortForm, rdd.scope)
rdd.getStorageLevel, parentIds, callSite, rdd.scope)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds a general issue. Why we only apply it in RDDInfo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile As far as I am aware, RDDInfo is the only place the call site is included in the event log.

}
}