-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-4145] Web UI job pages #3009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
2568a6c
4487dcb
bfce2b9
4b206fb
45343b8
a475ea1
56701fa
1cf4987
85e9c85
4d58e55
4846ce4
b7bf30e
8a2351b
3d0a007
1145c60
d62ea7b
79793cd
5884f91
8ab6c28
8955f4c
e2f2c43
171b53c
f2a15da
5eb39dc
d69c775
7d10b97
67080ba
eebdc2c
034aa8d
0b77e3e
1f45d44
61c265a
2bbf41a
6f17f3f
ff804cd
b89c258
f00c851
eb05e90
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,144 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.ui.jobs | ||
|
|
||
| import scala.xml.{Node, NodeSeq} | ||
|
|
||
| import javax.servlet.http.HttpServletRequest | ||
|
|
||
| import org.apache.spark.ui.{WebUIPage, UIUtils} | ||
| import org.apache.spark.ui.jobs.UIData.JobUIData | ||
|
|
||
|
|
||
| /** Page showing list of all ongoing and recently finished jobs */ | ||
| private[ui] class AllJobsPage(parent: JobsTab) extends WebUIPage("") { | ||
| private val sc = parent.sc | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will parent.sc be set when this is created? If so, I think it would be better to have an Option[Long] that you create here describing the start time, and then use to optionally show the elapsed time later (just because it makes it more clear how this is used).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good idea, and good dependency injection in general. |
||
| private val listener = parent.listener | ||
|
|
||
| private def getSubmissionTime(job: JobUIData): Option[Long] = { | ||
| for ( | ||
| firstStageId <- job.stageIds.headOption; | ||
| firstStageInfo <- listener.stageIdToInfo.get(firstStageId); | ||
| submitTime <- firstStageInfo.submissionTime | ||
| ) yield submitTime | ||
| } | ||
|
|
||
| private def jobsTable(jobs: Seq[JobUIData]): Seq[Node] = { | ||
| val columns: Seq[Node] = { | ||
| <th>Job Id (Job Group)</th> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This notion of "Job Group" always confused me -- what's the difference between this and Job Id? Do we need to mention both names in the UI? Do you use "job group" here because that's what we call the id you can pass in to kill the job?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure how Job Group is being used in all cases now, or whether it even works particularly well at all, but the concept of a Job Group could be useful when the "job" from the user's point of view is actually composed of multiple Spark jobs. That can be the case when you want to do something like sorting an RDD without falling into the nastiness of embedded, eager RDD actions to generate a RangePartitioner. Instead, you'd queue up multiple jobs in a Job Group with later jobs depending on the results of earlier jobs in the group. If the user decides that the "job" should be killed, then all of the jobs in the Job Group should be canceled.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OH I see -- I didn't realize that when there's a job group, it will be shown in parentheses ("1 (4)", for example). I thought the "(job group)" was indicating that the job Id was the same as the job group Id. This all makes sense now -- thanks!
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Job groups are perhaps an obscure feature, but they can be useful in environments where multiple users interact with a shared SparkContext (such as a job server). Do you think there's a clearer way to label this that's less confusing? Job groups might be an uncommon feature that most users aren't aware of / don't need.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One idea I had was to just label the column as Job ID if none of the jobs On Tue, Nov 11, 2014 at 2:15 PM, Josh Rosen [email protected]
|
||
| <th>Description</th> | ||
| <th>Submitted</th> | ||
| <th>Duration</th> | ||
| <th>Tasks: Succeeded/Total</th> | ||
| } | ||
|
|
||
| def makeRow(job: JobUIData): Seq[Node] = { | ||
| val lastStageInfo = job.stageIds.lastOption.flatMap(listener.stageIdToInfo.get) | ||
| val lastStageData = lastStageInfo.flatMap { s => | ||
| listener.stageIdToData.get((s.stageId, s.attemptId)) | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you move the code to compute the last stage's name up here? It was hard for me to figure out why you were grabbing the last stage here. |
||
| val duration: Option[Long] = { | ||
| job.startTime.map { start => | ||
| val end = job.endTime.getOrElse(System.currentTimeMillis()) | ||
| end - start | ||
| } | ||
| } | ||
| val formattedDuration = duration.map(d => UIUtils.formatDuration(d)).getOrElse("Unknown") | ||
| val formattedSubmissionTime = job.startTime.map(UIUtils.formatDate).getOrElse("Unknown") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize we use in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In other places, we have "Unknown stage name", etc; I'm not sure that this is a huge win (it would be beneficial if we decided to localize, though, but we're not doing that here). |
||
| val detailUrl = | ||
| "%s/jobs/job?id=%s".format(UIUtils.prependBaseUri(parent.basePath), job.jobId) | ||
|
|
||
| <tr> | ||
| <td sorttable_customkey={job.jobId.toString}> | ||
| {job.jobId} {job.jobGroup.map(id => s"($id)").getOrElse("")} | ||
| </td> | ||
| <td> | ||
| <div><em>{lastStageData.flatMap(_.description).getOrElse("")}</em></div> | ||
| <a href={detailUrl}>{lastStageInfo.map(_.name).getOrElse("(Unknown Stage Name)")}</a> | ||
| </td> | ||
| <td sorttable_customkey={job.startTime.getOrElse(-1).toString}> | ||
| {formattedSubmissionTime} | ||
| </td> | ||
| <td sorttable_customkey={duration.getOrElse(-1).toString}>{formattedDuration}</td> | ||
| <td class="progress-cell"> | ||
| {UIUtils.makeProgressBar(job.numActiveTasks, job.numCompletedTasks, | ||
| job.numFailedTasks, job.numTasks)} | ||
| </td> | ||
| </tr> | ||
| } | ||
|
|
||
| <table class="table table-bordered table-striped table-condensed sortable"> | ||
| <thead>{columns}</thead> | ||
| <tbody> | ||
| {jobs.map(makeRow)} | ||
| </tbody> | ||
| </table> | ||
| } | ||
|
|
||
| def render(request: HttpServletRequest): Seq[Node] = { | ||
| listener.synchronized { | ||
| val activeJobs = listener.activeJobs.values.toSeq | ||
| val completedJobs = listener.completedJobs.reverse.toSeq | ||
| val failedJobs = listener.failedJobs.reverse.toSeq | ||
| val now = System.currentTimeMillis | ||
|
|
||
| val activeJobsTable = | ||
| jobsTable(activeJobs.sortBy(getSubmissionTime(_).getOrElse(-1L)).reverse) | ||
| val completedJobsTable = | ||
| jobsTable(completedJobs.sortBy(getSubmissionTime(_).getOrElse(-1L)).reverse) | ||
| val failedJobsTable = | ||
| jobsTable(failedJobs.sortBy(getSubmissionTime(_).getOrElse(-1L)).reverse) | ||
|
|
||
| val summary: NodeSeq = | ||
| <div> | ||
| <ul class="unstyled"> | ||
| {if (sc.isDefined) { | ||
| // Total duration is not meaningful unless the UI is live | ||
| <li> | ||
| <strong>Total Duration: </strong> | ||
| {UIUtils.formatDuration(now - sc.get.startTime)} | ||
| </li> | ||
| }} | ||
| <li> | ||
| <strong>Scheduling Mode: </strong> | ||
| {listener.schedulingMode.map(_.toString).getOrElse("Unknown")} | ||
| </li> | ||
| <li> | ||
| <a href="#active"><strong>Active Jobs:</strong></a> | ||
| {activeJobs.size} | ||
| </li> | ||
| <li> | ||
| <a href="#completed"><strong>Completed Jobs:</strong></a> | ||
| {completedJobs.size} | ||
| </li> | ||
| <li> | ||
| <a href="#failed"><strong>Failed Jobs:</strong></a> | ||
| {failedJobs.size} | ||
| </li> | ||
| </ul> | ||
| </div> | ||
|
|
||
| val content = summary ++ | ||
| <h4 id="active">Active Jobs ({activeJobs.size})</h4> ++ activeJobsTable ++ | ||
| <h4 id="completed">Completed Jobs ({completedJobs.size})</h4> ++ completedJobsTable ++ | ||
| <h4 id ="failed">Failed Jobs ({failedJobs.size})</h4> ++ failedJobsTable | ||
|
|
||
| UIUtils.headerSparkPage("Spark Jobs", content, parent) | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,7 +25,7 @@ import org.apache.spark.scheduler.Schedulable | |
| import org.apache.spark.ui.{WebUIPage, UIUtils} | ||
|
|
||
| /** Page showing list of all ongoing and recently finished stages and pools */ | ||
| private[ui] class JobProgressPage(parent: JobProgressTab) extends WebUIPage("") { | ||
| private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage("") { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This naming change is great |
||
| private val sc = parent.sc | ||
| private val listener = parent.listener | ||
| private def isFairScheduler = parent.isFairScheduler | ||
|
|
@@ -41,11 +41,13 @@ private[ui] class JobProgressPage(parent: JobProgressTab) extends WebUIPage("") | |
|
|
||
| val activeStagesTable = | ||
| new StageTableBase(activeStages.sortBy(_.submissionTime).reverse, | ||
| parent, parent.killEnabled) | ||
| parent.basePath, parent.listener, parent.killEnabled) | ||
| val completedStagesTable = | ||
| new StageTableBase(completedStages.sortBy(_.submissionTime).reverse, parent) | ||
| new StageTableBase(completedStages.sortBy(_.submissionTime).reverse, parent.basePath, | ||
| parent.listener, parent.killEnabled) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't killEnabled always be false here (so don't pass it in, in which case it defaults to false)? Since killing a completed / failed stage has no meaning.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good catch. It turns out that this was a pretty serious mistake, since the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh awesome!! On Tue, Nov 11, 2014 at 3:16 PM, Josh Rosen [email protected]
|
||
| val failedStagesTable = | ||
| new FailedStageTable(failedStages.sortBy(_.submissionTime).reverse, parent) | ||
| new FailedStageTable(failedStages.sortBy(_.submissionTime).reverse, parent.basePath, | ||
| parent.listener, parent.killEnabled) | ||
|
|
||
| // For now, pool information is only accessible in live UIs | ||
| val pools = sc.map(_.getAllPools).getOrElse(Seq.empty[Schedulable]) | ||
|
|
@@ -93,7 +95,7 @@ private[ui] class JobProgressPage(parent: JobProgressTab) extends WebUIPage("") | |
| <h4 id ="failed">Failed Stages ({numFailedStages})</h4> ++ | ||
| failedStagesTable.toNodeSeq | ||
|
|
||
| UIUtils.headerSparkPage("Spark Stages", content, parent) | ||
| UIUtils.headerSparkPage("Spark Stages (for all jobs)", content, parent) | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.ui.jobs | ||
|
|
||
| import scala.xml.{NodeSeq, Node} | ||
|
|
||
| import javax.servlet.http.HttpServletRequest | ||
|
|
||
| import org.apache.spark.scheduler.StageInfo | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: fix import ordering |
||
| import org.apache.spark.ui.{UIUtils, WebUIPage} | ||
|
|
||
| /** Page showing statistics and stage list for a given job */ | ||
| private[ui] class JobPage(parent: JobsTab) extends WebUIPage("job") { | ||
| private val listener = parent.listener | ||
| private val sc = parent.sc | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this ever used?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes; it's used to compute the "Total duration" field if the UI is live.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like that's the case for AllJobsPage.scala, but I couldn't find any uses of this, even via IntelliJ. |
||
|
|
||
| def render(request: HttpServletRequest): Seq[Node] = { | ||
| listener.synchronized { | ||
| val jobId = request.getParameter("id").toInt | ||
| val jobDataOption = listener.jobIdToData.get(jobId) | ||
| if (jobDataOption.isEmpty) { | ||
| val content = | ||
| <div> | ||
| <p>No information to display for job {jobId}</p> | ||
| </div> | ||
| return UIUtils.headerSparkPage( | ||
| s"Details for Job $jobId", content, parent) | ||
| } | ||
| val jobData = jobDataOption.get | ||
| val stages = jobData.stageIds.map { stageId => | ||
| // This could be empty if the JobProgressListener hasn't received information about the | ||
| // stage or if the stage information has been garbage collected | ||
| listener.stageIdToInfo.getOrElse(stageId, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When will this be empty? Is this if a job has started but some of the stages have not yet started? A comment here would be helpful.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be empty if we haven't received information about the stage or if the stage information has been garbage collected. Right now, stages and jobs are GC'd separately in the web UI's JobProgressListener. We could revisit this, though. |
||
| new StageInfo(stageId, 0, "Unknown", 0, Seq.empty, "Unknown")) | ||
| } | ||
|
|
||
| val (activeStages, completedOrFailedStages) = stages.partition(_.completionTime.isDefined) | ||
| val (failedStages, completedStages) = | ||
| completedOrFailedStages.partition(_.failureReason.isDefined) | ||
|
|
||
| val activeStagesTable = | ||
| new StageTableBase(activeStages.sortBy(_.submissionTime).reverse, | ||
| parent.basePath, parent.listener, parent.killEnabled) | ||
| val completedStagesTable = | ||
| new StageTableBase(completedStages.sortBy(_.submissionTime).reverse, parent.basePath, | ||
| parent.listener, parent.killEnabled) | ||
| val failedStagesTable = | ||
| new FailedStageTable(failedStages.sortBy(_.submissionTime).reverse, parent.basePath, | ||
| parent.listener, parent.killEnabled) | ||
|
|
||
| val summary: NodeSeq = | ||
| <div> | ||
| <ul class="unstyled"> | ||
| { | ||
| if (jobData.jobGroup.isDefined) { | ||
| <li> | ||
| <strong>Job Group:</strong> | ||
| {jobData.jobGroup.get} | ||
| </li> | ||
| } else Seq.empty | ||
| } | ||
| <li> | ||
| <a href="#active"><strong>Active Stages:</strong></a> | ||
| {activeStages.size} | ||
| </li> | ||
| <li> | ||
| <a href="#completed"><strong>Completed Stages:</strong></a> | ||
| {completedStages.size} | ||
| </li> | ||
| <li> | ||
| <a href="#failed"><strong>Failed Stages:</strong></a> | ||
| {failedStages.size} | ||
| </li> | ||
| </ul> | ||
| </div> | ||
|
|
||
| val content = summary ++ | ||
| <h4 id="active">Active Stages ({activeStages.size})</h4> ++ | ||
| activeStagesTable.toNodeSeq ++ | ||
| <h4 id="completed">Completed Stages ({completedStages.size})</h4> ++ | ||
| completedStagesTable.toNodeSeq ++ | ||
| <h4 id ="failed">Failed Stages ({failedStages.size})</h4> ++ | ||
| failedStagesTable.toNodeSeq | ||
| UIUtils.headerSparkPage(s"Details for Job $jobId", content, parent) | ||
| } | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to move this to before the initialize() call -- as is, kill is always disabled, because killEnabled is false when initialize is called, so StagesTab will be initialized with killEnabled set to false (I noticed this when I was playing around with this, because the kill button was nowhere to be found).