Skip to content

Conversation

@rxin
Copy link
Contributor

@rxin rxin commented Nov 21, 2018

What changes were proposed in this pull request?

This patch defines an internal Spark interface for reporting shuffle metrics and uses that in shuffle reader. Before this patch, shuffle metrics is tied to a specific implementation (using a thread local temporary data structure and accumulators). After this patch, callers that define their own shuffle RDDs can create a custom metrics implementation.

With this patch, we would be able to create a better metrics for the SQL layer, e.g. reporting shuffle metrics in the SQL UI, for each exchange operator.

Note that I'm separating read side and write side implementations, as they are very different, to simplify code review. Write side change is at #23106

How was this patch tested?

No behavior change expected, as it is a straightforward refactoring. Updated all existing test cases.

@rxin rxin changed the title [SPARK-26140] Pull TempShuffleReadMetrics creation out of shuffle reader [SPARK-26140] Enable passing in a custom shuffle metrics reporter into shuffle reader Nov 21, 2018
* shuffle dependency, and all temporary metrics will be merged into the [[ShuffleReadMetrics]] at
* last.
*/
private[spark] class TempShuffleReadMetrics {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was moved to TempShuffleReadMetrics

@rxin rxin changed the title [SPARK-26140] Enable passing in a custom shuffle metrics reporter into shuffle reader [SPARK-26140] Enable custom metrics implementation in shuffle reader Nov 21, 2018
@rxin
Copy link
Contributor Author

rxin commented Nov 21, 2018

cc @jiangxb1987 @squito @zsxwing

@SparkQA
Copy link

SparkQA commented Nov 21, 2018

Test build #99128 has finished for PR 23105 at commit da253b5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 21, 2018

Test build #99131 has finished for PR 23105 at commit 1a5ce24.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 21, 2018

Test build #99129 has finished for PR 23105 at commit 29ef763.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

package org.apache.spark.shuffle

/**
* An interface for reporting shuffle information, for each shuffle. This interface assumes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each shuffle -> for each reducer of a shuffle?

* All methods have additional Spark visibility modifier to allow public, concrete implementations
* that still have these methods marked as private[spark].
*/
private[spark] trait ShuffleReadMetricsReporter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we plan to use this interface later on? It's not used in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xuanyuanking just submitted a PR on how to use it :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#23128 :)

endPartition: Int,
context: TaskContext): ShuffleReader[K, C]
context: TaskContext,
metrics: ShuffleMetricsReporter): ShuffleReader[K, C]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we should pass a read metrics reporter here, as this method is getReader which is called by the reducers to read shuffle files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a read metrics here actually. In the write PR this is renamed ShuffleReadMetricsReporter.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks! Merged to master.

@asfgit asfgit closed this in de84899 Nov 23, 2018
asfgit pushed a commit that referenced this pull request Nov 27, 2018
## What changes were proposed in this pull request?
In #23105, due to working on two parallel PRs at once, I made the mistake of committing the copy of the PR that used the name ShuffleMetricsReporter for the interface, rather than the appropriate one ShuffleReadMetricsReporter. This patch fixes that.

## How was this patch tested?
This should be fine as long as compilation passes.

Closes #23147 from rxin/ShuffleReadMetricsReporter.

Authored-by: Reynold Xin <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?
This patch defines an internal Spark interface for reporting shuffle metrics and uses that in shuffle reader. Before this patch, shuffle metrics is tied to a specific implementation (using a thread local temporary data structure and accumulators). After this patch, callers that define their own shuffle RDDs can create a custom metrics implementation.

With this patch, we would be able to create a better metrics for the SQL layer, e.g. reporting shuffle metrics in the SQL UI, for each exchange operator.

Note that I'm separating read side and write side implementations, as they are very different, to simplify code review. Write side change is at apache#23106

## How was this patch tested?
No behavior change expected, as it is a straightforward refactoring. Updated all existing test cases.

Closes apache#23105 from rxin/SPARK-26140.

Authored-by: Reynold Xin <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?
In apache#23105, due to working on two parallel PRs at once, I made the mistake of committing the copy of the PR that used the name ShuffleMetricsReporter for the interface, rather than the appropriate one ShuffleReadMetricsReporter. This patch fixes that.

## How was this patch tested?
This should be fine as long as compilation passes.

Closes apache#23147 from rxin/ShuffleReadMetricsReporter.

Authored-by: Reynold Xin <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?
This is the write side counterpart to apache#23105

## How was this patch tested?
No behavior change expected, as it is a straightforward refactoring. Updated all existing test cases.

Closes apache#23106 from rxin/SPARK-26141.

Authored-by: Reynold Xin <[email protected]>
Signed-off-by: Reynold Xin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants