Skip to content

Conversation

@xuanyuanking
Copy link
Member

What changes were proposed in this pull request?

Implement SQLShuffleMetricsReporter on the sql side as the customized ShuffleMetricsReporter, which extended the TempShuffleReadMetrics and update SQLMetrics, in this way shuffle metrics can be reported in the SQL UI.

How was this patch tested?

Add UT in SQLMetricsSuite.
Manual test locally, before:
image
after:
image

@xuanyuanking xuanyuanking changed the title [SPARK-26139][SQL] Support passing shuffle metrics to exchange operator [SPARK-26142][SQL] Support passing shuffle metrics to exchange operator Nov 23, 2018
@gatorsmile
Copy link
Member

@xuanyuanking Could you address the conflicts? Thanks for you fast work!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a hashmap lookup here could introduce serious performance regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I’m not referring to just this function, but in general, especially for per-row).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the less consideration on per-row operation here, I should be more careful. Fix done in cb46bfe.

@SparkQA
Copy link

SparkQA commented Nov 24, 2018

Test build #99220 has finished for PR 23128 at commit 1b556ec.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class SQLShuffleMetricsReporter(

@xuanyuanking
Copy link
Member Author

@gatorsmile Thanks Xiao! Conflicts resolve done, as Reynold comments in #23105 (comment), when the ShuffleMetricsReporter move to ShuffleReadMetricsReporter in write pr, it will conflict again here, I'll keep tracking the relevant pr.

@SparkQA
Copy link

SparkQA commented Nov 24, 2018

Test build #99223 has finished for PR 23128 at commit cb46bfe.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xuanyuanking
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 24, 2018

Test build #99225 has finished for PR 23128 at commit cb46bfe.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 24, 2018

Test build #4440 has finished for PR 23128 at commit cb46bfe.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 25, 2018

Test build #99236 has finished for PR 23128 at commit 8689acb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 27, 2018

Test build #99308 has finished for PR 23128 at commit 8689acb.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 27, 2018

Test build #99311 has finished for PR 23128 at commit 8689acb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to consider this case since ShuffledRowRDD is a private API. If we do need to consider it, we also need to take care if users pass in a metrics that is invalid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean we may leave the metrics empty when creating ShuffledRowRDD in tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean we may leave the metrics empty when creating ShuffledRowRDD in tests?

Yes, like we did in UnsafeRowSerializerSuite.

I don't think we need to consider this case since ShuffledRowRDD is a private API

Got it, after search new ShuffledRowRDD in all source code, UnsafeRowSerializerSuite is the only place, I'll change the test and delete the default value of metrics in this commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0348ae5.

@cloud-fan
Copy link
Contributor

LGTM except one comment

@SparkQA
Copy link

SparkQA commented Nov 27, 2018

Test build #99334 has finished for PR 23128 at commit 0348ae5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xuanyuanking xuanyuanking changed the title [SPARK-26142][SQL] Support passing shuffle metrics to exchange operator [SPARK-26142][SQL] Implement shuffle read metrics in SQL Nov 28, 2018
@SparkQA
Copy link

SparkQA commented Nov 28, 2018

Test build #99347 has finished for PR 23128 at commit d12ea31.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xuanyuanking
Copy link
Member Author

xuanyuanking commented Nov 28, 2018

python UT failed cause jvm crush.

@xuanyuanking
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 28, 2018

Test build #99355 has finished for PR 23128 at commit d12ea31.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

val metrics = context.taskMetrics().createTempShuffleReadMetrics()
val tempMetrics = context.taskMetrics().createTempShuffleReadMetrics()
// Wrap the tempMetrics with SQLShuffleMetricsReporter here to support
// shuffle read metrics in SQL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// `SQLShuffleMetricsReporter` will update its own metrics for SQL exchange operator,
// as well as the `tempMetrics` for basic shuffle metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Wenchen, done in 8e84c5b.

@SparkQA
Copy link

SparkQA commented Nov 28, 2018

Test build #99359 has finished for PR 23128 at commit d12ea31.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 28, 2018

Test build #99363 has finished for PR 23128 at commit 8e84c5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks , merging to master!

@asfgit asfgit closed this in 93112e6 Nov 28, 2018
@xuanyuanking
Copy link
Member Author

Thanks @cloud-fan @gatorsmile @rxin !

@xuanyuanking xuanyuanking deleted the SPARK-26142 branch November 28, 2018 12:33
/**
* Create all shuffle read relative metrics and return the Map.
*/
def getShuffleReadMetrics(sc: SparkContext): Map[String, SQLMetric] = Map(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to name this create, rather than get, to imply we are creating a new set rather than just returning some existing sets.

Copy link
Member Author

@xuanyuanking xuanyuanking Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, rename it to createShuffleReadMetrics and move to SQLShuffleMetricsReporter. Done in #23175.

* contains all shuffle metrics defined in [[SQLMetrics.getShuffleReadMetrics]].
*/
private[spark] class SQLShuffleMetricsReporter(
tempMetrics: TempShuffleReadMetrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 space indent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ,done in #23175.


private val baseForAvgMetric: Int = 10

val REMOTE_BLOCKS_FETCHED = "remoteBlocksFetched"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than putting this list and the getShuffleReadMetrics function here, we should move it into SQLShuffleMetricsReporter. Otherwise in the future when one adds another metric, he/she is likely to forget to update SQLShuffleMetricsReporter.

@rxin
Copy link
Contributor

rxin commented Nov 28, 2018

@xuanyuanking @cloud-fan when you think about where to put each code block, make sure you also think about future evolution of the codebase. In general put relevant things closer to each other (e.g. in one class, one file, or one method).

@xuanyuanking
Copy link
Member Author

@rxin Thanks for guidance, I'll address these comments in a follow up PR soon.

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

Implement `SQLShuffleMetricsReporter` on the sql side as the customized ShuffleMetricsReporter, which extended the `TempShuffleReadMetrics` and update SQLMetrics, in this way shuffle metrics can be reported in the SQL UI.

## How was this patch tested?

Add UT in SQLMetricsSuite.
Manual test locally, before:
![image](https://user-images.githubusercontent.com/4833765/48960517-30f97880-efa8-11e8-982c-92d05938fd1d.png)
after:
![image](https://user-images.githubusercontent.com/4833765/48960587-b54bfb80-efa8-11e8-8e95-7a3c8c74cc5c.png)

Closes apache#23128 from xuanyuanking/SPARK-26142.

Lead-authored-by: Yuanjian Li <[email protected]>
Co-authored-by: liyuanjian <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…LShuffleMetricsReporter

## What changes were proposed in this pull request?

Follow up for apache#23128, move sql read metrics relatives to `SQLShuffleMetricsReporter`, in order to put sql shuffle read metrics relatives closer and avoid possible problem about forgetting update SQLShuffleMetricsReporter while new metrics added by others.

## How was this patch tested?

Existing tests.

Closes apache#23175 from xuanyuanking/SPARK-26142-follow.

Authored-by: Yuanjian Li <[email protected]>
Signed-off-by: Reynold Xin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants