-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26142][SQL] Implement shuffle read metrics in SQL #23128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@xuanyuanking Could you address the conflicts? Thanks for you fast work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing a hashmap lookup here could introduce serious performance regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I’m not referring to just this function, but in general, especially for per-row).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the less consideration on per-row operation here, I should be more careful. Fix done in cb46bfe.
|
Test build #99220 has finished for PR 23128 at commit
|
1b556ec to
cb46bfe
Compare
|
@gatorsmile Thanks Xiao! Conflicts resolve done, as Reynold comments in #23105 (comment), when the ShuffleMetricsReporter move to ShuffleReadMetricsReporter in write pr, it will conflict again here, I'll keep tracking the relevant pr. |
|
Test build #99223 has finished for PR 23128 at commit
|
|
retest this please. |
|
Test build #99225 has finished for PR 23128 at commit
|
|
Test build #4440 has finished for PR 23128 at commit
|
cb46bfe to
8689acb
Compare
|
Test build #99236 has finished for PR 23128 at commit
|
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
Outdated
Show resolved
Hide resolved
|
retest this please |
|
Test build #99308 has finished for PR 23128 at commit
|
|
retest this please |
|
Test build #99311 has finished for PR 23128 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to consider this case since ShuffledRowRDD is a private API. If we do need to consider it, we also need to take care if users pass in a metrics that is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean we may leave the metrics empty when creating ShuffledRowRDD in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean we may leave the metrics empty when creating ShuffledRowRDD in tests?
Yes, like we did in UnsafeRowSerializerSuite.
I don't think we need to consider this case since ShuffledRowRDD is a private API
Got it, after search new ShuffledRowRDD in all source code, UnsafeRowSerializerSuite is the only place, I'll change the test and delete the default value of metrics in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 0348ae5.
|
LGTM except one comment |
8689acb to
0348ae5
Compare
|
Test build #99334 has finished for PR 23128 at commit
|
sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala
Outdated
Show resolved
Hide resolved
|
Test build #99347 has finished for PR 23128 at commit
|
|
python UT failed cause jvm crush. |
|
retest this please |
|
Test build #99355 has finished for PR 23128 at commit
|
|
retest this please |
| val metrics = context.taskMetrics().createTempShuffleReadMetrics() | ||
| val tempMetrics = context.taskMetrics().createTempShuffleReadMetrics() | ||
| // Wrap the tempMetrics with SQLShuffleMetricsReporter here to support | ||
| // shuffle read metrics in SQL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// `SQLShuffleMetricsReporter` will update its own metrics for SQL exchange operator,
// as well as the `tempMetrics` for basic shuffle metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Wenchen, done in 8e84c5b.
|
Test build #99359 has finished for PR 23128 at commit
|
|
Test build #99363 has finished for PR 23128 at commit
|
|
thanks , merging to master! |
|
Thanks @cloud-fan @gatorsmile @rxin ! |
| /** | ||
| * Create all shuffle read relative metrics and return the Map. | ||
| */ | ||
| def getShuffleReadMetrics(sc: SparkContext): Map[String, SQLMetric] = Map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to name this create, rather than get, to imply we are creating a new set rather than just returning some existing sets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, rename it to createShuffleReadMetrics and move to SQLShuffleMetricsReporter. Done in #23175.
| * contains all shuffle metrics defined in [[SQLMetrics.getShuffleReadMetrics]]. | ||
| */ | ||
| private[spark] class SQLShuffleMetricsReporter( | ||
| tempMetrics: TempShuffleReadMetrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 space indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ,done in #23175.
|
|
||
| private val baseForAvgMetric: Int = 10 | ||
|
|
||
| val REMOTE_BLOCKS_FETCHED = "remoteBlocksFetched" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than putting this list and the getShuffleReadMetrics function here, we should move it into SQLShuffleMetricsReporter. Otherwise in the future when one adds another metric, he/she is likely to forget to update SQLShuffleMetricsReporter.
|
@xuanyuanking @cloud-fan when you think about where to put each code block, make sure you also think about future evolution of the codebase. In general put relevant things closer to each other (e.g. in one class, one file, or one method). |
|
@rxin Thanks for guidance, I'll address these comments in a follow up PR soon. |
## What changes were proposed in this pull request? Implement `SQLShuffleMetricsReporter` on the sql side as the customized ShuffleMetricsReporter, which extended the `TempShuffleReadMetrics` and update SQLMetrics, in this way shuffle metrics can be reported in the SQL UI. ## How was this patch tested? Add UT in SQLMetricsSuite. Manual test locally, before:  after:  Closes apache#23128 from xuanyuanking/SPARK-26142. Lead-authored-by: Yuanjian Li <[email protected]> Co-authored-by: liyuanjian <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…LShuffleMetricsReporter ## What changes were proposed in this pull request? Follow up for apache#23128, move sql read metrics relatives to `SQLShuffleMetricsReporter`, in order to put sql shuffle read metrics relatives closer and avoid possible problem about forgetting update SQLShuffleMetricsReporter while new metrics added by others. ## How was this patch tested? Existing tests. Closes apache#23175 from xuanyuanking/SPARK-26142-follow. Authored-by: Yuanjian Li <[email protected]> Signed-off-by: Reynold Xin <[email protected]>
What changes were proposed in this pull request?
Implement
SQLShuffleMetricsReporteron the sql side as the customized ShuffleMetricsReporter, which extended theTempShuffleReadMetricsand update SQLMetrics, in this way shuffle metrics can be reported in the SQL UI.How was this patch tested?
Add UT in SQLMetricsSuite.


Manual test locally, before:
after: