-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26327][SQL][FOLLOW-UP] Refactor the code and restore the metrics name #23328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| /** | ||
| * Send the updated metrics to driver, while this function calling, selectedPartitions has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong.
| Map("numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output rows"), | ||
| "numFiles" -> SQLMetrics.createMetric(sparkContext, "number of files"), | ||
| "fileListingTime" -> SQLMetrics.createMetric(sparkContext, "file listing time (ms)"), | ||
| "metadataTime" -> SQLMetrics.createMetric(sparkContext, "metadata time"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original name is more straightfoward to end users who has no idea about file listing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we will add more metadata operation and reuse this metrics. Anyway it was not a good idea to do the renaming in a bug fix PR. @xuanyuanking can you create a ticket and send a new PR for renaming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy that, original thinking is this bug fix is part of https://issues.apache.org/jira/browse/SPARK-26222.
| case _ => | ||
| createNonBucketedReadRDD(readFile, selectedPartitions, relation) | ||
| } | ||
| sendDriverMetrics() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 313 and line 315, both are calling selectedPartitions. Thus, it is safer to say selectedPartitions is initialized before we send the driver-side metrics
|
LGTM |
|
Test build #100208 has finished for PR 23328 at commit
|
|
Test build #100209 has finished for PR 23328 at commit
|
| val optimizerMetadataTimeNs = relation.location.metadataOpsTimeNs.getOrElse(0L) | ||
| val startTime = System.nanoTime() | ||
| val ret = relation.location.listFiles(partitionFilters, dataFilters) | ||
| driverMetrics("filesNum") = ret.map(_.files.size.toLong).sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @gatorsmile . It looks like a typo of numFiles.
|
Test build #100217 has finished for PR 23328 at commit
|
|
Thanks! Merged to master. |
…cs name ## What changes were proposed in this pull request? - The original comment about `updateDriverMetrics` is not right. - Refactor the code to ensure `selectedPartitions ` has been set before sending the driver-side metrics. - Restore the original name, which is more general and extendable. ## How was this patch tested? The existing tests. Closes apache#23328 from gatorsmile/followupSpark-26142. Authored-by: gatorsmile <[email protected]> Signed-off-by: gatorsmile <[email protected]>
…cs name ## What changes were proposed in this pull request? - The original comment about `updateDriverMetrics` is not right. - Refactor the code to ensure `selectedPartitions ` has been set before sending the driver-side metrics. - Restore the original name, which is more general and extendable. ## How was this patch tested? The existing tests. Closes apache#23328 from gatorsmile/followupSpark-26142. Authored-by: gatorsmile <[email protected]> Signed-off-by: gatorsmile <[email protected]>
What changes were proposed in this pull request?
updateDriverMetricsis not right.selectedPartitionshas been set before sending the driver-side metrics.How was this patch tested?
The existing tests.