-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26140] Enable custom metrics implementation in shuffle reader #23105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -48,7 +48,8 @@ private[spark] trait ShuffleManager { | |
| handle: ShuffleHandle, | ||
| startPartition: Int, | ||
| endPartition: Int, | ||
| context: TaskContext): ShuffleReader[K, C] | ||
| context: TaskContext, | ||
| metrics: ShuffleMetricsReporter): ShuffleReader[K, C] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIUC, we should pass a read metrics reporter here, as this method is
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is a read metrics here actually. In the write PR this is renamed ShuffleReadMetricsReporter. |
||
|
|
||
| /** | ||
| * Remove a shuffle's metadata from the ShuffleManager. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.shuffle | ||
|
|
||
| /** | ||
| * An interface for reporting shuffle information, for each shuffle. This interface assumes | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| * all the methods are called on a single-threaded, i.e. concrete implementations would not need | ||
| * to synchronize anything. | ||
| */ | ||
| private[spark] trait ShuffleMetricsReporter { | ||
| def incRemoteBlocksFetched(v: Long): Unit | ||
| def incLocalBlocksFetched(v: Long): Unit | ||
| def incRemoteBytesRead(v: Long): Unit | ||
| def incRemoteBytesReadToDisk(v: Long): Unit | ||
| def incLocalBytesRead(v: Long): Unit | ||
| def incFetchWaitTime(v: Long): Unit | ||
| def incRecordsRead(v: Long): Unit | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.shuffle | ||
|
|
||
| /** | ||
| * An interface for reporting shuffle read metrics, for each shuffle. This interface assumes | ||
| * all the methods are called on a single-threaded, i.e. concrete implementations would not need | ||
| * to synchronize. | ||
| * | ||
| * All methods have additional Spark visibility modifier to allow public, concrete implementations | ||
| * that still have these methods marked as private[spark]. | ||
| */ | ||
| private[spark] trait ShuffleReadMetricsReporter { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how do we plan to use this interface later on? It's not used in this PR.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @xuanyuanking just submitted a PR on how to use it :)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #23128 :) |
||
| private[spark] def incRemoteBlocksFetched(v: Long): Unit | ||
| private[spark] def incLocalBlocksFetched(v: Long): Unit | ||
| private[spark] def incRemoteBytesRead(v: Long): Unit | ||
| private[spark] def incRemoteBytesReadToDisk(v: Long): Unit | ||
| private[spark] def incLocalBytesRead(v: Long): Unit | ||
| private[spark] def incFetchWaitTime(v: Long): Unit | ||
| private[spark] def incRecordsRead(v: Long): Unit | ||
| } | ||
|
|
||
|
|
||
| /** | ||
| * An interface for reporting shuffle write metrics. This interface assumes all the methods are | ||
| * called on a single-threaded, i.e. concrete implementations would not need to synchronize. | ||
| * | ||
| * All methods have additional Spark visibility modifier to allow public, concrete implementations | ||
| * that still have these methods marked as private[spark]. | ||
| */ | ||
| private[spark] trait ShuffleWriteMetricsReporter { | ||
| private[spark] def incBytesWritten(v: Long): Unit | ||
| private[spark] def incRecordsWritten(v: Long): Unit | ||
| private[spark] def incWriteTime(v: Long): Unit | ||
| private[spark] def decBytesWritten(v: Long): Unit | ||
| private[spark] def decRecordsWritten(v: Long): Unit | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was moved to TempShuffleReadMetrics