-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34840][SHUFFLE] Fixes cases of corruption in merged shuffle … #31934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tgravescs @Ngone51 @attilapiros @mridulm @Victsm |
|
does this cause data corruption if people use it with spark 3.1.1 release? Or are these block somehow caught and shuffle ends up failing? |
|
|
Can one of the admins verify this patch? |
|
This test failure is unrelated: |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @otterc .
SPARK-32916 is already released with Fix Version: 3.1.0. To have an independent Fix Version, this should have a new JIRA issue.
Ngone51
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm..I try to understand how those 2 scenarios cause the merged block corrupted.
-
Do you mean called
StreamCallback.onFailure()for 2 times cause the block corrupted?Seems like the thingonFailuredoes is only tosetCurrentMapIndex(-1)andsetEncounteredFailure(true). And they don't touch files, e.g., reset position or truncate. -
I can see how the duplicate stream may interfere with an active stream. e.g., the active stream may see
getCurrentMapIndex< 0 andisEncounteredFailure=truewhile writing normally itself. But it seems like the active stream is able to heal itself with the current framework.
I properly missed some details. Could you elaborate more about how corruption happens? Thanks.
.../network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
Outdated
Show resolved
Hide resolved
5aaaa6f to
c33f961
Compare
I created SPARK-34840 to address this. cc. @dongjoon-hyun |
|
Thank you, @otterc ! |
In both the scenarios, the Providing examples for both of these:
I have added UTs for both these cases as well with similar examples. |
|
@otterc Thanks for the explanation. Now I understand the cause. To confirm, for the example 2, I think the first 2 steps are not necessary, right? |
@Ngone51 I think the first 2 steps are necessary because in this edge case this can only happen when a stream is trying to merge a duplicate block, which was Let me know if I am missing some other cases. I can add UTs for them as well. |
So, it should be |
Right, that was a typo. Yes, it should be |
|
Ok, I get it now. |
Ngone51
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except one minor comment.
| } | ||
| } | ||
| } | ||
| isWriting = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this into the if condition scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move this to if scope and that would not change the behavior or cause any issues. The only reason I had it outside because it was consistent with where this flag is unset in onComplete. I understand that is a very trivial cosmetic reason so can move this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, keeping it consistent sounds fine. we can leave it as it is since it's trivial.
|
LGTM, thanks @otterc. |
### What changes were proposed in this pull request? This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block. `RemoteBlockPushResolver` was introduced in #30062 (SPARK-32916). There are 2 scenarios where the merged blocks get corrupted: 1. `StreamCallback.onFailure()` is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order. - `exceptionCaught`. This event is propagated to `StreamInterceptor`. `StreamInterceptor.exceptionCaught()` invokes `callback.onFailure(streamId, cause)`. This is the first time StreamCallback.onFailure() will be invoked. - `channelInactive`. Since the channel closes, the `channelInactive` event gets triggered which again is propagated to `StreamInterceptor`. `StreamInterceptor.channelInactive()` invokes `callback.onFailure(streamId, new ClosedChannelException())`. This is the second time StreamCallback.onFailure() will be invoked. 2. The flag `isWriting` is set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails. Also adding additional changes that improve the code. 1. Using positional writes all the time because this simplifies the code and with microbenchmarking haven't seen any performance impact. 2. Additional minor changes suggested by mridulm during an internal review. ### Why are the changes needed? These are bug fixes and simplify the code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster. Co-authored-by: Chandni Singh chsinghlinkedin.com Co-authored-by: Min Shen mshenlinkedin.com Closes #31934 from otterc/SPARK-32916-followup. Lead-authored-by: Chandni Singh <[email protected]> Co-authored-by: Min Shen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 6d88212) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
|
Thanks for the reviews @Ngone51, @dongjoon-hyun ! |
### What changes were proposed in this pull request? This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block. `RemoteBlockPushResolver` was introduced in apache#30062 (SPARK-32916). There are 2 scenarios where the merged blocks get corrupted: 1. `StreamCallback.onFailure()` is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order. - `exceptionCaught`. This event is propagated to `StreamInterceptor`. `StreamInterceptor.exceptionCaught()` invokes `callback.onFailure(streamId, cause)`. This is the first time StreamCallback.onFailure() will be invoked. - `channelInactive`. Since the channel closes, the `channelInactive` event gets triggered which again is propagated to `StreamInterceptor`. `StreamInterceptor.channelInactive()` invokes `callback.onFailure(streamId, new ClosedChannelException())`. This is the second time StreamCallback.onFailure() will be invoked. 2. The flag `isWriting` is set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails. Also adding additional changes that improve the code. 1. Using positional writes all the time because this simplifies the code and with microbenchmarking haven't seen any performance impact. 2. Additional minor changes suggested by mridulm during an internal review. ### Why are the changes needed? These are bug fixes and simplify the code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster. Co-authored-by: Chandni Singh chsinghlinkedin.com Co-authored-by: Min Shen mshenlinkedin.com Closes apache#31934 from otterc/SPARK-32916-followup. Lead-authored-by: Chandni Singh <[email protected]> Co-authored-by: Min Shen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 6d88212) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
### What changes were proposed in this pull request? This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block. `RemoteBlockPushResolver` was introduced in apache#30062 (SPARK-32916). There are 2 scenarios where the merged blocks get corrupted: 1. `StreamCallback.onFailure()` is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order. - `exceptionCaught`. This event is propagated to `StreamInterceptor`. `StreamInterceptor.exceptionCaught()` invokes `callback.onFailure(streamId, cause)`. This is the first time StreamCallback.onFailure() will be invoked. - `channelInactive`. Since the channel closes, the `channelInactive` event gets triggered which again is propagated to `StreamInterceptor`. `StreamInterceptor.channelInactive()` invokes `callback.onFailure(streamId, new ClosedChannelException())`. This is the second time StreamCallback.onFailure() will be invoked. 2. The flag `isWriting` is set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails. Also adding additional changes that improve the code. 1. Using positional writes all the time because this simplifies the code and with microbenchmarking haven't seen any performance impact. 2. Additional minor changes suggested by mridulm during an internal review. ### Why are the changes needed? These are bug fixes and simplify the code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster. Co-authored-by: Chandni Singh chsinghlinkedin.com Co-authored-by: Min Shen mshenlinkedin.com Closes apache#31934 from otterc/SPARK-32916-followup. Lead-authored-by: Chandni Singh <[email protected]> Co-authored-by: Min Shen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> (cherry picked from commit 6d88212) Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
### What changes were proposed in this pull request? This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block. `RemoteBlockPushResolver` was introduced in apache#30062 (SPARK-32916). There are 2 scenarios where the merged blocks get corrupted: 1. `StreamCallback.onFailure()` is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order. - `exceptionCaught`. This event is propagated to `StreamInterceptor`. `StreamInterceptor.exceptionCaught()` invokes `callback.onFailure(streamId, cause)`. This is the first time StreamCallback.onFailure() will be invoked. - `channelInactive`. Since the channel closes, the `channelInactive` event gets triggered which again is propagated to `StreamInterceptor`. `StreamInterceptor.channelInactive()` invokes `callback.onFailure(streamId, new ClosedChannelException())`. This is the second time StreamCallback.onFailure() will be invoked. 2. The flag `isWriting` is set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails. Also adding additional changes that improve the code. 1. Using positional writes all the time because this simplifies the code and with microbenchmarking haven't seen any performance impact. 2. Additional minor changes suggested by mridulm during an internal review. ### Why are the changes needed? These are bug fixes and simplify the code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster. Co-authored-by: Chandni Singh chsinghlinkedin.com Co-authored-by: Min Shen mshenlinkedin.com Closes apache#31934 from otterc/SPARK-32916-followup. Lead-authored-by: Chandni Singh <[email protected]> Co-authored-by: Min Shen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
### What changes were proposed in this pull request? This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block. `RemoteBlockPushResolver` was introduced in #30062 (SPARK-32916). There are 2 scenarios where the merged blocks get corrupted: 1. `StreamCallback.onFailure()` is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order. - `exceptionCaught`. This event is propagated to `StreamInterceptor`. `StreamInterceptor.exceptionCaught()` invokes `callback.onFailure(streamId, cause)`. This is the first time StreamCallback.onFailure() will be invoked. - `channelInactive`. Since the channel closes, the `channelInactive` event gets triggered which again is propagated to `StreamInterceptor`. `StreamInterceptor.channelInactive()` invokes `callback.onFailure(streamId, new ClosedChannelException())`. This is the second time StreamCallback.onFailure() will be invoked. 2. The flag `isWriting` is set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails. Also adding additional changes that improve the code. 1. Using positional writes all the time because this simplifies the code and with microbenchmarking haven't seen any performance impact. 2. Additional minor changes suggested by mridulm during an internal review. ### Why are the changes needed? These are bug fixes and simplify the code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster. Co-authored-by: Chandni Singh chsinghlinkedin.com Co-authored-by: Min Shen mshenlinkedin.com Closes #31934 from otterc/SPARK-32916-followup. Lead-authored-by: Chandni Singh <[email protected]> Co-authored-by: Min Shen <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
What changes were proposed in this pull request?
This PR fixes bugs that causes corruption of push-merged blocks when a client terminates while pushing block.
RemoteBlockPushResolverwas introduced in #30062 (SPARK-32916).There are 2 scenarios where the merged blocks get corrupted:
StreamCallback.onFailure()is called more than once. Initially we assumed that the onFailure callback will be called just once per stream. However, we observed that this is called twice when a client connection is reset. When the client connection is reset then there are 2 events that get triggered in this order.exceptionCaught. This event is propagated toStreamInterceptor.StreamInterceptor.exceptionCaught()invokescallback.onFailure(streamId, cause). This is the first time StreamCallback.onFailure() will be invoked.channelInactive. Since the channel closes, thechannelInactiveevent gets triggered which again is propagated toStreamInterceptor.StreamInterceptor.channelInactive()invokescallback.onFailure(streamId, new ClosedChannelException()). This is the second time StreamCallback.onFailure() will be invoked.isWritingis set prematurely to true. This introduces an edge case where a stream that is trying to merge a duplicate block (created because of a speculative task) may interfere with an active stream if the duplicate stream fails.Also adding additional changes that improve the code.
Why are the changes needed?
These are bug fixes and simplify the code.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added unit tests. I have also tested these changes in Linkedin's internal fork on a cluster.
Co-authored-by: Chandni Singh [email protected]
Co-authored-by: Min Shen [email protected]