-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-5147][Streaming] Delete the received data WAL log periodically #4149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…r duration of all the receiver streams
|
Test build #25932 has started for PR 4149 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just moved the FakeReceiver out of the ReceiverSuite to avoid serialization issues. The diff is very confusing, so just compare directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to dig around and understand why ReceiverInfo's might have a null actor reference. There seems to be only one case when this happens and it's when an error is received. But in that case doesn't the error report itself come from the actor (so you can have a reference to the actor via the sender?).
At a minimum, should we maybe indicate in ReceiverInfo that users should defend against the possibility of the actor ref being null? And would it be possible to avoid the possibility of a null reference alltogether (not for this patch specifically, but in general).
|
Hey TD - I looked through this. The logic seems sound to me though admittedly I'm newer with this part of the code. I added some comments around clarity, but they weren't really specific to the patch's logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a doc here that explains that the max remember duration can be used to set a conservative threshold on when it is safe to remember older data. Otherwise it's not super obvious why callers are using this particular time.
|
Test build #25932 has finished for PR 4149 at commit
|
|
Test PASSed. |
|
Thanks @tdas for your refactoring, I think we should fix this #4032 accordingly, otherwise we will still meet such kind of error when recovering from failure according to my test of your branch: |
|
@pwendell Since I made changes only to the comments I am merging this. |
This is a refactored fix based on jerryshao 's PR #4037 This enabled deletion of old WAL files containing the received block data. Improvements over #4037 - Respecting the rememberDuration of all receiver streams. In #4037, if there were two receiver streams with multiple remember durations, the deletion would have delete based on the shortest remember duration, thus deleting data prematurely for the receiver stream with longer remember duration. - Added unit test to test creation of receiver WAL, automatic deletion, and respecting of remember duration. jerryshao I am going to merge this ASAP to make it 1.2.1 Thanks for the initial draft of this PR. Made my job much easier. Author: Tathagata Das <[email protected]> Author: jerryshao <[email protected]> Closes #4149 from tdas/SPARK-5147 and squashes the following commits: 730798b [Tathagata Das] Added comments. c4cf067 [Tathagata Das] Minor fixes 2579b27 [Tathagata Das] Refactored the fix to make sure that the cleanup respects the remember duration of all the receiver streams 2736fd1 [jerryshao] Delete the old WAL log periodically (cherry picked from commit 3027f06) Signed-off-by: Tathagata Das <[email protected]>
|
Test build #25952 has started for PR 4149 at commit
|
|
Test build #25952 has finished for PR 4149 at commit
|
|
Test PASSed. |
This is a refactored fix based on @jerryshao 's PR #4037
This enabled deletion of old WAL files containing the received block data.
Improvements over #4037
@jerryshao I am going to merge this ASAP to make it 1.2.1 Thanks for the initial draft of this PR. Made my job much easier.