-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26068][Core]ChunkedByteBufferInputStream should handle empty chunks correctly #23040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26068][Core]ChunkedByteBufferInputStream should handle empty chunks correctly #23040
Conversation
ab81c1e to
fa7af44
Compare
|
cc @ericl and @JoshRosen, this bug was introduced by https://github.com/apache/spark/pull/14099/files After loosing empty chunk check, the ChunkedByteBufferInputStream doesn't handle empty chunks correctly |
xuanyuanking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, after looking into the code in detail the problem will not happened in Spark, @linhong-intel maybe you can leave the detail analysis and close this.
|
Problem: But on the other hand:
As a result, current spark code will never reach the problem as far as we won't use So it's both OK either we fix this or not. |
|
It's good to fix a potential bug, can you add a unit test? |
|
@cloud-fan Thanks for your reply. |
|
ok to test |
|
Test build #98878 has finished for PR 23040 at commit
|
| extends InputStream { | ||
|
|
||
| private[this] var chunks = chunkedByteBuffer.getChunks().iterator | ||
| private[this] var chunks = chunkedByteBuffer.getChunks().filter(_.hasRemaining).iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment above, saying that we do this filter because read assumes chunks has no empty chunk?
|
LGTM except one comment |
|
also cc @jiangxb1987 @zsxwing |
|
cc @cloud-fan @srowen |
|
Test build #98989 has finished for PR 23040 at commit
|
jiangxb1987
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
thanks, merging to master! |
…chunks correctly ## What changes were proposed in this pull request? Empty chunk in ChunkedByteBuffer will truncate the ChunkedByteBufferInputStream. The detail reason is described in: https://issues.apache.org/jira/browse/SPARK-26068 ## How was this patch tested? Modified current UT to cover this case. Closes apache#23040 from LinhongLiu/fix-empty-chunked-byte-buffer. Lead-authored-by: Liu,Linhong <[email protected]> Co-authored-by: Xianjin YE <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Empty chunk in ChunkedByteBuffer will truncate the ChunkedByteBufferInputStream.
The detail reason is described in: https://issues.apache.org/jira/browse/SPARK-26068
How was this patch tested?
Modified current UT to cover this case.