-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17378] [BUILD] Upgrade snappy-java to 1.1.2.6 #14958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
4b7ec9a
Update Snappy to 1.1.2.6
a-roberts 440b1df
Update spark-deps-hadoop-2.2
a-roberts 140f70a
Update spark-deps-hadoop-2.3
a-roberts cda5016
Update spark-deps-hadoop-2.4
a-roberts 36253ba
Update spark-deps-hadoop-2.6
a-roberts 3795997
Update spark-deps-hadoop-2.7
a-roberts accc88f
Add test for snappy-java handling of magic header
a-roberts 022cad7
Revert test addition in CompressionCodecSuite
a-roberts File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Add test for snappy-java handling of magic header
Probably needs TLC with the while loop, this is based on https://github.com/xerial/snappy-java/blob/60cc0c2e1d1a76ae2981d0572a5164fcfdfba5f1/src/test/java/org/xerial/snappy/SnappyInputStreamTest.java but outside of the snappy package so we can't use MAGIC_HEADER[0].
- Loading branch information
commit accc88f8c73864d50b5588f43c53c11323052d2e
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,7 +17,7 @@ | |
|
|
||
| package org.apache.spark.io | ||
|
|
||
| import java.io.{ByteArrayInputStream, ByteArrayOutputStream} | ||
| import java.io.{ByteArrayInputStream, ByteArrayOutputStream, InputStream} | ||
|
|
||
| import com.google.common.io.ByteStreams | ||
|
|
||
|
|
@@ -130,4 +130,58 @@ class CompressionCodecSuite extends SparkFunSuite { | |
| ByteStreams.readFully(concatenatedBytes, decompressed) | ||
| assert(decompressed.toSeq === (0 to 127)) | ||
| } | ||
|
|
||
| // Based on https://github.com/xerial/snappy-java/blob/60cc0c2e1d1a76ae2981d0572a5164fcfdfba5f1/src/test/java/org/xerial/snappy/SnappyInputStreamTest.java | ||
| test("SPARK 17378: snappy-java should handle magic header when reading stream") { | ||
| val b = new ByteArrayOutputStream() | ||
| // Write uncompressed length beginning with -126 (the same with magicheader[0]) | ||
| b.write(-126) // Can't access magic header[0] as it isn't public, so access this way | ||
| b.write(0x01) | ||
| // uncompressed data length = 130 | ||
|
|
||
| var data = new ByteArrayOutputStream() | ||
|
|
||
| for (i <- 0 until 130) { | ||
| data.write('A') | ||
| } | ||
|
|
||
| var dataMoreThan8Len = data.toByteArray() | ||
|
|
||
| // write literal (lower 2-bit of the first tag byte is 00, upper 6-bits represents data size) | ||
| b.write(60<<2) // 1-byte data length follows | ||
| b.write(dataMoreThan8Len.length-1) // subsequent data length | ||
| b.write(dataMoreThan8Len) | ||
|
|
||
| var compressed = b.toByteArray() | ||
|
|
||
| // This should succeed | ||
| assert(dataMoreThan8Len === org.xerial.snappy.Snappy.uncompress(compressed)) | ||
|
|
||
| // Reproduce error in #142 | ||
| val in = new org.xerial.snappy.SnappyInputStream(new ByteArrayInputStream(b.toByteArray())) | ||
|
|
||
| var uncompressed = readFully(in) | ||
| assert(dataMoreThan8Len === uncompressed) // this fails as uncompressed is empty | ||
| } | ||
|
|
||
| private def readFully(input: InputStream): Array[Byte] = { | ||
| try { | ||
| val out = new ByteArrayOutputStream() | ||
| var buf = new Array[Byte](4096) | ||
|
|
||
| var readBytes = 0 | ||
| while (readBytes != -1) { | ||
|
||
| readBytes = input.read(buf) | ||
| if (readBytes != -1) { | ||
| out.write(buf, 0, readBytes) | ||
| } | ||
| } | ||
| out.flush() | ||
| return out.toByteArray() | ||
| } | ||
| finally { | ||
| input.close(); | ||
| } | ||
| } | ||
|
|
||
| } | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this seems to tie this test to this internal detail of Snappy though. Spark itself doesn't really need to assert this detail in a test.
I feel like this test is just testing snappy, which snappy can test. I could see testing a case at the level of Spark that triggers this bug and verifies it's fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree, so how about we just revert the test case commit here and merge the 1.1.2.6 change itself as folks want it, and then in a later PR add an extra test for robustness if we want to.