Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
b2e92b4
Test for reading json in UTF-16 with BOM
MaxGekk Feb 11, 2018
cb2f27b
Use user's charset or autodetect it if the charset is not specified
MaxGekk Feb 11, 2018
0d45fd3
Added a type and a comment for charset
MaxGekk Feb 13, 2018
1fb9b32
Replacing the monadic chaining by matching because it is more readable
MaxGekk Feb 13, 2018
c3b04ee
Keeping the old method for backward compatibility
MaxGekk Feb 13, 2018
93d3879
testFile is moved into the test to make more local because it is used…
MaxGekk Feb 13, 2018
15798a1
Adding the charset as third parameter to the text method
MaxGekk Feb 13, 2018
cc05ce9
Removing whitespaces at the end of the line
MaxGekk Feb 13, 2018
74f2026
Fix the comment in javadoc style
MaxGekk Feb 13, 2018
4856b8e
Simplifying of the UTF-16 test
MaxGekk Feb 13, 2018
084f41f
A hint to the exception how to set the charset explicitly
MaxGekk Feb 15, 2018
31cd793
Fix for scala style checks
MaxGekk Feb 15, 2018
6eacd18
Run tests again
MaxGekk Feb 15, 2018
3b4a509
Improving of the exception message
MaxGekk Feb 15, 2018
cd1124e
Appended the original message to the exception
MaxGekk Feb 15, 2018
ebf5390
Multi-line reading of json file in utf-32
MaxGekk Feb 17, 2018
c5b6a35
Autodetect charset of jsons in the multiline mode
MaxGekk Feb 17, 2018
ef5e6c6
Test for reading a json in UTF-16LE in the multiline mode by using us…
MaxGekk Feb 17, 2018
f9b6ad1
Fix test: rename the test file - utf32be -> utf32BE
MaxGekk Feb 18, 2018
3b7714c
Fix code style
MaxGekk Feb 18, 2018
edb9167
Appending the create verb to the method for readability
MaxGekk Feb 18, 2018
5ba2881
Making the createParser as a separate private method
MaxGekk Feb 18, 2018
1509e10
Fix code style
MaxGekk Feb 18, 2018
e3184b3
Checks the charset option is supported
MaxGekk Feb 19, 2018
87d259c
Support charset as a parameter of the json method
MaxGekk Feb 19, 2018
76c1d08
Test for charset different from utf-8
MaxGekk Feb 19, 2018
88395b5
Description of the charset option of the json method
MaxGekk Feb 20, 2018
f2f8ae7
Minor changes in comments: added . at the end of a sentence
MaxGekk Feb 21, 2018
b451a03
Added a test for wrong charset name
MaxGekk Feb 21, 2018
c13c159
Testing that charset in any case is acceptable
MaxGekk Feb 21, 2018
1cb3ac0
Test: user specified wrong (but supported) charset
MaxGekk Feb 21, 2018
108e8e7
Set charset as an option
MaxGekk Feb 25, 2018
0d20cc6
Test: saving to json in UTF-32BE
MaxGekk Feb 23, 2018
54baf9f
Taking user's charset for saved json
MaxGekk Feb 23, 2018
1d50d94
Test: output charset is UTF-8 by default
MaxGekk Feb 23, 2018
bb53798
Changing the readJsonFiles method for readability
MaxGekk Mar 4, 2018
961b482
The test checks that json written by Spark can be read back
MaxGekk Mar 4, 2018
a794988
Adding the delimiter option encoded in base64
MaxGekk Feb 24, 2018
dccdaa2
Separator encoded as a sequence of bytes in hex
MaxGekk Feb 24, 2018
d0abab7
Refactoring: removed unused imports and renaming a parameter
MaxGekk Feb 24, 2018
6741796
The sep option is renamed to recordSeparator. The supported format is…
MaxGekk Mar 4, 2018
e4faae1
Renaming recordSeparator to recordDelimiter
MaxGekk Mar 18, 2018
01f4ef5
Comments for the recordDelimiter option
MaxGekk Mar 18, 2018
24cedb9
Support other formats of recordDelimiter
MaxGekk Mar 18, 2018
d40dda2
Checking different charsets and record delimiters
MaxGekk Mar 18, 2018
ad6496c
Renaming test's method to make it more readable
MaxGekk Mar 18, 2018
358863d
Test of reading json in different charsets and delimiters
MaxGekk Mar 18, 2018
7e5be5e
Fix inferring of csv schema for any charsets
MaxGekk Mar 18, 2018
d138d2d
Fix errors of scalastyle check
MaxGekk Mar 18, 2018
c26ef5d
Reserving format for regular expressions and concatenated json
MaxGekk Mar 22, 2018
5f0b069
Fix recordDelimiter tests
MaxGekk Mar 22, 2018
ef8248f
Additional cases are added to the delimiter test
MaxGekk Mar 22, 2018
2efac08
Renaming recordDelimiter to lineSeparator
MaxGekk Mar 22, 2018
b2020fa
Adding HyukjinKwon changes
MaxGekk Mar 22, 2018
f99c1e1
Revert lineSepInWrite back to lineSep
MaxGekk Mar 22, 2018
6d13d00
Merge remote-tracking branch 'origin/master' into json-line-sep
MaxGekk Mar 22, 2018
77112ef
Fix passing of the lineSeparator to HadoopFileLinesReader
MaxGekk Mar 22, 2018
d632706
Fix python style checking
MaxGekk Mar 23, 2018
bbff402
Fix text source tests and javadoc comments
MaxGekk Mar 23, 2018
3af996b
Merge branch 'json-charset' into json-charset-record-delimiter
MaxGekk Mar 27, 2018
8253811
Merge branch 'json-line-sep' into json-charset-record-delimiter
MaxGekk Mar 27, 2018
ab8210c
Getting UTF-8 as default charset for lineSep
MaxGekk Mar 27, 2018
7c6f115
Set charset different from UTF-8 in the test
MaxGekk Mar 27, 2018
f553b07
Fix for the charset test: charset wasn't specified
MaxGekk Mar 27, 2018
d6a07a1
Removing line leaved after merge
MaxGekk Mar 27, 2018
cb12ea3
Removing flexible format for lineSep
MaxGekk Mar 28, 2018
eb2965b
Adding ticket number to test titles
MaxGekk Mar 28, 2018
7a4865c
Making comments more precise
MaxGekk Mar 28, 2018
dbeb0c1
lineSep must be specified if charset is different from UTF-8
MaxGekk Mar 28, 2018
ac67020
Support encoding as a synonym for the charset option
MaxGekk Mar 29, 2018
d96b720
Merge remote-tracking branch 'origin/master' into json-encoding-line-sep
MaxGekk Mar 29, 2018
75f7bb6
Fix missing require and specifying field of internal row explicitly
MaxGekk Mar 29, 2018
d93dcdc
Making the doc generator happy
MaxGekk Mar 29, 2018
65b4b73
Making the encoding name as the primary name
MaxGekk Mar 31, 2018
6b52419
Blacklisting UTF-16 and UTF-32 in per-line mode
MaxGekk Mar 31, 2018
6116bac
Changes after code review
MaxGekk Mar 31, 2018
5383400
Renaming charset to encoding
MaxGekk Mar 31, 2018
1aeae3c
Changes requested by HyukjinKwon in the review
MaxGekk Apr 4, 2018
7e20891
Adding tests for SPARK-23094
MaxGekk Apr 4, 2018
0d3ed3c
Fix comments
MaxGekk Apr 5, 2018
5d5c295
Matching by encoding per each line is eliminated
MaxGekk Apr 6, 2018
e7be77d
Addressing Hyukjin's review comments
MaxGekk Apr 6, 2018
6bd841a
Fixes regarding to coding style
MaxGekk Apr 6, 2018
6a62679
Making lineSep as opt string
MaxGekk Apr 6, 2018
3b30ce0
Removing option name in a test
MaxGekk Apr 6, 2018
fcd0a21
Merge branch 'master' into json-encoding-line-sep
MaxGekk Apr 8, 2018
af71324
Addressing HyukjinKwon's review comments
MaxGekk Apr 8, 2018
76dbbed
Merge branch 'json-encoding-line-sep' of github.com:MaxGekk/spark-1 i…
MaxGekk Apr 8, 2018
3207e59
Merge remote-tracking branch 'origin/master' into json-encoding-line-sep
MaxGekk Apr 8, 2018
b817184
Making Scala style checker and compiler happy
MaxGekk Apr 8, 2018
15df9af
Merge remote-tracking branch 'origin/master' into json-encoding-line-sep
MaxGekk Apr 13, 2018
36253f4
Adressing Hyukjin Kwon's review comments
MaxGekk Apr 14, 2018
aa69559
Adding benchmarks for json reads
MaxGekk Apr 15, 2018
c35d5d1
Making Scala style checker happy
MaxGekk Apr 15, 2018
58fc5c6
Eliminate unneeded wrapping by ByteArrayInputStream per-line during s…
MaxGekk Apr 15, 2018
63b5894
Adding benchmarks for wide lines
MaxGekk Apr 15, 2018
1ace082
Making comments shorter
MaxGekk Apr 15, 2018
6c0df03
Removing empty line between spark's imports
MaxGekk Apr 15, 2018
b4c0d38
Creating a stream decoder with specific buffer size
MaxGekk Apr 15, 2018
f2a259f
Enable all JSON benchmarks
MaxGekk Apr 15, 2018
482b799
Addressing Hyukjin Kwon's review comments
MaxGekk Apr 22, 2018
a0ab98b
Addressing Wenchen Fan's review comments
MaxGekk Apr 23, 2018
a7be182
Merge branch 'json-encoding-line-sep' of github.com:MaxGekk/spark-1 i…
MaxGekk Apr 23, 2018
e0cebf4
Merge remote-tracking branch 'origin/master' into json-encoding-line-sep
MaxGekk Apr 27, 2018
d3d28aa
Addressing Hyukjin Kwon's review comments
MaxGekk Apr 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Removing whitespaces at the end of the line
  • Loading branch information
MaxGekk committed Mar 17, 2018
commit cc05ce9af7c9f1d14bd10c1f46a60ce043c13fe1
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ private[sql] object CreateJacksonParser extends Serializable {
jsonFactory.createParser(record.getBytes, 0, record.getLength)
}
}

def inputStream(jsonFactory: JsonFactory, record: InputStream): JsonParser = {
jsonFactory.createParser(record)
}
Expand Down