Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
fb10b91
Adding DateTimeFormatter
MaxGekk Dec 1, 2018
a9b39ec
Support DateTimeFormatter by JacksonParser and JacksonGenerator
MaxGekk Dec 1, 2018
ff589f5
Make test independent from current time zone
MaxGekk Dec 1, 2018
4646ded
Fix a test by new fallback
MaxGekk Dec 1, 2018
1c838e0
Set time zone explicitly
MaxGekk Dec 1, 2018
142f301
Updating the migration guide
MaxGekk Dec 1, 2018
606da21
Fix the migration guide by replacing CSV by JSON
MaxGekk Dec 1, 2018
f326042
Inlining method's arguments
MaxGekk Dec 1, 2018
4120228
A test for roundtrip timestamp parsing
MaxGekk Dec 2, 2018
6689747
Merge remote-tracking branch 'origin/master' into json-time-parser
MaxGekk Dec 2, 2018
e575162
Set time zone to GMT to eliminate of situation when time zone offset …
MaxGekk Dec 2, 2018
a35d5bf
UTC -> GMT
MaxGekk Dec 2, 2018
2a2085d
Using floorDiv to take days from seconds
MaxGekk Dec 2, 2018
55f2eac
Removing unnecessary time zone settings
MaxGekk Dec 2, 2018
57600e2
Merge remote-tracking branch 'origin/master' into json-time-parser
MaxGekk Dec 4, 2018
07fcf46
Using legacy parser in HiveCompatibilitySuite
MaxGekk Dec 5, 2018
6b6ea8a
Enable new parser in HiveCompatibilitySuit
MaxGekk Dec 7, 2018
244654b
Remove saving legacy parser settings
MaxGekk Dec 7, 2018
015fdce
Updating migration guide
MaxGekk Dec 8, 2018
96529f5
Making date parser independent from time zones
MaxGekk Dec 12, 2018
07d6031
Test refactoring
MaxGekk Dec 13, 2018
d761dee
protected is added
MaxGekk Dec 13, 2018
24b1e3d
toInstant -> toInstantWithZoneId
MaxGekk Dec 13, 2018
9a11515
Set time zone in the test
MaxGekk Dec 13, 2018
4b01d05
GMT -> UTC
MaxGekk Dec 13, 2018
0c7b96b
DateTimeFormatter -> TimestampFormatter
MaxGekk Dec 13, 2018
bbaff09
timeParser -> timestampParser
MaxGekk Dec 13, 2018
8af9df9
Round trip tests
MaxGekk Dec 14, 2018
363482e
Renaming test suite
MaxGekk Dec 14, 2018
07e0bf8
Added withClue
MaxGekk Dec 14, 2018
c12da1f
Put test under legacy time parser
MaxGekk Dec 14, 2018
60ab5b1
TODO
MaxGekk Dec 15, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
GMT -> UTC
  • Loading branch information
MaxGekk committed Dec 13, 2018
commit 4b01d05e306906f20372f1b3a7c987a3f5ce1c89
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ class Iso8601DateFormatter(
pattern: String,
locale: Locale) extends DateFormatter with FormatterUtils {

val zoneId = ZoneId.of("GMT")
val zoneId = ZoneId.of("UTC")

val formatter = buildFormatter(pattern, locale)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ abstract class HadoopFsRelationTest extends QueryTest with SQLTestUtils with Tes
} else {
Seq(false)
}
withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key -> "GMT") {
withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key -> "UTC") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried here. This test is a round-trip test, do you mean if we write out a date/timestamp to json and read it back, the values will be different if session timezone is not UTC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be same, if the session local timezone doesn't change between write and read back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried here. This test is a round-trip test ...
It should be same, if the session local timezone doesn't change between write and read back.

Not only JSON parser/formatter involved in the loop but also converting milliseconds to Java's Timestamp and to something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converting milliseconds to Java's Timestamp and to something else.

these don't matter once the dataframe is created.

The problem is, if we have a dataframe(no matter how it is generated), we write it out and read it back. If it becomes different, we have a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me look at it deeper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove the timezone setting here? Then we can look at jenkens report and see which seed can reproduce the bug and debug it locally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see which seed can reproduce the bug and debug it locally.

I ran it locally many times. It is almost 100% reproducible for any seed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to put the test under the flag spark.sql.legacy.timeParser.enabled and create a separate JIRA ticket? I would believe the bug somewhere in Spark's home made date/time functions rather than Java 8 implementation of timestamps parsing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Can you create the ticket? And put a TODO here which refers to the ticket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the ticket: https://issues.apache.org/jira/browse/SPARK-26374 and I added TODO

for (dataType <- supportedDataTypes) {
for (parquetDictionaryEncodingEnabled <- parquetDictionaryEncodingEnabledConfs) {
val extraMessage = if (isParquetDataSource) {
Expand Down