Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jun 4, 2020

What changes were proposed in this pull request?

Set spark.sql.datetime.java8API.enabled to true in:

  1. SparkSQLEnv.init() of Thrift server, and
  2. SparkSQLSessionManager.openSession()

Why are the changes needed?

  1. Date and timestamp string literals are parsed by using Java 8 time API and Spark's session time zone. Before the changes, date/timestamp values were collected as legacy types java.sql.Date/java.sql.Timestamp, and the value of such types didn't respect the config spark.sql.session.timeZone. To have consistent view, users had to keep JVM time zone and Spark's session time zone in sync.
  2. After the changes, formatting of date values doesn't depend on JVM time zone.
  3. While returning dates/timestamps of Java 8 type, we can avoid dates/timestamps rebasing from Proleptic Gregorian calendar to the hybrid calendar (Julian + Gregorian), and the issues related to calendar switching.
  4. Properly handle negative years (BCE).
  5. Consistent conversion of date/timestamp strings to/from internal Catalyst types in both direction to and from Spark.

Does this PR introduce any user-facing change?

Yes. Before:

spark-sql> select make_date(-44, 3, 15);
0045-03-15

After:

spark-sql> select make_date(-44, 3, 15);
-0044-03-15

How was this patch tested?

Manually via bin/spark-sql.

@MaxGekk MaxGekk changed the title [SPARK-30808][SQL] Enable Java 8 time API in in Thrift server [SPARK-30808][SQL] Enable Java 8 time API in Thrift server Jun 4, 2020
case None =>
}

sqlContext.sparkContext.setLocalProperty(SQLConf.DATETIME_JAVA8API_ENABLED.key, "true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be set in sqlContext.sparkSession.conf, not as sqlContext.sparkContext.setLocalProperty.
...
actually, an even better place to set it would be here in openSession, to just set it once per session. https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala#L65

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we forbid users setting the java 8 time config back?

@juliuszsompolski
Copy link
Contributor

Could you add a test that fails before and is fixed after this change to HiveThriftServerBinarySuite (for JDBC) and CLISuite? I think it could be similar to the test for SPARK-31861?

@SparkQA
Copy link

SparkQA commented Jun 4, 2020

Test build #123540 has finished for PR 28729 at commit 930c293.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 4, 2020

Test build #123544 has finished for PR 28729 at commit 3c35cf5.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@MaxGekk MaxGekk changed the title [SPARK-30808][SQL] Enable Java 8 time API in Thrift server [SPARK-31910][SQL] Enable Java 8 time API in Thrift server Jun 5, 2020
@SparkQA
Copy link

SparkQA commented Jun 5, 2020

Test build #123551 has finished for PR 28729 at commit 3c35cf5.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 5, 2020

Test build #123554 has finished for PR 28729 at commit d130f23.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Please change "2. SparkOperation. withLocalProperties()" in PR description to "SparkSQLSessionManager.openSession()".

@cloud-fan
Copy link
Contributor

Does it still work if the user sets the java 8 time config to false manually?

@MaxGekk
Copy link
Member Author

MaxGekk commented Jun 5, 2020

Does it still work if the user sets the java 8 time config to false manually?

It should work because we still have cases for the legacy types in toHiveString, right?

@juliuszsompolski
Copy link
Contributor

It should "work" as in not crash/throw errors, but revert to the inconsistencies it had before this PR, correct @MaxGekk ? I'd say that it's the user conscious choice if they do that and that's fine.

@SparkQA
Copy link

SparkQA commented Jun 5, 2020

Test build #123563 has finished for PR 28729 at commit f3e5357.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 2c9988e Jun 5, 2020
@HyukjinKwon
Copy link
Member

Late LGTM too

@MaxGekk MaxGekk deleted the enable-java8-time-api-in-thrift-server branch June 5, 2020 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants