-
Notifications
You must be signed in to change notification settings - Fork 29k
[WIP][SQL] Optimize fromJavaDate() and toJavaDate() #28091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan @HyukjinKwon This optimization requires to restore ugly code from Spark 2.4. I am not sure that it is worth Maybe, it makes sense for the read path but not for write. |
| tf.format(us) | ||
| } | ||
|
|
||
| private def millisToDays(millisUtc: Long, timeZone: TimeZone): SQLDate = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use microsToDays?
| after 1582, vec on 3813 3844 28 26.2 38.1 5.5X | ||
| before 1582, vec off 25912 25949 38 3.9 259.1 0.8X | ||
| before 1582, vec on 4322 4343 19 23.1 43.2 4.8X | ||
| after 1582, vec off 14060 14251 193 7.1 140.6 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan Read results look good.
| before 1582 17979 17979 0 5.6 179.8 0.5X | ||
| after 1582, noop 9250 9250 0 10.8 92.5 1.0X | ||
| before 1582, noop 9522 9522 0 10.5 95.2 1.0X | ||
| after 1582 16377 16377 0 6.1 163.8 0.6X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I don't think it's worth to optimize the write path. The code looks ugly, from my point of view.
|
Test build #120674 has finished for PR 28091 at commit
|
|
Test build #120676 has finished for PR 28091 at commit
|
|
Test build #120679 has finished for PR 28091 at commit
|
…om-to-java-date # Conflicts: # sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt # sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt
|
Test build #121014 has finished for PR 28091 at commit
|
|
I am closing this PR because the final results look not so good: OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load dates from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
-after 1582, vec off 39651 39686 31 2.5 396.5 1.0X
-after 1582, vec on 3647 3660 13 27.4 36.5 10.9X
-before 1582, vec off 38155 38219 61 2.6 381.6 1.0X
-before 1582, vec on 4041 4046 6 24.7 40.4 9.8X
+after 1582, vec off 76436 77047 877 1.3 764.4 1.0X
+after 1582, vec on 3790 3797 10 26.4 37.9 20.2X
+before 1582, vec off 52369 52460 105 1.9 523.7 1.5X
+before 1582, vec on 4171 4182 10 24.0 41.7 18.3X |
What changes were proposed in this pull request?
Why are the changes needed?
Before:
After:
Does this PR introduce any user-facing change?
How was this patch tested?