-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-7199][SQL] Add date and timestamp support to UnsafeRow #5984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #32127 has finished for PR 5984 at commit
|
|
Test build #32129 has finished for PR 5984 at commit
|
|
Test build #32136 has finished for PR 5984 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor style nit, but I think the braces here should be on separate lines, like how setDate is defined below.
|
Test build #32242 has finished for PR 5984 at commit
|
|
@JoshRosen Any more comments or thoughts? |
|
Let me take another look in a little bit. There's been a bit of debate over whether we want to commit to supporting 128-bit timestamps in Tungsten or restrict their precision to 64 bits. The decision that we make for 1.4 may not have to be final, though, since this is an experimental feature that's off by default. Since this patch seems like a clear improvement to the current code and since we'll have the flexibility to change this later, I'm inclined to merge this for 1.4.0 and address the timestamp questions in more detail for 1.5.0. |
|
Jenkins, retest this please. |
|
Test build #33287 has finished for PR 5984 at commit
|
|
Hmm, looks like DateUtils was moved or something? |
|
Yes, @rxin moved it since we want to keep internal utilities out of public packages. |
|
@JoshRosen I've updated for moved DateUtils. |
|
Test build #33323 has finished for PR 5984 at commit
|
|
Jenkins, retest this please. |
|
LGTM pending Jenkins. Once the tests pass, I'll pull this in so that it doesn't become conflicted again. |
|
Test build #33483 has finished for PR 5984 at commit
|
|
ping @JoshRosen |
|
/ping @davies, can you take a look at this? I almost merged it but was worried about causing a conflict with your code gen patch. |
|
@viirya getString/getDate/getTimestamp are public interfaces of Row, they are expensive so they should not be used internally. UnsafeRow and codegen both are internal things, so I think they don't need them at all. Could you remove these changes from this PR? |
|
@davies But I think this PR is just intended to add date and timestamp support (setter/getter) to UnsafeRow? |
|
@viirya Yes, but we didn't need |
|
Test build #34825 has finished for PR 5984 at commit
|
|
Test build #34826 has finished for PR 5984 at commit
|
|
@davies, has this latest set of changes addressed your comments regarding get / set methods in codegen? Just want to make sure that we haven't overlooked that and wanted to ping you since you're more familiar me with the considerations there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use target.setInt(column, source.getInt(column))
|
@JoshRosen Now we had separate |
|
If it's still not clear yet, I could send a small PR to show how to support it in UnsafeRow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since timestamps are now represented as longs, we can support updates to timestamps, so we can move this into the settableFieldTypes list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I will update this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
|
@davies It is very clear. Thanks. |
|
@davies @JoshRosen I updated this. Please take a look when you are available. Thanks. |
|
Test build #34993 has finished for PR 5984 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that we just use IntUnsafeColumnWriter for DateType? same to TimestampType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Updated now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the last comma
|
Once address the last comment, this looks good to me, thanks! |
|
Test build #35018 has finished for PR 5984 at commit
|
|
Test build #35029 has finished for PR 5984 at commit
|
|
Looks like an unrelated test failure. |
|
retest this please. |
|
Test build #35040 has finished for PR 5984 at commit
|
|
LGTM, merging this into master! |
JIRA: https://issues.apache.org/jira/browse/SPARK-7199 Author: Liang-Chi Hsieh <[email protected]> Closes apache#5984 from viirya/add_date_timestamp and squashes the following commits: 7f21ce9 [Liang-Chi Hsieh] For comment. 0b89698 [Liang-Chi Hsieh] Add timestamp to settableFieldTypes. c30d490 [Liang-Chi Hsieh] Use default IntUnsafeColumnWriter and LongUnsafeColumnWriter. 672ef17 [Liang-Chi Hsieh] Remove getter/setter for Date and Timestamp and use Int and Long for them. 9f3e577 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 281e844 [Liang-Chi Hsieh] Fix scala style. fb532b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 80af342 [Liang-Chi Hsieh] Fix compiling error. f4f5de6 [Liang-Chi Hsieh] Fix scala style. a463e83 [Liang-Chi Hsieh] Use Long to store timestamp for rows. 635388a [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 46946c6 [Liang-Chi Hsieh] Adapt for moved DateUtils. b16994e [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_date_timestamp 752251f [Liang-Chi Hsieh] Support setDate. Fix failed test. fcf8db9 [Liang-Chi Hsieh] Add functions for Date and Timestamp to SpecificRow. e42a809 [Liang-Chi Hsieh] Fix style. 4c07b57 [Liang-Chi Hsieh] Add date and timestamp support to UnsafeRow.
JIRA: https://issues.apache.org/jira/browse/SPARK-7199