-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9373][SQL] Support StructType in Tungsten projection #7689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a lot of code duplication between this one and createCode.
|
Test build #1206 has finished for PR 7689 at commit
|
0cae3e8 to
a9600e6
Compare
|
Test build #38578 has finished for PR 7689 at commit
|
|
Test build #38590 has finished for PR 7689 at commit
|
|
Jenkins, test this please. |
|
Test build #38599 has finished for PR 7689 at commit
|
f71b659 to
be9f377
Compare
|
Test build #38598 timed out for PR 7689 at commit |
|
Test build #38616 has finished for PR 7689 at commit
|
|
Looks like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: string -> struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't obvious at first glance... since these expressions are returning names won't it be safe to just assume that they return strings and to use a .asInstanceOf[String] rather than calling toString here? Just a little wary of using .toString() during processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see that this is just copied from CreateNamedStruct above, so it should be fine.
|
Merged build finished. Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, can you also add a check to ensure that codegen is enabled?
|
At a high level this seems fine to me, but I haven't looked in super-close detail yet. I'm fine with merging this pending tests and continuing to iterate in followup patches. |
|
Thanks - going to merge this and submit a pr to fix the remaining issues tonight. |
This pull request updates GenerateUnsafeProjection to support StructType. If an input struct type is backed already by an UnsafeRow, GenerateUnsafeProjection copies the bytes directly into its buffer space without any conversion. However, if the input is not an UnsafeRow, GenerateUnsafeProjection runs the code generated recursively to convert the input into an UnsafeRow and then copies it into the buffer space.
Also create a TungstenProject operator that projects data directly into UnsafeRow. Note that I'm not sure if this is the way we want to structure Unsafe+codegen operators, but we can defer that decision to follow-up pull requests.