-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9162][SQL] Implement code generation for ScalaUDF #9270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we put these branches in a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You meant using a script to generate it like the f?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but I think you can just write a loop instead of having branches?
...
val funcClassName = callFunc.getClass.getName
...
val evals = children.map(_.gen(ctx))
...
// generate callFunc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, part of this can be reduced like you show. But seems we will still have a (smaller) pattern matching, at least for something like this: classOf[(Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any) => Any].getName. I will do it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that one, can you just do callFunc.getClass.getName?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wouldn't it work? Isn't that better because we can even specialize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, you are right. I should have it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I tried, using function.getClass.getName causes java.lang.Exception: failed to compile: org.codehaus.janino.JaninoRuntimeException: Incompatible return types in ZeroArgument UDF test. If I use classOf[() => Any].getName for this zero argument function, it is ok then.
Because other tests are passed, I added a condition to take care of zero argument function now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like function.getClass.getName will cause other problem (java.lang.IncompatibleClassChangeError) as the following failed tests show.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But classOf[(Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any) => Any].getName works. So I will revert to it then.
|
Test build #44315 has finished for PR 9270 at commit
|
|
Test build #44337 has finished for PR 9270 at commit
|
|
retest this please. |
…sChangeError. But classOf[(Any) => Any].getName works.
|
Test build #44343 has finished for PR 9270 at commit
|
|
@rxin any other comments? |
|
ping @rxin |
|
I'm traveling right now. Will take a look when I get back. cc @davies also. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why index here? All freshName will be unique.
|
@viirya Could you add unit tests for every data types? it's easy to be wrong for some types. |
|
Test build #44893 has finished for PR 9270 at commit
|
|
@davies May I ask what the data types you meant? scala.Function0 to scala.Function22? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to save the index when you push the expression into ctx.references, otherwise it's easy to be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added that. Thanks.
|
@viirya I meant the SQL data types, for example, DateType, ArrayType, StructType and so on. |
|
Test build #44897 has finished for PR 9270 at commit
|
|
Test build #44898 has finished for PR 9270 at commit
|
|
@viirya Do you have time to add more tests for this? |
|
@davies yes. I will do it soon. |
|
Test build #45199 has finished for PR 9270 at commit
|
|
@davies I think it would be better to do that in a follow up pr. Thanks. |
|
LGTM, merging this into master and 1.6 branch. |
JIRA: https://issues.apache.org/jira/browse/SPARK-9162 Currently ScalaUDF extends CodegenFallback and doesn't provide code generation implementation. This path implements code generation for ScalaUDF. Author: Liang-Chi Hsieh <[email protected]> Closes #9270 from viirya/scalaudf-codegen. (cherry picked from commit 574141a) Signed-off-by: Davies Liu <[email protected]>
JIRA: https://issues.apache.org/jira/browse/SPARK-9162
Currently ScalaUDF extends CodegenFallback and doesn't provide code generation implementation. This path implements code generation for ScalaUDF.