-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48479][SQL] Support creating scalar and table SQL UDFs in parser #46816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48479][SQL] Support creating scalar and table SQL UDFs in parser #46816
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like we can remove createOrReplaceTableColType as it's completely the same as this new colDefinition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will create a follow up PR for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add an error class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be an internal error since we blocked languages other than SQL in the parser.
|
there are legitimate test failures: |
|
@allisonwang-db there are merge conflicts now... |
c0e198c to
fac0c87
Compare
|
the pyspark failure is unrelated, thanks, merging to master! |
|
Can we explain why the PR description isn't consistent with the implementation? Clauses like LANGUAGE-, SECURITY- are introduced by accident? |
| * ([param_name param_type [COMMENT param_comment], ...]) | ||
| * RETURNS {ret_type | TABLE (ret_name ret_type [COMMENT ret_comment], ...])} | ||
| * [routine_characteristics] | ||
| * RETURN {expression | TABLE ( query )}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR description is copied from here, but we need to update here to match the full syntax. @allisonwang-db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SECURITY- clause and its KEYWORDS do not necessarily have to be introduced based on the privilege control of other objects like tables, columns, and other host language UDFs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and Spark fails if the SECURITY clause is specified: https://github.com/apache/spark/pull/46816/files#diff-77a9aad2da3dc60210a2c4d2f3165d5f1d0acd54ca4811072a053225170ed748R808
It's just for better error message: instead of antlr errors, we give a clear error message for unsupported features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having better error messages on unsupported features sounds reasonable to me. However, we seems to ban a MYSQL feature instead of the ANSI one.
- ANSI 1999
<external security clause> ::= EXTERNAL SECURITY DEFINER | EXTERNAL SECURITY INVOKER | EXTERNAL SECURITY IMPLEMENTATION DEFINED - MySQL:
https://dev.mysql.com/doc/refman/8.4/en/stored-objects-security.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe higher ANSI standards support these, but I don't have a copy to check:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @srielau for ANSI SQL syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I've updated the description for this PR and will create a follow up PR to fix it in code.
…and colDefinitionType in parser ### What changes were proposed in this pull request? This PR is a follow-up for #46816 to address this comment: #46816 (comment) to consolidate `createOrReplaceTableColType` and `colDefinitionType` since they are exactly the same. ### Why are the changes needed? To make the code cleaner ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #47047 from allisonwang-db/spark-48479-sql-udf-parser-follow-up. Authored-by: allisonwang-db <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This PR adds support for creating user-defined SQL functions in parser. Here is the SQL syntax:
Why are the changes needed?
To support SQL user-defined functions.
Does this PR introduce any user-facing change?
Yes. This PR adds parser support for creating user-defined SQL functions.
How was this patch tested?
New unit tests.
Was this patch authored or co-authored using generative AI tooling?
No