-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37438][SQL] ANSI mode: Use store assignment rules for resolving function invocation #34681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
Show resolved
Hide resolved
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
Show resolved
Hide resolved
|
Test build #145508 has finished for PR 34681 at commit
|
| // Skip nodes who's children have not been resolved yet. | ||
| case e if !e.childrenResolved => e | ||
|
|
||
| case d @ DateAdd(AnyTimestampType(), _) => d.copy(startDate = Cast(d.startDate, DateType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should refactor these functions to extend ImplicitCastInputTypes later
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #145513 has finished for PR 34681 at commit
|
Signed-off-by: Karen Feng <[email protected]>
Re-generate golden files
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Merging to master |
|
Test build #145528 has finished for PR 34681 at commit
|
What changes were proposed in this pull request?
Under ANSI mode(spark.sql.ansi.enabled=true), the function invocation of Spark SQL:
Store assignmentrules as storing the input values as the declared parameter type of the SQL functionsWhy are the changes needed?
Currently, the ANSI SQL mode resolves the function invocation with

Least Common Type Resolutionbased onType precedence list. After a closer look at the ANSI SQL standard, the "store assignment" syntax rules should be used for resolving the type coercion between the input and parameters of SQL function, while theType precedence listis used for "Subject routine determination"(SQL function overloads).I have also done some data science among real-world SQL queries, the following implicit function casts are not allowed as per
Least Common Type Resolutionbut they are commonly seen:CONCAT(DATE_ADD(%1, CAST(%2 AS INT)), SUBSTR(CAST(%1 AS TIMESTAMP), 11)) AS TIMESTAMP)date_sub(now(), 7) < ...from_unixtime(updated/1000), note thatupdatedand1000will be converted as Double first.The changes in this PR is ANSI compatible and it is good for the adoption of ANSI SQL mode.
Does this PR introduce any user-facing change?
Yes, Use store assignment rules for resolving function invocation under ANSI mode.
How was this patch tested?
Unit tests