-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-47927][SQL][FOLLOWUP] fix ScalaUDF output nullability #47081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Batch("Nondeterministic", Once, | ||
| PullOutNondeterministic), | ||
| Batch("ScalaUDF", Once, | ||
| HandleNullInputsForUDF), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule adds null handling for ScalaUDF, so even if the ScalaUDF's nullable is false, the final expression (if-else) can also return null, so we need to run UpdateAttributeNullability after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we also need to run it before? Or how could we correctly handle null inputs to a udf if we do not have the correct nullability for the inputs?
Seems to me like one should be able to write a test case involving an outer join that shows that the proposed code will not handle such nullable input correctly.
|
Feels like this does not address the root issue. To me it seems problematic that Would it be better to address this issue by changing the |
|
@eejbyfeldt I did consider this approach before. However, the null check happens outside of |
Could you expand on how to trigger this error? I am not sure I am following. (But I do agree that it feels a bit off the the implementation of Here is a code that still produces incorrect result with this branch (and master and prior to #46156 ): To solve (while keeping the |
|
@eejbyfeldt good catch! I've fixed it and added a test |
|
thanks for the reviews, merging to master/3.5/3.4! |
This is a followup of #46156 , to fix the wrong nullability of ScalaUDF output. fix nullability no new test no Closes #47081 from cloud-fan/udf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit d89aad3) Signed-off-by: Wenchen Fan <[email protected]>
This is a followup of #46156 , to fix the wrong nullability of ScalaUDF output. fix nullability no new test no Closes #47081 from cloud-fan/udf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit d89aad3) Signed-off-by: Wenchen Fan <[email protected]>
This is a followup of apache#46156 , to fix the wrong nullability of ScalaUDF output. fix nullability no new test no Closes apache#47081 from cloud-fan/udf. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit d89aad3) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
This is a followup of #46156 , to fix the wrong nullability of ScalaUDF output.
Why are the changes needed?
fix nullability
Does this PR introduce any user-facing change?
no
How was this patch tested?
new test
Was this patch authored or co-authored using generative AI tooling?
no