-
Notifications
You must be signed in to change notification settings - Fork 29k
revert [SPARK-22785][SQL] remove ColumnVector.anyNullsSet #20452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
LGTM, and the name |
|
Test build #86869 has finished for PR 20452 at commit
|
|
Overall LGTM. One question is, at #19980, we replace |
|
LGTM. It is fine with me for |
|
Test build #86874 has finished for PR 20452 at commit
|
|
thanks, merging to master/2.3! |
## What changes were proposed in this pull request? In #19980 , we thought `anyNullsSet` can be simply implemented by `numNulls() > 0`. This is logically true, but may have performance problems. `OrcColumnVector` is an example. It doesn't have the `numNulls` property, only has a `noNulls` property. We will lose a lot of performance if we use `numNulls() > 0` to check null. This PR simply revert #19980, with a renaming to call it `hasNull`. Better name suggestions are welcome, e.g. `nullable`? ## How was this patch tested? existing test Author: Wenchen Fan <[email protected]> Closes #20452 from cloud-fan/null. (cherry picked from commit 48dd6a4) Signed-off-by: Wenchen Fan <[email protected]>
|
Hi, @cloud-fan . |
|
thanks @dongjoon-hyun ! |
What changes were proposed in this pull request?
In #19980 , we thought
anyNullsSetcan be simply implemented bynumNulls() > 0. This is logically true, but may have performance problems.OrcColumnVectoris an example. It doesn't have thenumNullsproperty, only has anoNullsproperty. We will lose a lot of performance if we usenumNulls() > 0to check null.This PR simply revert #19980, with a renaming to call it
hasNull. Better name suggestions are welcome, e.g.nullable?How was this patch tested?
existing test