-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20392][SQL][followup] should not add extra AnalysisBarrier #20094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just make the names shorter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the code to resolve expressions and add missing attributes in one shot, so that we have a central place to deal with analysis barrier and to decide which operator is supported and which is not.
|
Test build #85439 has finished for PR 20094 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If right plan is wrapped (e.g., we join two datasets) in an analysis barrier, the later right.collect doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see, you have recursively dedupRight on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newRight is introduced before to be wrapped in AnalysisBarrier. We can get rid of this redundant variable now.
| case d: Distinct => | ||
| (exprs.map(resolveExpression(_, d)), d) | ||
|
|
||
| case u: UnaryNode => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we stop at SubqueryAlias as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good catch! I missed that because the logic was in resolveExpressionRecursively instead of addMissingAttr.
It indicates that it's more clear to merge these 2 methods :)
|
LGTM with two minor comments. |
|
Test build #85452 has finished for PR 20094 at commit
|
|
Test build #85450 has finished for PR 20094 at commit
|
|
Test build #85453 has finished for PR 20094 at commit
|
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks! Merged to master. |
What changes were proposed in this pull request?
I found this problem while auditing the analyzer code. It's dangerous to introduce extra
AnalysisBarrerduring analysis, as the plan inside it will bypass all analysis afterward, which may not be expected. We should only preserveAnalysisBarrerbut not introduce new ones.How was this patch tested?
existing tests