-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20612][SQL] Throw exception when there is unresolvable attributes in Filter #17874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1023,8 +1023,6 @@ class Analyzer( | |
| * clause. This rule detects such queries and adds the required attributes to the original | ||
| * projection, so that they will be available during sorting. Another projection is added to | ||
| * remove these attributes after sorting. | ||
| * | ||
| * The HAVING clause could also used a grouping columns that is not presented in the SELECT. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is by design.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For example, select type, avg (price)
from titles
group by type
having sum (total_sales) > 10000This example is copied from Sybase ASE. I believe this is part of Transact-SQL
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
For the HAVING clause could also used a grouping columns that is not presented in the SELECT, yes. For other general cases, I doubt it. We have other rule doing this (HAVING clause with grouping columns). That is why the tests are passed after this rule is removed. The above query also works without this rule.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we introduced this by accident, I do not think we can remove it now. It could break the applications that are built on it. cc @rxin @cloud-fan @marmbrus |
||
| */ | ||
| object ResolveMissingReferences extends Rule[LogicalPlan] { | ||
| def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperators { | ||
|
|
@@ -1051,26 +1049,6 @@ class Analyzer( | |
| // in Sort | ||
| case ae: AnalysisException => s | ||
| } | ||
|
|
||
| case f @ Filter(cond, child) if child.resolved => | ||
| try { | ||
| val newCond = resolveExpressionRecursively(cond, child) | ||
| val requiredAttrs = newCond.references.filter(_.resolved) | ||
| val missingAttrs = requiredAttrs -- child.outputSet | ||
| if (missingAttrs.nonEmpty) { | ||
| // Add missing attributes and then project them away. | ||
| Project(child.output, | ||
| Filter(newCond, addMissingAttr(child, missingAttrs))) | ||
| } else if (newCond != cond) { | ||
| f.copy(condition = newCond) | ||
| } else { | ||
| f | ||
| } | ||
| } catch { | ||
| // Attempting to resolve it might fail. When this happens, return the original plan. | ||
| // Users will see an AnalysisException for resolution failure of missing attributes | ||
| case ae: AnalysisException => f | ||
| } | ||
| } | ||
|
|
||
| /** | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that existing spark applications may already use this pattern in the code, so no matter it's a bug or not, it seems a feature now and we can't break it...