-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13616][SQL] Let SQLBuilder convert logical plan without a project on top of it #11466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #52311 has finished for PR 11466 at commit
|
| case r: RepartitionByExpression => toSQL(node) | ||
| case _ => | ||
| build( | ||
| "SELECT", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
|
Test build #52355 has finished for PR 11466 at commit
|
|
Thanks - going to merge this in master. cc @liancheng |
| } | ||
| } | ||
|
|
||
| private def toSQL(node: LogicalPlan, topNode: Boolean): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only missing cases are Join. All the other cases have been supported. It breaks multiple cases. I plan to revert this change back. cc @liancheng
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, looks like so. We only need to take care of Join for this purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I am also afraid that the test case here for Join is not valid. I guess Join is the top layer node only when users are using DataFrame/DataSet APIs. If users are using SQL, Join will not be the top layer node, right?
If we want to convert the analyzed logical plans generated from DataFrame API to SQL, this is not the only issue. I also hit a couple of issues before. @liancheng , do you think we should support them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of reverting this, I think we can simply change the pattern cases to let it match Join to do the.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The analyzed plan of this query:
"SELECT x.key FROM parquet_t1 x JOIN parquet_t1 y ON x.key = y.key"== Analyzed Logical Plan ==
key: bigint
Project [key#18L]
+- Join Inner, Some((key#18L = key#23L))
:- SubqueryAlias x
: +- SubqueryAlias parquet_t1
: +- Relation[key#18L,value#19] ParquetRelation
+- SubqueryAlias y
+- SubqueryAlias parquet_t1
+- Relation[key#23L,value#24] ParquetRelation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, more details and background about SQL generation and native view support can be found in SPARK-11012.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gatorsmile You mentioned that you found some regressions. Did you mean some test cases in the current master branch are failing, or you constructed new test cases that failed this PR? I guess probably the latter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liancheng I saw multiple test cases in HiveCompatibilitySuite are unable to convert the plans to SQL. Originally, all of these cases should work.
As @dilipbiswal said, Limit is one of the issues. Filter is another one. Besides these two, I also saw more. Thus, I did not continue to fix all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So this PR decreases SQL generation coverage but didn't cause actual build failures, right? Still I'm going to revert it by merging #11539.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah
…t a project on top of it" This reverts commit f87ce05. According to discussion in #11466, let's revert PR #11466 for safe. Author: Cheng Lian <[email protected]> Closes #11539 from liancheng/revert-pr-11466.
…ect on top of it JIRA: https://issues.apache.org/jira/browse/SPARK-13616 ## What changes were proposed in this pull request? It is possibly that a logical plan has been removed `Project` from the top of it. Or the plan doesn't has a top `Project` from the beginning because it is not necessary. Currently the `SQLBuilder` can't convert such plans back to SQL. This change is to add this feature. ## How was this patch tested? A test is added to `LogicalPlanToSQLSuite`. Author: Liang-Chi Hsieh <[email protected]> Closes apache#11466 from viirya/sqlbuilder-notopselect.
…t a project on top of it" This reverts commit f87ce05. According to discussion in apache#11466, let's revert PR apache#11466 for safe. Author: Cheng Lian <[email protected]> Closes apache#11539 from liancheng/revert-pr-11466.
JIRA: https://issues.apache.org/jira/browse/SPARK-13616
What changes were proposed in this pull request?
It is possibly that a logical plan has been removed
Projectfrom the top of it. Or the plan doesn't has a topProjectfrom the beginning because it is not necessary. Currently theSQLBuildercan't convert such plans back to SQL. This change is to add this feature.How was this patch tested?
A test is added to
LogicalPlanToSQLSuite.