-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48466][SQL] Create dedicated node for EmptyRelation in AQE #46830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala
Outdated
Show resolved
Hide resolved
| object AQEPropagateEmptyRelation extends PropagateEmptyRelationBase { | ||
| override protected def isEmpty(plan: LogicalPlan): Boolean = | ||
| super.isEmpty(plan) || (!isRootRepartition(plan) && getEstimatedRowCount(plan).contains(0)) | ||
| super.isEmpty(plan) || plan.isInstanceOf[EmptyRelation] || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to check the type explicitly? It has row count which should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @liuzqt , I think this change is not needed as we have https://github.com/apache/spark/pull/46830/files#diff-d72cf9d12043beb25760fb2755ab0dd7fdd881483068c833541a126a1a38b780R32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan we still need to check this explicitly. getEstimatedRowCount only match for specific patterns. Before the change, the reason why it works is because super.isEmpty explicitly match empty LocalRelation.
We can also match EmptyRelation in getEstimatedRowCount alternatively. I don't have preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more reasonable to make getEstimatedRowCount recognize this new EmptyRelation
sql/core/src/main/scala/org/apache/spark/sql/execution/EmptyRelationExec.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/EmptyRelationExec.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlanInfo.scala
Outdated
Show resolved
Hide resolved
|
the failed pyspark test is unrelated, thanks, merging to master! |
…onExec ### What changes were proposed in this pull request? Fixed a missing pattern match introduced in #46830 Sorry for the silly mistake... ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? NO Closes #47089 from liuzqt/SPARK-48466. Authored-by: Ziqi Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? Remove cleanupResource() from`EmptyRelationExec` ### Why are the changes needed? This bug was introduced in #46830 : `cleanupResources` might be executed on the executor where `logical` is null. After revisiting cleanupResources relevant code paths, I think `EmptyRelationExec` doesn't need to anything here. - for driver side cleanup, we have [this code path](https://github.com/apache/spark/blob/0602020eb3b346a8c50ad32eeda4e6dabb70c584/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala) to cleanup each AQE query stage. - for executor side cleanup, so far we only have SortMergeJoinExec which invoke cleanupResource during its execution, so upon the time when EmptyRelationExec is created, it's guaranteed necessary cleanup has been done. - After all, `EmptyRelationExec` is only a never-execute wrapper for materialized physical query stages, it should not be responsible for any cleanup invocation. So I'm removing `cleanupResources` implementation from `EmptyRelationExec`. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? New unit test. ### Was this patch authored or co-authored using generative AI tooling? NO Closes #47931 from liuzqt/SPARK-49460. Authored-by: Ziqi Liu <[email protected]> Signed-off-by: yangjie01 <[email protected]>
…kPlanInfo ### What changes were proposed in this pull request? Fix `SparkPlanInfo.fromLogicalPlan` to handle nested empty relation. Before: <img width="294" alt="Screenshot 2025-03-21 at 11 30 56 AM" src="https://github.com/user-attachments/assets/d03f6b88-13ad-4b67-bc4d-18b532e4dea2" /> After: <img width="390" alt="Screenshot 2025-03-20 at 5 51 21 PM" src="https://github.com/user-attachments/assets/0e4f775c-b9cf-4955-af17-5e47fa44e44b" /> ### Why are the changes needed? A followup for #46830, in the original PR I forget to handle nested empty relation. ### Does this PR introduce _any_ user-facing change? Yes, UI change ### How was this patch tested? Verifed in Spark UI ### Was this patch authored or co-authored using generative AI tooling? NO Closes #50350 from liuzqt/SPARK-48466. Authored-by: liuzqt <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…kPlanInfo ### What changes were proposed in this pull request? Fix `SparkPlanInfo.fromLogicalPlan` to handle nested empty relation. Before: <img width="294" alt="Screenshot 2025-03-21 at 11 30 56 AM" src="https://github.com/user-attachments/assets/d03f6b88-13ad-4b67-bc4d-18b532e4dea2" /> After: <img width="390" alt="Screenshot 2025-03-20 at 5 51 21 PM" src="https://github.com/user-attachments/assets/0e4f775c-b9cf-4955-af17-5e47fa44e44b" /> ### Why are the changes needed? A followup for #46830, in the original PR I forget to handle nested empty relation. ### Does this PR introduce _any_ user-facing change? Yes, UI change ### How was this patch tested? Verifed in Spark UI ### Was this patch authored or co-authored using generative AI tooling? NO Closes #50350 from liuzqt/SPARK-48466. Authored-by: liuzqt <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit b8e0246) Signed-off-by: Wenchen Fan <[email protected]>
…kPlanInfo ### What changes were proposed in this pull request? Fix `SparkPlanInfo.fromLogicalPlan` to handle nested empty relation. Before: <img width="294" alt="Screenshot 2025-03-21 at 11 30 56 AM" src="https://github.com/user-attachments/assets/d03f6b88-13ad-4b67-bc4d-18b532e4dea2" /> After: <img width="390" alt="Screenshot 2025-03-20 at 5 51 21 PM" src="https://github.com/user-attachments/assets/0e4f775c-b9cf-4955-af17-5e47fa44e44b" /> ### Why are the changes needed? A followup for apache#46830, in the original PR I forget to handle nested empty relation. ### Does this PR introduce _any_ user-facing change? Yes, UI change ### How was this patch tested? Verifed in Spark UI ### Was this patch authored or co-authored using generative AI tooling? NO Closes apache#50350 from liuzqt/SPARK-48466. Authored-by: liuzqt <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 4661833) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Add dedicated node to represent empty relation (AQE only, i.e.,
AQEPropagateEmptyRelation.scala).logical.EmptyRelationexecution.EmptyRelationExecboth are leaf node and store the eliminated logical plan as a field.
In order to display the plan in spark UI, I extended
SparkPlanInfoto support mix of logical and physical plan.Spark UI
String representation
Why are the changes needed?
Currently we replace with a
LocalTableScanin case of empty relation propagation, which lost the information about the original query plan and make it less human readable. The idea is to create a dedicatedEmptyRelationnode which is a leaf node but wraps the original query plan inside.Does this PR introduce any user-facing change?
NO
How was this patch tested?
Existing tests
Was this patch authored or co-authored using generative AI tooling?
NO