Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SPARK UI K8S : this parameter's illustration(spark.kubernetes.executor.label.[LabelName] ) #21812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uh oh!
There was an error while loading. Please reload this page.
SPARK UI K8S : this parameter's illustration(spark.kubernetes.executor.label.[LabelName] ) #21812
Changes from 1 commit
a36c1a6de4feaea2166ec416cd1f1d9338b0be5aa27297ae0b24d3dbff0501bdbb4d83b3417b7e8c1a0c2c2a86b93df3cde9efb62e76b0121462bba2c10020a5d775a173fe451a644afb070dedf433ef736a3409f07c5063e5b4aea99d2849b6f2422dc047af5af86e0481977dc2246501452ea1d7db655d6a53d2824f143af1d3ef0ef1b3cc88d7fada28f20d3714df53818d9786ce63352d6f4c388bc7703b46299d2971b46f413bf7691534065efdadc4bd3eed8fb8f27ae18cb0c0270a9a322daeba6567fc4495d8cfb5ccf0d90da7dce4fee39c7c0b08b0a9352e219e69bce17758f225e01737d459a75c18a78a9049dbe53e13092d72cb9763bc0498dbc11146c8ef923c5a0d113f4bda715747cf9de11d354fcaaf7236e75c0cad59b56e9c6c8e909cb9a6f74dc8a6be92c2f0039dfaf233e77fa4e7d867c7e274298f363ba5849adf596ebe6e0596e8ab8ef7bac50aa594ac4f5264164baa01c86d16b98d48803b4c059ebc7967c6e07aee2dcaa49f02f878116f2c3e1b9368fd08f53d2669b4d9a76f23a1a64e36a0b77a78ecb6dc04cb2d776befb221d03a893ea22c5aa54dbd32b501c9acc26a97e8e5b05966524827fa95a4afe1d3f802224861f6e6899f71e8da03545ce797971ed54d8b8f8258478f91c698008f9cf599cde428155485fe129a7c8f0c5585c57776f299b42fda85bf95f27c08eb6772060db2deef6021145f1a2655aca8243fbf764a3489a529f997be04be9f0c32cfd3ee58dadb7bd6d54ac78bcc33952cfe71e93a01fcba2bf67f70141953fa381bce4de0425fc4369074f6a92044b33b79c6689e2c7e09034913b1bd3d61aec966beb6e9884984f1aa2890096fe3286e0559f232cb5086078b891f94bf474a8d635ff1b9b006e798592cc84ebf4bfb290c30a59c3c23ff7f6efe008ad13ab48f95ad4735301bff7e6c6f909fa4a1e1055c94395860a07704c911384897572505e0f4f200ce11d00f24c6fdfd7ac9c1b62e43bcb1b43b6005ba75571bf1a99ade1de3418aceb9643e4e853e7dc82699932196030875d62a98bbc2ffcbcf7121d4635339f929452603ae39549a28cf97045b045315b0c95a1ba437fc0f0d186d57a267f876d3f0ca16f64cf1bec52153447688ce8912634b2a4dd6f681845ffc2e1893b59d3234cb3b52694dd2002300debe9e28fc0c8c9c8bee931272b20cd203e0d404e54753f115cd5d93c1a4fda8File filter
Filter by extension
Conversations
Uh oh!
There was an error while loading. Please reload this page.
Jump to
Uh oh!
There was an error while loading. Please reload this page.
## What changes were proposed in this pull request? The ColumnPruning rule tries adding an extra Project if an input node produces fields more than needed, but as a post-processing step, it needs to remove the lower Project in the form of "Project - Filter - Project" otherwise it would conflict with PushPredicatesThroughProject and would thus cause a infinite optimization loop. The current post-processing method is defined as: ``` private def removeProjectBeforeFilter(plan: LogicalPlan): LogicalPlan = plan transform { case p1 Project(_, f Filter(_, p2 Project(_, child))) if p2.outputSet.subsetOf(child.outputSet) => p1.copy(child = f.copy(child = child)) } ``` This method works well when there is only one Filter but would not if there's two or more Filters. In this case, there is a deterministic filter and a non-deterministic filter so they stay as separate filter nodes and cannot be combined together. An simplified illustration of the optimization process that forms the infinite loop is shown below (F1 stands for the 1st filter, F2 for the 2nd filter, P for project, S for scan of relation, PredicatePushDown as abbrev. of PushPredicatesThroughProject): ``` F1 - F2 - P - S PredicatePushDown => F1 - P - F2 - S ColumnPruning => F1 - P - F2 - P - S => F1 - P - F2 - S (Project removed) PredicatePushDown => P - F1 - F2 - S ColumnPruning => P - F1 - P - F2 - S => P - F1 - P - F2 - P - S => P - F1 - F2 - P - S (only one Project removed) RemoveRedundantProject => F1 - F2 - P - S (goes back to the loop start) ``` So the problem is the ColumnPruning rule adds a Project under a Filter (and fails to remove it in the end), and that new Project triggers PushPredicateThroughProject. Once the filters have been push through the Project, a new Project will be added by the ColumnPruning rule and this goes on and on. The fix should be when adding Projects, the rule applies top-down, but later when removing extra Projects, the process should go bottom-up to ensure all extra Projects can be matched. ## How was this patch tested? Added a optimization rule test in ColumnPruningSuite; and a end-to-end test in SQLQuerySuite. Author: maryannxue <maryannxue@apache.org> Closes #21674 from maryannxue/spark-24696.Uh oh!
There was an error while loading. Please reload this page.
There are no files selected for viewing