[SPARK-40362][SQL] Fix BinaryComparison canonicalization #37851

peter-toth · 2022-09-10T18:04:05Z

What changes were proposed in this pull request?

Change canonicalization to a one pass process and move logic from Canonicalize.reorderCommutativeOperators to the respective commutative operators' canonicalize.

Why are the changes needed?

#34883 improved expression canonicalization performance but introduced regression when a commutative operator is under a BinaryComparison. This is because children reorder by their hashcode can't happen in preCanonicalized phase when children are not yet "final".

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added new UT.

ahshahid · 2022-09-10T18:17:03Z

Suggest you run the benchmark test of cloud_fan
May be something like nested add ( which his benchmar test uses) > 11

peter-toth · 2022-09-11T07:31:53Z

The benchmark:

  test("benchmark") {
    val col = Literal(true)
    var and = And(col, col)
    var i = 0
    while (i < 2000) {
      and = And(and, col)
      i += 1
    }
    and.canonicalized
  }

is fine, 140ms on my machine. That's because we don't alter the bottom-up traversal, just at the BinaryComparison nodes we can swap the already ordered children.

But I need to look into the test failures, though...

ahshahid · 2022-09-11T14:35:23Z

Got it.. your change is better solution.

cloud-fan · 2022-09-12T14:19:43Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala

+    case bc: GreaterThanOrEqual => orderBinaryComparison(bc, LessThanOrEqual)
+    case bc: LessThanOrEqual => orderBinaryComparison(bc, GreaterThanOrEqual)
+
+    case _ => e.mapChildren(preCanonicalizeAndReorderOperators).preCanonicalized


so we do reorder first and then do pre-canonicalization?

I would say that we pre-canonicalize and reorder the node's children first and then pre-canonicalize the node.

Unfortunately 88cb006 didn't work because HigherOrderFunction.preCanonicalized modifies its children's NamedLambdaVariables and reorder should happen after these modifications.

But in 99f8b61 and 6a38a00 I'm proposing a change that we break the full tree traversal in Canonicalize.reorderOperators, and instead, we could let non commutative operator's preCanonicalized to call their children'scanonicalized to initialte Canonicalize.reorderOperators on their children.

I think this approach can be refectored further and we can move all logic from preCanonicalized to canonicalized, just we need to override commutative operators's canonicalized to call Canonicalize.reorderCommutativeOperators: 726f7f0

…rators

ahshahid · 2022-09-12T20:46:21Z

I see that pre-canonicalized has been removed, which makes sense & is a single pass.
This also caches the intermediate canonicalized expression which is a good thing in the sense it is doing with reorder..
It would have slightly detrimental impact on actual benchmark but well within tolerance.
and since tree traversal has been removed from the reorder for non-commutative expressions which makes sense and improves perf.

This is good to go I suppose

cloud-fan · 2022-09-13T05:26:03Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala

    gatherCommutative(e, f).sortBy(_.hashCode())
-
-  def reorderCommutativeOperators(e: Expression): Expression = e match {
-    // TODO: do not reorder consecutive `Add`s or `Multiply`s with different `failOnError` flags


we need to keep this TODO somewhere.

Sure, added back in dead8b6

cloud-fan · 2022-09-13T05:27:48Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala

-  // the actual "user-facing API" of expression canonicalization. Only the root node of the
-  // expression tree will instantiate the `canonicalized` variable. This is different from
-  // `preCanonicalized`, because `canonicalized` does "global" canonicalization and most of the time
-  // you cannot reuse the canonicalization result of the children.


It's good that we now have a better and simplified version, but we should still have a detailed comment to explain the new workflow.

Added a process comment in 6c68d8b, let me know if it needs more details.

I've also updated the PR description.

cloud-fan · 2022-09-13T05:30:02Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala

      f: PartialFunction[Expression, Seq[Expression]]): Seq[Expression] = e match {
    case c if f.isDefinedAt(c) => f(c).flatMap(gatherCommutative(_, f))
-    case other => reorderCommutativeOperators(other) :: Nil
+    case other => other.canonicalized :: Nil


I think this is the key change. This means that we will first reorder leaf-most adjacent commutative operators, and then do this recursively bottom-up.

It's probably better to have a new trait CommutativeExpression and remove this object.

Agreed, fixed in ff2fe32

cloud-fan · 2022-09-13T14:52:44Z

thanks, merging to master!

cloud-fan · 2022-09-13T14:53:02Z

@peter-toth can you open a backport PR for 3.3? it has conflicts. Thanks!

Change canonicalization to a one pass process and move logic from `Canonicalize.reorderCommutativeOperators` to the respective commutative operators' `canonicalize`. apache#34883 improved expression canonicalization performance but introduced regression when a commutative operator is under a `BinaryComparison`. This is because children reorder by their hashcode can't happen in `preCanonicalized` phase when children are not yet "final". No. Added new UT. Closes apache#37851 from peter-toth/SPARK-40362-fix-binarycomparison-canonicalization. Authored-by: Peter Toth <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

peter-toth · 2022-09-13T15:03:56Z

thanks, merging to master!

Thanks for the review!

@peter-toth can you open a backport PR for 3.3? it has conflicts. Thanks!

Sure, opened here: #37866

…imization for canonicalizing large trees of commutative expressions ### What changes were proposed in this pull request? - This PR introduces a new expression called `MultiCommutativeOp` which is used by the commutative expressions (e.g., `Add`, `Multiply`, `And`, `Or`, `BitwiseOr`, `BitwiseAnd`, `BitwiseXor`) during canonicalization. - During canonicalization, when there is a list of consecutive commutative expressions, we now create a MultiCommutative expression with references to original operands, instead of creating new objects. - This new expression is added as a memory optimization to reduce generating a large number of intermediate objects during canonicalization. ### Why are the changes needed? - With the [recent changes](#37851) in the expression canonicalization, a complex query with a large number of commutative operations could end up consuming significantly more (sometimes > 10X) memory on the executors. - In our case, this issue happens for a specific complex query that has a huge expression tree containing Add operators interleaved by non Add operators. - The issue is related to canonicalization and why it is causing issues in the executors is because the codegen component relies on expression canonicalization to deduplicate expressions. - When we have a large number of Adds interleaved by non-Add operators, [this line](https://github.com/apache/spark/pull/37851/files#diff-7278f2db37934522ee7c74b71525153234cff245cefaf996957e4a9ff3dbaacdR1171) ends up materializing a new canonicalized expression tree at every non-Add operator. - In our case, analyzing the executor heap histogram shows that the additional memory is consumed by a large number of Add objects. - The high memory usage causes the executors to lose heartbeat signals and results in task failures. - The proposed `MultiCommutativeOp` expression avoids generating new Add expressions and keeps the extra memory usage to a minimum. ### Does this PR introduce _any_ user-facing change? - No ### How was this patch tested? - Existing unit tests and new unit tests. Closes #39722 from db-scnakandala/SPARK-42162. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

…imization for canonicalizing large trees of commutative expressions ### What changes were proposed in this pull request? - This PR introduces a new expression called `MultiCommutativeOp` which is used by the commutative expressions (e.g., `Add`, `Multiply`, `And`, `Or`, `BitwiseOr`, `BitwiseAnd`, `BitwiseXor`) during canonicalization. - During canonicalization, when there is a list of consecutive commutative expressions, we now create a MultiCommutative expression with references to original operands, instead of creating new objects. - This new expression is added as a memory optimization to reduce generating a large number of intermediate objects during canonicalization. ### Why are the changes needed? - With the [recent changes](#37851) in the expression canonicalization, a complex query with a large number of commutative operations could end up consuming significantly more (sometimes > 10X) memory on the executors. - In our case, this issue happens for a specific complex query that has a huge expression tree containing Add operators interleaved by non Add operators. - The issue is related to canonicalization and why it is causing issues in the executors is because the codegen component relies on expression canonicalization to deduplicate expressions. - When we have a large number of Adds interleaved by non-Add operators, [this line](https://github.com/apache/spark/pull/37851/files#diff-7278f2db37934522ee7c74b71525153234cff245cefaf996957e4a9ff3dbaacdR1171) ends up materializing a new canonicalized expression tree at every non-Add operator. - In our case, analyzing the executor heap histogram shows that the additional memory is consumed by a large number of Add objects. - The high memory usage causes the executors to lose heartbeat signals and results in task failures. - The proposed `MultiCommutativeOp` expression avoids generating new Add expressions and keeps the extra memory usage to a minimum. ### Does this PR introduce _any_ user-facing change? - No ### How was this patch tested? - Existing unit tests and new unit tests. Closes #39722 from db-scnakandala/SPARK-42162. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 99431e2) Signed-off-by: Wenchen Fan <[email protected]>

…imization for canonicalizing large trees of commutative expressions ### What changes were proposed in this pull request? - This PR introduces a new expression called `MultiCommutativeOp` which is used by the commutative expressions (e.g., `Add`, `Multiply`, `And`, `Or`, `BitwiseOr`, `BitwiseAnd`, `BitwiseXor`) during canonicalization. - During canonicalization, when there is a list of consecutive commutative expressions, we now create a MultiCommutative expression with references to original operands, instead of creating new objects. - This new expression is added as a memory optimization to reduce generating a large number of intermediate objects during canonicalization. ### Why are the changes needed? - With the [recent changes](apache#37851) in the expression canonicalization, a complex query with a large number of commutative operations could end up consuming significantly more (sometimes > 10X) memory on the executors. - In our case, this issue happens for a specific complex query that has a huge expression tree containing Add operators interleaved by non Add operators. - The issue is related to canonicalization and why it is causing issues in the executors is because the codegen component relies on expression canonicalization to deduplicate expressions. - When we have a large number of Adds interleaved by non-Add operators, [this line](https://github.com/apache/spark/pull/37851/files#diff-7278f2db37934522ee7c74b71525153234cff245cefaf996957e4a9ff3dbaacdR1171) ends up materializing a new canonicalized expression tree at every non-Add operator. - In our case, analyzing the executor heap histogram shows that the additional memory is consumed by a large number of Add objects. - The high memory usage causes the executors to lose heartbeat signals and results in task failures. - The proposed `MultiCommutativeOp` expression avoids generating new Add expressions and keeps the extra memory usage to a minimum. ### Does this PR introduce _any_ user-facing change? - No ### How was this patch tested? - Existing unit tests and new unit tests. Closes apache#39722 from db-scnakandala/SPARK-42162. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 99431e2) Signed-off-by: Wenchen Fan <[email protected]>

[SPARK-40362][SQL] Fix BinaryComparison canonicalization

44be823

github-actions bot added the SQL label Sep 10, 2022

peter-toth mentioned this pull request Sep 10, 2022

[SPARK-40362][SQL] Bug in Canonicalization of expressions like Add & Multiply i.e Commutative Operators #37824

Closed

fix

a4e0e0e

peter-toth force-pushed the SPARK-40362-fix-binarycomparison-canonicalization branch from 22222cf to a4e0e0e Compare September 11, 2022 16:27

peter-toth added 2 commits September 12, 2022 14:34

incorporate 1 pass from apache#37824

3c465a8

drop duplicate preCanonicalized

88cb006

peter-toth force-pushed the SPARK-40362-fix-binarycomparison-canonicalization branch from b45d6fe to 88cb006 Compare September 12, 2022 13:17

cloud-fan reviewed Sep 12, 2022

View reviewed changes

peter-toth added 4 commits September 12, 2022 19:02

different approach

99f8b61

let preCanonicalize use children's canonicalized except in reorderOpe…

6a38a00

…rators

revert unnecessary changes

de3c65c

refactor

726f7f0

cloud-fan reviewed Sep 13, 2022

View reviewed changes

peter-toth added 3 commits September 13, 2022 08:42

add back todo

dead8b6

add trait CommutativeExpression

ff2fe32

add comment

6c68d8b

peter-toth force-pushed the SPARK-40362-fix-binarycomparison-canonicalization branch from 5d8a184 to 6c68d8b Compare September 13, 2022 08:13

cloud-fan approved these changes Sep 13, 2022

View reviewed changes

cloud-fan closed this in 3f97cd6 Sep 13, 2022

db-scnakandala mentioned this pull request Jan 24, 2023

[SPARK-42162] Introduce MultiCommutativeOp expression as a memory optimization for canonicalizing large trees of commutative expressions #39722

Closed

NVnavkumar mentioned this pull request May 10, 2023

Add a unit test for reordered canonicalized expressions in BinaryComparison NVIDIA/spark-rapids#8274

Merged

[SPARK-40362][SQL] Fix BinaryComparison canonicalization #37851

[SPARK-40362][SQL] Fix BinaryComparison canonicalization #37851

Uh oh!

Conversation

peter-toth commented Sep 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ahshahid commented Sep 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peter-toth commented Sep 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahshahid commented Sep 11, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth Sep 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth Sep 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahshahid commented Sep 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peter-toth Sep 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Sep 13, 2022

Uh oh!

cloud-fan commented Sep 13, 2022

Uh oh!

peter-toth commented Sep 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

peter-toth commented Sep 10, 2022 •

edited

Loading

ahshahid commented Sep 10, 2022 •

edited

Loading

peter-toth commented Sep 11, 2022 •

edited

Loading

peter-toth Sep 12, 2022 •

edited

Loading

peter-toth Sep 12, 2022 •

edited

Loading

ahshahid commented Sep 12, 2022 •

edited

Loading

peter-toth Sep 13, 2022 •

edited

Loading

peter-toth commented Sep 13, 2022 •

edited

Loading