Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -390,6 +390,7 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
case If(TrueLiteral, trueValue, _) => trueValue
case If(FalseLiteral, _, falseValue) => falseValue
case If(Literal(null, _), _, falseValue) => falseValue
case If(_, trueValue, falseValue) if trueValue.semanticEquals(falseValue) => trueValue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right. The condition must be deterministic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate?

For trueValue.semanticEquals(falseValue), it's guaranteed that both trueValue and falseValue are deterministic.

def semanticEquals(other: Expression): Boolean =
    deterministic && other.deterministic && canonicalized == other.canonicalized

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understandable that the condition can be non-deterministic, but this doesn't change the result of If.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition could have a side effect. For example, calling a stateful UDF.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.


case e @ CaseWhen(branches, elseValue) if branches.exists(x => falseOrNullLiteral(x._1)) =>
// If there are branches that are always false, remove them.
Expand All @@ -403,14 +404,14 @@ object SimplifyConditionals extends Rule[LogicalPlan] with PredicateHelper {
e.copy(branches = newBranches)
}

case e @ CaseWhen(branches, _) if branches.headOption.map(_._1) == Some(TrueLiteral) =>
case CaseWhen(branches, _) if branches.headOption.map(_._1).contains(TrueLiteral) =>
Copy link
Member

@dongjoon-hyun dongjoon-hyun Jul 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we removed Scala 2.10, it seems to be okay. However, if we revert this irrelevant change, this PR becomes neater (and easier for someone to backport this).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's not a bug fix, I guess it's unlikely someone will backport this :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In the community, it's not allowed for backport. I mean the others who want to have this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, in any event, wouldn't it be better to revert this change back if there's any actual advantage against a unrelated style change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, we avoid adding unneeded refactoring in such a PR. Please avoid it next time. Thanks!

// If the first branch is a true literal, remove the entire CaseWhen and use the value
// from that. Note that CaseWhen.branches should never be empty, and as a result the
// headOption (rather than head) added above is just an extra (and unnecessary) safeguard.
branches.head._2

case CaseWhen(branches, _) if branches.exists(_._1 == TrueLiteral) =>
// a branc with a TRue condition eliminates all following branches,
// a branch with a true condition eliminates all following branches,
// these branches can be pruned away
val (h, t) = branches.span(_._1 != TrueLiteral)
CaseWhen( h :+ t.head, None)
Expand Down Expand Up @@ -651,6 +652,7 @@ object SimplifyCaseConversionExpressions extends Rule[LogicalPlan] {
}
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra space line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we always have two new lines between two objects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay to remove it back though assuming from

Use one or two blank line(s) to separate class definitions.

https://github.com/databricks/scala-style-guide#blank-lines-vertical-whitespace

Looks either way is fine.

/**
* Combine nested [[Concat]] expressions.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

package org.apache.spark.sql.catalyst.optimizer

import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
import org.apache.spark.sql.catalyst.dsl.plans._
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral}
Expand Down Expand Up @@ -57,6 +58,14 @@ class SimplifyConditionalSuite extends PlanTest with PredicateHelper {
Literal(20))
}

test("remove unnecessary if when the outputs are semantic equivalence") {
assertEquivalent(
If(IsNotNull(UnresolvedAttribute("a")),
Subtract(Literal(10), Literal(1)),
Add(Literal(6), Literal(3))),
Literal(9))
}

test("remove unreachable branches") {
// i.e. removing branches whose conditions are always false
assertEquivalent(
Expand Down