Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -367,11 +367,12 @@ trait CheckAnalysis extends PredicateHelper {

case o if o.expressions.exists(!_.deterministic) &&
!o.isInstanceOf[Project] && !o.isInstanceOf[Filter] &&
!o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] =>
!o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] &&
!o.isInstanceOf[DeserializeToObject] && !o.isInstanceOf[SerializeFromObject] =>
// The rule above is used to check Aggregate operator.
failAnalysis(
s"""nondeterministic expressions are only allowed in
|Project, Filter, Aggregate or Window, found:
s"""nondeterministic expressions are only allowed in Project, Filter,
| Aggregate, Window, SerializeFromObject or DeserializeToObject, found:
| ${o.expressions.map(_.sql).mkString(",")}
|in operator ${operator.simpleString}
""".stripMargin)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ case class AssertTrue(child: Expression) extends UnaryExpression with ImplicitCa

override def nullable: Boolean = true

override lazy val deterministic: Boolean = false
Copy link
Member

@viirya viirya Nov 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this two expressions are very commonly used in other expressions' children. Making them non-deterministic will make such expressions as non-deterministic too, e.g., encoders. It might have big impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's true. The change may be big indeed. I am open to other proposals, as creating a new sideEffect flag in the Expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this, I'm leaning towards creating a new flag instead of making them non-deterministic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thanks you all the comments. I'll try and add a new flag. I'll update the PR as soon as I have a solution for that. Probably it will take some time, though, as I will have to recheck all the usage of determistic.... Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile thanks for your comment. I think the main issue with making this non-determistic is that we have basically to skip the sanity check at

case o if o.expressions.exists(!_.deterministic) &&
, because we have AssertNotNull is a lot of other different locations. Another option may be to special case AssertNotNull and avoid that check for it... What do you think?

Copy link
Contributor Author

@mgaido91 mgaido91 Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any comment on this @gatorsmile @maryannxue ?


override def inputTypes: Seq[DataType] = Seq(BooleanType)

override def dataType: DataType = NullType
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1632,6 +1632,8 @@ case class AssertNotNull(child: Expression, walkedTypePath: Seq[String] = Nil)
override def foldable: Boolean = false
override def nullable: Boolean = false

override lazy val deterministic: Boolean = false

override def flatArguments: Iterator[Any] = Iterator(child)

private val errMsg = "Null value appeared in non-nullable field:" +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
import org.apache.spark.sql.catalyst.dsl.plans._
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral}
import org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull
import org.apache.spark.sql.catalyst.plans.PlanTest
import org.apache.spark.sql.catalyst.plans.logical._
import org.apache.spark.sql.catalyst.rules._
Expand Down Expand Up @@ -166,4 +167,13 @@ class SimplifyConditionalSuite extends PlanTest with PredicateHelper {
Literal(1))
)
}

test("SPARK-24913: don't skip AssertNotNull and AssertTrue") {
val ifWithAssertNotNull = If(AssertNotNull(UnresolvedAttribute("b")), Literal(1), Literal(1))
val ifWithAssertTrue = If(AssertTrue(UnresolvedAttribute("b")), Literal(1), Literal(1))
val plan = Filter(And(ifWithAssertNotNull, ifWithAssertTrue), OneRowRelation())
val optimized = Optimize.execute(plan).analyze
// optimization should not change the plan
comparePlans(plan, optimized, checkAnalysis = false)
}
}