Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
address comments
  • Loading branch information
gengliangwang committed Jun 10, 2020
commit 296068cb1f3f47e3a897387a518a1c8a2b83175f
Original file line number Diff line number Diff line change
Expand Up @@ -227,30 +227,27 @@ trait PredicateHelper extends Logging {
// For each side, there is no need to expand predicates of the same references.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same references. -> the same table references.?

// So here we can aggregate predicates of the same references as one single predicate,
// for reducing the size of pushed down predicates and corresponding codegen.
val right = aggregateExpressionsOfSameQualifiers(resultStack.pop())
val left = aggregateExpressionsOfSameQualifiers(resultStack.pop())
val right = groupExpressionsByQualifier(resultStack.pop())
val left = groupExpressionsByQualifier(resultStack.pop())
// Stop the loop whenever the result exceeds the `maxCnfNodeCount`
if (left.size * right.size > maxCnfNodeCount) {
Seq.empty
return Seq.empty
} else {
for {x <- left; y <- right} yield Or(x, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit format: {x <- left; y <- right} -> { x <- left; y <- right }?

}
case other => other :: Nil
}
if (cnf.isEmpty) {
return Seq.empty
}
if (resultStack.length != 1) {
logWarning("The length of CNF conversion result stack is supposed to be 1. There might " +
"be something wrong with CNF conversion.")
return Seq.empty
}
resultStack.push(cnf)
}
if (resultStack.length != 1) {
logWarning("The length of CNF conversion result stack is supposed to be 1. There might " +
"be something wrong with CNF conversion.")
return Seq.empty
}
resultStack.top
}

private def aggregateExpressionsOfSameQualifiers(
private def groupExpressionsByQualifier(
expressions: Seq[Expression]): Seq[Expression] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it seems you don't need the line break;

  private def groupExpressionsByQualifier(expressions: Seq[Expression]): Seq[Expression] = {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's true

expressions.groupBy(_.references.map(_.qualifier)).map(_._2.reduceLeft(And)).toSeq
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a test case dt = '1' OR (dt = '2' AND id = 1) passed to conjunctiveNormalForm, still return dt = '1' OR (dt = '2' AND id = 1).

See qualifier when groupby , they are

List(List(spark_catalog, default, t))
List(List(spark_catalog, default, t))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can try

expressions.groupBy(_.references.flatMap(_.qualifier).toSet).map(_._2.reduceLeft(And)).toSeq

I will update this PR later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expressions.groupBy(_.references.flatMap(_.qualifier).toSet).map(_._2.reduceLeft(And)).toSeq

Not work, just

expressions.groupBy(_.references).map(_._2.reduceLeft(And)).toSeq

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The qualifier is the table name which is able to be used for aggregating more expressions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The qualifier is the table name which is able to be used for aggregating more expressions

Got the point, you did this for split condition to join children, I want convert scan predicate condition to optimize scan predicate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is complex enough. Let's keep this part in this way for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I will raise pr for other problem base on your code and change a little after your pr merged.

Expand Down