Skip to content

Conversation

@jiangxb1987
Copy link
Contributor

What changes were proposed in this pull request?

For the given example below, the predicate added by InferFiltersFromConstraints is folded by ConstantPropagation later, this leads to unconverged optimize iteration:

Seq((1, 1)).toDF("col1", "col2").createOrReplaceTempView("t1")
Seq(1, 2).toDF("col").createOrReplaceTempView("t2")
sql("SELECT * FROM t1, t2 WHERE t1.col1 = 1 AND 1 = t1.col2 AND t1.col1 = t2.col AND t1.col2 = t2.col")

We can fix this by adjusting the indent of the optimize rules.

How was this patch tested?

Add test case that would have failed in SQLQuerySuite.

@jiangxb1987
Copy link
Contributor Author

cc @gatorsmile

@SparkQA
Copy link

SparkQA commented Aug 31, 2017

Test build #81299 has finished for PR 19099 at commit 7a364a1.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

PushProjectionThroughUnion,
ReorderJoin,
EliminateOuterJoin,
InferFiltersFromConstraints,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you changed this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal, but it should resolve the specific failure case and could unblock other PR.

@gatorsmile is working on an alter approach that should perfectly resolve the entire issue of add/remove filters, but it takes time and I don't think it will be ready in a few days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, this is to infer the filters at the beginning of this huge batch, and then remove the useless at the end. When it involves multiple rules, we might still hit the max iteration. Eventually, we need a clean fix to split the whole batch to two separate ones. One is to add the filters/constraints; another is to remove the filters/constraints based on the costs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha, thanks for the explanation~

}

test("SPARK-21652: rule confliction of InferFiltersFromConstraints and ConstantPropagation") {
// Under test environment, throws Exception if the max iteration number is reached for an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just catching the exception is not enough for this test?

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81322 has finished for PR 19099 at commit 9b1c642.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81323 has finished for PR 19099 at commit 231ca3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 5, 2017

Test build #81418 has finished for PR 19099 at commit 231ca3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in fd60d4f Sep 5, 2017
@jiangxb1987 jiangxb1987 deleted the unconverge-optimization branch September 5, 2017 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants