[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause #12235

davies · 2016-04-07T07:47:38Z

What changes were proposed in this pull request?

This PR brings the support of using grouping()/grouping_id() in HAVING/ORDER BY clause.

The resolved grouping()/grouping_id() will be replaced by unresolved "spark_gropuing_id" virtual attribute, then resolved by ResolveMissingAttribute.

This PR also fix the HAVING clause that access a grouping column that is not presented in SELECT clause, for example:

select count(1) from (select 1 as a) t group by a having a > 0

How was this patch tested?

Add new tests.

davies · 2016-04-07T07:48:08Z

cc @cloud-fan

cloud-fan · 2016-04-07T09:19:18Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

        }
    }

    private def isAggregateExpression(e: Expression): Boolean = {


do you need this method anymore? It's just a simple isInstanceOf now

cloud-fan · 2016-04-07T09:19:38Z

overall LGTM

SparkQA · 2016-04-07T09:24:38Z

Test build #55201 has finished for PR 12235 at commit 2412671.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2016-04-07T17:51:59Z

@marmbrus Could you also take a quick look on this one?

SparkQA · 2016-04-07T18:20:31Z

Test build #55229 has finished for PR 12235 at commit 8c4bd6c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2016-04-07T18:27:22Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala

-  val groupingIdName: String = "grouping__id"
+  // The attribute name used by Hive, which has different result than Spark, deprecated.
+  val hiveGroupingIdName: String = "grouping__id"
+  val groupingIdName: String = "spark_grouping_id"


Can you explain what's going on here?

"grouping__id" came from Hive, but unfortunately the implementation is wrong, see https://issues.apache.org/jira/browse/HIVE-12833. So we deprecated to favor the standard function grouping_id() as public API. "spark_grouping_id" is the virtual column only used internally.

marmbrus · 2016-04-07T18:47:57Z

LGTM

davies · 2016-04-07T18:58:16Z

Merged into master, thanks!

Davies Liu added 4 commits April 6, 2016 22:53

support grouping()/grouping_id() in having clause

fe6ece7

support grouping() in sort

7ad43fe

update comments

f60fcb4

add a test for having

2412671

cloud-fan reviewed Apr 7, 2016
View reviewed changes

CR

8c4bd6c

davies changed the title ~~[SPARK-12740] support grouping()/grouping_id() in having/order clause~~ [SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause Apr 7, 2016

marmbrus reviewed Apr 7, 2016
View reviewed changes

asfgit closed this in aa85221 Apr 7, 2016

viirya mentioned this pull request May 6, 2017

[SPARK-20612][SQL] Throw exception when there is unresolvable attributes in Filter #17874

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause #12235

[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause #12235

Uh oh!

davies commented Apr 7, 2016

Uh oh!

davies commented Apr 7, 2016

Uh oh!

cloud-fan Apr 7, 2016

Uh oh!

cloud-fan commented Apr 7, 2016

Uh oh!

SparkQA commented Apr 7, 2016

Uh oh!

davies commented Apr 7, 2016

Uh oh!

SparkQA commented Apr 7, 2016

Uh oh!

marmbrus Apr 7, 2016

Uh oh!

davies Apr 7, 2016

Uh oh!

marmbrus commented Apr 7, 2016

Uh oh!

davies commented Apr 7, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause #12235

[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause #12235

Uh oh!

Conversation

davies commented Apr 7, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

davies commented Apr 7, 2016

Uh oh!

cloud-fan Apr 7, 2016

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Apr 7, 2016

Uh oh!

SparkQA commented Apr 7, 2016

Uh oh!

davies commented Apr 7, 2016

Uh oh!

SparkQA commented Apr 7, 2016

Uh oh!

marmbrus Apr 7, 2016

Choose a reason for hiding this comment

Uh oh!

davies Apr 7, 2016

Choose a reason for hiding this comment

Uh oh!

marmbrus commented Apr 7, 2016

Uh oh!

davies commented Apr 7, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants