Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Apr 7, 2016

What changes were proposed in this pull request?

This PR brings the support of using grouping()/grouping_id() in HAVING/ORDER BY clause.

The resolved grouping()/grouping_id() will be replaced by unresolved "spark_gropuing_id" virtual attribute, then resolved by ResolveMissingAttribute.

This PR also fix the HAVING clause that access a grouping column that is not presented in SELECT clause, for example:

select count(1) from (select 1 as a) t group by a having a > 0

How was this patch tested?

Add new tests.

@davies
Copy link
Contributor Author

davies commented Apr 7, 2016

cc @cloud-fan

}
}

private def isAggregateExpression(e: Expression): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need this method anymore? It's just a simple isInstanceOf now

@cloud-fan
Copy link
Contributor

overall LGTM

@SparkQA
Copy link

SparkQA commented Apr 7, 2016

Test build #55201 has finished for PR 12235 at commit 2412671.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Apr 7, 2016

@marmbrus Could you also take a quick look on this one?

@davies davies changed the title [SPARK-12740] support grouping()/grouping_id() in having/order clause [SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in having/order clause Apr 7, 2016
@SparkQA
Copy link

SparkQA commented Apr 7, 2016

Test build #55229 has finished for PR 12235 at commit 8c4bd6c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val groupingIdName: String = "grouping__id"
// The attribute name used by Hive, which has different result than Spark, deprecated.
val hiveGroupingIdName: String = "grouping__id"
val groupingIdName: String = "spark_grouping_id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what's going on here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"grouping__id" came from Hive, but unfortunately the implementation is wrong, see https://issues.apache.org/jira/browse/HIVE-12833. So we deprecated to favor the standard function grouping_id() as public API. "spark_grouping_id" is the virtual column only used internally.

@marmbrus
Copy link
Contributor

marmbrus commented Apr 7, 2016

LGTM

@asfgit asfgit closed this in aa85221 Apr 7, 2016
@davies
Copy link
Contributor Author

davies commented Apr 7, 2016

Merged into master, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants