-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17177][SQL] Make grouping columns accessible from RelationalGroupedDataset
#14742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17177][SQL] Make grouping columns accessible from RelationalGroupedDataset
#14742
Conversation
|
Test build #64169 has finished for PR 14742 at commit
|
|
cc: @shivaram, @liancheng |
|
@liancheng, @rxin, Do you think adding |
| * | ||
| * @since 2.1.0 | ||
| */ | ||
| def columns: Array[String] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be a column (a named expression). Thus, this might not be useful to end users.
|
Could we please close this PR now? |
|
Test build #78035 has started for PR 14742 at commit |
|
Test FAILed. |
|
Hi @gatorsmile, #14431 depends on this. Is there a way I can access the grouping columns from |
|
Let us close this one. We can discuss how to resolve the issue in #14431 |
|
yes, we can close this, but it would be great if you could help us to find a way to access the grouping columns from SparkR in #14431 |
## What changes were proposed in this pull request? This PR proposes to close stale PRs, mostly the same instances with apache#18017 I believe the author in apache#14807 removed his account. Closes apache#7075 Closes apache#8927 Closes apache#9202 Closes apache#9366 Closes apache#10861 Closes apache#11420 Closes apache#12356 Closes apache#13028 Closes apache#13506 Closes apache#14191 Closes apache#14198 Closes apache#14330 Closes apache#14807 Closes apache#15839 Closes apache#16225 Closes apache#16685 Closes apache#16692 Closes apache#16995 Closes apache#17181 Closes apache#17211 Closes apache#17235 Closes apache#17237 Closes apache#17248 Closes apache#17341 Closes apache#17708 Closes apache#17716 Closes apache#17721 Closes apache#17937 Added: Closes apache#14739 Closes apache#17139 Closes apache#17445 Closes apache#18042 Closes apache#18359 Added: Closes apache#16450 Closes apache#16525 Closes apache#17738 Added: Closes apache#16458 Closes apache#16508 Closes apache#17714 Added: Closes apache#17830 Closes apache#14742 ## How was this patch tested? N/A Author: hyukjinkwon <[email protected]> Closes apache#18417 from HyukjinKwon/close-stale-pr.
What changes were proposed in this pull request?
Currently, once we create
RelationalGroupedDataset, we cannot access the grouping columns from its instance.Analog to
Datasetwe can have a public method which returns the list of grouping columns.https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L457
This can be useful for instance in SparkR when we want to have certain logic associated with the grouping columns, accessible from
RelationalGroupedDataset.Similar to
Dataset.columnsI've addedRelationalGroupedDataset.columnsmethod which makes grouping column names accessible.How was this patch tested?
Unit tests in
DataFrameAggregateSuite