[SPARK-17237][SPARK-17458][SQL][Backport-2.0] Preserve aliases that are given for pivot aggregations #16565

maropu · 2017-01-13T00:38:26Z

What changes were proposed in this pull request?

This pr is to preserve aliases that are given for pivot aggregations to solve the issue reported in SPARK-17237. This pivoting adds backticks (e.g. 3_count(`c`)) in column names and, in some cases,
thes causes analysis exceptions like;

scala> val df = Seq((2, 3, 4), (3, 4, 5)).toDF("a", "x", "y")
scala> df.groupBy("a").pivot("x").agg(count("y"), avg("y")).na.fill(0)
org.apache.spark.sql.AnalysisException: syntax error in attribute name: `3_count(`y`)`;
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:134)
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:144)
...

So, this pr also removes these backticks from column names.

How was this patch tested?

Added a test in DataFrameAggregateSuite.

SparkQA · 2017-01-13T01:59:43Z

Test build #71281 has finished for PR 16565 at commit 875e2c0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-01-13T02:06:05Z

Could you change the PR title to [SPARK-17237][SQL][Backport-2.0] Remove backticks in a pivot result schema

maropu · 2017-01-13T02:06:47Z

okay! I'm now looking into the test failure, so just a sec, thanks

SparkQA · 2017-01-13T07:29:32Z

Test build #71290 has finished for PR 16565 at commit e2c2fae.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-01-13T07:35:54Z

I checked the change history. Actually, you also backported #15111. Could you please update your PR description and PR title?

gatorsmile · 2017-01-13T07:38:07Z

LGTM except one comment

maropu · 2017-01-13T08:09:17Z

@gatorsmile oh, I see. Is it okay to mix this pr with the fix of #15111? Would it be better to backport #15111 first then, backport this?

maropu · 2017-01-15T01:18:23Z

@gatorsmile ping

gatorsmile · 2017-01-15T02:06:13Z

I think it is fine to do it together. Basically, your PR is to fix the bug of #15111

maropu · 2017-01-15T02:07:46Z

okay! I'll update them

maropu · 2017-01-15T02:21:20Z

@gatorsmile How about this fix? plz check this again?

gatorsmile

LGTM

…re given for pivot aggregations ## What changes were proposed in this pull request? This pr is to preserve aliases that are given for pivot aggregations to solve the issue reported in `SPARK-17237`. This pivoting adds backticks (e.g. 3_count(\`c\`)) in column names and, in some cases, thes causes analysis exceptions like; ``` scala> val df = Seq((2, 3, 4), (3, 4, 5)).toDF("a", "x", "y") scala> df.groupBy("a").pivot("x").agg(count("y"), avg("y")).na.fill(0) org.apache.spark.sql.AnalysisException: syntax error in attribute name: `3_count(`y`)`; at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:134) at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:144) ... ``` So, this pr also removes these backticks from column names. ## How was this patch tested? Added a test in `DataFrameAggregateSuite`. Author: Takeshi YAMAMURO <[email protected]> Closes #16565 from maropu/SPARK-17237-3.

gatorsmile · 2017-01-15T07:41:39Z

Thanks! Merging to 2.0

Could you please close it?

maropu · 2017-01-15T07:42:28Z

Okay and thanks!

maropu changed the title ~~[SPARK-17237][SQL] Remove backticks in a pivot result schema~~ [SPARK-17237][SQL][Backport-2.0] Remove backticks in a pivot result schema Jan 13, 2017

Fix a bug to handle missing data after pivoting

e2c2fae

maropu force-pushed the SPARK-17237-3 branch from 875e2c0 to e2c2fae Compare January 13, 2017 05:25

maropu changed the title ~~[SPARK-17237][SQL][Backport-2.0] Remove backticks in a pivot result schema~~ [SPARK-17237][SPARK-17458][SQL][Backport-2.0] Alias specified for aggregates in a pivot are not honored Jan 15, 2017

maropu changed the title ~~[SPARK-17237][SPARK-17458][SQL][Backport-2.0] Alias specified for aggregates in a pivot are not honored~~ [SPARK-17237][SPARK-17458][SQL][Backport-2.0] Preserve aliases that are given for pivot aggregations Jan 15, 2017

gatorsmile approved these changes Jan 15, 2017

View reviewed changes

maropu closed this Jan 15, 2017

maropu mentioned this pull request Jun 14, 2017

[SPARK-17237][SQL][FOLLOWUP][WIP] Add a qualifier in pretty expressions #18302

Closed

maropu deleted the SPARK-17237-3 branch July 5, 2017 11:44

[SPARK-17237][SPARK-17458][SQL][Backport-2.0] Preserve aliases that are given for pivot aggregations #16565

[SPARK-17237][SPARK-17458][SQL][Backport-2.0] Preserve aliases that are given for pivot aggregations #16565

Uh oh!

Conversation

maropu commented Jan 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Jan 13, 2017

Uh oh!

gatorsmile commented Jan 13, 2017

Uh oh!

maropu commented Jan 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Jan 13, 2017

Uh oh!

gatorsmile commented Jan 13, 2017

Uh oh!

gatorsmile commented Jan 13, 2017

Uh oh!

maropu commented Jan 13, 2017

Uh oh!

maropu commented Jan 15, 2017

Uh oh!

gatorsmile commented Jan 15, 2017

Uh oh!

maropu commented Jan 15, 2017

Uh oh!

maropu commented Jan 15, 2017

Uh oh!

gatorsmile left a comment

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Jan 15, 2017

Uh oh!

maropu commented Jan 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maropu commented Jan 13, 2017 •

edited

Loading

maropu commented Jan 13, 2017 •

edited

Loading