-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6145] [SQL] Fix the bug of nested data type resolving in ORDER BY #4892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
b82c8c5
fix the bug of nested data type resolving in sort by
chenghao-intel 44ac17d
nested a.a.a which returns expression with alias name 'a'
chenghao-intel 47db754
fix the bug in unittest
chenghao-intel 5de3b9e
fix another bug in unittest
chenghao-intel 73eb346
update the output column attribute name for nested data query
chenghao-intel 9209ac1
update as feedback
chenghao-intel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
fix the bug in unittest
- Loading branch information
commit 47db754364ecd64e9fa73effc9e753d72e49647a
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The python test failure is caused by replacing
aliasNamewithnamehere. Is it okay?SELECT a.b.c FROM tablewould get attribute nameda.b.cinstead ofcbefore.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @viirya I've updated the python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant should it be that? In Hive it should be
cinstead ofa.b.c?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not so sure how Hive handle that, but it can not be
c; otherwise it may cause reference arbitrary for its parent logical plan.e.g.
Assume we have table
tblwith schema Struct < a : Struct < b : Int, c: Int>, b: int>There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can change the default alias when extracting nested fields. I believe we match hive behaviors now, and this would break existing queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree we shouldn't break the existed logic, but I believe this is a bug of Hive.
I am wondering if we can break the naming rule of Hive for nested data type references, which always causes ambiguous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this a bug? These are pretty contrived examples. How often do you actually have nested structures where the outside name is the same as the inside name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I shouldn't say "always", but "possible", it maybe quite often while with
join.