-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ifferent between hive SerDe and row format delimited
|
Gentle ping @cloud-fan @MaxGekk @maropu |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137920 has finished for PR 32335 at commit
|
|
Test build #137921 has finished for PR 32335 at commit
|
|
Merged to master. |
|
@HyukjinKwon Since you just merged this, I think I need to add a follow up one to add this behavior in migration guide? ok? |
|
please go ahead. |
| |FROM v | ||
| |""".stripMargin), | ||
| identity, | ||
| Row("INTERVAL '1 00:00:00' DAY TO SECOND", "INTERVAL '0-10' YEAR TO MONTH") :: Nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the spark-sql shell and df.show have different formats for intervals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the spark-sql shell and
df.showhave different formats for intervals?
Yea, have this problem too, since spark sql follow hive format. What should I to do next?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is interval format the only difference between hive format and spark cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is interval format the only difference between hive format and spark cast?
Yea, ANSI_STYLE and HIVE_STYLE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have a new Expression ToHiveString and use it in df.show and TRANSFORM, so that they are consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have a new Expression
ToHiveStringand use it indf.showandTRANSFORM, so that they are consistent.
Yea, create a ticket https://issues.apache.org/jira/browse/SPARK-35228
What changes were proposed in this pull request?
DayTimeIntervalType/YearMonthIntervalString show different between Hive SerDe and row format delimited.
Create this pr to add a test and have disscuss.
For this problem I think we have two direction:
Why are the changes needed?
Add UT
Does this PR introduce any user-facing change?
No
How was this patch tested?
added ut