-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35088][SQL] Accept ANSI intervals by the Sequence expression #32311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #137853 has finished for PR 32311 at commit
|
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137858 has finished for PR 32311 at commit
|
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the comment
Line 2490 in 9af338c
| then the step expression must resolve to the 'interval' type, otherwise to the same type |
interval relates to CalendarIntervalType
|
|
||
| checkEvaluation(new Sequence( | ||
| Literal(Timestamp.valueOf("2018-01-01 00:00:00")), | ||
| Literal(Timestamp.valueOf("2018-01-02 00:00:01")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the test for CalendarInterval? It seems it checks a specific case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm dazzled.
| Timestamp.valueOf("2018-01-01 12:00:00"), | ||
| Timestamp.valueOf("2018-01-01 00:00:00"))) | ||
|
|
||
| checkEvaluation(new Sequence( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The test is already big enough. Does it make sense to put new checks to a separate test and prepend JIRA id?
| Literal(Date.valueOf("2018-01-01")), | ||
| Literal(Date.valueOf("2018-01-05")), | ||
| Literal(Period.ofDays(2))), | ||
| EmptyRow, "sequence step must be a day interval if start and end values are dates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a day interval -> a day-time interval. Please, call DayTimeIntervalType.typeName() here. Since the types are not stable yet, we can rename the types in the near future.
|
The error message confuses slightly: Line 2567 in 9af338c
Type checking can fails because of unsupported type of steps even start and stop have one of: integral, timestamp or date.
|
| Literal(Date.valueOf("1970-02-01")), | ||
| Literal(Period.ofMonths(-1))), | ||
| EmptyRow, | ||
| s"sequence boundaries: 0 to 2678400000000 by -${28 * MICROS_PER_DAY}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, 28 because we assume 28 days per month there
Line 2742 in 9af338c
| private val microsPerMonth = 28 * microsPerDay |
By introducing new interval types, we tried to avoid such assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good found.
It seems we should change the behavior of CalenderInterval so as it could avoid such assumption.
Or the assumption looks like is a bug for the CalenderInterval.
If so , could we fix the bug in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In further, microsPerDay just used to estimated length of the sequences. We just need to improve the exception message so as avoid output the assume value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. I see.
|
|
||
| private class PeriodSequenceImpl[T: ClassTag] | ||
| (dt: IntegralType, scale: Long, fromLong: Long => T, zoneId: ZoneId) | ||
| (implicit num: Integral[T]) extends InternalSequenceBase(dt, scale, fromLong, zoneId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it is good idea to use timestampAddInterval() in InternalSequenceBase.eval for adding months to dates. I guess DateTimeUtils.dateAddMonths() and DateTimeUtils.timestampAddInterval can return different result, especially taking into account that dateAddMonths() does not depend on the current time zone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, the current implement uses DateTimeUtils.timestampAddInterval and it's behavior seems good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Let's use timestampAddInterval since we don't have an example that could demonstrate any issues caused by timestampAddInterval().
|
Test build #137913 has finished for PR 32311 at commit
|
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137919 has finished for PR 32311 at commit
|
| If start and stop expressions resolve to the 'date' or 'timestamp' type | ||
| then the step expression must resolve to the 'interval' type, otherwise to the same type | ||
| as the start and stop expressions. | ||
| then the step expression must resolve to the 'interval' or 'year-month' or 'day-time' type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
'year-month' -> 'year-month interval'
'day-time' -> 'day-time interval'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| |$prettyName uses the wrong parameter type. The parameter type must conform to: | ||
| |1. The start and stop expressions must resolve to the same type. | ||
| |2. If start and stop expressions resolve to the 'date' or 'timestamp' type | ||
| |then the step expression must resolve to the 'interval' or 'year-month' or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you call YearMonthIntervalType.typeName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137914 has finished for PR 32311 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. GA passed. Merging to master.
Thank you, @beliefer .
|
@MaxGekk Thanks for you review. |
|
Test build #137932 has finished for PR 32311 at commit
|
|
Refer to this link for build results (access rights to CI server needed): |
What changes were proposed in this pull request?
This PR makes
Sequenceexpression supports ANSI intervals as step expression.If the start and stop expression is
TimestampType,then the step expression could select year-month or day-time interval.If the start and stop expression is
DateType,then the step expression must be year-month.Why are the changes needed?
Extends the function of
Sequenceexpression.Does this PR introduce any user-facing change?
'Yes'. Users could use ANSI intervals as step expression for
Sequenceexpression.How was this patch tested?
New tests.