-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31562][SQL] Update ExpressionDescription for substring, current_date, and current_timestamp #28342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| expression[MonthsBetween]("months_between"), | ||
| expression[NextDay]("next_day"), | ||
| expression[CurrentTimestamp]("now", true), | ||
| expression[Now]("now"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the usage is different between now and current_timestamp, I added a Now expr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we don't need to add a version here where an alias added? For example, current_timestamp and now have been implemented in the different versions, 1.5 and 1.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between now and current_timestamp? Should now extend Nondeterministic, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CURRENT_TIMESTAMP is a SQL syntax and NOW is not: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L777
|
Test build #121801 has finished for PR 28342 at commit
|
|
Test build #121806 has finished for PR 28342 at commit
|
| @ExpressionDescription( | ||
| usage = "_FUNC_() - Returns the current timestamp at the start of query evaluation.", | ||
| group = "datetime_funcs", | ||
| since = "1.5.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess now should be 3.1.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, now has been supported since 1.6.0? https://issues.apache.org/jira/browse/SPARK-11768
Or, you suggested above that we should set the version where this expr added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Sorry, I was confused. Never mind~
| Examples: | ||
| > SELECT _FUNC_(); | ||
| 2020-04-25 | ||
| > SELECT _FUNC_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you remember when we started to support this syntax? I roughly remember it was added relatively recently. If the added version is different, it might be best to note it in note while we're here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the PR to implement the syntax: #14442
Spark without the hive support has supported it since 2.0. Yea, I'll add this info as note.
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. one question.
| override def prettyName: String = | ||
| getTagValue(FunctionRegistry.FUNC_ALIAS).getOrElse("current_timestamp") | ||
| @ExpressionDescription( | ||
| usage = "_FUNC_() - Returns the current timestamp at the start of query evaluation.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Do you think we can have examples = for now() since now usage is different(having different class and only functional form) from current_timestamp? Of course, we can skip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. I'll add an example for now in this PR, too. Actually, I think it is better to add examples for the expressions that do not have examples now.
sql/gen-sql-functions-docs.py
Outdated
|
|
||
| for info in infos: | ||
| # Extracts (signature, description) pairs from `info.usage`, e.g., | ||
| # the signature is `func(expr)` and the description is `...` in an usage `func(expr) - ...`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to update this description accordingly because this PR dropped ( and ) requirement?
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. (only minor nits which can be ignored.)
|
Test build #121834 has finished for PR 28342 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much, @maropu and @HyukjinKwon .
Merged to master/3.0.
…t_date, and current_timestamp
### What changes were proposed in this pull request?
This PR intends to add entries for substring, current_date, and current_timestamp in the SQL built-in function documents. Specifically, the entries are as follows;
- SELECT current_date;
- SELECT current_timestamp;
- SELECT substring('abcd' FROM 1);
- SELECT substring('abcd' FROM 1 FOR 2);
### Why are the changes needed?
To make the SQL (built-in functions) references complete.
### Does this PR introduce any user-facing change?
<img width="1040" alt="Screen Shot 2020-04-25 at 16 51 07" src="https://user-images.githubusercontent.com/692303/80274851-6ca5ee00-8718-11ea-9a35-9ae82008cb4b.png">
<img width="974" alt="Screen Shot 2020-04-25 at 17 24 24" src="https://user-images.githubusercontent.com/692303/80275032-a88d8300-8719-11ea-92ec-95b80169ae28.png">
<img width="862" alt="Screen Shot 2020-04-25 at 17 27 48" src="https://user-images.githubusercontent.com/692303/80275114-36696e00-871a-11ea-8e39-02e93eabb92f.png">
### How was this patch tested?
Added test examples.
Closes #28342 from maropu/SPARK-31562.
Authored-by: Takeshi Yamamuro <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit e01125d)
Signed-off-by: Dongjoon Hyun <[email protected]>
|
Thanks, guys! @dongjoon-hyun @HyukjinKwon |
What changes were proposed in this pull request?
This PR intends to add entries for substring, current_date, and current_timestamp in the SQL built-in function documents. Specifically, the entries are as follows;
Why are the changes needed?
To make the SQL (built-in functions) references complete.
Does this PR introduce any user-facing change?
How was this patch tested?
Added test examples.