Skip to content

Conversation

@dilipbiswal
Copy link
Contributor

What changes were proposed in this pull request?

Document CREATE FUNCTION statement in SQL Reference Guide.

Why are the changes needed?

Currently Spark lacks documentation on the supported SQL constructs causing
confusion among users who sometimes have to look at the code to understand the
usage. This is aimed at addressing this issue.

Does this PR introduce any user-facing change?

Yes.

Before:
There was no documentation for this.

After.
Screen Shot 2019-09-22 at 3 01 52 PM
Screen Shot 2019-09-22 at 3 02 11 PM
Screen Shot 2019-09-22 at 3 02 39 PM
Screen Shot 2019-09-22 at 3 04 04 PM

How was this patch tested?

Tested using jykyll build --serve

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111168 has finished for PR 25894 at commit cddbebb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • -- public class SimpleUdf extends UDF
  • -- public class SimpleUdfR extends UDF

<dt><code><em>TEMPORARY</em></code></dt>
<dd>
Indicates the scope of function being created. When TEMPORARY is specified, the
created function is valid in the current session. No persistent entry is made
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valid -> valid and visible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile thanks.. fixed.


### Examples
{% highlight sql %}
-- 1. Create a simple UDF `SimpleUdf` that adds the supplied integet value by 10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integet -> integral

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile fixed.

<dt><code><em>class_name</em></code></dt>
<dd>
Specifies the name of the class that provides the implementation for function to be created.
The implementing class should extend from one of the base classes in `Hive` as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UserDefinedAggregateFunction. We also support this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile I have created a place holder link for custom scalar functions. There is already a place holder for aggregate functions and a lot of content is already in place.

in Spark. Temporary functions are scoped at a session level where as permanent
functions are created in the persistent catalog and are made available to
all sessions. The resources specified in the `USING` clause are made available
to all executors when they are executed for the first time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to explain we can create/register temporary SQL functions via Python/Scala/Java APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile I have created a place holder link for custom scalar functions. There is already a place holder for aggregate functions and a lot of content is already in place.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111187 has finished for PR 25894 at commit 8c734c9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 8, 2019

Test build #111860 has finished for PR 25894 at commit 3d0389f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

### Related statements
- [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html)
- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html)
- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile oops.. thanks for catching it. Will fix.

</div>

## Scalar Functions
(to be filled soon)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a JIRA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gatorsmile created here

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except two comments.

@SparkQA
Copy link

SparkQA commented Oct 14, 2019

Test build #112005 has finished for PR 25894 at commit 0cb396b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

cc @gatorsmile

<dt><code><em>IF NOT EXISTS</em></code></dt>
<dd>
If specified, creates the function only when it does not exist. The creation
of function succeeds (no error is thrown), if the specified function already
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove comma

</dd>
<dt><code><em>TEMPORARY</em></code></dt>
<dd>
Indicates the scope of function being created. When TEMPORARY is specified, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: code-format TEMPORARY?

<dl>
<dt><code><em>OR REPLACE</em></code></dt>
<dd>
If specified, the resources for function are reloaded. This is mainly useful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the function

<dt><code><em>class_name</em></code></dt>
<dd>
Specifies the name of the class that provides the implementation for function to be created.
The implementing class should extend from one of the base classes as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove 'from'


### Examples
{% highlight sql %}
-- 1. Create a simple UDF `SimpleUdf` that adds the supplied integral value by 10.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"increments" rather than "adds"?

-- import org.apache.hadoop.hive.ql.exec.UDF;
-- public class SimpleUdf extends UDF {
-- public int evaluate(int value) {
-- return value + 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: indent this more

-- return value + 10;
-- }
-- }
-- 2. Compile and place it in a jar file called `SimpleUdf.jar` in /tmp.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: jar -> JAR

| simple_temp_udf|
+------------------+

-- 1. Mofify `SimpleUdf`'s implementation to add supplied integral value by 20.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mofify -> Modify


-- public class SimpleUdfR extends UDF {
-- public int evaluate(int value) {
-- return value + 20;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent here

@SparkQA
Copy link

SparkQA commented Oct 21, 2019

Test build #112408 has finished for PR 25894 at commit 0a964c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in c1c6485 Oct 22, 2019
@srowen
Copy link
Member

srowen commented Oct 22, 2019

Merged to master

@dilipbiswal
Copy link
Contributor Author

Thanks a lot @srowen @gatorsmile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants