Skip to content

Conversation

@mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Apr 9, 2018

What changes were proposed in this pull request?

The PR adds the SQL function array_join. The behavior of the function is based on Presto's one.

The function accepts an array of string which is to be joined, a string which is the delimiter to use between the items of the first argument and optionally a string which is used to replace null values.

How was this patch tested?

added UTs

@SparkQA
Copy link

SparkQA commented Apr 9, 2018

Test build #89067 has finished for PR 21011 at commit 1408cfc.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ArrayJoin(

val df = Seq(
(Seq[String]("a", "b"), ","),
(Seq[String]("a", null, "b"), ","),
(Seq[String](), ",")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Seq.empty[String]

hello world
> SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
hello , world
""")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and since.

hello world
> SELECT _FUNC_(array('hello', null ,'world'), ' ', ',');
hello , world
""")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add since. see this discussion.

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89094 has finished for PR 21011 at commit e52ff85.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 10, 2018

Test build #89106 has finished for PR 21011 at commit e52ff85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

cc @ueshin

Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for some nits.

""".stripMargin)
} else {
ev.copy(s"""
|boolean ${ev.isNull} = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess we can remove this?

|}
|$buffer.append(${replacementGen.value});
|$firstItem = false;
""".stripMargin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent

|$code
""".stripMargin)
} else {
ev.copy(s"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we need a line break between copy( and s"""?

s"""
|${replacementGen.code}
|$execCode
""".stripMargin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent

if (delimiterEval == null) return null
val nullReplacementEval = nullReplacement.map(_.eval(input))
if (nullReplacementEval.contains(null)) return null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove an extra line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the other one.... :)

Seq(ArrayType(StringType), StringType, StringType)
} else {
Seq(ArrayType(StringType), StringType)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the indent is wrong since this is for the if...else and not for the method itself

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think the indent is 2 spaces in this case. For example, namedExpressions.scala#L170-L174 or regexpExpressions.scala#L46-L51.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops, you are right....I am not sure where I saw it differently...maybe I just got confused...sorry, I am fixing it

Seq(array, delimiter, nullReplacement.get)
} else {
Seq(array, delimiter)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@SparkQA
Copy link

SparkQA commented Apr 13, 2018

Test build #89342 has finished for PR 21011 at commit ad0d4aa.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 13, 2018

Test build #89340 has finished for PR 21011 at commit dd9482e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 13, 2018

Test build #89351 has finished for PR 21011 at commit ad0d4aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

any more comments @ueshin ?

@SparkQA
Copy link

SparkQA commented Apr 17, 2018

Test build #89451 has finished for PR 21011 at commit 703c09c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

def array_join(col, delimiter, null_replacement=None):
"""
Concatenates the elements of `column` using the `delimiter`. Null values are replaced with
`nullReplacement` if set, otherwise they are ignored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: null_replacement?

@SparkQA
Copy link

SparkQA commented Apr 19, 2018

Test build #89534 has finished for PR 21011 at commit b1597d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

any more comments?

@SparkQA
Copy link

SparkQA commented Apr 20, 2018

Test build #89645 has finished for PR 21011 at commit e9d7baa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class Summarizer(object):
  • class SummaryBuilder(JavaWrapper):
  • case class Reverse(child: Expression) extends UnaryExpression with ImplicitCastInputTypes
  • case class ArrayPosition(left: Expression, right: Expression)
  • case class ElementAt(left: Expression, right: Expression) extends GetMapValueUtil
  • case class Concat(children: Seq[Expression]) extends Expression
  • abstract class GetMapValueUtil extends BinaryExpression with ImplicitCastInputTypes
  • case class GetMapValue(child: Expression, key: Expression)
  • class ArrayDataIndexedSeq[T](arrayData: ArrayData, dataType: DataType) extends IndexedSeq[T]

@mgaido91
Copy link
Contributor Author

kindly ping @ueshin

@ueshin
Copy link
Member

ueshin commented Apr 26, 2018

Thanks! merging to master.

@asfgit asfgit closed this in cd10f9d Apr 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants