-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-23916][SQL] Add array_join function #21011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #89067 has finished for PR 21011 at commit
|
| val df = Seq( | ||
| (Seq[String]("a", "b"), ","), | ||
| (Seq[String]("a", null, "b"), ","), | ||
| (Seq[String](), ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Seq.empty[String]
| hello world | ||
| > SELECT _FUNC_(array('hello', null ,'world'), ' ', ','); | ||
| hello , world | ||
| """) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and since.
| hello world | ||
| > SELECT _FUNC_(array('hello', null ,'world'), ' ', ','); | ||
| hello , world | ||
| """) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add since. see this discussion.
|
Test build #89094 has finished for PR 21011 at commit
|
|
retest this please |
|
Test build #89106 has finished for PR 21011 at commit
|
|
cc @ueshin |
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for some nits.
| """.stripMargin) | ||
| } else { | ||
| ev.copy(s""" | ||
| |boolean ${ev.isNull} = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I guess we can remove this?
| |} | ||
| |$buffer.append(${replacementGen.value}); | ||
| |$firstItem = false; | ||
| """.stripMargin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent
| |$code | ||
| """.stripMargin) | ||
| } else { | ||
| ev.copy(s""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we need a line break between copy( and s"""?
| s""" | ||
| |${replacementGen.code} | ||
| |$execCode | ||
| """.stripMargin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent
| if (delimiterEval == null) return null | ||
| val nullReplacementEval = nullReplacement.map(_.eval(input)) | ||
| if (nullReplacementEval.contains(null)) return null | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove an extra line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the other one.... :)
| Seq(ArrayType(StringType), StringType, StringType) | ||
| } else { | ||
| Seq(ArrayType(StringType), StringType) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the indent is wrong since this is for the if...else and not for the method itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think the indent is 2 spaces in this case. For example, namedExpressions.scala#L170-L174 or regexpExpressions.scala#L46-L51.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ops, you are right....I am not sure where I saw it differently...maybe I just got confused...sorry, I am fixing it
| Seq(array, delimiter, nullReplacement.get) | ||
| } else { | ||
| Seq(array, delimiter) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
Test build #89342 has finished for PR 21011 at commit
|
|
retest this please |
|
Test build #89340 has finished for PR 21011 at commit
|
|
Test build #89351 has finished for PR 21011 at commit
|
|
any more comments @ueshin ? |
|
Test build #89451 has finished for PR 21011 at commit
|
python/pyspark/sql/functions.py
Outdated
| def array_join(col, delimiter, null_replacement=None): | ||
| """ | ||
| Concatenates the elements of `column` using the `delimiter`. Null values are replaced with | ||
| `nullReplacement` if set, otherwise they are ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: null_replacement?
|
Test build #89534 has finished for PR 21011 at commit
|
|
any more comments? |
|
Test build #89645 has finished for PR 21011 at commit
|
|
kindly ping @ueshin |
|
Thanks! merging to master. |
What changes were proposed in this pull request?
The PR adds the SQL function
array_join. The behavior of the function is based on Presto's one.The function accepts an
arrayofstringwhich is to be joined, astringwhich is the delimiter to use between the items of the first argument and optionally astringwhich is used to replacenullvalues.How was this patch tested?
added UTs