cleanups to docs for SPARK-15784 #4

jkbradley · 2018-04-03T23:55:08Z

PR for your PR here apache#15770 to make it easier to show my feedback around docs. Please check and merge/edit as needed. Thanks!

Most of these changes are expanding/clarifying docstrings. I'll comment on the other changes.

jkbradley · 2018-04-03T23:56:08Z

mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala

   */
  @Since("2.3.0")
-  val idCol = new Param[String](this, "id", "column name for ids.")
+  val idCol = new Param[String](this, "idCol", "Name of the input column for vertex IDs.")


The 2nd arg here should match the Param name exactly

jkbradley · 2018-04-03T23:56:31Z

mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala

-   * Param for the column name for neighbors required by PowerIterationClustering.transform().
-   * Default: "neighbor"
+   * Param for the name of the input column for neighbors in the adjacency list representation.
+   * Default: "neighbors"


I made this plural since each row holds multiple neighbors (not 1 neighbor)

I updated the Param, getter & setter accordingly

jkbradley · 2018-04-03T23:57:02Z

mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala

-          "The length of neighbor list must be equal to the the length of the weight list.")
-        nbr.toArray.toIterator.zip(weight.toArray.toIterator)
-          .map(x => (id, x._1.toLong, x._2))}
+          require(nbr.size == weight.size,


needed indentation fix to match scala style guide

jkbradley · 2018-04-03T23:57:14Z

mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala

-  def generatePICData(spark: SparkSession, r1: Double, r2: Double,
-    n1: Int, n2: Int): DataFrame = {
+  def generatePICData(
+      spark: SparkSession,


matching scala style guide

jkbradley · 2018-04-03T23:57:25Z

mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala

+    val rdd = sc.parallelize(similarities).map {
      case (id: Long, nbr: Array[Long], weight: Array[Double]) =>
-      TestRow2(id, Vectors.dense(nbr.map(i => i.toDouble)), Vectors.dense(weight))}
+        TestRow2(id, Vectors.dense(nbr.map(i => i.toDouble)), Vectors.dense(weight))


ditto: scala style

…ver)QueryTestSuite ### What changes were proposed in this pull request? This PR adds 2 changes regarding exception handling in `SQLQueryTestSuite` and `ThriftServerQueryTestSuite` - fixes an expected output sorting issue in `ThriftServerQueryTestSuite` as if there is an exception then there is no need for sort - introduces common exception handling in those 2 suites with a new `handleExceptions` method ### Why are the changes needed? Currently `ThriftServerQueryTestSuite` passes on master, but it fails on one of my PRs (apache#23531) with this error (https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111651/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/sql_3/): ``` org.scalatest.exceptions.TestFailedException: Expected " [Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit org.apache.spark.SparkException] ", but got " [org.apache.spark.SparkException Recursion level limit 100 reached but query has not exhausted, try increasing spark.sql.cte.recursion.level.limit] " Result did not match for query #4 WITH RECURSIVE r(level) AS ( VALUES (0) UNION ALL SELECT level + 1 FROM r ) SELECT * FROM r ``` The unexpected reversed order of expected output (error message comes first, then the exception class) is due to this line: https://github.com/apache/spark/pull/26028/files#diff-b3ea3021602a88056e52bf83d8782de8L146. It should not sort the expected output if there was an error during execution. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing UTs. Closes apache#26028 from peter-toth/SPARK-29359-better-exception-handling. Authored-by: Peter Toth <[email protected]> Signed-off-by: Yuming Wang <[email protected]>

jkbradley added 2 commits April 3, 2018 16:46

cleanups to docs

b027d0e

typo

f4d115c

jkbradley commented Apr 3, 2018

View reviewed changes

jkbradley closed this Apr 16, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cleanups to docs for SPARK-15784 #4

cleanups to docs for SPARK-15784 #4

Uh oh!

jkbradley commented Apr 3, 2018 •

edited

Loading

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

jkbradley Apr 3, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cleanups to docs for SPARK-15784 #4

cleanups to docs for SPARK-15784 #4

Uh oh!

Conversation

jkbradley commented Apr 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

jkbradley Apr 3, 2018

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jkbradley commented Apr 3, 2018 •

edited

Loading