Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
SPARK-22896 Improvement in String interpolation
  • Loading branch information
chetkhatri committed Dec 24, 2017
commit 9916fd1f67234b1fa5608231181bdf3b08718981
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ object ChiSquareTestExample {

val df = data.toDF("label", "features")
val chi = ChiSquareTest.test(df, "features", "label").head
println("pValues = " + chi.getAs[Vector](0))
println("degreesOfFreedom = " + chi.getSeq[Int](1).mkString("[", ",", "]"))
println("statistics = " + chi.getAs[Vector](2))
println(s"pValues = ${chi.getAs[Vector](0)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK; anything more complex I might suggest breaking out the expression into a val.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

println(s"degreesOfFreedom ${chi.getSeq[Int](1).mkString("[", ",", "]")}")
println(s"statistics ${chi.getAs[Vector](2)}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ object CorrelationExample {

val df = data.map(Tuple1.apply).toDF("features")
val Row(coeff1: Matrix) = Correlation.corr(df, "features").head
println("Pearson correlation matrix:\n" + coeff1.toString)
println(s"Pearson correlation matrix:\n ${coeff1.toString}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing we could improve: .toString is redundant here I believe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.


val Row(coeff2: Matrix) = Correlation.corr(df, "features", "spearman").head
println("Spearman correlation matrix:\n" + coeff2.toString)
println(s"Spearman correlation matrix:\n ${coeff2.toString}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ object DataFrameExample {
val parser = new OptionParser[Params]("DataFrameExample") {
head("DataFrameExample: an example app using DataFrame for ML.")
opt[String]("input")
.text(s"input path to dataframe")
.text("input path to dataframe")
.action((x, c) => c.copy(input = x))
checkConfig { params =>
success
Expand Down Expand Up @@ -93,7 +93,7 @@ object DataFrameExample {
// Load the records back.
println(s"Loading Parquet file with UDT from $outputDir.")
val newDF = spark.read.parquet(outputDir)
println(s"Schema from Parquet:")
println("Schema from Parquet:")
newDF.printSchema()

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,10 @@ object DecisionTreeClassificationExample {
.setPredictionCol("prediction")
.setMetricName("accuracy")
val accuracy = evaluator.evaluate(predictions)
println("Test Error = " + (1.0 - accuracy))
println(s"Test Error = ${(1.0 - accuracy)}")

val treeModel = model.stages(2).asInstanceOf[DecisionTreeClassificationModel]
println("Learned classification tree model:\n" + treeModel.toDebugString)
println(s"Learned classification tree model:\n ${treeModel.toDebugString}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ object DeveloperApiExample {
// Create a LogisticRegression instance. This instance is an Estimator.
val lr = new MyLogisticRegression()
// Print out the parameters, documentation, and any default values.
println("MyLogisticRegression parameters:\n" + lr.explainParams() + "\n")
println(s"MyLogisticRegression parameters:\n ${lr.explainParams()}")

// We may set parameters using setter methods.
lr.setMaxIter(10)
Expand Down Expand Up @@ -169,10 +169,10 @@ private class MyLogisticRegressionModel(
Vectors.dense(-margin, margin)
}

/** Number of classes the label can take. 2 indicates binary classification. */
// Number of classes the label can take. 2 indicates binary classification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good to make this a standard comment, not scaladoc style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

override val numClasses: Int = 2

/** Number of features the model was trained on. */
// Number of features the model was trained on.
override val numFeatures: Int = coefficients.size

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ object EstimatorTransformerParamExample {
// Create a LogisticRegression instance. This instance is an Estimator.
val lr = new LogisticRegression()
// Print out the parameters, documentation, and any default values.
println("LogisticRegression parameters:\n" + lr.explainParams() + "\n")
println(s"LogisticRegression parameters:\n ${lr.explainParams()}\n")

// We may set parameters using setter methods.
lr.setMaxIter(10)
Expand All @@ -58,7 +58,7 @@ object EstimatorTransformerParamExample {
// we can view the parameters it used during fit().
// This prints the parameter (name: value) pairs, where names are unique IDs for this
// LogisticRegression instance.
println("Model 1 was fit using parameters: " + model1.parent.extractParamMap)
println(s"Model 1 was fit using parameters: ${model1.parent.extractParamMap}")

// We may alternatively specify parameters using a ParamMap,
// which supports several methods for specifying parameters.
Expand All @@ -73,7 +73,7 @@ object EstimatorTransformerParamExample {
// Now learn a new model using the paramMapCombined parameters.
// paramMapCombined overrides all parameters set earlier via lr.set* methods.
val model2 = lr.fit(training, paramMapCombined)
println("Model 2 was fit using parameters: " + model2.parent.extractParamMap)
println(s"Model 2 was fit using parameters: ${model2.parent.extractParamMap}")

// Prepare test data.
val test = spark.createDataFrame(Seq(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,10 @@ object GradientBoostedTreeClassifierExample {
.setPredictionCol("prediction")
.setMetricName("accuracy")
val accuracy = evaluator.evaluate(predictions)
println("Test Error = " + (1.0 - accuracy))
println(s"Test Error = ${1.0 - accuracy}")

val gbtModel = model.stages(2).asInstanceOf[GBTClassificationModel]
println("Learned classification GBT model:\n" + gbtModel.toDebugString)
println(s"Learned classification GBT model:\n ${gbtModel.toDebugString}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,10 @@ object GradientBoostedTreeRegressorExample {
.setPredictionCol("prediction")
.setMetricName("rmse")
val rmse = evaluator.evaluate(predictions)
println("Root Mean Squared Error (RMSE) on test data = " + rmse)
println(s"Root Mean Squared Error (RMSE) on test data = ${rmse}")

val gbtModel = model.stages(1).asInstanceOf[GBTRegressionModel]
println("Learned regression GBT model:\n" + gbtModel.toDebugString)
println(s"Learned regression GBT model:\n ${gbtModel.toDebugString}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ object MulticlassLogisticRegressionWithElasticNetExample {

// Print the coefficients and intercept for multinomial logistic regression
println(s"Coefficients: \n${lrModel.coefficientMatrix}")
println(s"Intercepts: ${lrModel.interceptVector}")
println(s"Intercepts: \n${lrModel.interceptVector}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ object MultilayerPerceptronClassifierExample {
val evaluator = new MulticlassClassificationEvaluator()
.setMetricName("accuracy")

println("Test set accuracy = " + evaluator.evaluate(predictionAndLabels))
println(s"Test set accuracy = ${evaluator.evaluate(predictionAndLabels)}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ object NaiveBayesExample {
.setPredictionCol("prediction")
.setMetricName("accuracy")
val accuracy = evaluator.evaluate(predictions)
println("Test set accuracy = " + accuracy)
println(s"Test set accuracy = $accuracy")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,11 @@ object QuantileDiscretizerExample {

// $example on$
val data = Array((0, 18.0), (1, 19.0), (2, 8.0), (3, 5.0), (4, 2.2))
val df = spark.createDataFrame(data).toDF("id", "hour")
val df = spark.createDataFrame(data).toDF("id", "hour").repartition(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it looks weird, I think the author intended the repartition(1) to not appear in the body of the example that's copied into the docs. I wouldn't change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

// $example off$
// Output of QuantileDiscretizer for such small datasets can depend on the number of
// partitions. Here we force a single partition to ensure consistent results.
// Note this is not necessary for normal use cases
.repartition(1)

// $example on$
val discretizer = new QuantileDiscretizer()
Expand All @@ -45,7 +44,7 @@ object QuantileDiscretizerExample {
.setNumBuckets(3)

val result = discretizer.fit(df).transform(df)
result.show()
result.show(false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more question - is it necessary to make this not truncate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're following same style in other examples so it is good to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which other examples? most do not set this, and the Java equivalent doesn't either. If there's a good reason that the output needs to be untruncated, that's fine, just also change the Java example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen correct either way it works for ex. examples/ml/LDAExamples.scala

// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,10 +85,10 @@ object RandomForestClassifierExample {
.setPredictionCol("prediction")
.setMetricName("accuracy")
val accuracy = evaluator.evaluate(predictions)
println("Test Error = " + (1.0 - accuracy))
println(s"Test Error = ${(1.0 - accuracy)}")

val rfModel = model.stages(2).asInstanceOf[RandomForestClassificationModel]
println("Learned classification forest model:\n" + rfModel.toDebugString)
println(s"Learned classification forest model:\n ${rfModel.toDebugString}")
// $example off$

spark.stop()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,10 @@ object RandomForestRegressorExample {
.setPredictionCol("prediction")
.setMetricName("rmse")
val rmse = evaluator.evaluate(predictions)
println("Root Mean Squared Error (RMSE) on test data = " + rmse)
println(s"Root Mean Squared Error (RMSE) on test data = $rmse")

val rfModel = model.stages(1).asInstanceOf[RandomForestRegressionModel]
println("Learned regression forest model:\n" + rfModel.toDebugString)
println(s"Learned regression forest model:\n ${rfModel.toDebugString}")
// $example off$

spark.stop()
Expand Down