Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Address comments.
  • Loading branch information
yanboliang committed Dec 1, 2016
commit e784901f1e2b1cd10c15537efa28077f8e67a768
14 changes: 8 additions & 6 deletions docs/ml-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,33 +64,35 @@ and the migration guide below will explain all changes between releases.

### Breaking changes

* [SPARK-18481](https://issues.apache.org/jira/browse/SPARK-18481):
* `RandomForestClassificationModel.getNumTrees` and `RandomForestRegressionModel.getNumTrees` were made final.
* `RandomForestClassificationModel.setFeatureSubsetStrategy` and `RandomForestRegressionModel.setFeatureSubsetStrategy` return the concrete class type,
rather than an arbitrary trait. This only affected Java compatibility, not Scala.

**Deprecated methods removed**

* `setLabelCol` in `feature.ChiSqSelectorModel`
* `numTrees` in `classification.RandomForestClassificationModel` (This now refers to the Param called `numTrees`)
* `numTrees` in `regression.RandomForestRegressionModel` (This now refers to the Param called `numTrees`)
* `model` in `regression.LinearRegressionSummary`
* `validateParams` in `PipelineStage`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: validateParams in Evaluator

* `validateParams` in `Evaluator`

### Deprecations and changes of behavior

**Deprecations**

* [SPARK-18592](https://issues.apache.org/jira/browse/SPARK-18592):
Deprecate all setter methods for `DecisionTreeClassificationModel`, `GBTClassificationModel`, `RandomForestClassificationModel`, `DecisionTreeRegressionModel`, `GBTRegressionModel` and `RandomForestRegressionModel`
Deprecate all Param setter methods except for input/output column Params for `DecisionTreeClassificationModel`, `GBTClassificationModel`, `RandomForestClassificationModel`, `DecisionTreeRegressionModel`, `GBTRegressionModel` and `RandomForestRegressionModel`

**Changes of behavior**

* [SPARK-17870](https://issues.apache.org/jira/browse/SPARK-17870):
Fix a bug of `ChiSqSelector` which will likely change its result.
Fix a bug of `ChiSqSelector` which will likely change its result. Now `ChiSquareSelector` use pValue rather than raw statistic to select a fixed number of top features.
* [SPARK-3261](https://issues.apache.org/jira/browse/SPARK-3261):
`KMeans` returns potentially fewer than k cluster centers in cases where k distinct centroids aren't available or aren't selected.
* [SPARK-17389](https://issues.apache.org/jira/browse/SPARK-17389):
`KMeans` reduces the default number of steps from 5 to 2 for the k-means|| initialization mode.
* [SPARK-18481](https://issues.apache.org/jira/browse/SPARK-18481):
* `RandomForestClassificationModel.getNumTrees` and `RandomForestRegressionModel.getNumTrees` were made final.
* `RandomForestClassificationModel.setFeatureSubsetStrategy` and `RandomForestRegressionModel.setFeatureSubsetStrategy` return the concrete class type,
rather than an arbitrary trait. This only affected Java compatibility, not Scala.

## Previous Spark versions

Expand Down