Skip to content

Conversation

@yanboliang
Copy link
Contributor

Missing method of ml.feature are listed here:
StringIndexer lacks of parameter handleInvalid.
StringIndexerModel lacks of method labels.
VectorIndexerModel lacks of methods numFeatures and categoryMaps.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41248 has finished for PR 8313 at commit 953ca82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol, HasHandleInvalid):
    • class HasHandleInvalid(Params):

@holdenk
Copy link
Contributor

holdenk commented Aug 19, 2015

Exposing labels from StringIndexerModel is also in #7976 but its pretty much the same so should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other classes has no shared accessor class (e.g. StandardScaler, RegexTokenizer) corresponding to arbitrary properties. It might be better to keep handleInvalid only inside of StringIndexer or create other shared accessor classes for StandardScaler etc to standardize the way of accessing to arbitrary properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleInvalid is common property and other Transformer/Estimators may use it in future, so I think we need to make it shared param. Another reason is that we want to make Python consistency with Scala, and handleInvalid in Scala is shared param.
@jkbradley

@holdenk
Copy link
Contributor

holdenk commented Sep 8, 2015

Maybe add a test for numFeatures & categoryMaps?

@yanboliang
Copy link
Contributor Author

@holdenk Agree, done.

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42183 has finished for PR 8313 at commit 86353b4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol, HasHandleInvalid):
    • class HasHandleInvalid(Params):

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42199 has finished for PR 8313 at commit 4eb5607.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol, HasHandleInvalid):
    • class HasHandleInvalid(Params):

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42201 has finished for PR 8313 at commit 5403ae2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol, HasHandleInvalid):
    • class HasHandleInvalid(Params):

@asfgit asfgit closed this in a140dd7 Sep 11, 2015
@yanboliang yanboliang deleted the spark-10027 branch May 5, 2016 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants