Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8389280
Mark a number of alogrithms and models experimental that are marked t…
holdenk May 5, 2016
1fa57e5
Add the rest
holdenk May 5, 2016
b1ce817
Use mathjax for formula in PyDoc
holdenk May 5, 2016
8125c8c
Switch to math highlighting and update legostic regresion get doc sin…
holdenk May 5, 2016
c72fa46
Long line fix
holdenk May 5, 2016
3fd1dce
Start adding the missing params to mutli-layer perceptron, also inves…
holdenk May 5, 2016
c7caa43
Or wait we just don't need to support None
holdenk May 5, 2016
4776221
Update the doc string for weights param and add doctest that verifys …
holdenk May 6, 2016
64942b7
Merge in master
holdenk May 9, 2016
2397004
mini fix
holdenk May 10, 2016
130d05f
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk May 10, 2016
a73913b
more pydoc fix
holdenk May 10, 2016
50b41ae
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk May 10, 2016
9e38ddf
Remove flaky doctet component
holdenk May 10, 2016
f4df8f0
Add a : as requested
holdenk May 10, 2016
5df5a93
Merge in master
holdenk May 19, 2016
2eec947
Back out some unrelated changes that are in a seperate PR anyways
holdenk May 19, 2016
e11dbf8
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk May 23, 2016
4111b2d
Update scaladoc and PyDoc to both have the correct chain for getThres…
holdenk May 26, 2016
53ab790
pep8
holdenk May 26, 2016
c2c7900
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 6, 2016
a7aadec
Revert doc change
holdenk Jun 6, 2016
e4061f4
minor fix
holdenk Jun 6, 2016
7b634b6
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 7, 2016
873f6c8
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 11, 2016
9fb2e41
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 12, 2016
74636b1
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 13, 2016
d925f38
Merge branch 'master' into SPARK-15162-SPARK-15164-update-some-pydocs
holdenk Jun 14, 2016
3981612
oook lets try 86ing mathjax but... welll w/e
holdenk Jun 14, 2016
3d13c6c
reenable mathjax
holdenk Jun 14, 2016
2be8cdf
Revert "[SPARK-15745][SQL] Use classloader's getResource() for readin…
holdenk Jun 14, 2016
4431daa
Support both methods
holdenk Jun 14, 2016
d842309
Revert "Support both methods"
holdenk Jun 21, 2016
de63f9f
Revert "Revert "[SPARK-15745][SQL] Use classloader's getResource() fo…
holdenk Jun 21, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Switch to math highlighting and update legostic regresion get doc sin…
…ce it doesn't throw an an error
  • Loading branch information
holdenk committed May 5, 2016
commit 8125c8c6a79cf55a74894a7d2e4efb68a331fcfe
4 changes: 2 additions & 2 deletions python/pyspark/ml/classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ def getThreshold(self):
"""
Gets the value of threshold or attempt to convert thresholds to threshold if set, or default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that both this and Scala doc is inaccurate.

Scala side says:

   * Get threshold for binary classification.
   *
   * If [[threshold]] is set, returns that value.
   * Otherwise, if [[thresholds]] is set with length 2 (i.e., binary classification),
   * this returns the equivalent threshold: {{{1 / (1 + thresholds(0) / thresholds(1))}}}.
   * Otherwise, returns [[threshold]] default value.

But actually, the logic is "if thresholds is set and is length 2, return 1 / (1 + t(0) / t(1) ). Otherwise return threshold or its default value."

Seems to me we should update both Scala and Python doc to reflect this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll go and update both the docs.

value if neither are set.
This conversion is equivalent to: {{{1 / (1 + thresholds(0) / thresholds(1))}}}.
This conversion is equivalent to: :math:`\\frac{1}{1 + \\frac{thresholds(0)}{thresholds(1)}}`.
"""
self._checkThresholdConsistency()
if self.isSet(self.thresholds):
Expand Down Expand Up @@ -188,7 +188,7 @@ def getThresholds(self):
If :py:attr:`thresholds` is set, return its value.
Otherwise, if :py:attr:`threshold` is set, return the equivalent thresholds for binary
classification: (1-threshold, threshold).
If neither are set, throw an error.
If neither are set, return the default value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If neither are explicitly set, it does in fact throw an error:

In [22]: if not lr.isSet(lr.thresholds) and lr.isSet(lr.threshold):
   ....:     t = lr.getOrDefault(lr.threshold)
   ....:     [1.0-t, t]
   ....: else:
   ....:     lr.getOrDefault(lr.thresholds)
   ....:
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-22-869f82439552> in <module>()
      3     [1.0-t, t]
      4 else:
----> 5     lr.getOrDefault(lr.thresholds)
      6

/Users/nick/workspace/scala/mlnick-spark/python/pyspark/ml/param/__init__.pyc in getOrDefault(self, param)
    348             return self._paramMap[param]
    349         else:
--> 350             return self._defaultParamMap[param]
    351
    352     @since("1.4.0")

KeyError: Param(parent=u'LogisticRegression_4b97b6978cdc41d90ee3', name='thresholds', doc="Thresholds in multi-class classification to adjust the probability of predicting each class. Array must have length equal to the number of classes, with values >= 0. The class with largest value p/t is predicted, where p is the original probability of that class and t is the class' threshold.")

Copy link
Contributor

@MLnick MLnick Jun 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should revert the doc ...

... but it does seem strange to me to throw that error. Why not if both are unset, use the default value for threshold and return the [1 - t, t]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably harmonize with the Scala side and return the default value yes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say yeah - cc @yanboliang thoughts?

Copy link
Contributor

@yanboliang yanboliang Jun 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
If both are unset, using default threshold to produce thresholds rather than throwing exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually now that I'm back from traveling I think reverting the doc change is the right thing to do since that is what the Scala side does (I misread it earlier).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - we can revisit a change to behaviour in a separate PR

"""
self._checkThresholdConsistency()
if not self.isSet(self.thresholds) and self.isSet(self.threshold):
Expand Down