-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14104][PYSPARK][ML] All Python param setters should use the _set method
#11939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
3598c9f
436745f
ea02225
0c0fc63
c864597
8079c11
37b9ac5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -444,7 +444,14 @@ def _setDefault(self, **kwargs): | |
| Sets default params. | ||
| """ | ||
| for param, value in kwargs.items(): | ||
| self._defaultParamMap[getattr(self, param)] = value | ||
| p = getattr(self, param) | ||
|
||
| if value is not None: | ||
| try: | ||
| value = p.typeConverter(value) | ||
| except TypeError as e: | ||
| raise TypeError('Invalid default param value given for param "%s". %s' | ||
| % (p.name, e)) | ||
| self._defaultParamMap[p] = value | ||
| return self | ||
|
|
||
| def _copyValues(self, to, extra=None): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the change to
_setDefault, I had to change this default to be a list instead of JavaObject. The other option would be to have type converters do nothing if they encounter JavaObjects. It is nice to leave stop words as a JavaObject if they are never accessed explicitly on the Python side. Would appreciate thoughts on this problem.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we ever explicitly access them on the Python side - although a users application might attempt to do that and append stop words to the existing list in which case having it as a list is maybe good. One could get a similar effect by changing getStopWords without having to round trip the list in cases where it isn't ever accessed on the python side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is simple to make this change, so I think it's a good idea. This will help in the future for similar cases or if the list of stopwords grows even larger. I changed
getStopWordsto return a list always which is better for users, I think. Thanks for the suggestion!