-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-18434][ML] Add missing ParamValidations for ML algos #15881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These all look good to me. @jkbradley @sethah et al?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just a switch/case is simpler? I don't feel strongly though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we not use a param validator here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we can get an error message by using ParamValidator. Anyway, it can be helpful to inform users about the available options in the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a reasonable argument, but everywhere else we use a param validator and list the valid options in the param doc, AFAIK. If we want to change this, it seems like a discussion that should be had somewhere else, and we should stick to the current convention here. I don't see a good argument for changing the convention in one place, but not all places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reply. I assume solver is different from other params as it's inherited from a trait where we cannot define valid options. I don't mean to change any convention here. For this case, I simply mean we can add the available options in the error message in the require.
And surely we can discuss how to improve ParamValidator elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point about the param definition being inherited, I had overlooked that. I think it's ok to use a require here then. I would prefer Set("auto", ...).contains but it's a very minor issue.
|
Test build #68613 has finished for PR 15881 at commit
|
77658b7 to
587fb9c
Compare
|
Thanks for reviewing. I update the checking in LiR's |
hhbyyh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| */ | ||
| final val minInfoGain: DoubleParam = new DoubleParam(this, "minInfoGain", | ||
| "Minimum information gain for a split to be considered at a tree node.") | ||
| "Minimum information gain for a split to be considered at a tree node.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to miss this. Perhaps add >=0 at the end to be consistent with others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌, I will add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think it's better to not add >=0 here, because all params in treeParams.scala don't add it.
|
Test build #68648 has finished for PR 15881 at commit
|
|
Thanks for taking this on :) I did a quick skim and these all look pretty reasonable (in some cases we are directly calling an old mllib algorithm which would do the validation at training time and the values seem to line up). |
|
LGTM, thanks to all. Merged into master and branch-2.1. |
## What changes were proposed in this pull request? Add missing ParamValidations for ML algos ## How was this patch tested? existing tests Author: Zheng RuiFeng <[email protected]> Closes #15881 from zhengruifeng/arg_checking. (cherry picked from commit c68f1a3) Signed-off-by: Yanbo Liang <[email protected]>
## What changes were proposed in this pull request? Add missing ParamValidations for ML algos ## How was this patch tested? existing tests Author: Zheng RuiFeng <[email protected]> Closes apache#15881 from zhengruifeng/arg_checking.
What changes were proposed in this pull request?
Add missing ParamValidations for ML algos
How was this patch tested?
existing tests