-
Notifications
You must be signed in to change notification settings - Fork 29k
[Spark-24024][ML] Fix poisson deviance calculations in GLM to handle y = 0 #21125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -495,8 +495,8 @@ class GeneralizedLinearRegressionSuite extends MLTest with DefaultReadWriteTest | |
| [1] 1.8121235 -0.1747493 -0.5815417 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you update the R script which generate the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated. The updated script is sufficient to calculate deviance on its own. |
||
| */ | ||
| val expected = Seq( | ||
| Vectors.dense(0.0, -0.0457441, -0.6833928), | ||
| Vectors.dense(1.8121235, -0.1747493, -0.5815417)) | ||
| Vectors.dense(0.0, -0.0457441, -0.6833928, 3.8093), | ||
| Vectors.dense(1.8121235, -0.1747493, -0.5815417, 3.7006)) | ||
|
|
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding them to How about
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Modified. Thanks! |
||
| import GeneralizedLinearRegression._ | ||
|
|
||
|
|
@@ -507,7 +507,8 @@ class GeneralizedLinearRegressionSuite extends MLTest with DefaultReadWriteTest | |
| val trainer = new GeneralizedLinearRegression().setFamily("poisson").setLink(link) | ||
| .setFitIntercept(fitIntercept).setLinkPredictionCol("linkPrediction") | ||
| val model = trainer.fit(dataset) | ||
| val actual = Vectors.dense(model.intercept, model.coefficients(0), model.coefficients(1)) | ||
| val actual = Vectors.dense(model.intercept, model.coefficients(0), model.coefficients(1), | ||
| model.summary.deviance) | ||
| assert(actual ~= expected(idx) absTol 1e-4, "Model mismatch: GLM with poisson family, " + | ||
| s"$link link and fitIntercept = $fitIntercept (with zero values).") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. assert(model.summary.deviance ~== residualDeviancesR(idx) absTol 1E-3) |
||
| idx += 1 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another
ylogyimplementation inBinomial. Can you move this code toobject GeneralizedLinearRegressionand make it private to this package?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the quick review. I have moved the
ylogimplementation toobject GeneralizedLinearRegression. One quick question here: I am not sure I have fully understood why this is the right place forylog? Thanks!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestion to avoid the duplicated code? Let's followup this later if you have an idea.