-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handling the case λmax = 0. #53
base: master
Are you sure you want to change the base?
Conversation
Two changes: 1) Change to computeλ to ensure λmax = 0 leads to an output of [0] and not [NaN, ..., NaN]. 2) Change to fit! to ensure the case where autoλ = true and λmax = 0 is handled correctly (rather than throwing an error).
Pull Request Test Coverage Report for Build 202
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this change.
It seems like the test case is basically one where there is no variation in y
.
Do you think you could add a test for this case?
Changing spelling of 'regularisation'.
Added tests (and some more minor changes). Let me know if anything else needs done! |
@@ -209,6 +209,10 @@ const MAX_DEV_FRAC = 0.999 | |||
# Compute automatic λ values based on λmax and λminratio | |||
function computeλ(λmax, λminratio, α, nλ) | |||
λmax /= α | |||
if isapprox(λmax, 0; atol=1e-10) # then assuming λmax = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky because I think lambda is not unitless, so if it is small or not depends on the data given.
How does glmnet in R (or GLMNet.jl) handle this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I've changed the equality check to an isapprox() check is due to floating point arithmetic leading to a lambdamax that should actually be zero being very close to zero but non-zero instead. Simple example when this happens is a design matrix X with entries sampled from U[0, 1] and y a non-zero vector with identical entries.
I agree, the data could be such that lambdamax is genuinely very small but non-zero.
That said, I think it would be very rare to encounter such data in practice... especially since lambdamax (for the linear model) scales linearly with X and y, and we tend to standardise these.
I see two approaches going forward:
-
Keep this check as is - the case where it would fail to produce correct output basically never occurs anyway.
-
Revert back to the equality check. The real case in which the package failed was the case where lambdamax was exactly zero, anyway. Moreover, even if lambdamax should be zero but instead is a very small number, there is no major problem: the solver works very fast and it is clear from the output that every value of lambda yields zero active coefficients.
I'll leave it to you to decide 😃.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally: I'm not sure how glmnet in R or Julia handles this.
return true | ||
end | ||
|
||
@test zero_variation_test() == true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use @test_log instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, any idea why the tests stopped working in julia v1.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use @test_log instead?
I agree. Will implement tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, any idea why the tests stopped working in julia v1.0?
Unfortunately nope!
Fixes #51.
Two changes:
not [NaN, ..., NaN].
handled correctly (rather than throwing an error).