-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handling the case λmax = 0. #53
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -154,6 +154,24 @@ end | |
end | ||
end | ||
|
||
# Test case with zero variation in y is handled correctly | ||
function zero_variation_test() | ||
X = [ | ||
0.5472502169628388 0.37660447632078875 0.06669114126498532 0.4950818154768257; | ||
0.5142931961160688 0.520205941129849 0.4052730635141131 0.6700530909562794; | ||
0.5831846867316071 0.3174143498124731 0.772131243876973 0.03386847158881201; | ||
0.8802489459954292 0.6742158685234003 0.3849775799923969 0.7773264968613842; | ||
0.9216786846192617 0.7888303438159934 0.09788865152005011 0.34950775139369905 | ||
] | ||
y = 0.2937233091452627 .+ zeros(size(X, 1)) | ||
path = fit(LassoPath, X, y) | ||
(path.λ == eltype(path.λ)[0]) || return false | ||
(length(path.coefs.nzval) == 0) || return false | ||
return true | ||
end | ||
|
||
@test zero_variation_test() == true | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe use @test_log instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, any idea why the tests stopped working in julia v1.0? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree. Will implement tomorrow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Unfortunately nope! |
||
|
||
# Test for sparse matrices | ||
|
||
# @testset "LassoPath Zero in" begin | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky because I think lambda is not unitless, so if it is small or not depends on the data given.
How does glmnet in R (or GLMNet.jl) handle this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I've changed the equality check to an isapprox() check is due to floating point arithmetic leading to a lambdamax that should actually be zero being very close to zero but non-zero instead. Simple example when this happens is a design matrix X with entries sampled from U[0, 1] and y a non-zero vector with identical entries.
I agree, the data could be such that lambdamax is genuinely very small but non-zero.
That said, I think it would be very rare to encounter such data in practice... especially since lambdamax (for the linear model) scales linearly with X and y, and we tend to standardise these.
I see two approaches going forward:
Keep this check as is - the case where it would fail to produce correct output basically never occurs anyway.
Revert back to the equality check. The real case in which the package failed was the case where lambdamax was exactly zero, anyway. Moreover, even if lambdamax should be zero but instead is a very small number, there is no major problem: the solver works very fast and it is clear from the output that every value of lambda yields zero active coefficients.
I'll leave it to you to decide 😃.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally: I'm not sure how glmnet in R or Julia handles this.