-
Notifications
You must be signed in to change notification settings - Fork 228
[MRG+2] Update repo to work with both new and old scikit-learn #313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+2] Update repo to work with both new and old scikit-learn #313
Conversation
…vis job to test the old scikit-learn
Actually, tests are failing because of some LMNN problem similar to the discussion in #309 : a test is run on LMNN with classes that have too few labels. |
Ok so there's just the travis test with the old scikit learn that is failing, because of the string representation of estimators, I'll look into that |
|
||
|
||
def remove_spaces(s): | ||
return re.sub(r'\s+', '', s) | ||
|
||
|
||
def sk_repr_kwargs(def_kwargs, nndef_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I thought it could be good to test the str repr for both the old (<=0.22) sklearn version and the newer ones, so I made a string representation that depends on the sklearn version
@@ -121,7 +131,8 @@ def test_array_like_inputs(estimator, build_dataset, with_preprocessor): | |||
|
|||
# we subsample the data for the test to be more efficient | |||
input_data, _, labels, _ = train_test_split(input_data, labels, | |||
train_size=20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test was failing here because of classes with too few labels in LMNN (see comments above), so in the end I created a toy example with a bit more samples (which I guess makes sense because the role of this particular test is not to test edge cases, but rather the fact that array-like objects work with our estimators),
I guess the PR is ready to merge :) |
|
||
def test_scml(self): | ||
check_estimator(SCML_Supervised) | ||
check_estimator(SCML_Supervised()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here scikit-learn had return an error saying that checks should be run on estimators instances, not classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small comment. Otherwise LGTM
@@ -39,6 +39,18 @@ matrix: | |||
- pytest test --cov; | |||
after_success: | |||
- bash <(curl -s https://codecov.io/bash) | |||
- name: "Pytest python 3.6 with skggm + scikit-learn 0.20.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a note here to clarify this additional test's purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, done
Thanks for the review @terrytangyuan! I adressed your comment |
test/test_base_metric.py
Outdated
def strify(obj): | ||
"""Function to add the quotation marks if the object | ||
is a string""" | ||
if type(obj) is str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(obj, str)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the inputs we expect to pass, it might be simpler to call json.dumps(obj)
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, thanks ! Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually there was a pb for booleans (json.dumps
casts them to lowercase, i.e. json.dumps(True)
returns true
), but I found online that repr
can do the same without this pb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/test_base_metric.py
Outdated
""" | ||
if skversion >= version.parse('0.22.0'): | ||
def_kwargs = "" | ||
nndef_kwargs = eval(f"dict({nndef_kwargs})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using eval, could we pass actual dicts here instead? The values are all simple literals, so it should be easier that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're right, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/test_mahalanobis_mixin.py
Outdated
from sklearn.utils.testing import set_random_state | ||
import sklearn | ||
from packaging import version | ||
if version.parse(sklearn.__version__) >= version.parse('0.22.0'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something we could do once during module initialization, then have a global constant like SKLEARN_AT_LEAST_0_22
that we could use.
Or make a sklearn_shims.py
file that does the conditional import, and have everything else import from that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for the review @perimosocordiae ! I'll do the changes and set the PR to [MRG] as soon as I finish |
from sklearn.utils.estimator_checks import is_public_parameter | ||
from sklearn.metrics.scorer import get_scorer | ||
|
||
__all__ = ['set_random_state', 'assert_warns_message', 'set_random_state', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writing this __all__
lines were necessary for flake8 not to return errors
@perimosocordiae I adressed your comments, I guess the PR is now ready to merge :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for merging, thanks @wdevazelhes!
Somehow we do not see all checks (it happens sometimes), maybe do an empty commit to make sure @wdevazelhes? |
Thanks for the review @bellet ! I tried to do an empty commit but we still see 1 line for the check, indeed it's weird... Although clicking on details we see all checks, so maybe there was a change of display of travis checks on github ? |
@perimosocordiae, do you approve my updates after your review ? |
We don't see test coverage though for some reason |
Merged, thanks! |
Ah yes that's right sorry ! Actually they sent me an email telling me that their bash uploader was hacked: https://about.codecov.io/security-update/?utm_medium=email&utm_source=marketo&ajs_uid=2386266&ajs_event=Clicked%20Email&mkt_tok=MzMyLUxWWC03NDEAAAF8dJ5mOA5H1E5ORbGsBiNKeiALhIlihm9VPW70i0qLTSuToRz_FC1YHHXfXcvCiatj0yHQxAfkDKCri3j9ksaM0_8pH36YziLCqWSOcdIt |
Thanks @perimosocordiae ! |
Good catch! They have guidelines on what to do on the above page |
I opened a draft PR here #314, running their recommended As a precaution, on app.codecov.io, I clicked on "Regenerate" for the "Repository Upload Token", but I don't even think it had any effect, since I searched for "codecov" in our code, and I see nowhere any token So I think we are safe, However codecov still doesn't appear in the "all checks" of the test PR #314, so I think we can wait a little bit: they may be still fixing problems |
Fixes #311