Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dask] [ci] add support for scikit-learn 0.24+ in tests (fixes #4031) #4032

Merged
merged 8 commits into from
Mar 2, 2021

Conversation

jameslamb
Copy link
Collaborator

Fixes #4031. In scikit-learn 0.24.x, support for passing a class (instead of an instance) to sklearn.utils.estimator_checks.check_parameters_default_constructible() was removed (scikit-learn/scikit-learn#17134).

We must have recently started getting scikit-learn >=0.24.0 in tests, because CI has been failing on this test with this error

E TypeError: type() takes 1 or 3 arguments

This error only shows up for the Dask estimators, because we call check_parameters_default_constructible() directly there. The equivalent test in tests/python_package_test/test_sklearn.py isn't failing because it's skipped on sklearn>=0.24.0 (

sk_version >= parse_version("0.24"),
) and instead run as part of the list of tests run with sklearn.utils.estimator_checks.parametrize_with_checks():
@parametrize_with_checks(list(_tested_estimators()))
.

if sk_version > parse_version("0.23"):
Estimator = estimator
else:
Estimator = estimator.__class__
sklearn_checks.check_parameters_default_constructible(name, Estimator)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try just passing an instance always, so we could avoid parsing the scikit-learn version, but it looks like scikit-learn 0.22.x only supported passing a class in this method 😆

Here's what I got changing the code to the following on scikit-learn==0.22.0

@pytest.mark.parametrize("estimator", list(_tested_estimators()))
def test_parameters_default_constructible(estimator):
    name = estimator.__class__.__name__
    sklearn_checks.check_parameters_default_constructible(name, estimator)

E TypeError: 'DaskLGBMRegressor' object is not callable

@StrikerRUS
Copy link
Collaborator

0.24.0
Dec 22, 2020
https://pypi.org/project/scikit-learn/0.24.0/#history

"Fast" update, conda! 👍 😄

But at anaconda.org it is still shown that the latest scikit-learn version is 0.23.2:

image

https://anaconda.org/anaconda/scikit-learn

I don't understand why we get 0.24.1 version. So weird!

@StrikerRUS
Copy link
Collaborator

I hope some time we just drop all workarounds and leave only unified version of tests that assumes we use > 0.24.0.

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
@jameslamb jameslamb requested a review from StrikerRUS March 1, 2021 14:25
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Many thanks!

@jameslamb
Copy link
Collaborator Author

jameslamb commented Mar 1, 2021

ugh, more MiKTeX issues. this one is new.

The downloaded binary packages are in
C:\Users\runneradmin\AppData\Local\Temp\RtmpKM9jGl\downloaded_packages
Downloading https://github.com/microsoft/LightGBM/releases/download/v2.0.12/miktexsetup-4.0-x64.zip
Setting up MiKTeX
miktexsetup.exe: The remote package repository is not online. You have to choose another repository.
miktexsetup.exe: Data: url="https://ctan.math.illinois.edu/systems/win32/miktex/tm/packages/"
Error: Process completed with exit code -1.

I can see that that isn't true, and wasn't true the last time I restarted the failing Windows R jobs about 30 minutes ago.

image

I'm going to try upgrading to miktex-setup 4.1 (https://ctan.math.illinois.edu/systems/win32/miktex/setup/windows-x86/). It looks like a new minor version was released a few weeks ago.

https://ctan.math.illinois.edu/systems/win32/miktex/setup/windows-x86/

image

I'll just do that on this PR, since it's blocking anyway.

@jameslamb jameslamb requested a review from Laurae2 as a code owner March 1, 2021 15:57
@jameslamb jameslamb mentioned this pull request Mar 1, 2021
7 tasks
@jameslamb
Copy link
Collaborator Author

upgrading MikTeX did not work. Same error as before.

https://github.com/microsoft/LightGBM/pull/4032/checks?check_run_id=2005836771

The downloaded binary packages are in
C:\Users\runneradmin\AppData\Local\Temp\RtmpcJrRkF\downloaded_packages
Downloading https://github.com/microsoft/LightGBM/releases/download/v2.0.12/miktexsetup-4.1-x86.zip
Setting up MiKTeX
miktexsetup_standalone.exe: The remote package repository is not online. You have to choose another repository.
miktexsetup_standalone.exe: Data: url="https://ctan.math.illinois.edu/systems/win32/miktex/tm/packages/"

But I can clearly see that that URL is responding ok.

image

and I get a 200 hitting it directly with curl -I https://ctan.math.illinois.edu/systems/win32/miktex/tm/packages/

image

@jameslamb
Copy link
Collaborator Author

There are not any open issues at https://github.com/MiKTeX/miktex/issues related to this.

The answers on https://tex.stackexchange.com/questions/251242/unable-to-connect-to-repository-in-miktex-2-9 (from a few years ago), suggest that this error isn't about the availability of any one mirror. There is an answer there with 12 upvotes that says

Sometimes the miktex maintainer disables the main server with the list of repositories e.g. when a serious bug has appeared to avoid that people download a broken package. Sometimes the main server is simply down.

In November 2019, on a related issue (MiKTeX/miktex#414 (comment)), one of the MiKTeX maintainers said the following

Try contacting the MiKTEX server at https://api2.miktex.org/

I can confirm that that is down

image

curl -I https://api2.miktex.org/

returns

HTTP/2 404 
server: Microsoft-IIS/10.0
date: Mon, 01 Mar 2021 17:21:07 GMT

This seems like an issue similar to #4005 , where all of the MiKTeX ecosystem is stuck until something central is updated :/.

I don't have time right now to investigate this further, but if it isn't resolved by tonight (it's currently 11am in my timezone), I can pick up my exploration of {tinytex} (jameslamb#40) again, as a way to hopefully eliminate our dependency on this ecosystem.

@jameslamb
Copy link
Collaborator Author

One more update. I just tried pushing a change that points our CI code at a different package repository.

MiKTeX/miktex#414 mentioned that you could use https://miktex.org/pkg/repositories to find the list of package repositories. I noticed that the math.illinois.edu mirror we use is not listed there (https://miktex.org/pkg/repositories), even though it is listed as healthy on http://dante.ctan.org/mirmon/.

I pushed 4845136 changing package repo URLs, just to check if that matters at all.

@jameslamb
Copy link
Collaborator Author

Seems this is still a problem. I found another relevant issue. The exact same thing seems to have happened in 2018. Similar response from the maintainer as we've seen in other MiKTeX issues, basically "oh this is just broken for a day or two" 😫

MiKTeX/miktex#49 (comment)

@StrikerRUS StrikerRUS merged commit 2a00b6f into microsoft:master Mar 2, 2021
@jameslamb jameslamb deleted the fix/dask-sklearn-tests branch March 2, 2021 14:41
@jameslamb
Copy link
Collaborator Author

@StrikerRUS thanks for rebuilding! I'm glad this started working again.

Once we're done with the release (#3872 ), I'll spend some time working on some options to hopefully avoid this in the future. It's awful having the whole project's CI blocked for as long as two days just because of the tools needed for the R documentation on Windows.

@StrikerRUS
Copy link
Collaborator

Great, thank you!

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[dask] [ci] scikit-learn compatibility tests failing
2 participants