Support MPI in Dask #3831

StrikerRUS · 2021-01-24T01:39:10Z

Summary

Dask currently only supports pure socket-based training.
https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html#socket-version

Motivation

Adding this feature would allow users to perform more efficient training because LightGBM has native support of MPI.

References

#3515 (comment)

LightGBM/tests/python_package_test/test_dask.py

Line 36 in da44387

    
           pytest.mark.skipif(os.getenv('TASK', '') == 'mpi', reason='Fails to run with MPI interface'),

LightGBM/CMakeLists.txt

Line 1 in da44387

OPTION(USE_MPI "Enable MPI-based parallel learning" OFF)

https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html#mpi-version
http://mpi.dask.org/en/latest/
https://blog.dask.org/2019/01/31/dask-mpi-experiment

StrikerRUS · 2021-01-24T01:40:05Z

Closed in favor of being in #2302. We decided to keep all feature requests in one place.

Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.

jameslamb · 2021-01-24T03:48:21Z

Thanks for writing this up!

When I tried to use the existing lightgbm.dask module with a version of lightgbm built with MPI support, I found that training does not throw an error, but produces an incorrect result. I saw messages in the logs that indicated that each worker process thought it was the only one (rank: 0), and the returned booster was from the model trained on only a portion of the data.

I can provide a reproducible example with specific logs in the future, sorry that I don't have them readily available right now.

Or anyone who's interested can follow https://github.com/jameslamb/lightgbm-dask-testing and change the installation instructions to build with MPI support, based on the links in this issue's description.

StrikerRUS · 2021-03-27T20:04:13Z

Just tried to run all Dask tests with a version of lightgbm built with MPI support with the latest master.

And it seems that a lot of things have been changed since the last update.

Short summary:

2021-03-26T22:58:55.1435822Z =========================== short test summary info ============================
2021-03-26T22:58:55.1436513Z FAILED ../tests/python_package_test/test_dask.py::test_classifier[binary-classification-dataframe-with-categorical]
2021-03-26T22:58:55.1437294Z FAILED ../tests/python_package_test/test_dask.py::test_classifier[multiclass-classification-dataframe-with-categorical]
2021-03-26T22:58:55.1437831Z FAILED ../tests/python_package_test/test_dask.py::test_regressor[scipy_csr_matrix]
2021-03-26T22:58:55.1438502Z FAILED ../tests/python_package_test/test_dask.py::test_regressor[dataframe-with-categorical]
2021-03-26T22:58:55.1439237Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-binary-classification]
2021-03-26T22:58:55.1440037Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-multiclass-classification]
2021-03-26T22:58:55.1440817Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-regression]
2021-03-26T22:58:55.1441557Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-ranking]
2021-03-26T22:58:55.1442339Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-binary-classification]
2021-03-26T22:58:55.1443149Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-multiclass-classification]
2021-03-26T22:58:55.1443951Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-regression]
2021-03-26T22:58:55.1444743Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-binary-classification]
2021-03-26T22:58:55.1445636Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-multiclass-classification]
2021-03-26T22:58:55.1446417Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-regression]
2021-03-26T22:58:55.1447157Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-ranking]
2021-03-26T22:58:55.1447998Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-binary-classification]
2021-03-26T22:58:55.1448857Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-multiclass-classification]
2021-03-26T22:58:55.1449765Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-regression]
2021-03-26T22:58:55.1450578Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-ranking]
2021-03-26T22:58:55.1451151Z = 19 failed, 367 passed, 7 skipped, 2 xfailed, 279 warnings in 665.52s (0:11:05) =

Full testing logs:

2021-03-26T22:47:51.1373972Z ============================= test session starts ==============================
2021-03-26T22:47:51.1375603Z platform linux -- Python 3.8.8, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
2021-03-26T22:47:51.1376264Z rootdir: /__w/1/s
2021-03-26T22:47:51.1376785Z collected 395 items
2021-03-26T22:47:51.1377017Z 
2021-03-26T22:47:52.3656107Z ../tests/python_package_test/test_basic.py .........................     [  6%]
2021-03-26T22:47:59.8886337Z ../tests/python_package_test/test_consistency.py ......                  [  7%]
2021-03-26T22:49:57.3844912Z ../tests/python_package_test/test_dask.py ...F...F...........F.F........ [ 15%]
2021-03-26T22:55:27.1210953Z ...................................................s...............s.... [ 33%]
2021-03-26T22:58:11.5637273Z ....FFFFFFFsFFFFFFFF.....................s.................              [ 48%]
2021-03-26T22:58:11.5651450Z ../tests/python_package_test/test_dual.py s                              [ 48%]
2021-03-26T22:58:14.6851199Z ../tests/python_package_test/test_engine.py ............................ [ 55%]
2021-03-26T22:58:49.2158790Z .........................................                                [ 66%]
2021-03-26T22:58:49.9193008Z ../tests/python_package_test/test_plotting.py .....                      [ 67%]
2021-03-26T22:58:51.5285006Z ../tests/python_package_test/test_sklearn.py ........................... [ 74%]
2021-03-26T22:58:54.0446909Z ......x.............................................x................... [ 92%]
2021-03-26T22:58:54.9952924Z .......................ss...                                             [ 99%]
2021-03-26T22:58:55.0171532Z ../tests/python_package_test/test_utilities.py .                         [100%]
2021-03-26T22:58:55.0172521Z 
2021-03-26T22:58:55.0173223Z =================================== FAILURES ===================================
2021-03-26T22:58:55.0178689Z ______ test_classifier[binary-classification-dataframe-with-categorical] _______
2021-03-26T22:58:55.0179456Z 
2021-03-26T22:58:55.0183552Z output = 'dataframe-with-categorical', task = 'binary-classification'
2021-03-26T22:58:55.0185379Z client = <Client: 'tcp://127.0.0.1:42153' processes=2 threads=2, memory=16.70 GB>
2021-03-26T22:58:55.0186092Z 
2021-03-26T22:58:55.0187164Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0188411Z     @pytest.mark.parametrize('task', ['binary-classification', 'multiclass-classification'])
2021-03-26T22:58:55.0189630Z     def test_classifier(output, task, client):
2021-03-26T22:58:55.0190375Z         X, y, w, _, dX, dy, dw, _ = _create_data(
2021-03-26T22:58:55.0191094Z             objective=task,
2021-03-26T22:58:55.0191826Z             output=output
2021-03-26T22:58:55.0192382Z         )
2021-03-26T22:58:55.0195859Z     
2021-03-26T22:58:55.0196333Z         params = {
2021-03-26T22:58:55.0196884Z             "n_estimators": 50,
2021-03-26T22:58:55.0197518Z             "num_leaves": 31
2021-03-26T22:58:55.0198098Z         }
2021-03-26T22:58:55.0198599Z     
2021-03-26T22:58:55.0199195Z         dask_classifier = lgb.DaskLGBMClassifier(
2021-03-26T22:58:55.0199847Z             client=client,
2021-03-26T22:58:55.0200437Z             time_out=5,
2021-03-26T22:58:55.0201026Z             **params
2021-03-26T22:58:55.0201572Z         )
2021-03-26T22:58:55.0202193Z         dask_classifier = dask_classifier.fit(dX, dy, sample_weight=dw)
2021-03-26T22:58:55.0202947Z         p1 = dask_classifier.predict(dX)
2021-03-26T22:58:55.0203674Z         p1_proba = dask_classifier.predict_proba(dX).compute()
2021-03-26T22:58:55.0204461Z         p1_pred_leaf = dask_classifier.predict(dX, pred_leaf=True)
2021-03-26T22:58:55.0205197Z         p1_local = dask_classifier.to_local().predict(X)
2021-03-26T22:58:55.0205926Z         s1 = _accuracy_score(dy, p1)
2021-03-26T22:58:55.0206564Z         p1 = p1.compute()
2021-03-26T22:58:55.0207086Z     
2021-03-26T22:58:55.0207664Z         local_classifier = lgb.LGBMClassifier(**params)
2021-03-26T22:58:55.0208375Z         local_classifier.fit(X, y, sample_weight=w)
2021-03-26T22:58:55.0209011Z         p2 = local_classifier.predict(X)
2021-03-26T22:58:55.0209633Z         p2_proba = local_classifier.predict_proba(X)
2021-03-26T22:58:55.0210315Z         s2 = local_classifier.score(X, y)
2021-03-26T22:58:55.0210953Z     
2021-03-26T22:58:55.0211496Z         assert_eq(s1, s2)
2021-03-26T22:58:55.0212107Z         assert_eq(p1, p2)
2021-03-26T22:58:55.0212680Z         assert_eq(y, p1)
2021-03-26T22:58:55.0213289Z         assert_eq(y, p2)
2021-03-26T22:58:55.0213902Z >       assert_eq(p1_proba, p2_proba, atol=0.01)
2021-03-26T22:58:55.0214322Z 
2021-03-26T22:58:55.0214869Z ../tests/python_package_test/test_dask.py:269: 
2021-03-26T22:58:55.0215555Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2021-03-26T22:58:55.0215994Z 
2021-03-26T22:58:55.0216440Z a = array([[0.00325882, 0.99674118],
2021-03-26T22:58:55.0217032Z        [0.99325411, 0.00674589],
2021-03-26T22:58:55.0217805Z        [0.012703  , 0.987297  ],
2021-03-26T22:58:55.0218299Z        ...,
2021-03-26T22:58:55.0218768Z        [0.9968851 , 0.0031149 ],
2021-03-26T22:58:55.0219295Z        [0.99687036, 0.00312964],
2021-03-26T22:58:55.0219846Z        [0.00338618, 0.99661382]])
2021-03-26T22:58:55.0220369Z b = array([[0.00325299, 0.99674701],
2021-03-26T22:58:55.0220912Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0221423Z        [0.00325299, 0.99674701],
2021-03-26T22:58:55.0221888Z        ...,
2021-03-26T22:58:55.0222365Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0223088Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0223635Z        [0.00325299, 0.99674701]])
2021-03-26T22:58:55.0224259Z check_shape = True, check_graph = True, check_meta = True, check_chunks = True
2021-03-26T22:58:55.0225465Z kwargs = {'atol': 0.01}
2021-03-26T22:58:55.0226232Z a_original = array([[0.00325882, 0.99674118],
2021-03-26T22:58:55.0226841Z        [0.99325411, 0.00674589],
2021-03-26T22:58:55.0227379Z        [0.012703  , 0.987297  ],
2021-03-26T22:58:55.0227938Z        ...,
2021-03-26T22:58:55.0228415Z        [0.9968851 , 0.0031149 ],
2021-03-26T22:58:55.0228925Z        [0.99687036, 0.00312964],
2021-03-26T22:58:55.0229465Z        [0.00338618, 0.99661382]])
2021-03-26T22:58:55.0230063Z b_original = array([[0.00325299, 0.99674701],
2021-03-26T22:58:55.0230672Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0231225Z        [0.00325299, 0.99674701],
2021-03-26T22:58:55.0231705Z        ...,
2021-03-26T22:58:55.0232345Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0233025Z        [0.99675091, 0.00324909],
2021-03-26T22:58:55.0233746Z        [0.00325299, 0.99674701]])
2021-03-26T22:58:55.0235081Z adt = dtype('float64'), a_meta = None, a_computed = None, bdt = dtype('float64')
2021-03-26T22:58:55.0235748Z b_meta = None
2021-03-26T22:58:55.0236029Z 
2021-03-26T22:58:55.0236481Z     def assert_eq(
2021-03-26T22:58:55.0236992Z         a,
2021-03-26T22:58:55.0237487Z         b,
2021-03-26T22:58:55.0238018Z         check_shape=True,
2021-03-26T22:58:55.0238572Z         check_graph=True,
2021-03-26T22:58:55.0239135Z         check_meta=True,
2021-03-26T22:58:55.0239721Z         check_chunks=True,
2021-03-26T22:58:55.0240269Z         **kwargs,
2021-03-26T22:58:55.0240750Z     ):
2021-03-26T22:58:55.0241253Z         a_original = a
2021-03-26T22:58:55.0241766Z         b_original = b
2021-03-26T22:58:55.0242235Z     
2021-03-26T22:58:55.0242784Z         a, adt, a_meta, a_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0243597Z             a, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0244290Z         )
2021-03-26T22:58:55.0244881Z         b, bdt, b_meta, b_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0245768Z             b, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0246460Z         )
2021-03-26T22:58:55.0246912Z     
2021-03-26T22:58:55.0247423Z         if str(adt) != str(bdt):
2021-03-26T22:58:55.0248143Z             # Ignore check for matching length of flexible dtypes, since Array._meta
2021-03-26T22:58:55.0249251Z             # can't encode that information
2021-03-26T22:58:55.0250074Z             if adt.type == bdt.type and not (adt.type == np.bytes_ or adt.type == np.str_):
2021-03-26T22:58:55.0251219Z                 diff = difflib.ndiff(str(adt).splitlines(), str(bdt).splitlines())
2021-03-26T22:58:55.0252004Z                 raise AssertionError(
2021-03-26T22:58:55.0252747Z                     "string repr are different" + os.linesep + os.linesep.join(diff)
2021-03-26T22:58:55.0253496Z                 )
2021-03-26T22:58:55.0254015Z     
2021-03-26T22:58:55.0254516Z         try:
2021-03-26T22:58:55.0255035Z             assert (
2021-03-26T22:58:55.0255581Z                 a.shape == b.shape
2021-03-26T22:58:55.0256295Z             ), f"a and b have different shapes (a: {a.shape}, b: {b.shape})"
2021-03-26T22:58:55.0257105Z             if check_meta:
2021-03-26T22:58:55.0257857Z                 if hasattr(a, "_meta") and hasattr(b, "_meta"):
2021-03-26T22:58:55.0259131Z                     assert_eq(a._meta, b._meta)
2021-03-26T22:58:55.0259863Z                 if hasattr(a_original, "_meta"):
2021-03-26T22:58:55.0260505Z                     msg = (
2021-03-26T22:58:55.0261598Z                         f"compute()-ing 'a' changes its number of dimensions "
2021-03-26T22:58:55.0262437Z                         f"(before: {a_original._meta.ndim}, after: {a.ndim})"
2021-03-26T22:58:55.0263294Z                     )
2021-03-26T22:58:55.0264030Z                     assert a_original._meta.ndim == a.ndim, msg
2021-03-26T22:58:55.0264820Z                     if a_meta is not None:
2021-03-26T22:58:55.0265412Z                         msg = (
2021-03-26T22:58:55.0266569Z                             f"compute()-ing 'a' changes its type "
2021-03-26T22:58:55.0267572Z                             f"(before: {type(a_original._meta)}, after: {type(a_meta)})"
2021-03-26T22:58:55.0268285Z                         )
2021-03-26T22:58:55.0268883Z                         assert type(a_original._meta) == type(a_meta), msg
2021-03-26T22:58:55.0269654Z                         if not (np.isscalar(a_meta) or np.isscalar(a_computed)):
2021-03-26T22:58:55.0270405Z                             msg = (
2021-03-26T22:58:55.0271551Z                                 f"compute()-ing 'a' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0272712Z                                 f"(meta: {type(a_meta)}, computed: {type(a_computed)})"
2021-03-26T22:58:55.0273645Z                             )
2021-03-26T22:58:55.0274286Z                             assert type(a_meta) == type(a_computed), msg
2021-03-26T22:58:55.0275086Z                 if hasattr(b_original, "_meta"):
2021-03-26T22:58:55.0275734Z                     msg = (
2021-03-26T22:58:55.0276824Z                         f"compute()-ing 'b' changes its number of dimensions "
2021-03-26T22:58:55.0277965Z                         f"(before: {b_original._meta.ndim}, after: {b.ndim})"
2021-03-26T22:58:55.0278742Z                     )
2021-03-26T22:58:55.0279400Z                     assert b_original._meta.ndim == b.ndim, msg
2021-03-26T22:58:55.0280227Z                     if b_meta is not None:
2021-03-26T22:58:55.0280880Z                         msg = (
2021-03-26T22:58:55.0281873Z                             f"compute()-ing 'b' changes its type "
2021-03-26T22:58:55.0282709Z                             f"(before: {type(b_original._meta)}, after: {type(b_meta)})"
2021-03-26T22:58:55.0283441Z                         )
2021-03-26T22:58:55.0284105Z                         assert type(b_original._meta) == type(b_meta), msg
2021-03-26T22:58:55.0284970Z                         if not (np.isscalar(b_meta) or np.isscalar(b_computed)):
2021-03-26T22:58:55.0285750Z                             msg = (
2021-03-26T22:58:55.0286985Z                                 f"compute()-ing 'b' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0287978Z                                 f"(meta: {type(b_meta)}, computed: {type(b_computed)})"
2021-03-26T22:58:55.0288685Z                             )
2021-03-26T22:58:55.0289316Z                             assert type(b_meta) == type(b_computed), msg
2021-03-26T22:58:55.0290510Z             msg = "found values in 'a' and 'b' which differ by more than the allowed amount"
2021-03-26T22:58:55.0291333Z >           assert allclose(a, b, **kwargs), msg
2021-03-26T22:58:55.0292450Z E           AssertionError: found values in 'a' and 'b' which differ by more than the allowed amount
2021-03-26T22:58:55.0293192Z 
2021-03-26T22:58:55.0294387Z /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/dask/array/utils.py:340: AssertionError
2021-03-26T22:58:55.0295671Z ---------------------------- Captured stderr setup -----------------------------
2021-03-26T22:58:55.0297177Z distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
2021-03-26T22:58:55.0298599Z distributed.scheduler - INFO - Clear task state
2021-03-26T22:58:55.0299745Z distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:42153
2021-03-26T22:58:55.0300961Z distributed.scheduler - INFO -   dashboard at:            127.0.0.1:8787
2021-03-26T22:58:55.0302146Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:42895
2021-03-26T22:58:55.0303493Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:42895
2021-03-26T22:58:55.0304844Z distributed.worker - INFO -          dashboard at:            127.0.0.1:34601
2021-03-26T22:58:55.0306105Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:42153
2021-03-26T22:58:55.0307316Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0308614Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0309768Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:33801
2021-03-26T22:58:55.0310874Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0312018Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:33801
2021-03-26T22:58:55.0313542Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-db7bb05f-965e-4bc8-87d9-4dcacb2dacce/dask-worker-space/worker-jvlnksrt
2021-03-26T22:58:55.0315011Z distributed.worker - INFO -          dashboard at:            127.0.0.1:35153
2021-03-26T22:58:55.0316478Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0317649Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:42153
2021-03-26T22:58:55.0318850Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0320064Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0321323Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0322863Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-e0079976-e7f0-4449-b050-f779fb7f9b1a/dask-worker-space/worker-1wqn8enl
2021-03-26T22:58:55.0324430Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0326098Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:42895', name: tcp://127.0.0.1:42895, memory: 0, processing: 0>
2021-03-26T22:58:55.0327750Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:42895
2021-03-26T22:58:55.0328920Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0330070Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:42153
2021-03-26T22:58:55.0331329Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0332432Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0334055Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:33801', name: tcp://127.0.0.1:33801, memory: 0, processing: 0>
2021-03-26T22:58:55.0335728Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:33801
2021-03-26T22:58:55.0336924Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0338137Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:42153
2021-03-26T22:58:55.0339736Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0341089Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0342827Z distributed.scheduler - INFO - Receive client connection: Client-581105ed-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0344111Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0345271Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0346048Z Finding random open ports for workers
2021-03-26T22:58:55.0346773Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0347613Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0348937Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0350414Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0351747Z ----------------------------- Captured stderr call -----------------------------
2021-03-26T22:58:55.0353014Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0354122Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0355443Z --------------------------- Captured stderr teardown ---------------------------
2021-03-26T22:58:55.0356647Z distributed.scheduler - INFO - Remove client Client-581105ed-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0358007Z distributed.scheduler - INFO - Remove client Client-581105ed-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0359294Z distributed.scheduler - INFO - Close client connection: Client-581105ed-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0361154Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:42895
2021-03-26T22:58:55.0362450Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:33801
2021-03-26T22:58:55.0364313Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:42895', name: tcp://127.0.0.1:42895, memory: 0, processing: 0>
2021-03-26T22:58:55.0366599Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:42895
2021-03-26T22:58:55.0368399Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:33801', name: tcp://127.0.0.1:33801, memory: 0, processing: 0>
2021-03-26T22:58:55.0369874Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:33801
2021-03-26T22:58:55.0370751Z distributed.scheduler - INFO - Lost all workers
2021-03-26T22:58:55.0371680Z distributed.scheduler - INFO - Scheduler closing...
2021-03-26T22:58:55.0372666Z distributed.scheduler - INFO - Scheduler closing all comms
2021-03-26T22:58:55.0373725Z ____ test_classifier[multiclass-classification-dataframe-with-categorical] _____
2021-03-26T22:58:55.0374210Z 
2021-03-26T22:58:55.0375098Z output = 'dataframe-with-categorical', task = 'multiclass-classification'
2021-03-26T22:58:55.0376248Z client = <Client: 'tcp://127.0.0.1:36825' processes=2 threads=2, memory=16.70 GB>
2021-03-26T22:58:55.0376812Z 
2021-03-26T22:58:55.0377622Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0378695Z     @pytest.mark.parametrize('task', ['binary-classification', 'multiclass-classification'])
2021-03-26T22:58:55.0379486Z     def test_classifier(output, task, client):
2021-03-26T22:58:55.0380149Z         X, y, w, _, dX, dy, dw, _ = _create_data(
2021-03-26T22:58:55.0380716Z             objective=task,
2021-03-26T22:58:55.0381239Z             output=output
2021-03-26T22:58:55.0381713Z         )
2021-03-26T22:58:55.0382136Z     
2021-03-26T22:58:55.0382541Z         params = {
2021-03-26T22:58:55.0383106Z             "n_estimators": 50,
2021-03-26T22:58:55.0383494Z             "num_leaves": 31
2021-03-26T22:58:55.0383959Z         }
2021-03-26T22:58:55.0384370Z     
2021-03-26T22:58:55.0384887Z         dask_classifier = lgb.DaskLGBMClassifier(
2021-03-26T22:58:55.0385463Z             client=client,
2021-03-26T22:58:55.0385957Z             time_out=5,
2021-03-26T22:58:55.0386460Z             **params
2021-03-26T22:58:55.0386951Z         )
2021-03-26T22:58:55.0387580Z         dask_classifier = dask_classifier.fit(dX, dy, sample_weight=dw)
2021-03-26T22:58:55.0388309Z         p1 = dask_classifier.predict(dX)
2021-03-26T22:58:55.0389022Z         p1_proba = dask_classifier.predict_proba(dX).compute()
2021-03-26T22:58:55.0390278Z         p1_pred_leaf = dask_classifier.predict(dX, pred_leaf=True)
2021-03-26T22:58:55.0391049Z         p1_local = dask_classifier.to_local().predict(X)
2021-03-26T22:58:55.0391742Z         s1 = _accuracy_score(dy, p1)
2021-03-26T22:58:55.0392349Z         p1 = p1.compute()
2021-03-26T22:58:55.0392868Z     
2021-03-26T22:58:55.0393448Z         local_classifier = lgb.LGBMClassifier(**params)
2021-03-26T22:58:55.0394169Z         local_classifier.fit(X, y, sample_weight=w)
2021-03-26T22:58:55.0394855Z         p2 = local_classifier.predict(X)
2021-03-26T22:58:55.0395519Z         p2_proba = local_classifier.predict_proba(X)
2021-03-26T22:58:55.0396208Z         s2 = local_classifier.score(X, y)
2021-03-26T22:58:55.0396776Z     
2021-03-26T22:58:55.0397479Z >       assert_eq(s1, s2)
2021-03-26T22:58:55.0397824Z 
2021-03-26T22:58:55.0398436Z ../tests/python_package_test/test_dask.py:265: 
2021-03-26T22:58:55.0399261Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2021-03-26T22:58:55.0400090Z 
2021-03-26T22:58:55.0400725Z a = 0.998, b = 1.0, check_shape = True, check_graph = True, check_meta = True
2021-03-26T22:58:55.0401558Z check_chunks = True, kwargs = {}, a_original = 0.998, b_original = 1.0
2021-03-26T22:58:55.0402993Z adt = dtype('float64'), a_meta = None, a_computed = None, bdt = dtype('float64')
2021-03-26T22:58:55.0403702Z b_meta = None
2021-03-26T22:58:55.0403997Z 
2021-03-26T22:58:55.0404481Z     def assert_eq(
2021-03-26T22:58:55.0404979Z         a,
2021-03-26T22:58:55.0405463Z         b,
2021-03-26T22:58:55.0405989Z         check_shape=True,
2021-03-26T22:58:55.0406559Z         check_graph=True,
2021-03-26T22:58:55.0407242Z         check_meta=True,
2021-03-26T22:58:55.0407820Z         check_chunks=True,
2021-03-26T22:58:55.0408369Z         **kwargs,
2021-03-26T22:58:55.0408855Z     ):
2021-03-26T22:58:55.0409368Z         a_original = a
2021-03-26T22:58:55.0409927Z         b_original = b
2021-03-26T22:58:55.0410427Z     
2021-03-26T22:58:55.0411022Z         a, adt, a_meta, a_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0411851Z             a, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0412547Z         )
2021-03-26T22:58:55.0413142Z         b, bdt, b_meta, b_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0413952Z             b, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0414646Z         )
2021-03-26T22:58:55.0415118Z     
2021-03-26T22:58:55.0415653Z         if str(adt) != str(bdt):
2021-03-26T22:58:55.0416409Z             # Ignore check for matching length of flexible dtypes, since Array._meta
2021-03-26T22:58:55.0417560Z             # can't encode that information
2021-03-26T22:58:55.0418444Z             if adt.type == bdt.type and not (adt.type == np.bytes_ or adt.type == np.str_):
2021-03-26T22:58:55.0419394Z                 diff = difflib.ndiff(str(adt).splitlines(), str(bdt).splitlines())
2021-03-26T22:58:55.0420141Z                 raise AssertionError(
2021-03-26T22:58:55.0420892Z                     "string repr are different" + os.linesep + os.linesep.join(diff)
2021-03-26T22:58:55.0421593Z                 )
2021-03-26T22:58:55.0422073Z     
2021-03-26T22:58:55.0422719Z         try:
2021-03-26T22:58:55.0423279Z             assert (
2021-03-26T22:58:55.0423845Z                 a.shape == b.shape
2021-03-26T22:58:55.0424633Z             ), f"a and b have different shapes (a: {a.shape}, b: {b.shape})"
2021-03-26T22:58:55.0425423Z             if check_meta:
2021-03-26T22:58:55.0426169Z                 if hasattr(a, "_meta") and hasattr(b, "_meta"):
2021-03-26T22:58:55.0426936Z                     assert_eq(a._meta, b._meta)
2021-03-26T22:58:55.0427679Z                 if hasattr(a_original, "_meta"):
2021-03-26T22:58:55.0428353Z                     msg = (
2021-03-26T22:58:55.0429482Z                         f"compute()-ing 'a' changes its number of dimensions "
2021-03-26T22:58:55.0430413Z                         f"(before: {a_original._meta.ndim}, after: {a.ndim})"
2021-03-26T22:58:55.0431163Z                     )
2021-03-26T22:58:55.0431811Z                     assert a_original._meta.ndim == a.ndim, msg
2021-03-26T22:58:55.0432539Z                     if a_meta is not None:
2021-03-26T22:58:55.0433201Z                         msg = (
2021-03-26T22:58:55.0434256Z                             f"compute()-ing 'a' changes its type "
2021-03-26T22:58:55.0435175Z                             f"(before: {type(a_original._meta)}, after: {type(a_meta)})"
2021-03-26T22:58:55.0435960Z                         )
2021-03-26T22:58:55.0436656Z                         assert type(a_original._meta) == type(a_meta), msg
2021-03-26T22:58:55.0437570Z                         if not (np.isscalar(a_meta) or np.isscalar(a_computed)):
2021-03-26T22:58:55.0438375Z                             msg = (
2021-03-26T22:58:55.0439612Z                                 f"compute()-ing 'a' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0440795Z                                 f"(meta: {type(a_meta)}, computed: {type(a_computed)})"
2021-03-26T22:58:55.0441605Z                             )
2021-03-26T22:58:55.0442318Z                             assert type(a_meta) == type(a_computed), msg
2021-03-26T22:58:55.0443115Z                 if hasattr(b_original, "_meta"):
2021-03-26T22:58:55.0443798Z                     msg = (
2021-03-26T22:58:55.0444890Z                         f"compute()-ing 'b' changes its number of dimensions "
2021-03-26T22:58:55.0445805Z                         f"(before: {b_original._meta.ndim}, after: {b.ndim})"
2021-03-26T22:58:55.0446539Z                     )
2021-03-26T22:58:55.0447336Z                     assert b_original._meta.ndim == b.ndim, msg
2021-03-26T22:58:55.0448087Z                     if b_meta is not None:
2021-03-26T22:58:55.0448747Z                         msg = (
2021-03-26T22:58:55.0449795Z                             f"compute()-ing 'b' changes its type "
2021-03-26T22:58:55.0450721Z                             f"(before: {type(b_original._meta)}, after: {type(b_meta)})"
2021-03-26T22:58:55.0451514Z                         )
2021-03-26T22:58:55.0452201Z                         assert type(b_original._meta) == type(b_meta), msg
2021-03-26T22:58:55.0453119Z                         if not (np.isscalar(b_meta) or np.isscalar(b_computed)):
2021-03-26T22:58:55.0453927Z                             msg = (
2021-03-26T22:58:55.0455168Z                                 f"compute()-ing 'b' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0456361Z                                 f"(meta: {type(b_meta)}, computed: {type(b_computed)})"
2021-03-26T22:58:55.0457082Z                             )
2021-03-26T22:58:55.0457914Z                             assert type(b_meta) == type(b_computed), msg
2021-03-26T22:58:55.0459113Z             msg = "found values in 'a' and 'b' which differ by more than the allowed amount"
2021-03-26T22:58:55.0459915Z >           assert allclose(a, b, **kwargs), msg
2021-03-26T22:58:55.0461242Z E           AssertionError: found values in 'a' and 'b' which differ by more than the allowed amount
2021-03-26T22:58:55.0461816Z 
2021-03-26T22:58:55.0463115Z /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/dask/array/utils.py:340: AssertionError
2021-03-26T22:58:55.0464445Z ---------------------------- Captured stderr setup -----------------------------
2021-03-26T22:58:55.0466025Z distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
2021-03-26T22:58:55.0467376Z distributed.scheduler - INFO - Clear task state
2021-03-26T22:58:55.0468529Z distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:36825
2021-03-26T22:58:55.0469762Z distributed.scheduler - INFO -   dashboard at:            127.0.0.1:8787
2021-03-26T22:58:55.0471040Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:43775
2021-03-26T22:58:55.0472342Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:43775
2021-03-26T22:58:55.0473635Z distributed.worker - INFO -          dashboard at:            127.0.0.1:39257
2021-03-26T22:58:55.0474906Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:36825
2021-03-26T22:58:55.0476228Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0477430Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0478687Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0480279Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-38748e1d-3129-4c5d-a0c4-6a0f4fa58d48/dask-worker-space/worker-bkz6bwv4
2021-03-26T22:58:55.0481736Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0482987Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:42565
2021-03-26T22:58:55.0484434Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:42565
2021-03-26T22:58:55.0485707Z distributed.worker - INFO -          dashboard at:            127.0.0.1:34889
2021-03-26T22:58:55.0487012Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:36825
2021-03-26T22:58:55.0488331Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0489599Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0490879Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0492440Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-e73195fb-2b55-4ef5-9b8a-0b9bcf3244e8/dask-worker-space/worker-zibzdxw7
2021-03-26T22:58:55.0494234Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0495966Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:43775', name: tcp://127.0.0.1:43775, memory: 0, processing: 0>
2021-03-26T22:58:55.0497675Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:43775
2021-03-26T22:58:55.0498986Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0500154Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:36825
2021-03-26T22:58:55.0501401Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0503232Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:42565', name: tcp://127.0.0.1:42565, memory: 0, processing: 0>
2021-03-26T22:58:55.0504817Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0506032Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:42565
2021-03-26T22:58:55.0507250Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0508534Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:36825
2021-03-26T22:58:55.0509824Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0510941Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0512184Z distributed.scheduler - INFO - Receive client connection: Client-62203d5b-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0513490Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0514650Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0515446Z Finding random open ports for workers
2021-03-26T22:58:55.0516156Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0517038Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0518413Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0519921Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0521211Z ----------------------------- Captured stderr call -----------------------------
2021-03-26T22:58:55.0522488Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0523704Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0524984Z --------------------------- Captured stderr teardown ---------------------------
2021-03-26T22:58:55.0526209Z distributed.scheduler - INFO - Remove client Client-62203d5b-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0527511Z distributed.scheduler - INFO - Remove client Client-62203d5b-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0528923Z distributed.scheduler - INFO - Close client connection: Client-62203d5b-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0530262Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:42565
2021-03-26T22:58:55.0531451Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:43775
2021-03-26T22:58:55.0533334Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:42565', name: tcp://127.0.0.1:42565, memory: 0, processing: 0>
2021-03-26T22:58:55.0535007Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:42565
2021-03-26T22:58:55.0536722Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:43775', name: tcp://127.0.0.1:43775, memory: 0, processing: 0>
2021-03-26T22:58:55.0538335Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:43775
2021-03-26T22:58:55.0539379Z distributed.scheduler - INFO - Lost all workers
2021-03-26T22:58:55.0540450Z distributed.scheduler - INFO - Scheduler closing...
2021-03-26T22:58:55.0541654Z distributed.scheduler - INFO - Scheduler closing all comms
2021-03-26T22:58:55.0542522Z _______________________ test_regressor[scipy_csr_matrix] _______________________
2021-03-26T22:58:55.0543130Z 
2021-03-26T22:58:55.0543954Z output = 'scipy_csr_matrix'
2021-03-26T22:58:55.0545119Z client = <Client: 'tcp://127.0.0.1:33671' processes=2 threads=2, memory=16.70 GB>
2021-03-26T22:58:55.0545972Z 
2021-03-26T22:58:55.0546840Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0547626Z     def test_regressor(output, client):
2021-03-26T22:58:55.0548388Z         X, y, w, _, dX, dy, dw, _ = _create_data(
2021-03-26T22:58:55.0574862Z             objective='regression',
2021-03-26T22:58:55.0575917Z             output=output
2021-03-26T22:58:55.0576509Z         )
2021-03-26T22:58:55.0577043Z     
2021-03-26T22:58:55.0577545Z         params = {
2021-03-26T22:58:55.0578146Z             "random_state": 42,
2021-03-26T22:58:55.0578797Z             "num_leaves": 31,
2021-03-26T22:58:55.0579446Z             "n_estimators": 20,
2021-03-26T22:58:55.0580007Z         }
2021-03-26T22:58:55.0580493Z     
2021-03-26T22:58:55.0581075Z         dask_regressor = lgb.DaskLGBMRegressor(
2021-03-26T22:58:55.0581716Z             client=client,
2021-03-26T22:58:55.0582308Z             time_out=5,
2021-03-26T22:58:55.0583691Z             tree='data',
2021-03-26T22:58:55.0584325Z             **params
2021-03-26T22:58:55.0584846Z         )
2021-03-26T22:58:55.0585505Z         dask_regressor = dask_regressor.fit(dX, dy, sample_weight=dw)
2021-03-26T22:58:55.0586253Z         p1 = dask_regressor.predict(dX)
2021-03-26T22:58:55.0586931Z         p1_pred_leaf = dask_regressor.predict(dX, pred_leaf=True)
2021-03-26T22:58:55.0587535Z     
2021-03-26T22:58:55.0588050Z         s1 = _r2_score(dy, p1)
2021-03-26T22:58:55.0588618Z         p1 = p1.compute()
2021-03-26T22:58:55.0589236Z         p1_local = dask_regressor.to_local().predict(X)
2021-03-26T22:58:55.0589950Z         s1_local = dask_regressor.to_local().score(X, y)
2021-03-26T22:58:55.0590540Z     
2021-03-26T22:58:55.0591091Z         local_regressor = lgb.LGBMRegressor(**params)
2021-03-26T22:58:55.0591783Z         local_regressor.fit(X, y, sample_weight=w)
2021-03-26T22:58:55.0592436Z         s2 = local_regressor.score(X, y)
2021-03-26T22:58:55.0593073Z         p2 = local_regressor.predict(X)
2021-03-26T22:58:55.0593600Z     
2021-03-26T22:58:55.0594121Z         # Scores should be the same
2021-03-26T22:58:55.0594727Z >       assert_eq(s1, s2, atol=0.01)
2021-03-26T22:58:55.0595067Z 
2021-03-26T22:58:55.0595628Z ../tests/python_package_test/test_dask.py:436: 
2021-03-26T22:58:55.0596397Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2021-03-26T22:58:55.0596852Z 
2021-03-26T22:58:55.0597437Z a = 0.9631152654080015, b = 0.9769588528806643, check_shape = True
2021-03-26T22:58:55.0598161Z check_graph = True, check_meta = True, check_chunks = True
2021-03-26T22:58:55.0599312Z kwargs = {'atol': 0.01}, a_original = 0.9631152654080015
2021-03-26T22:58:55.0600412Z b_original = 0.9769588528806643, adt = dtype('float64'), a_meta = None
2021-03-26T22:58:55.0601477Z a_computed = None, bdt = dtype('float64'), b_meta = None
2021-03-26T22:58:55.0601883Z 
2021-03-26T22:58:55.0602365Z     def assert_eq(
2021-03-26T22:58:55.0603009Z         a,
2021-03-26T22:58:55.0603493Z         b,
2021-03-26T22:58:55.0604009Z         check_shape=True,
2021-03-26T22:58:55.0604568Z         check_graph=True,
2021-03-26T22:58:55.0605102Z         check_meta=True,
2021-03-26T22:58:55.0605659Z         check_chunks=True,
2021-03-26T22:58:55.0606196Z         **kwargs,
2021-03-26T22:58:55.0606675Z     ):
2021-03-26T22:58:55.0607176Z         a_original = a
2021-03-26T22:58:55.0607720Z         b_original = b
2021-03-26T22:58:55.0608194Z     
2021-03-26T22:58:55.0608776Z         a, adt, a_meta, a_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0609577Z             a, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0610256Z         )
2021-03-26T22:58:55.0610966Z         b, bdt, b_meta, b_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0611774Z             b, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0612455Z         )
2021-03-26T22:58:55.0612917Z     
2021-03-26T22:58:55.0613453Z         if str(adt) != str(bdt):
2021-03-26T22:58:55.0614213Z             # Ignore check for matching length of flexible dtypes, since Array._meta
2021-03-26T22:58:55.0615310Z             # can't encode that information
2021-03-26T22:58:55.0616092Z             if adt.type == bdt.type and not (adt.type == np.bytes_ or adt.type == np.str_):
2021-03-26T22:58:55.0616761Z                 diff = difflib.ndiff(str(adt).splitlines(), str(bdt).splitlines())
2021-03-26T22:58:55.0617336Z                 raise AssertionError(
2021-03-26T22:58:55.0617986Z                     "string repr are different" + os.linesep + os.linesep.join(diff)
2021-03-26T22:58:55.0618494Z                 )
2021-03-26T22:58:55.0619020Z     
2021-03-26T22:58:55.0619408Z         try:
2021-03-26T22:58:55.0619832Z             assert (
2021-03-26T22:58:55.0620260Z                 a.shape == b.shape
2021-03-26T22:58:55.0620871Z             ), f"a and b have different shapes (a: {a.shape}, b: {b.shape})"
2021-03-26T22:58:55.0621491Z             if check_meta:
2021-03-26T22:58:55.0622051Z                 if hasattr(a, "_meta") and hasattr(b, "_meta"):
2021-03-26T22:58:55.0622888Z                     assert_eq(a._meta, b._meta)
2021-03-26T22:58:55.0623475Z                 if hasattr(a_original, "_meta"):
2021-03-26T22:58:55.0623998Z                     msg = (
2021-03-26T22:58:55.0624879Z                         f"compute()-ing 'a' changes its number of dimensions "
2021-03-26T22:58:55.0625602Z                         f"(before: {a_original._meta.ndim}, after: {a.ndim})"
2021-03-26T22:58:55.0626204Z                     )
2021-03-26T22:58:55.0626698Z                     assert a_original._meta.ndim == a.ndim, msg
2021-03-26T22:58:55.0627257Z                     if a_meta is not None:
2021-03-26T22:58:55.0627829Z                         msg = (
2021-03-26T22:58:55.0628680Z                             f"compute()-ing 'a' changes its type "
2021-03-26T22:58:55.0629406Z                             f"(before: {type(a_original._meta)}, after: {type(a_meta)})"
2021-03-26T22:58:55.0630017Z                         )
2021-03-26T22:58:55.0630544Z                         assert type(a_original._meta) == type(a_meta), msg
2021-03-26T22:58:55.0631314Z                         if not (np.isscalar(a_meta) or np.isscalar(a_computed)):
2021-03-26T22:58:55.0631937Z                             msg = (
2021-03-26T22:58:55.0632920Z                                 f"compute()-ing 'a' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0633722Z                                 f"(meta: {type(a_meta)}, computed: {type(a_computed)})"
2021-03-26T22:58:55.0634357Z                             )
2021-03-26T22:58:55.0634894Z                             assert type(a_meta) == type(a_computed), msg
2021-03-26T22:58:55.0635495Z                 if hasattr(b_original, "_meta"):
2021-03-26T22:58:55.0636029Z                     msg = (
2021-03-26T22:58:55.0636920Z                         f"compute()-ing 'b' changes its number of dimensions "
2021-03-26T22:58:55.0637789Z                         f"(before: {b_original._meta.ndim}, after: {b.ndim})"
2021-03-26T22:58:55.0638375Z                     )
2021-03-26T22:58:55.0638892Z                     assert b_original._meta.ndim == b.ndim, msg
2021-03-26T22:58:55.0639476Z                     if b_meta is not None:
2021-03-26T22:58:55.0639988Z                         msg = (
2021-03-26T22:58:55.0640843Z                             f"compute()-ing 'b' changes its type "
2021-03-26T22:58:55.0641570Z                             f"(before: {type(b_original._meta)}, after: {type(b_meta)})"
2021-03-26T22:58:55.0642195Z                         )
2021-03-26T22:58:55.0642716Z                         assert type(b_original._meta) == type(b_meta), msg
2021-03-26T22:58:55.0643553Z                         if not (np.isscalar(b_meta) or np.isscalar(b_computed)):
2021-03-26T22:58:55.0644179Z                             msg = (
2021-03-26T22:58:55.0645196Z                                 f"compute()-ing 'b' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0646012Z                                 f"(meta: {type(b_meta)}, computed: {type(b_computed)})"
2021-03-26T22:58:55.0646630Z                             )
2021-03-26T22:58:55.0647170Z                             assert type(b_meta) == type(b_computed), msg
2021-03-26T22:58:55.0648181Z             msg = "found values in 'a' and 'b' which differ by more than the allowed amount"
2021-03-26T22:58:55.0648859Z >           assert allclose(a, b, **kwargs), msg
2021-03-26T22:58:55.0649784Z E           AssertionError: found values in 'a' and 'b' which differ by more than the allowed amount
2021-03-26T22:58:55.0650256Z 
2021-03-26T22:58:55.0651129Z /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/dask/array/utils.py:340: AssertionError
2021-03-26T22:58:55.0652146Z ---------------------------- Captured stderr setup -----------------------------
2021-03-26T22:58:55.0653347Z distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
2021-03-26T22:58:55.0654508Z distributed.scheduler - INFO - Clear task state
2021-03-26T22:58:55.0655411Z distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:33671
2021-03-26T22:58:55.0656413Z distributed.scheduler - INFO -   dashboard at:            127.0.0.1:8787
2021-03-26T22:58:55.0657653Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:46647
2021-03-26T22:58:55.0658718Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:46647
2021-03-26T22:58:55.0659771Z distributed.worker - INFO -          dashboard at:            127.0.0.1:42965
2021-03-26T22:58:55.0660799Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:33671
2021-03-26T22:58:55.0661827Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:35523
2021-03-26T22:58:55.0662962Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0663979Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0664993Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:35523
2021-03-26T22:58:55.0666008Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0667124Z distributed.worker - INFO -          dashboard at:            127.0.0.1:44227
2021-03-26T22:58:55.0668427Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-d6ff5854-e6ac-4cd6-8a3e-48cb017f1be1/dask-worker-space/worker-s0npde55
2021-03-26T22:58:55.0669500Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:33671
2021-03-26T22:58:55.0670547Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0671446Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0672527Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0673727Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0674931Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-8ab394d7-56a3-487b-9919-9bf6df6db4e5/dask-worker-space/worker-2fbwsik7
2021-03-26T22:58:55.0676039Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0677376Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:46647', name: tcp://127.0.0.1:46647, memory: 0, processing: 0>
2021-03-26T22:58:55.0678728Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:46647
2021-03-26T22:58:55.0679627Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0680717Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:33671
2021-03-26T22:58:55.0681664Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0682922Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:35523', name: tcp://127.0.0.1:35523, memory: 0, processing: 0>
2021-03-26T22:58:55.0684059Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0684987Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:35523
2021-03-26T22:58:55.0685912Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0686835Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:33671
2021-03-26T22:58:55.0687767Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0688612Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0689591Z distributed.scheduler - INFO - Receive client connection: Client-7df33586-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0690517Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0691376Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0691967Z Finding random open ports for workers
2021-03-26T22:58:55.0692527Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0693554Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0694307Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0695325Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0696356Z ----------------------------- Captured stderr call -----------------------------
2021-03-26T22:58:55.0697280Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0698193Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0699099Z --------------------------- Captured stderr teardown ---------------------------
2021-03-26T22:58:55.0700084Z distributed.scheduler - INFO - Remove client Client-7df33586-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0701077Z distributed.scheduler - INFO - Remove client Client-7df33586-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0702141Z distributed.scheduler - INFO - Close client connection: Client-7df33586-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0703311Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:46647
2021-03-26T22:58:55.0704307Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:35523
2021-03-26T22:58:55.0705638Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:46647', name: tcp://127.0.0.1:46647, memory: 0, processing: 0>
2021-03-26T22:58:55.0706916Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:46647
2021-03-26T22:58:55.0708233Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:35523', name: tcp://127.0.0.1:35523, memory: 0, processing: 0>
2021-03-26T22:58:55.0709464Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:35523
2021-03-26T22:58:55.0710495Z distributed.scheduler - INFO - Lost all workers
2021-03-26T22:58:55.0711298Z distributed.scheduler - INFO - Scheduler closing...
2021-03-26T22:58:55.0712123Z distributed.scheduler - INFO - Scheduler closing all comms
2021-03-26T22:58:55.0712996Z __________________ test_regressor[dataframe-with-categorical] __________________
2021-03-26T22:58:55.0713389Z 
2021-03-26T22:58:55.0713986Z output = 'dataframe-with-categorical'
2021-03-26T22:58:55.0714954Z client = <Client: 'tcp://127.0.0.1:39473' processes=2 threads=2, memory=16.70 GB>
2021-03-26T22:58:55.0715428Z 
2021-03-26T22:58:55.0716078Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0716651Z     def test_regressor(output, client):
2021-03-26T22:58:55.0717332Z         X, y, w, _, dX, dy, dw, _ = _create_data(
2021-03-26T22:58:55.0718070Z             objective='regression',
2021-03-26T22:58:55.0718569Z             output=output
2021-03-26T22:58:55.0718992Z         )
2021-03-26T22:58:55.0719362Z     
2021-03-26T22:58:55.0719736Z         params = {
2021-03-26T22:58:55.0720206Z             "random_state": 42,
2021-03-26T22:58:55.0720701Z             "num_leaves": 31,
2021-03-26T22:58:55.0721188Z             "n_estimators": 20,
2021-03-26T22:58:55.0721631Z         }
2021-03-26T22:58:55.0721992Z     
2021-03-26T22:58:55.0722460Z         dask_regressor = lgb.DaskLGBMRegressor(
2021-03-26T22:58:55.0722980Z             client=client,
2021-03-26T22:58:55.0723417Z             time_out=5,
2021-03-26T22:58:55.0724159Z             tree='data',
2021-03-26T22:58:55.0724605Z             **params
2021-03-26T22:58:55.0725013Z         )
2021-03-26T22:58:55.0725519Z         dask_regressor = dask_regressor.fit(dX, dy, sample_weight=dw)
2021-03-26T22:58:55.0726088Z         p1 = dask_regressor.predict(dX)
2021-03-26T22:58:55.0726623Z         p1_pred_leaf = dask_regressor.predict(dX, pred_leaf=True)
2021-03-26T22:58:55.0727112Z     
2021-03-26T22:58:55.0727511Z         s1 = _r2_score(dy, p1)
2021-03-26T22:58:55.0727946Z         p1 = p1.compute()
2021-03-26T22:58:55.0728455Z         p1_local = dask_regressor.to_local().predict(X)
2021-03-26T22:58:55.0729015Z         s1_local = dask_regressor.to_local().score(X, y)
2021-03-26T22:58:55.0729472Z     
2021-03-26T22:58:55.0729905Z         local_regressor = lgb.LGBMRegressor(**params)
2021-03-26T22:58:55.0730446Z         local_regressor.fit(X, y, sample_weight=w)
2021-03-26T22:58:55.0730978Z         s2 = local_regressor.score(X, y)
2021-03-26T22:58:55.0731451Z         p2 = local_regressor.predict(X)
2021-03-26T22:58:55.0731868Z     
2021-03-26T22:58:55.0732271Z         # Scores should be the same
2021-03-26T22:58:55.0732729Z >       assert_eq(s1, s2, atol=0.01)
2021-03-26T22:58:55.0733005Z 
2021-03-26T22:58:55.0733470Z ../tests/python_package_test/test_dask.py:436: 
2021-03-26T22:58:55.0734058Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
2021-03-26T22:58:55.0734435Z 
2021-03-26T22:58:55.0734917Z a = 0.9438565250981764, b = 0.9700866728749767, check_shape = True
2021-03-26T22:58:55.0735538Z check_graph = True, check_meta = True, check_chunks = True
2021-03-26T22:58:55.0736457Z kwargs = {'atol': 0.01}, a_original = 0.9438565250981764
2021-03-26T22:58:55.0737367Z b_original = 0.9700866728749767, adt = dtype('float64'), a_meta = None
2021-03-26T22:58:55.0738190Z a_computed = None, bdt = dtype('float64'), b_meta = None
2021-03-26T22:58:55.0738537Z 
2021-03-26T22:58:55.0738910Z     def assert_eq(
2021-03-26T22:58:55.0739280Z         a,
2021-03-26T22:58:55.0739649Z         b,
2021-03-26T22:58:55.0740040Z         check_shape=True,
2021-03-26T22:58:55.0740484Z         check_graph=True,
2021-03-26T22:58:55.0740929Z         check_meta=True,
2021-03-26T22:58:55.0741372Z         check_chunks=True,
2021-03-26T22:58:55.0741785Z         **kwargs,
2021-03-26T22:58:55.0742189Z     ):
2021-03-26T22:58:55.0742751Z         a_original = a
2021-03-26T22:58:55.0743205Z         b_original = b
2021-03-26T22:58:55.0743591Z     
2021-03-26T22:58:55.0744048Z         a, adt, a_meta, a_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0744838Z             a, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0745351Z         )
2021-03-26T22:58:55.0745832Z         b, bdt, b_meta, b_computed = _get_dt_meta_computed(
2021-03-26T22:58:55.0746468Z             b, check_shape=check_shape, check_graph=check_graph, check_chunks=check_chunks
2021-03-26T22:58:55.0747006Z         )
2021-03-26T22:58:55.0747347Z     
2021-03-26T22:58:55.0747765Z         if str(adt) != str(bdt):
2021-03-26T22:58:55.0748344Z             # Ignore check for matching length of flexible dtypes, since Array._meta
2021-03-26T22:58:55.0749260Z             # can't encode that information
2021-03-26T22:58:55.0750135Z             if adt.type == bdt.type and not (adt.type == np.bytes_ or adt.type == np.str_):
2021-03-26T22:58:55.0750898Z                 diff = difflib.ndiff(str(adt).splitlines(), str(bdt).splitlines())
2021-03-26T22:58:55.0751486Z                 raise AssertionError(
2021-03-26T22:58:55.0752072Z                     "string repr are different" + os.linesep + os.linesep.join(diff)
2021-03-26T22:58:55.0752598Z                 )
2021-03-26T22:58:55.0752981Z     
2021-03-26T22:58:55.0753339Z         try:
2021-03-26T22:58:55.0753701Z             assert (
2021-03-26T22:58:55.0754099Z                 a.shape == b.shape
2021-03-26T22:58:55.0754636Z             ), f"a and b have different shapes (a: {a.shape}, b: {b.shape})"
2021-03-26T22:58:55.0755170Z             if check_meta:
2021-03-26T22:58:55.0755667Z                 if hasattr(a, "_meta") and hasattr(b, "_meta"):
2021-03-26T22:58:55.0756206Z                     assert_eq(a._meta, b._meta)
2021-03-26T22:58:55.0756715Z                 if hasattr(a_original, "_meta"):
2021-03-26T22:58:55.0757196Z                     msg = (
2021-03-26T22:58:55.0758007Z                         f"compute()-ing 'a' changes its number of dimensions "
2021-03-26T22:58:55.0758653Z                         f"(before: {a_original._meta.ndim}, after: {a.ndim})"
2021-03-26T22:58:55.0759182Z                     )
2021-03-26T22:58:55.0759626Z                     assert a_original._meta.ndim == a.ndim, msg
2021-03-26T22:58:55.0760152Z                     if a_meta is not None:
2021-03-26T22:58:55.0760668Z                         msg = (
2021-03-26T22:58:55.0761521Z                             f"compute()-ing 'a' changes its type "
2021-03-26T22:58:55.0762234Z                             f"(before: {type(a_original._meta)}, after: {type(a_meta)})"
2021-03-26T22:58:55.0762858Z                         )
2021-03-26T22:58:55.0763388Z                         assert type(a_original._meta) == type(a_meta), msg
2021-03-26T22:58:55.0764085Z                         if not (np.isscalar(a_meta) or np.isscalar(a_computed)):
2021-03-26T22:58:55.0764695Z                             msg = (
2021-03-26T22:58:55.0765658Z                                 f"compute()-ing 'a' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0766432Z                                 f"(meta: {type(a_meta)}, computed: {type(a_computed)})"
2021-03-26T22:58:55.0767064Z                             )
2021-03-26T22:58:55.0767598Z                             assert type(a_meta) == type(a_computed), msg
2021-03-26T22:58:55.0768222Z                 if hasattr(b_original, "_meta"):
2021-03-26T22:58:55.0768742Z                     msg = (
2021-03-26T22:58:55.0769615Z                         f"compute()-ing 'b' changes its number of dimensions "
2021-03-26T22:58:55.0770299Z                         f"(before: {b_original._meta.ndim}, after: {b.ndim})"
2021-03-26T22:58:55.0770887Z                     )
2021-03-26T22:58:55.0771394Z                     assert b_original._meta.ndim == b.ndim, msg
2021-03-26T22:58:55.0771971Z                     if b_meta is not None:
2021-03-26T22:58:55.0772473Z                         msg = (
2021-03-26T22:58:55.0773318Z                             f"compute()-ing 'b' changes its type "
2021-03-26T22:58:55.0774061Z                             f"(before: {type(b_original._meta)}, after: {type(b_meta)})"
2021-03-26T22:58:55.0774831Z                         )
2021-03-26T22:58:55.0775364Z                         assert type(b_original._meta) == type(b_meta), msg
2021-03-26T22:58:55.0776059Z                         if not (np.isscalar(b_meta) or np.isscalar(b_computed)):
2021-03-26T22:58:55.0776686Z                             msg = (
2021-03-26T22:58:55.0777683Z                                 f"compute()-ing 'b' results in a different type than implied by its metadata "
2021-03-26T22:58:55.0778470Z                                 f"(meta: {type(b_meta)}, computed: {type(b_computed)})"
2021-03-26T22:58:55.0779093Z                             )
2021-03-26T22:58:55.0779627Z                             assert type(b_meta) == type(b_computed), msg
2021-03-26T22:58:55.0780728Z             msg = "found values in 'a' and 'b' which differ by more than the allowed amount"
2021-03-26T22:58:55.0781318Z >           assert allclose(a, b, **kwargs), msg
2021-03-26T22:58:55.0782212Z E           AssertionError: found values in 'a' and 'b' which differ by more than the allowed amount
2021-03-26T22:58:55.0782862Z 
2021-03-26T22:58:55.0783793Z /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/dask/array/utils.py:340: AssertionError
2021-03-26T22:58:55.0784794Z ---------------------------- Captured stderr setup -----------------------------
2021-03-26T22:58:55.0786003Z distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
2021-03-26T22:58:55.0787069Z distributed.scheduler - INFO - Clear task state
2021-03-26T22:58:55.0787915Z distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:39473
2021-03-26T22:58:55.0788866Z distributed.scheduler - INFO -   dashboard at:            127.0.0.1:8787
2021-03-26T22:58:55.0789858Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:43573
2021-03-26T22:58:55.0790868Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:43573
2021-03-26T22:58:55.0791842Z distributed.worker - INFO -          dashboard at:            127.0.0.1:43503
2021-03-26T22:58:55.0792821Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:39473
2021-03-26T22:58:55.0793750Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0794661Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0795609Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0796816Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-4449b4af-e9fb-44ca-8a69-c5c7ccec711f/dask-worker-space/worker-7g5mk922
2021-03-26T22:58:55.0797959Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0798933Z distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:41257
2021-03-26T22:58:55.0799952Z distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:41257
2021-03-26T22:58:55.0800931Z distributed.worker - INFO -          dashboard at:            127.0.0.1:43299
2021-03-26T22:58:55.0801960Z distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:39473
2021-03-26T22:58:55.0802908Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0803833Z distributed.worker - INFO -               Threads:                          1
2021-03-26T22:58:55.0804811Z distributed.worker - INFO -                Memory:                    8.35 GB
2021-03-26T22:58:55.0806056Z distributed.worker - INFO -       Local Directory: /__w/1/s/python-package/_test_worker-6a1b97d9-1841-4d95-815c-5c1ed507cee1/dask-worker-space/worker-p77382be
2021-03-26T22:58:55.0807188Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0808495Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:43573', name: tcp://127.0.0.1:43573, memory: 0, processing: 0>
2021-03-26T22:58:55.0809967Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:43573
2021-03-26T22:58:55.0810908Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0811963Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:39473
2021-03-26T22:58:55.0812918Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0814183Z distributed.scheduler - INFO - Register worker <Worker 'tcp://127.0.0.1:41257', name: tcp://127.0.0.1:41257, memory: 0, processing: 0>
2021-03-26T22:58:55.0815492Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0816488Z distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:41257
2021-03-26T22:58:55.0817662Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0818581Z distributed.worker - INFO -         Registered to:      tcp://127.0.0.1:39473
2021-03-26T22:58:55.0819707Z distributed.worker - INFO - -------------------------------------------------
2021-03-26T22:58:55.0820565Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0821580Z distributed.scheduler - INFO - Receive client connection: Client-82a73749-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0822721Z distributed.core - INFO - Starting established connection
2021-03-26T22:58:55.0823625Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0824229Z Finding random open ports for workers
2021-03-26T22:58:55.0824831Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0825498Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0826531Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0827708Z [LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
2021-03-26T22:58:55.0828822Z ----------------------------- Captured stderr call -----------------------------
2021-03-26T22:58:55.0829729Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0830636Z distributed.worker - INFO - Run out-of-band function '_find_random_open_port'
2021-03-26T22:58:55.0831584Z --------------------------- Captured stderr teardown ---------------------------
2021-03-26T22:58:55.0832571Z distributed.scheduler - INFO - Remove client Client-82a73749-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0833556Z distributed.scheduler - INFO - Remove client Client-82a73749-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0834636Z distributed.scheduler - INFO - Close client connection: Client-82a73749-8e85-11eb-a37d-87e408bca6bd
2021-03-26T22:58:55.0835644Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:43573
2021-03-26T22:58:55.0836906Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:43573', name: tcp://127.0.0.1:43573, memory: 0, processing: 0>
2021-03-26T22:58:55.0838162Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:43573
2021-03-26T22:58:55.0839112Z distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:41257
2021-03-26T22:58:55.0840443Z distributed.scheduler - INFO - Remove worker <Worker 'tcp://127.0.0.1:41257', name: tcp://127.0.0.1:41257, memory: 0, processing: 0>
2021-03-26T22:58:55.0841731Z distributed.core - INFO - Removing comms to tcp://127.0.0.1:41257
2021-03-26T22:58:55.0842559Z distributed.scheduler - INFO - Lost all workers
2021-03-26T22:58:55.0843372Z distributed.scheduler - INFO - Scheduler closing...
2021-03-26T22:58:55.0844167Z distributed.scheduler - INFO - Scheduler closing all comms
2021-03-26T22:58:55.0845071Z ____ test_machines_should_be_used_if_provided[array-binary-classification] _____
2021-03-26T22:58:55.0845478Z 
2021-03-26T22:58:55.0846129Z task = 'binary-classification', output = 'array'
2021-03-26T22:58:55.0846468Z 
2021-03-26T22:58:55.0847094Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.0848014Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0848652Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.0849572Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.0850506Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.0851033Z     
2021-03-26T22:58:55.0851575Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.0852200Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.0852689Z                 objective=task,
2021-03-26T22:58:55.0853130Z                 output=output,
2021-03-26T22:58:55.0853699Z                 chunk_size=10,
2021-03-26T22:58:55.0854111Z                 group=None
2021-03-26T22:58:55.0854508Z             )
2021-03-26T22:58:55.0854859Z     
2021-03-26T22:58:55.0855280Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.0855733Z     
2021-03-26T22:58:55.0856206Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.0857070Z             if output == 'array':
2021-03-26T22:58:55.0857564Z                 client.rebalance()
2021-03-26T22:58:55.0857970Z     
2021-03-26T22:58:55.0858687Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.0859229Z             assert n_workers > 1
2021-03-26T22:58:55.0859758Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.0860312Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.0860775Z                 n_estimators=5,
2021-03-26T22:58:55.0861183Z                 num_leaves=5,
2021-03-26T22:58:55.0861616Z                 machines=",".join([
2021-03-26T22:58:55.0862103Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.0862785Z                     for port in open_ports
2021-03-26T22:58:55.0863245Z                 ]),
2021-03-26T22:58:55.0863655Z             )
2021-03-26T22:58:55.0864020Z     
2021-03-26T22:58:55.0864500Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.0865075Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.0865603Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.0866206Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.0866874Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.0867839Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.0868376Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.0869197Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.0869569Z 
2021-03-26T22:58:55.0870002Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.0870873Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0871660Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.0872227Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0873216Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0873939Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0874942Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0875960Z __ test_machines_should_be_used_if_provided[array-multiclass-classification] ___
2021-03-26T22:58:55.0876324Z 
2021-03-26T22:58:55.0876950Z task = 'multiclass-classification', output = 'array'
2021-03-26T22:58:55.0877271Z 
2021-03-26T22:58:55.0877843Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.0878575Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0879178Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.0880158Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.0881039Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.0881529Z     
2021-03-26T22:58:55.0882087Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.0882718Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.0883220Z                 objective=task,
2021-03-26T22:58:55.0883659Z                 output=output,
2021-03-26T22:58:55.0884096Z                 chunk_size=10,
2021-03-26T22:58:55.0884522Z                 group=None
2021-03-26T22:58:55.0884917Z             )
2021-03-26T22:58:55.0885387Z     
2021-03-26T22:58:55.0885826Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.0886285Z     
2021-03-26T22:58:55.0886786Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.0887674Z             if output == 'array':
2021-03-26T22:58:55.0888177Z                 client.rebalance()
2021-03-26T22:58:55.0888592Z     
2021-03-26T22:58:55.0889314Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.0889833Z             assert n_workers > 1
2021-03-26T22:58:55.0890380Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.0890942Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.0891400Z                 n_estimators=5,
2021-03-26T22:58:55.0891811Z                 num_leaves=5,
2021-03-26T22:58:55.0892246Z                 machines=",".join([
2021-03-26T22:58:55.0892728Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.0893248Z                     for port in open_ports
2021-03-26T22:58:55.0893662Z                 ]),
2021-03-26T22:58:55.0894042Z             )
2021-03-26T22:58:55.0894381Z     
2021-03-26T22:58:55.0894843Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.0895425Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.0895964Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.0896612Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.0897312Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.0898256Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.0898808Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.0899703Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.0900107Z 
2021-03-26T22:58:55.0900579Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.0901451Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0902227Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.0902960Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0904020Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0904741Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0905735Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0906710Z __________ test_machines_should_be_used_if_provided[array-regression] __________
2021-03-26T22:58:55.0907079Z 
2021-03-26T22:58:55.0907698Z task = 'regression', output = 'array'
2021-03-26T22:58:55.0907979Z 
2021-03-26T22:58:55.0908590Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.0909329Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0909936Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.0910791Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.0911694Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.0912357Z     
2021-03-26T22:58:55.0912900Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.0913501Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.0913988Z                 objective=task,
2021-03-26T22:58:55.0914440Z                 output=output,
2021-03-26T22:58:55.0914871Z                 chunk_size=10,
2021-03-26T22:58:55.0915319Z                 group=None
2021-03-26T22:58:55.0915723Z             )
2021-03-26T22:58:55.0916094Z     
2021-03-26T22:58:55.0916529Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.0916997Z     
2021-03-26T22:58:55.0917619Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.0918454Z             if output == 'array':
2021-03-26T22:58:55.0918951Z                 client.rebalance()
2021-03-26T22:58:55.0919335Z     
2021-03-26T22:58:55.0920010Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.0920532Z             assert n_workers > 1
2021-03-26T22:58:55.0921082Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.0921713Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.0922179Z                 n_estimators=5,
2021-03-26T22:58:55.0922630Z                 num_leaves=5,
2021-03-26T22:58:55.0923058Z                 machines=",".join([
2021-03-26T22:58:55.0923536Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.0924010Z                     for port in open_ports
2021-03-26T22:58:55.0924436Z                 ]),
2021-03-26T22:58:55.0924809Z             )
2021-03-26T22:58:55.0925170Z     
2021-03-26T22:58:55.0925639Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.0926203Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.0926738Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.0927348Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.0928027Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.0928986Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.0929528Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.0930371Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.0930754Z 
2021-03-26T22:58:55.0931184Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.0931983Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0932714Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.0933278Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0934281Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0935029Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0936050Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0937085Z ___________ test_machines_should_be_used_if_provided[array-ranking] ____________
2021-03-26T22:58:55.0937466Z 
2021-03-26T22:58:55.0938093Z task = 'ranking', output = 'array'
2021-03-26T22:58:55.0938372Z 
2021-03-26T22:58:55.0938990Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.0939717Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0940277Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.0941131Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.0941996Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.0942483Z     
2021-03-26T22:58:55.0943139Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.0943860Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.0944347Z                 objective=task,
2021-03-26T22:58:55.0944793Z                 output=output,
2021-03-26T22:58:55.0945261Z                 chunk_size=10,
2021-03-26T22:58:55.0945681Z                 group=None
2021-03-26T22:58:55.0946080Z             )
2021-03-26T22:58:55.0946414Z     
2021-03-26T22:58:55.0946855Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.0947298Z     
2021-03-26T22:58:55.0947771Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.0948592Z             if output == 'array':
2021-03-26T22:58:55.0949194Z                 client.rebalance()
2021-03-26T22:58:55.0949597Z     
2021-03-26T22:58:55.0950315Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.0950866Z             assert n_workers > 1
2021-03-26T22:58:55.0951429Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.0952006Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.0952533Z                 n_estimators=5,
2021-03-26T22:58:55.0952971Z                 num_leaves=5,
2021-03-26T22:58:55.0953414Z                 machines=",".join([
2021-03-26T22:58:55.0953905Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.0954407Z                     for port in open_ports
2021-03-26T22:58:55.0954846Z                 ]),
2021-03-26T22:58:55.0955219Z             )
2021-03-26T22:58:55.0955567Z     
2021-03-26T22:58:55.0956077Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.0956674Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.0957195Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.0957783Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.0958484Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.0959454Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.0960020Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.0960833Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.0961220Z 
2021-03-26T22:58:55.0961666Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.0962497Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0963276Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.0963839Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0964863Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0965637Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0966644Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0967661Z _ test_machines_should_be_used_if_provided[scipy_csr_matrix-binary-classification] _
2021-03-26T22:58:55.0968051Z 
2021-03-26T22:58:55.0968756Z task = 'binary-classification', output = 'scipy_csr_matrix'
2021-03-26T22:58:55.0969100Z 
2021-03-26T22:58:55.0969736Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.0970445Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.0971011Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.0971845Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.0972730Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.0973261Z     
2021-03-26T22:58:55.0973760Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.0974400Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.0975008Z                 objective=task,
2021-03-26T22:58:55.0975448Z                 output=output,
2021-03-26T22:58:55.0975888Z                 chunk_size=10,
2021-03-26T22:58:55.0976347Z                 group=None
2021-03-26T22:58:55.0976777Z             )
2021-03-26T22:58:55.0977124Z     
2021-03-26T22:58:55.0977558Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.0978004Z     
2021-03-26T22:58:55.0978485Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.0979336Z             if output == 'array':
2021-03-26T22:58:55.0979843Z                 client.rebalance()
2021-03-26T22:58:55.0980241Z     
2021-03-26T22:58:55.0980946Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.0981666Z             assert n_workers > 1
2021-03-26T22:58:55.0982233Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.0982941Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.0983394Z                 n_estimators=5,
2021-03-26T22:58:55.0983812Z                 num_leaves=5,
2021-03-26T22:58:55.0984246Z                 machines=",".join([
2021-03-26T22:58:55.0984730Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.0985238Z                     for port in open_ports
2021-03-26T22:58:55.0985684Z                 ]),
2021-03-26T22:58:55.0986075Z             )
2021-03-26T22:58:55.0986421Z     
2021-03-26T22:58:55.0986919Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.0987529Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.0988092Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.0988729Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.0989491Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.0990545Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.0991107Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.0991931Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.0992341Z 
2021-03-26T22:58:55.0992784Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.0993658Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.0994680Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.0995280Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0996314Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0997061Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.0998022Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.0999028Z _ test_machines_should_be_used_if_provided[scipy_csr_matrix-multiclass-classification] _
2021-03-26T22:58:55.0999440Z 
2021-03-26T22:58:55.1000112Z task = 'multiclass-classification', output = 'scipy_csr_matrix'
2021-03-26T22:58:55.1000459Z 
2021-03-26T22:58:55.1001085Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1001842Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1002419Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1003355Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1004306Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1004815Z     
2021-03-26T22:58:55.1005333Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1005952Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1006441Z                 objective=task,
2021-03-26T22:58:55.1006883Z                 output=output,
2021-03-26T22:58:55.1007318Z                 chunk_size=10,
2021-03-26T22:58:55.1007888Z                 group=None
2021-03-26T22:58:55.1008283Z             )
2021-03-26T22:58:55.1008640Z     
2021-03-26T22:58:55.1009086Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1009538Z     
2021-03-26T22:58:55.1010035Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1010883Z             if output == 'array':
2021-03-26T22:58:55.1011370Z                 client.rebalance()
2021-03-26T22:58:55.1011763Z     
2021-03-26T22:58:55.1012463Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1013166Z             assert n_workers > 1
2021-03-26T22:58:55.1013745Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1014463Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1014928Z                 n_estimators=5,
2021-03-26T22:58:55.1015390Z                 num_leaves=5,
2021-03-26T22:58:55.1015841Z                 machines=",".join([
2021-03-26T22:58:55.1016334Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1016815Z                     for port in open_ports
2021-03-26T22:58:55.1017265Z                 ]),
2021-03-26T22:58:55.1017644Z             )
2021-03-26T22:58:55.1017987Z     
2021-03-26T22:58:55.1018471Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1019057Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1019598Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1020223Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1020947Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1021918Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1022536Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1023555Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1023953Z 
2021-03-26T22:58:55.1024370Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1025182Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1025922Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1026486Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1027424Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1028139Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1029095Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1030062Z ____ test_machines_should_be_used_if_provided[scipy_csr_matrix-regression] _____
2021-03-26T22:58:55.1030452Z 
2021-03-26T22:58:55.1031694Z task = 'regression', output = 'scipy_csr_matrix'
2021-03-26T22:58:55.1032052Z 
2021-03-26T22:58:55.1032696Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1033472Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1034060Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1034927Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1035817Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1036350Z     
2021-03-26T22:58:55.1036883Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1037520Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1038012Z                 objective=task,
2021-03-26T22:58:55.1038446Z                 output=output,
2021-03-26T22:58:55.1038855Z                 chunk_size=10,
2021-03-26T22:58:55.1039285Z                 group=None
2021-03-26T22:58:55.1039683Z             )
2021-03-26T22:58:55.1040024Z     
2021-03-26T22:58:55.1040585Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1041027Z     
2021-03-26T22:58:55.1041479Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1042329Z             if output == 'array':
2021-03-26T22:58:55.1042858Z                 client.rebalance()
2021-03-26T22:58:55.1043262Z     
2021-03-26T22:58:55.1043947Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1044487Z             assert n_workers > 1
2021-03-26T22:58:55.1045012Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1045604Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1046208Z                 n_estimators=5,
2021-03-26T22:58:55.1046664Z                 num_leaves=5,
2021-03-26T22:58:55.1047098Z                 machines=",".join([
2021-03-26T22:58:55.1047560Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1048054Z                     for port in open_ports
2021-03-26T22:58:55.1048589Z                 ]),
2021-03-26T22:58:55.1048982Z             )
2021-03-26T22:58:55.1049333Z     
2021-03-26T22:58:55.1049827Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1050451Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1051005Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1051617Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1052317Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1053285Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1053874Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1054737Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1055136Z 
2021-03-26T22:58:55.1055566Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1056450Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1057228Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1057816Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1058810Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1059575Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1060598Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1061584Z __ test_machines_should_be_used_if_provided[dataframe-binary-classification] ___
2021-03-26T22:58:55.1061960Z 
2021-03-26T22:58:55.1062765Z task = 'binary-classification', output = 'dataframe'
2021-03-26T22:58:55.1063106Z 
2021-03-26T22:58:55.1063774Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1064507Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1065116Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1066009Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1066927Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1067450Z     
2021-03-26T22:58:55.1068019Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1068662Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1069170Z                 objective=task,
2021-03-26T22:58:55.1069619Z                 output=output,
2021-03-26T22:58:55.1070064Z                 chunk_size=10,
2021-03-26T22:58:55.1070512Z                 group=None
2021-03-26T22:58:55.1070924Z             )
2021-03-26T22:58:55.1071275Z     
2021-03-26T22:58:55.1071739Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1072205Z     
2021-03-26T22:58:55.1072665Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1073666Z             if output == 'array':
2021-03-26T22:58:55.1074176Z                 client.rebalance()
2021-03-26T22:58:55.1074586Z     
2021-03-26T22:58:55.1075279Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1075814Z             assert n_workers > 1
2021-03-26T22:58:55.1076383Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1076954Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1077441Z                 n_estimators=5,
2021-03-26T22:58:55.1077882Z                 num_leaves=5,
2021-03-26T22:58:55.1078335Z                 machines=",".join([
2021-03-26T22:58:55.1078953Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1079497Z                     for port in open_ports
2021-03-26T22:58:55.1079961Z                 ]),
2021-03-26T22:58:55.1080374Z             )
2021-03-26T22:58:55.1080737Z     
2021-03-26T22:58:55.1081248Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1081855Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1082394Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1083025Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1083755Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1084746Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1085309Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1086197Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1086621Z 
2021-03-26T22:58:55.1087058Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1087930Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1088712Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1089270Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1090260Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1091000Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1092007Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1093038Z _ test_machines_should_be_used_if_provided[dataframe-multiclass-classification] _
2021-03-26T22:58:55.1093443Z 
2021-03-26T22:58:55.1094115Z task = 'multiclass-classification', output = 'dataframe'
2021-03-26T22:58:55.1094446Z 
2021-03-26T22:58:55.1095089Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1095821Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1096443Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1097364Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1098286Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1098791Z     
2021-03-26T22:58:55.1099324Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1099956Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1100452Z                 objective=task,
2021-03-26T22:58:55.1100878Z                 output=output,
2021-03-26T22:58:55.1101320Z                 chunk_size=10,
2021-03-26T22:58:55.1101760Z                 group=None
2021-03-26T22:58:55.1102152Z             )
2021-03-26T22:58:55.1102521Z     
2021-03-26T22:58:55.1103145Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1103591Z     
2021-03-26T22:58:55.1104060Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1105210Z             if output == 'array':
2021-03-26T22:58:55.1105881Z                 client.rebalance()
2021-03-26T22:58:55.1106292Z     
2021-03-26T22:58:55.1107014Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1107543Z             assert n_workers > 1
2021-03-26T22:58:55.1108087Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1108631Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1109082Z                 n_estimators=5,
2021-03-26T22:58:55.1109544Z                 num_leaves=5,
2021-03-26T22:58:55.1109998Z                 machines=",".join([
2021-03-26T22:58:55.1110477Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1110993Z                     for port in open_ports
2021-03-26T22:58:55.1111588Z                 ]),
2021-03-26T22:58:55.1111986Z             )
2021-03-26T22:58:55.1112375Z     
2021-03-26T22:58:55.1112896Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1113536Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1114092Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1114762Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1115500Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1116521Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1117072Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1117964Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1118365Z 
2021-03-26T22:58:55.1118846Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1119749Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1120555Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1121131Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1122187Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1123016Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1124071Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1125079Z ________ test_machines_should_be_used_if_provided[dataframe-regression] ________
2021-03-26T22:58:55.1125480Z 
2021-03-26T22:58:55.1126125Z task = 'regression', output = 'dataframe'
2021-03-26T22:58:55.1126435Z 
2021-03-26T22:58:55.1127080Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1127839Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1128486Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1129415Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1130374Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1130897Z     
2021-03-26T22:58:55.1131537Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1132398Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1132919Z                 objective=task,
2021-03-26T22:58:55.1133363Z                 output=output,
2021-03-26T22:58:55.1133826Z                 chunk_size=10,
2021-03-26T22:58:55.1134275Z                 group=None
2021-03-26T22:58:55.1134669Z             )
2021-03-26T22:58:55.1135053Z     
2021-03-26T22:58:55.1135512Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1135968Z     
2021-03-26T22:58:55.1136480Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1137424Z             if output == 'array':
2021-03-26T22:58:55.1137949Z                 client.rebalance()
2021-03-26T22:58:55.1138358Z     
2021-03-26T22:58:55.1139096Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1139806Z             assert n_workers > 1
2021-03-26T22:58:55.1140373Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1140950Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1141453Z                 n_estimators=5,
2021-03-26T22:58:55.1141941Z                 num_leaves=5,
2021-03-26T22:58:55.1142409Z                 machines=",".join([
2021-03-26T22:58:55.1143091Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1143617Z                     for port in open_ports
2021-03-26T22:58:55.1144082Z                 ]),
2021-03-26T22:58:55.1144473Z             )
2021-03-26T22:58:55.1144849Z     
2021-03-26T22:58:55.1145498Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1146130Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1146674Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1147325Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1148053Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1149064Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1149638Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1150529Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1150926Z 
2021-03-26T22:58:55.1151409Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1152336Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1153141Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1153726Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1154752Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1155546Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1156584Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1157618Z _________ test_machines_should_be_used_if_provided[dataframe-ranking] __________
2021-03-26T22:58:55.1158030Z 
2021-03-26T22:58:55.1158644Z task = 'ranking', output = 'dataframe'
2021-03-26T22:58:55.1158958Z 
2021-03-26T22:58:55.1159601Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1160364Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1160995Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1161898Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1162844Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1163395Z     
2021-03-26T22:58:55.1163944Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1164590Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1165174Z                 objective=task,
2021-03-26T22:58:55.1165631Z                 output=output,
2021-03-26T22:58:55.1166083Z                 chunk_size=10,
2021-03-26T22:58:55.1166533Z                 group=None
2021-03-26T22:58:55.1166936Z             )
2021-03-26T22:58:55.1167314Z     
2021-03-26T22:58:55.1167768Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1168217Z     
2021-03-26T22:58:55.1168742Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1169645Z             if output == 'array':
2021-03-26T22:58:55.1170155Z                 client.rebalance()
2021-03-26T22:58:55.1170554Z     
2021-03-26T22:58:55.1171270Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1171809Z             assert n_workers > 1
2021-03-26T22:58:55.1172363Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1173105Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1173584Z                 n_estimators=5,
2021-03-26T22:58:55.1174037Z                 num_leaves=5,
2021-03-26T22:58:55.1174478Z                 machines=",".join([
2021-03-26T22:58:55.1174993Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1175512Z                     for port in open_ports
2021-03-26T22:58:55.1175971Z                 ]),
2021-03-26T22:58:55.1176363Z             )
2021-03-26T22:58:55.1176731Z     
2021-03-26T22:58:55.1177249Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1177857Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1178549Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1179203Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1179940Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1180929Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1181513Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1182423Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1182983Z 
2021-03-26T22:58:55.1183469Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1184377Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1185175Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1185761Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1186782Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1187542Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1188599Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1189669Z _ test_machines_should_be_used_if_provided[dataframe-with-categorical-binary-classification] _
2021-03-26T22:58:55.1190087Z 
2021-03-26T22:58:55.1190905Z task = 'binary-classification', output = 'dataframe-with-categorical'
2021-03-26T22:58:55.1191265Z 
2021-03-26T22:58:55.1191855Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1192789Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1193415Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1194335Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1195263Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1195796Z     
2021-03-26T22:58:55.1196340Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1197001Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1197495Z                 objective=task,
2021-03-26T22:58:55.1197960Z                 output=output,
2021-03-26T22:58:55.1198412Z                 chunk_size=10,
2021-03-26T22:58:55.1198867Z                 group=None
2021-03-26T22:58:55.1199279Z             )
2021-03-26T22:58:55.1199669Z     
2021-03-26T22:58:55.1200138Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1200601Z     
2021-03-26T22:58:55.1201107Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1202014Z             if output == 'array':
2021-03-26T22:58:55.1202560Z                 client.rebalance()
2021-03-26T22:58:55.1202974Z     
2021-03-26T22:58:55.1203727Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1204301Z             assert n_workers > 1
2021-03-26T22:58:55.1204876Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1205484Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1206115Z                 n_estimators=5,
2021-03-26T22:58:55.1206573Z                 num_leaves=5,
2021-03-26T22:58:55.1207025Z                 machines=",".join([
2021-03-26T22:58:55.1207548Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1208096Z                     for port in open_ports
2021-03-26T22:58:55.1208563Z                 ]),
2021-03-26T22:58:55.1208955Z             )
2021-03-26T22:58:55.1209334Z     
2021-03-26T22:58:55.1209847Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1210458Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1211042Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1211826Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1212561Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1213588Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1214175Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1215084Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1215496Z 
2021-03-26T22:58:55.1215960Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1216891Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1217701Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1218295Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1219358Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1220160Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1221197Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1222291Z _ test_machines_should_be_used_if_provided[dataframe-with-categorical-multiclass-classification] _
2021-03-26T22:58:55.1223369Z 
2021-03-26T22:58:55.1224176Z task = 'multiclass-classification', output = 'dataframe-with-categorical'
2021-03-26T22:58:55.1224470Z 
2021-03-26T22:58:55.1224913Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1225460Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1225907Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1226535Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1227167Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1227544Z     
2021-03-26T22:58:55.1227926Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1228379Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1228720Z                 objective=task,
2021-03-26T22:58:55.1229040Z                 output=output,
2021-03-26T22:58:55.1229358Z                 chunk_size=10,
2021-03-26T22:58:55.1229661Z                 group=None
2021-03-26T22:58:55.1229955Z             )
2021-03-26T22:58:55.1230217Z     
2021-03-26T22:58:55.1230540Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1230854Z     
2021-03-26T22:58:55.1231204Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1231753Z             if output == 'array':
2021-03-26T22:58:55.1232101Z                 client.rebalance()
2021-03-26T22:58:55.1232393Z     
2021-03-26T22:58:55.1232881Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1233265Z             assert n_workers > 1
2021-03-26T22:58:55.1233651Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1234069Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1234407Z                 n_estimators=5,
2021-03-26T22:58:55.1234725Z                 num_leaves=5,
2021-03-26T22:58:55.1235190Z                 machines=",".join([
2021-03-26T22:58:55.1235554Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1235926Z                     for port in open_ports
2021-03-26T22:58:55.1236235Z                 ]),
2021-03-26T22:58:55.1236524Z             )
2021-03-26T22:58:55.1236786Z     
2021-03-26T22:58:55.1237148Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1237566Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1237962Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1238414Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1239001Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1239625Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1240031Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1240635Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1240910Z 
2021-03-26T22:58:55.1241236Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1241833Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1242374Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1242789Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1243238Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1243932Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1244687Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1245398Z _ test_machines_should_be_used_if_provided[dataframe-with-categorical-regression] _
2021-03-26T22:58:55.1245685Z 
2021-03-26T22:58:55.1246151Z task = 'regression', output = 'dataframe-with-categorical'
2021-03-26T22:58:55.1246406Z 
2021-03-26T22:58:55.1246840Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1247407Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1247852Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1248476Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1249103Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1249474Z     
2021-03-26T22:58:55.1249854Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1250299Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1250642Z                 objective=task,
2021-03-26T22:58:55.1250964Z                 output=output,
2021-03-26T22:58:55.1251282Z                 chunk_size=10,
2021-03-26T22:58:55.1251580Z                 group=None
2021-03-26T22:58:55.1251877Z             )
2021-03-26T22:58:55.1252141Z     
2021-03-26T22:58:55.1252455Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1252784Z     
2021-03-26T22:58:55.1253133Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1253681Z             if output == 'array':
2021-03-26T22:58:55.1254030Z                 client.rebalance()
2021-03-26T22:58:55.1254320Z     
2021-03-26T22:58:55.1254924Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1255305Z             assert n_workers > 1
2021-03-26T22:58:55.1255690Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1256110Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1256452Z                 n_estimators=5,
2021-03-26T22:58:55.1256780Z                 num_leaves=5,
2021-03-26T22:58:55.1257105Z                 machines=",".join([
2021-03-26T22:58:55.1257461Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1257909Z                     for port in open_ports
2021-03-26T22:58:55.1258222Z                 ]),
2021-03-26T22:58:55.1258511Z             )
2021-03-26T22:58:55.1258777Z     
2021-03-26T22:58:55.1259124Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1259557Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1259955Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1260410Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1260896Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1261521Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1261991Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1262802Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1263126Z 
2021-03-26T22:58:55.1263435Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1264042Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1264569Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1264983Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1265679Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1266208Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1266894Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1267595Z _ test_machines_should_be_used_if_provided[dataframe-with-categorical-ranking] _
2021-03-26T22:58:55.1267876Z 
2021-03-26T22:58:55.1268331Z task = 'ranking', output = 'dataframe-with-categorical'
2021-03-26T22:58:55.1268577Z 
2021-03-26T22:58:55.1269012Z     @pytest.mark.parametrize('task', tasks)
2021-03-26T22:58:55.1269550Z     @pytest.mark.parametrize('output', data_output)
2021-03-26T22:58:55.1269988Z     def test_machines_should_be_used_if_provided(task, output):
2021-03-26T22:58:55.1270594Z         if task == 'ranking' and output == 'scipy_csr_matrix':
2021-03-26T22:58:55.1271235Z             pytest.skip('LGBMRanker is not currently tested on sparse matrices')
2021-03-26T22:58:55.1271606Z     
2021-03-26T22:58:55.1271986Z         with LocalCluster(n_workers=2) as cluster, Client(cluster) as client:
2021-03-26T22:58:55.1272419Z             _, _, _, _, dX, dy, _, dg = _create_data(
2021-03-26T22:58:55.1272770Z                 objective=task,
2021-03-26T22:58:55.1273089Z                 output=output,
2021-03-26T22:58:55.1273407Z                 chunk_size=10,
2021-03-26T22:58:55.1273710Z                 group=None
2021-03-26T22:58:55.1274003Z             )
2021-03-26T22:58:55.1274267Z     
2021-03-26T22:58:55.1274576Z             dask_model_factory = task_to_dask_factory[task]
2021-03-26T22:58:55.1274901Z     
2021-03-26T22:58:55.1275249Z             # rebalance data to be sure that each worker has a piece of the data
2021-03-26T22:58:55.1275801Z             if output == 'array':
2021-03-26T22:58:55.1276149Z                 client.rebalance()
2021-03-26T22:58:55.1276446Z     
2021-03-26T22:58:55.1276930Z             n_workers = len(client.scheduler_info()['workers'])
2021-03-26T22:58:55.1277297Z             assert n_workers > 1
2021-03-26T22:58:55.1277694Z             open_ports = [lgb.dask._find_random_open_port() for _ in range(n_workers)]
2021-03-26T22:58:55.1278129Z             dask_model = dask_model_factory(
2021-03-26T22:58:55.1278471Z                 n_estimators=5,
2021-03-26T22:58:55.1278774Z                 num_leaves=5,
2021-03-26T22:58:55.1279102Z                 machines=",".join([
2021-03-26T22:58:55.1279462Z                     "127.0.0.1:" + str(port)
2021-03-26T22:58:55.1279828Z                     for port in open_ports
2021-03-26T22:58:55.1280136Z                 ]),
2021-03-26T22:58:55.1280420Z             )
2021-03-26T22:58:55.1280777Z     
2021-03-26T22:58:55.1281126Z             # test that "machines" is actually respected by creating a socket that uses
2021-03-26T22:58:55.1281560Z             # one of the ports mentioned in "machines"
2021-03-26T22:58:55.1281955Z             error_msg = "Binding port %s failed" % open_ports[0]
2021-03-26T22:58:55.1282409Z             with pytest.raises(lgb.basic.LightGBMError, match=error_msg):
2021-03-26T22:58:55.1282897Z                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
2021-03-26T22:58:55.1283522Z                     s.bind(('127.0.0.1', open_ports[0]))
2021-03-26T22:58:55.1283930Z >                   dask_model.fit(dX, dy, group=dg)
2021-03-26T22:58:55.1284535Z E                   Failed: DID NOT RAISE <class 'lightgbm.basic.LightGBMError'>
2021-03-26T22:58:55.1284881Z 
2021-03-26T22:58:55.1285208Z ../tests/python_package_test/test_dask.py:1098: Failed
2021-03-26T22:58:55.1285799Z ----------------------------- Captured stdout call -----------------------------
2021-03-26T22:58:55.1286458Z Using passed-in 'machines' parameter
2021-03-26T22:58:55.1286873Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1287324Z [LightGBM] [Info] Local rank: 0, total number of machines: 1
2021-03-26T22:58:55.1288005Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1288778Z [LightGBM] [Warning] num_threads is set=2, n_jobs=-1 will be ignored. Current value: num_threads=2
2021-03-26T22:58:55.1289309Z =============================== warnings summary ===============================
2021-03-26T22:58:55.1289734Z tests/python_package_test/test_basic.py::test_basic
2021-03-26T22:58:55.1290141Z tests/python_package_test/test_engine.py::test_reference_chain
2021-03-26T22:58:55.1290555Z tests/python_package_test/test_engine.py::test_init_with_subset
2021-03-26T22:58:55.1290967Z tests/python_package_test/test_engine.py::test_fpreproc
2021-03-26T22:58:55.1291394Z tests/python_package_test/test_engine.py::test_dataset_params_with_reference
2021-03-26T22:58:55.1292182Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1433: UserWarning: Overriding the parameters from Reference Dataset.
2021-03-26T22:58:55.1292900Z     _log_warning('Overriding the parameters from Reference Dataset.')
2021-03-26T22:58:55.1293171Z 
2021-03-26T22:58:55.1293540Z tests/python_package_test/test_basic.py::test_add_features_equal_data_on_alternating_used_unused
2021-03-26T22:58:55.1294039Z tests/python_package_test/test_basic.py::test_add_features_same_booster_behaviour
2021-03-26T22:58:55.1294484Z tests/python_package_test/test_engine.py::test_sliced_data
2021-03-26T22:58:55.1294911Z tests/python_package_test/test_engine.py::test_monotone_penalty_max
2021-03-26T22:58:55.1295325Z tests/python_package_test/test_engine.py::test_forced_bins
2021-03-26T22:58:55.1296216Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:448: UserWarning: Usage of np.ndarray subset (sliced data) is not recommended due to it will double the peak memory cost in LightGBM.
2021-03-26T22:58:55.1296907Z     _log_warning("Usage of np.ndarray subset (sliced data) is not recommended "
2021-03-26T22:58:55.1297185Z 
2021-03-26T22:58:55.1297547Z tests/python_package_test/test_basic.py::test_add_features_equal_data_on_alternating_used_unused
2021-03-26T22:58:55.1298047Z tests/python_package_test/test_basic.py::test_add_features_same_booster_behaviour
2021-03-26T22:58:55.1298521Z tests/python_package_test/test_basic.py::test_add_features_from_different_sources
2021-03-26T22:58:55.1299384Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:2129: UserWarning: Cannot add features from NoneType type of raw data to NoneType type of raw data.
2021-03-26T22:58:55.1299987Z   Set free_raw_data=False when construct Dataset to avoid this
2021-03-26T22:58:55.1300337Z     _log_warning(err_msg)
2021-03-26T22:58:55.1300531Z 
2021-03-26T22:58:55.1300888Z tests/python_package_test/test_basic.py::test_add_features_equal_data_on_alternating_used_unused
2021-03-26T22:58:55.1301449Z tests/python_package_test/test_basic.py::test_add_features_same_booster_behaviour
2021-03-26T22:58:55.1301925Z tests/python_package_test/test_basic.py::test_add_features_from_different_sources
2021-03-26T22:58:55.1302872Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:2131: UserWarning: Reseting categorical features.
2021-03-26T22:58:55.1303457Z   You can set new categorical features via ``set_categorical_feature`` method
2021-03-26T22:58:55.1303865Z     _log_warning("Reseting categorical features.\n"
2021-03-26T22:58:55.1304099Z 
2021-03-26T22:58:55.1304433Z tests/python_package_test/test_basic.py::test_add_features_from_different_sources
2021-03-26T22:58:55.1305402Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:2129: UserWarning: Cannot add features from list type of raw data to ndarray type of raw data.
2021-03-26T22:58:55.1305949Z   Freeing raw data
2021-03-26T22:58:55.1306253Z     _log_warning(err_msg)
2021-03-26T22:58:55.1306434Z 
2021-03-26T22:58:55.1306793Z tests/python_package_test/test_basic.py::test_add_features_from_different_sources
2021-03-26T22:58:55.1307725Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:2129: UserWarning: Cannot add features from list type of raw data to csr_matrix type of raw data.
2021-03-26T22:58:55.1308280Z   Freeing raw data
2021-03-26T22:58:55.1308580Z     _log_warning(err_msg)
2021-03-26T22:58:55.1308756Z 
2021-03-26T22:58:55.1309105Z tests/python_package_test/test_basic.py::test_add_features_from_different_sources
2021-03-26T22:58:55.1309949Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:2129: UserWarning: Cannot add features from list type of raw data to DataFrame type of raw data.
2021-03-26T22:58:55.1310487Z   Freeing raw data
2021-03-26T22:58:55.1310772Z     _log_warning(err_msg)
2021-03-26T22:58:55.1310961Z 
2021-03-26T22:58:55.1311279Z tests/python_package_test/test_consistency.py: 10 warnings
2021-03-26T22:58:55.1312067Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/engine.py:148: UserWarning: Found `num_trees` in params. Will use it instead of argument
2021-03-26T22:58:55.1312677Z     _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
2021-03-26T22:58:55.1312943Z 
2021-03-26T22:58:55.1313269Z tests/python_package_test/test_consistency.py: 10 warnings
2021-03-26T22:58:55.1314058Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1222: UserWarning: data keyword has been found in `params` and will be ignored.
2021-03-26T22:58:55.1314658Z   Please use data argument of the Dataset constructor to pass this parameter.
2021-03-26T22:58:55.1315279Z     _log_warning('{0} keyword has been found in `params` and will be ignored.\n'
2021-03-26T22:58:55.1315562Z 
2021-03-26T22:58:55.1315878Z tests/python_package_test/test_dask.py: 173 warnings
2021-03-26T22:58:55.1316588Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py:285: UserWarning: Parameter n_jobs will be ignored.
2021-03-26T22:58:55.1317122Z     _log_warning(f"Parameter {param_alias} will be ignored.")
2021-03-26T22:58:55.1317360Z 
2021-03-26T22:58:55.1317705Z tests/python_package_test/test_dask.py::test_training_does_not_fail_on_port_conflicts
2021-03-26T22:58:55.1318185Z tests/python_package_test/test_dask.py::test_training_does_not_fail_on_port_conflicts
2021-03-26T22:58:55.1318657Z tests/python_package_test/test_dask.py::test_training_does_not_fail_on_port_conflicts
2021-03-26T22:58:55.1319131Z tests/python_package_test/test_dask.py::test_training_does_not_fail_on_port_conflicts
2021-03-26T22:58:55.1319895Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py:285: UserWarning: Parameter num_threads will be ignored.
2021-03-26T22:58:55.1320430Z     _log_warning(f"Parameter {param_alias} will be ignored.")
2021-03-26T22:58:55.1320735Z 
2021-03-26T22:58:55.1321399Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-binary-classification-pickle]
2021-03-26T22:58:55.1322312Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1322843Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1323212Z   Hosting the HTTP server on port 37203 instead
2021-03-26T22:58:55.1323534Z     warnings.warn(
2021-03-26T22:58:55.1323716Z 
2021-03-26T22:58:55.1324375Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-binary-classification-joblib]
2021-03-26T22:58:55.1325345Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1325878Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1326247Z   Hosting the HTTP server on port 33305 instead
2021-03-26T22:58:55.1326566Z     warnings.warn(
2021-03-26T22:58:55.1326749Z 
2021-03-26T22:58:55.1327438Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-binary-classification-cloudpickle]
2021-03-26T22:58:55.1328332Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1328866Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1329234Z   Hosting the HTTP server on port 43253 instead
2021-03-26T22:58:55.1329566Z     warnings.warn(
2021-03-26T22:58:55.1329738Z 
2021-03-26T22:58:55.1330426Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-multiclass-classification-pickle]
2021-03-26T22:58:55.1331333Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1331851Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1332218Z   Hosting the HTTP server on port 44841 instead
2021-03-26T22:58:55.1332549Z     warnings.warn(
2021-03-26T22:58:55.1332719Z 
2021-03-26T22:58:55.1333394Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-multiclass-classification-joblib]
2021-03-26T22:58:55.1334303Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1334836Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1335193Z   Hosting the HTTP server on port 32785 instead
2021-03-26T22:58:55.1335527Z     warnings.warn(
2021-03-26T22:58:55.1335694Z 
2021-03-26T22:58:55.1336377Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-multiclass-classification-cloudpickle]
2021-03-26T22:58:55.1337296Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1337855Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1338216Z   Hosting the HTTP server on port 34245 instead
2021-03-26T22:58:55.1338544Z     warnings.warn(
2021-03-26T22:58:55.1338723Z 
2021-03-26T22:58:55.1339364Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-regression-pickle]
2021-03-26T22:58:55.1340252Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1340783Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1341148Z   Hosting the HTTP server on port 45953 instead
2021-03-26T22:58:55.1341547Z     warnings.warn(
2021-03-26T22:58:55.1341731Z 
2021-03-26T22:58:55.1342375Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-regression-joblib]
2021-03-26T22:58:55.1343482Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1344011Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1344377Z   Hosting the HTTP server on port 45391 instead
2021-03-26T22:58:55.1344690Z     warnings.warn(
2021-03-26T22:58:55.1344874Z 
2021-03-26T22:58:55.1345539Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-regression-cloudpickle]
2021-03-26T22:58:55.1346502Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1347042Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1347434Z   Hosting the HTTP server on port 42285 instead
2021-03-26T22:58:55.1347772Z     warnings.warn(
2021-03-26T22:58:55.1347941Z 
2021-03-26T22:58:55.1348590Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-ranking-pickle]
2021-03-26T22:58:55.1349452Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1349982Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1350393Z   Hosting the HTTP server on port 37475 instead
2021-03-26T22:58:55.1350731Z     warnings.warn(
2021-03-26T22:58:55.1350897Z 
2021-03-26T22:58:55.1351542Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-ranking-joblib]
2021-03-26T22:58:55.1352420Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1352934Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1353298Z   Hosting the HTTP server on port 37689 instead
2021-03-26T22:58:55.1353623Z     warnings.warn(
2021-03-26T22:58:55.1353794Z 
2021-03-26T22:58:55.1354445Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[True-ranking-cloudpickle]
2021-03-26T22:58:55.1355331Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1355862Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1356214Z   Hosting the HTTP server on port 33847 instead
2021-03-26T22:58:55.1356542Z     warnings.warn(
2021-03-26T22:58:55.1356740Z 
2021-03-26T22:58:55.1357411Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-binary-classification-pickle]
2021-03-26T22:58:55.1358316Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1358840Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1359206Z   Hosting the HTTP server on port 36535 instead
2021-03-26T22:58:55.1359520Z     warnings.warn(
2021-03-26T22:58:55.1359702Z 
2021-03-26T22:58:55.1360365Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-binary-classification-joblib]
2021-03-26T22:58:55.1361268Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1361804Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1362169Z   Hosting the HTTP server on port 36923 instead
2021-03-26T22:58:55.1362593Z     warnings.warn(
2021-03-26T22:58:55.1362834Z 
2021-03-26T22:58:55.1363509Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-binary-classification-cloudpickle]
2021-03-26T22:58:55.1364417Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1364942Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1365303Z   Hosting the HTTP server on port 45309 instead
2021-03-26T22:58:55.1365634Z     warnings.warn(
2021-03-26T22:58:55.1365802Z 
2021-03-26T22:58:55.1366462Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-multiclass-classification-pickle]
2021-03-26T22:58:55.1367415Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1367950Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1368319Z   Hosting the HTTP server on port 41641 instead
2021-03-26T22:58:55.1368651Z     warnings.warn(
2021-03-26T22:58:55.1368825Z 
2021-03-26T22:58:55.1369509Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-multiclass-classification-joblib]
2021-03-26T22:58:55.1370414Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1370918Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1371276Z   Hosting the HTTP server on port 44095 instead
2021-03-26T22:58:55.1371603Z     warnings.warn(
2021-03-26T22:58:55.1371763Z 
2021-03-26T22:58:55.1372439Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-multiclass-classification-cloudpickle]
2021-03-26T22:58:55.1373322Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1373851Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1374203Z   Hosting the HTTP server on port 45179 instead
2021-03-26T22:58:55.1374533Z     warnings.warn(
2021-03-26T22:58:55.1374702Z 
2021-03-26T22:58:55.1375361Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-regression-pickle]
2021-03-26T22:58:55.1376241Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1376772Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1377121Z   Hosting the HTTP server on port 41487 instead
2021-03-26T22:58:55.1377450Z     warnings.warn(
2021-03-26T22:58:55.1377630Z 
2021-03-26T22:58:55.1378284Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-regression-joblib]
2021-03-26T22:58:55.1379209Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1379735Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1380100Z   Hosting the HTTP server on port 37289 instead
2021-03-26T22:58:55.1380417Z     warnings.warn(
2021-03-26T22:58:55.1380599Z 
2021-03-26T22:58:55.1381247Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-regression-cloudpickle]
2021-03-26T22:58:55.1382137Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1382809Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1383195Z   Hosting the HTTP server on port 38679 instead
2021-03-26T22:58:55.1383623Z     warnings.warn(
2021-03-26T22:58:55.1383793Z 
2021-03-26T22:58:55.1384462Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-ranking-pickle]
2021-03-26T22:58:55.1385328Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1385855Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1386219Z   Hosting the HTTP server on port 34439 instead
2021-03-26T22:58:55.1386545Z     warnings.warn(
2021-03-26T22:58:55.1386714Z 
2021-03-26T22:58:55.1387360Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-ranking-joblib]
2021-03-26T22:58:55.1388317Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1388837Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1389200Z   Hosting the HTTP server on port 40549 instead
2021-03-26T22:58:55.1389531Z     warnings.warn(
2021-03-26T22:58:55.1389697Z 
2021-03-26T22:58:55.1390344Z tests/python_package_test/test_dask.py::test_model_and_local_version_are_picklable_whether_or_not_client_set_explicitly[False-ranking-cloudpickle]
2021-03-26T22:58:55.1391231Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/node.py:151: UserWarning: Port 8787 is already in use.
2021-03-26T22:58:55.1391754Z   Perhaps you already have a cluster running?
2021-03-26T22:58:55.1392102Z   Hosting the HTTP server on port 40215 instead
2021-03-26T22:58:55.1392433Z     warnings.warn(
2021-03-26T22:58:55.1392601Z 
2021-03-26T22:58:55.1392915Z tests/python_package_test/test_dask.py::test_errors
2021-03-26T22:58:55.1393648Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/dask.py:312: RuntimeWarning: coroutine '_wait' was never awaited
2021-03-26T22:58:55.1394129Z     wait(parts)
2021-03-26T22:58:55.1394296Z 
2021-03-26T22:58:55.1394611Z tests/python_package_test/test_dask.py::test_errors
2021-03-26T22:58:55.1395485Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/distributed/utils_test.py:939: RuntimeWarning: coroutine 'PooledRPCCall.__getattr__.<locals>.send_recv_from_rpc' was never awaited
2021-03-26T22:58:55.1396065Z     gc.collect()
2021-03-26T22:58:55.1396244Z 
2021-03-26T22:58:55.1396548Z tests/python_package_test/test_engine.py::test_binary
2021-03-26T22:58:55.1397343Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/engine.py:148: UserWarning: Found `num_iteration` in params. Will use it instead of argument
2021-03-26T22:58:55.1397957Z     _log_warning("Found `{}` in params. Will use it instead of argument".format(alias))
2021-03-26T22:58:55.1398224Z 
2021-03-26T22:58:55.1398556Z tests/python_package_test/test_engine.py::test_pandas_categorical
2021-03-26T22:58:55.1398978Z tests/python_package_test/test_engine.py::test_linear_trees
2021-03-26T22:58:55.1399392Z tests/python_package_test/test_engine.py::test_save_and_load_linear
2021-03-26T22:58:55.1400168Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1705: UserWarning: categorical_feature in Dataset is overridden.
2021-03-26T22:58:55.1400676Z   New categorical_feature is [0]
2021-03-26T22:58:55.1401212Z     _log_warning('categorical_feature in Dataset is overridden.\n'
2021-03-26T22:58:55.1401463Z 
2021-03-26T22:58:55.1401797Z tests/python_package_test/test_engine.py::test_pandas_categorical
2021-03-26T22:58:55.1402554Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1705: UserWarning: categorical_feature in Dataset is overridden.
2021-03-26T22:58:55.1403174Z   New categorical_feature is ['A']
2021-03-26T22:58:55.1403715Z     _log_warning('categorical_feature in Dataset is overridden.\n'
2021-03-26T22:58:55.1404054Z 
2021-03-26T22:58:55.1404368Z tests/python_package_test/test_engine.py::test_pandas_categorical
2021-03-26T22:58:55.1405113Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1705: UserWarning: categorical_feature in Dataset is overridden.
2021-03-26T22:58:55.1405775Z   New categorical_feature is ['A', 'B', 'C', 'D']
2021-03-26T22:58:55.1406329Z     _log_warning('categorical_feature in Dataset is overridden.\n'
2021-03-26T22:58:55.1406566Z 
2021-03-26T22:58:55.1406889Z tests/python_package_test/test_engine.py::test_pandas_categorical
2021-03-26T22:58:55.1407610Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1705: UserWarning: categorical_feature in Dataset is overridden.
2021-03-26T22:58:55.1408345Z   New categorical_feature is ['A', 'B', 'C', 'D', 'E']
2021-03-26T22:58:55.1408908Z     _log_warning('categorical_feature in Dataset is overridden.\n'
2021-03-26T22:58:55.1409145Z 
2021-03-26T22:58:55.1409469Z tests/python_package_test/test_engine.py::test_pandas_categorical
2021-03-26T22:58:55.1410203Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1705: UserWarning: categorical_feature in Dataset is overridden.
2021-03-26T22:58:55.1410683Z   New categorical_feature is []
2021-03-26T22:58:55.1411188Z     _log_warning('categorical_feature in Dataset is overridden.\n'
2021-03-26T22:58:55.1411440Z 
2021-03-26T22:58:55.1411745Z tests/python_package_test/test_engine.py::test_pandas_sparse
2021-03-26T22:58:55.1412147Z tests/python_package_test/test_sklearn.py::test_pandas_sparse
2021-03-26T22:58:55.1413093Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/pandas/core/generic.py:5673: PerformanceWarning: Concatenating sparse arrays with multiple fill values: '[0, nan, False]'. Picking the first and converting the rest.
2021-03-26T22:58:55.1413789Z     return self._mgr.as_array(transpose=self._AXIS_REVERSED)
2021-03-26T22:58:55.1414015Z 
2021-03-26T22:58:55.1414356Z tests/python_package_test/test_engine.py::test_int32_max_sparse_contribs
2021-03-26T22:58:55.1415278Z   /home/AzDevOps_azpcontainer/miniconda/envs/test-env/lib/python3.8/site-packages/scipy/sparse/_index.py:82: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
2021-03-26T22:58:55.1415883Z     self._set_intXint(row, col, x.flat[0])
2021-03-26T22:58:55.1416079Z 
2021-03-26T22:58:55.1416377Z tests/python_package_test/test_engine.py::test_init_with_subset
2021-03-26T22:58:55.1417098Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1959: UserWarning: Cannot subset str type of raw data.
2021-03-26T22:58:55.1417552Z   Returning original raw data
2021-03-26T22:58:55.1417892Z     _log_warning("Cannot subset {} type of raw data.\n"
2021-03-26T22:58:55.1418117Z 
2021-03-26T22:58:55.1418424Z tests/python_package_test/test_engine.py::test_monotone_constraints
2021-03-26T22:58:55.1418834Z tests/python_package_test/test_engine.py::test_monotone_penalty
2021-03-26T22:58:55.1419308Z tests/python_package_test/test_engine.py::test_monotone_penalty_max
2021-03-26T22:58:55.1419721Z tests/python_package_test/test_engine.py::test_get_split_value_histogram
2021-03-26T22:58:55.1420144Z tests/python_package_test/test_sklearn.py::test_pandas_categorical
2021-03-26T22:58:55.1420561Z tests/python_package_test/test_sklearn.py::test_pandas_categorical
2021-03-26T22:58:55.1420976Z tests/python_package_test/test_sklearn.py::test_pandas_categorical
2021-03-26T22:58:55.1421373Z tests/python_package_test/test_sklearn.py::test_pandas_categorical
2021-03-26T22:58:55.1421781Z tests/python_package_test/test_sklearn.py::test_pandas_categorical
2021-03-26T22:58:55.1422524Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1702: UserWarning: Using categorical_feature in Dataset.
2021-03-26T22:58:55.1423406Z     _log_warning('Using categorical_feature in Dataset.')
2021-03-26T22:58:55.1423637Z 
2021-03-26T22:58:55.1423968Z tests/python_package_test/test_engine.py::test_dataset_update_params
2021-03-26T22:58:55.1424899Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:1222: UserWarning: categorical_feature keyword has been found in `params` and will be ignored.
2021-03-26T22:58:55.1425520Z   Please use categorical_feature argument of the Dataset constructor to pass this parameter.
2021-03-26T22:58:55.1426173Z     _log_warning('{0} keyword has been found in `params` and will be ignored.\n'
2021-03-26T22:58:55.1426453Z 
2021-03-26T22:58:55.1426773Z tests/python_package_test/test_plotting.py::test_plot_metrics
2021-03-26T22:58:55.1427558Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/plotting.py:328: UserWarning: More than one metric available, picking one to plot.
2021-03-26T22:58:55.1428202Z     _log_warning("More than one metric available, picking one to plot.")
2021-03-26T22:58:55.1428447Z 
2021-03-26T22:58:55.1428817Z tests/python_package_test/test_sklearn.py::test_binary_classification_with_custom_objective
2021-03-26T22:58:55.1429741Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/sklearn.py:922: UserWarning: Cannot compute class probabilities or labels due to the usage of customized objective function.
2021-03-26T22:58:55.1430322Z   Returning raw scores instead.
2021-03-26T22:58:55.1430677Z     _log_warning("Cannot compute class probabilities or labels "
2021-03-26T22:58:55.1430922Z 
2021-03-26T22:58:55.1431227Z tests/python_package_test/test_sklearn.py: 12 warnings
2021-03-26T22:58:55.1431967Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/basic.py:739: UserWarning: Converting data to scipy sparse matrix.
2021-03-26T22:58:55.1432656Z     _log_warning('Converting data to scipy sparse matrix.')
2021-03-26T22:58:55.1432897Z 
2021-03-26T22:58:55.1433235Z tests/python_package_test/test_utilities.py::test_register_logger
2021-03-26T22:58:55.1434135Z   /home/AzDevOps_azpcontainer/.local/lib/python3.8/site-packages/lightgbm/plotting.py:357: UserWarning: Attempting to set identical bottom == top == 1.0 results in singular transformations; automatically expanding.
2021-03-26T22:58:55.1434720Z     ax.set_ylim(ylim)
2021-03-26T22:58:55.1434898Z 
2021-03-26T22:58:55.1435376Z -- Docs: https://docs.pytest.org/en/stable/warnings.html
2021-03-26T22:58:55.1435822Z =========================== short test summary info ============================
2021-03-26T22:58:55.1436513Z FAILED ../tests/python_package_test/test_dask.py::test_classifier[binary-classification-dataframe-with-categorical]
2021-03-26T22:58:55.1437294Z FAILED ../tests/python_package_test/test_dask.py::test_classifier[multiclass-classification-dataframe-with-categorical]
2021-03-26T22:58:55.1437831Z FAILED ../tests/python_package_test/test_dask.py::test_regressor[scipy_csr_matrix]
2021-03-26T22:58:55.1438502Z FAILED ../tests/python_package_test/test_dask.py::test_regressor[dataframe-with-categorical]
2021-03-26T22:58:55.1439237Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-binary-classification]
2021-03-26T22:58:55.1440037Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-multiclass-classification]
2021-03-26T22:58:55.1440817Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-regression]
2021-03-26T22:58:55.1441557Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[array-ranking]
2021-03-26T22:58:55.1442339Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-binary-classification]
2021-03-26T22:58:55.1443149Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-multiclass-classification]
2021-03-26T22:58:55.1443951Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[scipy_csr_matrix-regression]
2021-03-26T22:58:55.1444743Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-binary-classification]
2021-03-26T22:58:55.1445636Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-multiclass-classification]
2021-03-26T22:58:55.1446417Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-regression]
2021-03-26T22:58:55.1447157Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-ranking]
2021-03-26T22:58:55.1447998Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-binary-classification]
2021-03-26T22:58:55.1448857Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-multiclass-classification]
2021-03-26T22:58:55.1449765Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-regression]
2021-03-26T22:58:55.1450578Z FAILED ../tests/python_package_test/test_dask.py::test_machines_should_be_used_if_provided[dataframe-with-categorical-ranking]
2021-03-26T22:58:55.1451151Z = 19 failed, 367 passed, 7 skipped, 2 xfailed, 279 warnings in 665.52s (0:11:05) =
2021-03-26T22:58:55.6498638Z ##[error]Bash exited with code '255'.
2021-03-26T22:58:55.6555869Z ##[section]Finishing: Test

I guess that first 4 failures are quite easy to "fix" by slightly relaxing the asserting accuracy, e.g.

a = 0.998, b = 1.0, check_shape = True, check_graph = True, check_meta = True
check_chunks = True, kwargs = {}, a_original = 0.998, b_original = 1.0
adt = dtype('float64'), a_meta = None, a_computed = None, bdt = dtype('float64')
b_meta = None

or


a = 0.9631152654080015, b = 0.9769588528806643, check_shape = True
check_graph = True, check_meta = True, check_chunks = True
kwargs = {'atol': 0.01}, a_original = 0.9631152654080015
b_original = 0.9769588528806643, adt = dtype('float64'), a_meta = None
a_computed = None, bdt = dtype('float64'), b_meta = None

Also, sometimes test_dask.py::test_regressor[array] is failing for the same reason.

All other failures (all parametrized variants of test_machines_should_be_used_if_provided) are expected I believe due to socket vs MPI underlying differences.

rudra0713 · 2022-02-01T04:21:48Z

Hi, I am looking for the lightgbm.dask support for the MPI version for the following reasons:

Application: hyperparameter optimization, where multiple lgbm.fit() calls are submitted, either sequentially/parallel
Sockets don't work for two reasons: 1) Currently, for each trial the socket version starts looking for random ports for each worker, however due to security concerns, a node cannot open any random port; 2) the amount of open ports on my system is limited, so I need to limit the number of ports used, I tried by specifying the machines parameter to use a fixed set of ports for each trial while doing multiple trials sequentially, that did not work. [dask] Binding Port Failed when Using the Machines Parameter #4960 (comment)

StrikerRUS added feature request dask labels Jan 24, 2021

StrikerRUS mentioned this issue Jan 24, 2021

Feature Requests & Voting Hub #2302

Open

StrikerRUS closed this as completed Jan 24, 2021

jameslamb mentioned this issue Jan 29, 2022

[dask] Shifting to Serial Tree Learner Despite Having multiple workers #4987

Closed

This comment was marked as off-topic.

Sign in to view

github-actions bot locked as resolved and limited conversation to collaborators Aug 16, 2023

microsoft unlocked this conversation Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support MPI in Dask #3831

Support MPI in Dask #3831

StrikerRUS commented Jan 24, 2021 •

edited

Loading

StrikerRUS commented Jan 24, 2021

jameslamb commented Jan 24, 2021

StrikerRUS commented Mar 27, 2021 •

edited

Loading

rudra0713 commented Feb 1, 2022

This comment was marked as off-topic.

Support MPI in Dask #3831

Support MPI in Dask #3831

Comments

StrikerRUS commented Jan 24, 2021 • edited Loading

Summary

Motivation

References

StrikerRUS commented Jan 24, 2021

jameslamb commented Jan 24, 2021

StrikerRUS commented Mar 27, 2021 • edited Loading

rudra0713 commented Feb 1, 2022

This comment was marked as off-topic.

StrikerRUS commented Jan 24, 2021 •

edited

Loading

StrikerRUS commented Mar 27, 2021 •

edited

Loading