Implement secure boost scheme - secure evaluation and validation (during training) without local feature leakage #10079

ZiyueXu77 · 2024-02-27T21:57:48Z

For implementing Vertical Federated Learning with Secure Features, as discussed in
#9987
This part is independent from the encryption and the alternative vertical pipeline. The purpose is to avoid leaking the real cut value information from participants. Hence add as a separate PR.
This PR is based on #10037, which should be reviewed and merged first.

…ute under secure scenario

…valent to broadcast

…lobal best split, but need to further apply split correctly

…case

…ute under secure scenario

…valent to broadcast

…lobal best split, but need to further apply split correctly

…case

Add alternate vertical splits

…x for training phase

…rent model

trivialfis

The code looks good to me overall. We can merge it once we have some basic unittests.

As for integration tests in Python with nvflare (in future PRs), we can assert that

models are different for different workers.
predictions are the same
evaluation result are the same
only works if the 0th worker has the label.

I highly recommend using the hypothesis test framework (see python tests in xgboost and search the term hypothesis).

ZiyueXu77 · 2024-03-07T19:04:41Z

The code looks good to me overall. We can merge it once we have some basic unittests.

As for integration tests in Python with nvflare (in future PRs), we can assert that

models are different for different workers.

predictions are the same

evaluation result are the same

only works if the 0th worker has the label.

I highly recommend using the hypothesis test framework (see python tests in xgboost and search the term hypothesis).

Thanks! @YuanTingHsieh , could you add the unit tests according to @trivialfis 's suggestions?

trivialfis · 2024-03-07T19:32:06Z

could you add the unit tests according to @trivialfis 's suggestions

Those points are all for integration tests not for small unittest. I think the integration tests in Python with nvflare will take more effort, we don't need to rush it in this PR.

trivialfis · 2024-03-20T07:08:02Z

Hi, is there any update?

ZiyueXu77 · 2024-03-20T14:05:10Z

Hi, is there any update?

Thanks for asking! :) @YuanTingHsieh has been busy with a related NVFlare release in the past two weeks, now the release is close to finish, he will have time to work on this soon.

Add secure inf unit tests

ZiyueXu77 · 2024-04-15T18:05:12Z

@trivialfis Yuanting just added some unit tests, seems there is a failed R-test, but not sure if it is related to our modifications, the error message being

* checking package namespace information ... OK
* checking package dependencies ... ERROR
Packages suggested but not available: 'ggplot2', 'DiagrammeR', 'igraph'
.........
Traceback (most recent call last):
Ncpus: 4
  File "/__w/xgboost/xgboost/tests/ci_build/test_r_package.py", line 359, in <module>
    main(args)
  File "/__w/xgboost/xgboost/tests/ci_build/test_utils.py", line 52, in inner
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/xgboost/xgboost/tests/ci_build/test_r_package.py", line 307, in main
    check_rpackage(tarball)
  File "/__w/xgboost/xgboost/tests/ci_build/test_utils.py", line 31, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/xgboost/xgboost/tests/ci_build/test_utils.py", line 52, in inner
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/xgboost/xgboost/tests/ci_build/test_r_package.py", line 166, in check_rpackage
    with open(rcheck_dir / "00install.out", "r") as fd:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'xgboost.Rcheck/00install.out'

trivialfis · 2024-04-18T21:09:20Z

That should be unrelated, will look into this PR today.

ZiyueXu77 · 2024-04-29T14:59:59Z

Hi @trivialfis , thanks for the updates, just merged it.
Everything passed, except an error for R regarding "matrix":
2024-04-29T13:40:51.6394593Z make: *** [Makefile:291: Matrix.ts] Error 1

trivialfis · 2024-04-29T15:40:45Z

Triggered the rest of the CI.

ZiyueXu77 · 2024-05-07T18:44:08Z

Hi @trivialfis , there are 3 failed checks, but I think they align with the rebase merge, shall we just merge this? Thanks!

trivialfis · 2024-05-09T01:58:44Z

Please fix errors on buildkite.

ZiyueXu77 · 2024-05-09T13:16:59Z

Please fix errors on buildkite.

The error is
C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0e4dfcc8c2daeb569-1\xgboost\xgboost-ci-windows\src\data\ellpack_page.cu(174): error : class "cuda::std::__4::tuple<size_t, size_t, size_t>" has no member "get" [C:\buildkite-agent\builds\buildkite-windows-cpu-autoscaling-group-i-0e4dfcc8c2daeb569-1\xgboost\xgboost-ci-windows\build\src\objxgboost.vcxproj] �_bk;t=1714406066901� auto e = batch.GetElement(out.get<2>());

But I do not think it is part of this PR?

ZiyueXu77 · 2024-05-09T13:22:09Z

And the other two errors are

File "/__w/xgboost/xgboost/tests/ci_build/test_r_package.py", line 166, in check_rpackage with open(rcheck_dir / "00install.out", "r") as fd: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'xgboost.Rcheck/00install.out'

�_bk;t=1714406293923�An error occurred (RepositoryAlreadyExistsException) when calling the CreateRepository operation: The repository with name 'xgb-ci.clang_tidy11.8' already exists in the registry with id '492475357299'

I don't quite get what they indicate.

trivialfis · 2024-05-09T21:02:30Z

Let's ignore the Windows error for now and get the Linux buildkite ci to pass: https://buildkite.com/xgboost/xgboost-ci/builds/5116#018f2a11-c525-41a1-a9d7-81bfdb1f701c

ZiyueXu77 · 2024-05-10T15:47:40Z

Let's ignore the Windows error for now and get the Linux buildkite ci to pass: https://buildkite.com/xgboost/xgboost-ci/builds/5116#018f2a11-c525-41a1-a9d7-81bfdb1f701c

Thanks! addressed the warning in clang-tidy

trivialfis · 2024-05-11T08:21:09Z

Excellent! Please continue to fix other errors in the Linux CI (at the moment, it's the memory leak reported by the leak sanitizer).

ZiyueXu77 · 2024-05-14T20:26:08Z

Linux buildkite ci passed with LeakSanitizer fix :)

cherry-pick dmlc#10079

… (#10530) --------- Co-authored-by: Ziyue Xu <[email protected]>

…10079) (dmlc#10530) --------- Co-authored-by: Ziyue Xu <[email protected]>

ZiyueXu77 and others added 29 commits January 31, 2024 10:48

Add additional data split mode to cover the secure vertical pipeline

8570ba5

Add IsSecure info and update corresponding functions

2d00db6

Modify evaluate_splits to block non-label owners to perform hist comp…

ab17f5a

…ute under secure scenario

Continue using Allgather for best split sync for secure vertical, equ…

fb1787c

…valent to broadcast

Modify histogram sync scheme for secure vertical case, can identify g…

7a2a2b8

…lobal best split, but need to further apply split correctly

Sync cut informaiton across clients, full pipeline works for testing …

3ca3142

…case

Code cleanup, phase 1 of alternative vertical pipeline finished

22dd522

Code clean

52e8951

change kColS to kColSecure to avoid confusion with kCols

e9eef15

Add additional data split mode to cover the secure vertical pipeline

70e6ca6

Add IsSecure info and update corresponding functions

a54ea6a

Modify evaluate_splits to block non-label owners to perform hist comp…

6fe61dd

…ute under secure scenario

Continue using Allgather for best split sync for secure vertical, equ…

1c2b7ed

…valent to broadcast

Modify histogram sync scheme for secure vertical case, can identify g…

b36ff2b

…lobal best split, but need to further apply split correctly

Sync cut informaiton across clients, full pipeline works for testing …

0707731

…case

Code cleanup, phase 1 of alternative vertical pipeline finished

dce7609

Code clean

6cebc31

change kColS to kColSecure to avoid confusion with kCols

1562f52

Add one unit test

f31c824

Merge branch 'SecureBoost' into add_alternate_vertical_splits

6fcbe02

Merge pull request #1 from YuanTingHsieh/add_alternate_vertical_splits

967e307

Add alternate vertical splits

Merge branch 'dmlc:master' into SecureBoost

04cd1cb

Merge branch 'dmlc:master' into SecureBoost

087a8dd

modify inference behavior of secure vertical from split value to inde…

5e85438

…x for training phase

fix the logic for secure vertical inference, each client save a diffe…

e008818

…rent model

code clean

1fd1fb0

code clean

72159b9

code clean

069f811

code clean

4e3c329

ZiyueXu77 mentioned this pull request Feb 27, 2024

Vertical Federated Learning with Secure Features (secure inference and encrypted training) RFC #9987

Closed

trivialfis reviewed Mar 7, 2024

View reviewed changes

YuanTingHsieh and others added 2 commits April 14, 2024 21:31

Add secure inf unit tests

7cdde6f

Merge pull request #3 from YuanTingHsieh/add_secure_inf_unit_tests

be37fcd

Add secure inf unit tests

ZiyueXu77 requested a review from trivialfis April 15, 2024 18:05

trivialfis approved these changes Apr 19, 2024

View reviewed changes

Merge branch 'vertical-federated-learning' into SecureBoostInf

090cb1a

fix clang-tidy warning

da97000

fix memory leakage for unit test

0c854a4

trivialfis merged commit 8585df5 into dmlc:vertical-federated-learning May 16, 2024
25 of 28 checks passed

ZiyueXu77 deleted the SecureBoostInf branch May 17, 2024 20:36

trivialfis pushed a commit to trivialfis/xgboost that referenced this pull request Jul 2, 2024

[fed] Evaluation and validation w/o local feature leakage (dmlc#10079)

298a9f4

cherry-pick dmlc#10079

trivialfis mentioned this pull request Jul 2, 2024

[CP] [fed] Evaluation and validation w/o local feature leakage (#10079) #10530

Merged

trivialfis added a commit that referenced this pull request Jul 2, 2024

[CP] [fed] Evaluation and validation w/o local feature leakage (#10079)…

621692c

… (#10530) --------- Co-authored-by: Ziyue Xu <[email protected]>

trivialfis added a commit to trivialfis/xgboost that referenced this pull request Jul 15, 2024

[CP] [fed] Evaluation and validation w/o local feature leakage (dmlc#…

11563ca

…10079) (dmlc#10530) --------- Co-authored-by: Ziyue Xu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement secure boost scheme - secure evaluation and validation (during training) without local feature leakage #10079

Implement secure boost scheme - secure evaluation and validation (during training) without local feature leakage #10079

ZiyueXu77 commented Feb 27, 2024 •

edited

Loading

trivialfis left a comment

ZiyueXu77 commented Mar 7, 2024

trivialfis commented Mar 7, 2024

trivialfis commented Mar 20, 2024

ZiyueXu77 commented Mar 20, 2024

ZiyueXu77 commented Apr 15, 2024 •

edited

Loading

trivialfis commented Apr 18, 2024

ZiyueXu77 commented Apr 29, 2024

trivialfis commented Apr 29, 2024

ZiyueXu77 commented May 7, 2024

trivialfis commented May 9, 2024

ZiyueXu77 commented May 9, 2024

ZiyueXu77 commented May 9, 2024

trivialfis commented May 9, 2024

ZiyueXu77 commented May 10, 2024

trivialfis commented May 11, 2024

ZiyueXu77 commented May 14, 2024

Implement secure boost scheme - secure evaluation and validation (during training) without local feature leakage #10079

Implement secure boost scheme - secure evaluation and validation (during training) without local feature leakage #10079

Conversation

ZiyueXu77 commented Feb 27, 2024 • edited Loading

trivialfis left a comment

Choose a reason for hiding this comment

ZiyueXu77 commented Mar 7, 2024

trivialfis commented Mar 7, 2024

trivialfis commented Mar 20, 2024

ZiyueXu77 commented Mar 20, 2024

ZiyueXu77 commented Apr 15, 2024 • edited Loading

trivialfis commented Apr 18, 2024

ZiyueXu77 commented Apr 29, 2024

trivialfis commented Apr 29, 2024

ZiyueXu77 commented May 7, 2024

trivialfis commented May 9, 2024

ZiyueXu77 commented May 9, 2024

ZiyueXu77 commented May 9, 2024

trivialfis commented May 9, 2024

ZiyueXu77 commented May 10, 2024

trivialfis commented May 11, 2024

ZiyueXu77 commented May 14, 2024

ZiyueXu77 commented Feb 27, 2024 •

edited

Loading

ZiyueXu77 commented Apr 15, 2024 •

edited

Loading