Run collaborator in docker #280

dmitryagapov · 2021-12-16T08:57:43Z

nvidia-container-runtime should be installed
https://docs.docker.com/config/containers/resource_constraints/#gpu

Add gpgkey for nvidia-container-runtime

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update

Install nvidia-container-runtime

sudo apt-get install nvidia-container-runtime

Ensure the nvidia-container-runtime-hook is accessible from $PATH.

which nvidia-container-runtime-hook

Restart the Docker daemon

sudo service docker restart

Docker proxy:
In order to use docker with proxy it can be defined in director_config.yaml and envoy_config.yaml

#director_config.yaml
settings:
  listen_host: localhost
  listen_port: 50050
  sample_shape: [ '300', '400', '3' ]
  target_shape: [ '300', '400' ]
  envoy_health_check_period: 5  # in seconds
  docker:
    env:
      http_proxy:
      https_proxy:
      no_proxy:
    buildargs:
      HTTP_PROXY:
      HTTPS_PROXY:
      NO_PROXY:

#envoy_config.yaml
params:
  cuda_devices: [ 0, 2 ]
  docker:
    env:
      http_proxy:
      https_rpoxy:
      no_proxy:
    buildargs:
      HTTP_PROXY:
      HTTPS_PROXY:
      NO_PROXY:

optional_plugin_components:
  cuda_device_monitor:
    template: openfl.plugins.processing_units_monitor.pynvml_monitor.PynvmlCUDADeviceMonitor
    settings: [ ]

shard_descriptor:
  template: kvasir_shard_descriptor.KvasirShardDescriptor
  params:
    data_folder: kvasir_data
    rank_worldsize: 1,10
    enforce_image_hw: '300,400'

Manage Docker as a non-root user:
https://docs.docker.com/engine/install/linux-postinstall/

* Implementation of director-aggregator gRPC communication * Retry on unavailable aggregator for async client * Check if the experiment is available * Increase timeout * Retrun instead raise error after timeout * Fix test * enforce_image_hw is string * Col exp can be empty * Artifacts were removed, less dependency from aggregator attribute * Wait experiment readiness to get an aggregator client. * Director requests validation * Some enhancements * Doc strings * Update openfl/federated/plan/plan.py Co-authored-by: Igor Davidyuk <[email protected]> * Update plan.py * Remove redundant method * Additional error handling Co-authored-by: Igor Davidyuk <[email protected]>

…docker

psfoley · 2022-02-16T17:38:29Z

@dmitryagapov @alexey-gruzdev Can a tag be added to PR's like this to reflect that the feature is experimental / needs pending design review before merge? WIP is used for PR's that aren't ready for review yet, but it seems like this belongs in a different category

…docker

psfoley · 2022-04-01T16:32:55Z

openfl-tutorials/interactive_api/PyTorch_Kvasir_UNet/envoy/envoy_config_no_gpu.yaml

+  cuda_devices: [ ]
+  docker_env:
+    http_proxy:
+    https_rpoxy:


MasterSkepticista · 2024-08-12T04:48:05Z

Closing since the PR is stale. GPU support will be ported from OpenFL-Security with a new PR.

dmitryagapov added 3 commits December 8, 2021 12:25

wip

a0935f0

wip

6a5651f

wip

dacd4f1

psfoley self-requested a review December 16, 2021 15:40

dmitryagapov added 3 commits December 23, 2021 16:50

refactoring

f2afcf9

add aiodocker requirements

52b8284

refactoring

add5710

securefederatedai deleted a comment from alexey-gruzdev Jan 11, 2022

dmitryagapov and others added 16 commits January 11, 2022 18:34

refactoring

20e318e

refactoring

96d9c7d

refactoring

517494d

Merge branch 'develop' into feature/run_collaborator_in_docker

b6c1102

refactoring

51db0d0

refactoring

05c18c1

refactoring

bde6023

create docker module

f3ed426

refactoring

7e65c40

Merge branch 'develop' into feature/run_collaborator_in_docker

0dd2c7c

add --use_docker to envoy

9321627

Merge branch 'dockerezation-launch' into feature/run_collaborator_in_…

7b9c625

…docker

fix flake8

41c8399

add openfl.docker module to packages

a414cfb

fix initial tensor path

c9d0223

alexey-gruzdev added the experimental label Feb 17, 2022

dmitryagapov added 4 commits February 17, 2022 11:57

merge fix

baa5be6

add --use-docker flag for envoy

1cc73eb

add --use-docker flag for director

898a22e

fix

493eb96

dmitryagapov added 5 commits March 15, 2022 16:43

add docker proxy for director and envoy configs

9ffc06e

add docker proxy for director and envoy configs

04a387e

fix

c5457d0

add buildargs config to envoy/director configs

814bc85

add buildargs config to envoy/director configs

4923046

dmitryagapov removed a link to an issue Mar 18, 2022

Introduce a deterministic algorithm for choosing a port for the aggregator server. #342

Open

dmitryagapov added 10 commits March 23, 2022 12:27

docker config

c335c78

Merge branch 'develop' of github.com:intel/openfl into develop

488054f

Merge branch 'develop' into dockerezation-launch

c98455f

Merge branch 'develop' into feature/run_collaborator_in_docker

03ad397

Merge branch 'dockerezation-launch' into feature/run_collaborator_in_…

160b94c

…docker

refactoring

b24b995

fixes

4d510fc

fixes

e9a5a1c

add volumes for PyTorch_Kvasir_UNet

cfc178a

fix

6e481be

dmitryagapov marked this pull request as ready for review March 25, 2022 15:58

dmitryagapov requested review from aleksandr-mokrov, igor-davidyuk, alexey-khorkin, itrushkin, ViktoriiaRomanova and alexey-gruzdev March 25, 2022 15:59

dmitryagapov added 2 commits March 31, 2022 14:26

send only one model to aggregator when last == best

425bc1a

relative import to absolute

adf79c5

psfoley reviewed Apr 1, 2022

View reviewed changes

dmitryagapov added 2 commits April 4, 2022 14:08

fixes

79cf99a

Diagrams

9d9a968

MasterSkepticista closed this Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run collaborator in docker #280

Run collaborator in docker #280

dmitryagapov commented Dec 16, 2021 •

edited

Loading

psfoley commented Feb 16, 2022

psfoley Apr 1, 2022

MasterSkepticista commented Aug 12, 2024

Run collaborator in docker #280

Run collaborator in docker #280

Conversation

dmitryagapov commented Dec 16, 2021 • edited Loading

psfoley commented Feb 16, 2022

psfoley Apr 1, 2022

Choose a reason for hiding this comment

MasterSkepticista commented Aug 12, 2024

dmitryagapov commented Dec 16, 2021 •

edited

Loading