Skip to content

Enable federated XGBoost using bootstrap aggregation in Task Runner #628

Enable federated XGBoost using bootstrap aggregation in Task Runner

Enable federated XGBoost using bootstrap aggregation in Task Runner #628

Workflow file for this run

# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
name: GaNDLF TaskRunner
on:
pull_request:
branches: [ develop ]
permissions:
contents: read
env:
# A workaround for long FQDN names provided by GitHub actions.
FQDN: "localhost"
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.10
uses: actions/setup-python@v3
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install typer==0.11.1
pip install torch==2.3.1+cpu torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
pip install .
- name: Install GaNDLF
run: |
git clone https://github.com/MLCommons/GaNDLF.git ./gandlf
cd gandlf
git fetch --tags
echo "Checkout the latest GaNDLF tag"
latestTag=$(git describe --tags "$(git rev-list --tags --max-count=1)")
git checkout $latestTag
- name: GaNDLF Task Runner Test
run: |
cd gandlf
pwd
pip install -e .
pip uninstall onnx -y
cat ./GANDLF/version.py
echo "Download data and Split CSVs into training and validation"
# python -c "from testing.test_full import test_generic_download_data, test_generic_constructTrainingCSV; test_generic_download_data(); test_generic_constructTrainingCSV()"
pytest --cov=. --cov-report=xml -k "prepare_data_for_ci"
head -n 1 testing/data/train_2d_rad_segmentation.csv > /home/runner/work/openfl/openfl/valid.csv
tail -n +9 testing/data/train_2d_rad_segmentation.csv >> /home/runner/work/openfl/openfl/valid.csv
head -n 8 testing/data/train_2d_rad_segmentation.csv > /home/runner/work/openfl/openfl/train.csv
cp testing/config_segmentation.yaml /home/runner/work/openfl/openfl/config_segmentation.yaml
echo "DEBUG display the config file"
cat /home/runner/work/openfl/openfl/config_segmentation.yaml
echo "Initialize OpenFL plan"
## from docs
export WORKSPACE_TEMPLATE=gandlf_seg_test
export WORKSPACE_PATH=./my_federation
fx workspace create --prefix ${WORKSPACE_PATH} --template ${WORKSPACE_TEMPLATE}
cd ${WORKSPACE_PATH}
mkdir ./data/one
mkdir ./data/two
cp /home/runner/work/openfl/openfl/*.csv ./data/one/
cp /home/runner/work/openfl/openfl/*.csv ./data/two/
## from docs
# fx plan initialize --gandlf_config ../testing/config_segmentation.yaml
cd /home/runner/work/openfl/openfl
ls
file "/home/runner/work/openfl/openfl/config_segmentation.yaml"
## for 2d data, only a single change is needed in the gandlf config
sed -i 's/# n_channels: 3/num_channels: 3/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
## for 3d data, the following changes are needed in the gandlf config -- commented out for now
# sed -i 's/dimension: 2/dimension: 3/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
# sed -i 's/0,255/0,1/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
# sed -i 's/128,128/32,32,32/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
python -m tests.github.test_gandlf --template gandlf_seg_test --fed_workspace aggregator --col1 one --col2 two --rounds-to-train 1 --gandlf_config "/home/runner/work/openfl/openfl/config_segmentation.yaml"