Skip to content

Commit

Permalink
Merge branch 'main' into xgboost-cyclic
Browse files Browse the repository at this point in the history
  • Loading branch information
yan-gao-GY authored Dec 5, 2023
2 parents f836180 + c3347a4 commit 7d96154
Show file tree
Hide file tree
Showing 80 changed files with 15,899 additions and 240 deletions.
4 changes: 2 additions & 2 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ RUN SNIPPET="export PROMPT_COMMAND='history -a' && export HISTFILE=/commandhisto
&& echo $SNIPPET >> "/home/$USERNAME/.bashrc"

# Install system dependencies
RUN apt update
RUN apt install -y curl wget gnupg python3 python-is-python3 python3-pip git \
RUN apt-get update
RUN apt-get install -y curl wget gnupg python3 python-is-python3 python3-pip git \
build-essential tmux vim

RUN python -m pip install \
Expand Down
16 changes: 10 additions & 6 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
{
"dockerFile": "Dockerfile",
"postCreateCommand": "poetry install --extras \"simulation\"",
"extensions": ["ms-python.python"],
"settings": {
"files.watcherExclude": {},
"search.exclude": {},
"terminal.integrated.defaultProfile.linux": "bash"
"postCreateCommand": "sudo poetry install --extras \"simulation\"",
"customizations": {
"vscode": {
"settings": {
"files.watcherExclude": { },
"search.exclude": { },
"terminal.integrated.defaultProfile.linux": "bash"
},
"extensions": [ "ms-python.python" ]
}
},
"remoteUser": "flwr-vscode",
"containerEnv": {
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/android-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
working-directory: src/kotlin
name: Release build and publish
if: github.repository == 'adap/flower'
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Check out code
uses: actions/checkout@v4
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/cpp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ on:
jobs:
build_and_test:
name: Build and test
runs-on: ubuntu-latest
runs-on: ubuntu-22.04

steps:
- uses: actions/checkout@v4
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/framework-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
publish:
if: ${{ github.repository == 'adap/flower' }}
name: Publish draft
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v4
Expand Down Expand Up @@ -53,7 +53,7 @@ jobs:
cat body.md
- name: Release
uses: softprops/action-gh-release@de2c0eb
uses: softprops/action-gh-release@v1
with:
body_path: ./body.md
draft: true
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/update-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
- 'main'
jobs:
autoupdate:
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
steps:
- name: Automatically update mergeable PRs
uses: adRise/[email protected]
Expand Down
2 changes: 1 addition & 1 deletion baselines/doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
author = "The Flower Authors"

# The full version, including alpha/beta/rc tags
release = "1.6.0"
release = "1.7.0"


# -- General configuration ---------------------------------------------------
Expand Down
18 changes: 17 additions & 1 deletion datasets/doc/source/how-to-use-with-numpy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,30 @@ Use with NumPy

Let's integrate ``flwr-datasets`` with NumPy.

Prepare the desired partitioning::
Create a ``FederatedDataset``::

from flwr_datasets import FederatedDataset

fds = FederatedDataset(dataset="cifar10", partitioners={"train": 10})
partition = fds.load_partition(0, "train")
centralized_dataset = fds.load_full("test")

Inspect the names of the features::

partition.features

In case of CIFAR10, you should see the following output.

.. code-block:: none
{'img': Image(decode=True, id=None),
'label': ClassLabel(names=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog',
'frog', 'horse', 'ship', 'truck'], id=None)}
We will use the keys in the partition features in order to apply transformations to the data or pass it to a ML model. Let's move to the transformations.

NumPy
-----
Transform to NumPy::

partition_np = partition.with_format("numpy")
Expand Down
48 changes: 25 additions & 23 deletions datasets/doc/source/how-to-use-with-tensorflow.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,32 @@
Use with TensorFlow
===================

Let's integrate ``flwr-datasets`` with TensorFlow. We show you three ways how to convert the data into the formats
Let's integrate ``flwr-datasets`` with ``TensorFlow``. We show you three ways how to convert the data into the formats
that ``TensorFlow``'s models expect. Please note that, especially for the smaller datasets, the performance of the
following methods is very close. We recommend you choose the method you are the most comfortable with.

Create a ``FederatedDataset``::

from flwr_datasets import FederatedDataset

fds = FederatedDataset(dataset="cifar10", partitioners={"train": 10})
partition = fds.load_partition(0, "train")
centralized_dataset = fds.load_full("test")

Inspect the names of the features::

partition.features

In case of CIFAR10, you should see the following output.

.. code-block:: none
{'img': Image(decode=True, id=None),
'label': ClassLabel(names=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog',
'frog', 'horse', 'ship', 'truck'], id=None)}
We will use the keys in the partition features in order to construct a `tf.data.Dataset <https://www.tensorflow.org/api_docs/python/tf/data/Dataset>`_. Let's move to the transformations.

NumPy
-----
The first way is to transform the data into the NumPy arrays. It's an easier option that is commonly used. Feel free to
Expand All @@ -14,17 +36,7 @@ follow the :doc:`how-to-use-with-numpy` tutorial, especially if you are a beginn

TensorFlow Dataset
------------------
Work with ``TensorFlow Dataset`` abstraction.

Standard setup::

from flwr_datasets import FederatedDataset

fds = FederatedDataset(dataset="cifar10", partitioners={"train": 10})
partition = fds.load_partition(0, "train")
centralized_dataset = fds.load_full("test")

Transformation to the TensorFlow Dataset::
Transform the data to ``TensorFlow Dataset``::

tf_dataset = partition.to_tf_dataset(columns="img", label_cols="label", batch_size=64,
shuffle=True)
Expand All @@ -33,17 +45,7 @@ Transformation to the TensorFlow Dataset::

TensorFlow Tensors
------------------
Change the data type to TensorFlow Tensors (it's not the TensorFlow dataset).

Standard setup::

from flwr_datasets import FederatedDataset

fds = FederatedDataset(dataset="cifar10", partitioners={"train": 10})
partition = fds.load_partition(0, "train")
centralized_dataset = fds.load_full("test")

Transformation to the TensorFlow Tensors ::
Transform the data to the TensorFlow `tf.Tensor <https://www.tensorflow.org/api_docs/python/tf/Tensor>`_ (it's not the TensorFlow dataset)::

data_tf = partition.with_format("tf")
# Assuming you have defined your model and compiled it
Expand Down
33 changes: 26 additions & 7 deletions datasets/doc/source/tutorial-quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ Run Flower Datasets as fast as possible by learning only the essentials.

Install Federated Datasets
--------------------------
Run on the command line
On the command line, run

.. code-block:: bash
python -m pip install flwr-datasets[vision]
python -m pip install "flwr-datasets[vision]"
Install the ML framework
------------------------
Expand All @@ -28,12 +28,11 @@ PyTorch
Choose the dataset
------------------
Choose the dataset by going to Hugging Face `Datasets Hub <https://huggingface.co/datasets>`_ and searching for your
dataset by name. Note that the name is case sensitive, so make sure to pass the correct name as the `dataset` parameter
to `FederatedDataset`.
dataset by name that you will pass to the `dataset` parameter of `FederatedDataset`. Note that the name is case sensitive.

Partition the dataset
---------------------
::
To iid partition your dataset, choose the split you want to partition and the number of partitions::

from flwr_datasets import FederatedDataset

Expand All @@ -42,12 +41,32 @@ Partition the dataset
centralized_dataset = fds.load_full("test")

Now you're ready to go. You have ten partitions created from the train split of the MNIST dataset and the test split
for the centralized evaluation. We will convert the type of the dataset from Hugging Face's Dataset type to the one
for the centralized evaluation. We will convert the type of the dataset from Hugging Face's `Dataset` type to the one
supported by your framework.

Display the features
--------------------
Determine the names of the features of your dataset (you can alternatively do that directly on the Hugging Face
website). The names can vary along different datasets e.g. "img" or "image", "label" or "labels". You will also see
the names of label categories. Type::

partition.features

In case of CIFAR10, you should see the following output.

.. code-block:: none
{'img': Image(decode=True, id=None),
'label': ClassLabel(names=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog',
'frog', 'horse', 'ship', 'truck'], id=None)}
Note that the image is denoted by "img" which is crucial for the next steps (conversion you the ML
framework of your choice).

Conversion
----------
For more detailed instructions, go to :doc:`how-to-use-with-pytorch`.
For more detailed instructions, go to :doc:`how-to-use-with-pytorch`, :doc:`how-to-use-with-numpy`, or
:doc:`how-to-use-with-tensorflow`.

PyTorch DataLoader
^^^^^^^^^^^^^^^^^^
Expand Down
4 changes: 2 additions & 2 deletions dev/get-latest-changelog.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ tags=$(git tag --sort=-v:refname)
new_version=$(echo "$tags" | sed -n '1p')
old_version=$(echo "$tags" | sed -n '2p')

awk -v start="$new_version" -v end="$old_version" '
awk '{sub(/<!--.*-->/, ""); print}' doc/source/ref-changelog.md | awk -v start="$new_version" -v end="$old_version" '
$0 ~ start {flag=1; next}
$0 ~ end {flag=0}
flag && !printed && /^$/ {next} # skip the first blank line
flag && !printed {printed=1}
flag' doc/source/ref-changelog.md
flag'
13 changes: 8 additions & 5 deletions dev/add-shortlog.sh → dev/prepare-release-changelog.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,23 @@
set -e
cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"/../

# Get the current date in the format YYYY-MM-DD
current_date=$(date +"%Y-%m-%d")

tags=$(git tag --sort=-v:refname)
new_version=$(echo "$tags" | sed -n '1p')
old_version=$(echo "$tags" | sed -n '2p')
new_version=$1
old_version=$(echo "$tags" | sed -n '1p')

shortlog=$(git shortlog "$old_version".."$new_version" -s | grep -vEi '(\(|\[)bot(\)|\])' | awk '{name = substr($0, index($0, $2)); printf "%s`%s`", sep, name; sep=", "} END {print ""}')
shortlog=$(git shortlog "$old_version"..main -s | grep -vEi '(\(|\[)bot(\)|\])' | awk '{name = substr($0, index($0, $2)); printf "%s`%s`", sep, name; sep=", "} END {print ""}')

token="<!---TOKEN_$new_version-->"
thanks="\n### Thanks to our contributors\n\nWe would like to give our special thanks to all the contributors who made the new version of Flower possible (in \`git shortlog\` order):\n\n$shortlog $token"

# Check if the token exists in the markdown file
if ! grep -q "$token" doc/source/ref-changelog.md; then
# If the token does not exist in the markdown file, append the new content after the version
awk -v version="$new_version" -v text="$thanks" \
'{print} $0 ~ "## " version {print text}' doc/source/ref-changelog.md > temp.md && mv temp.md doc/source/ref-changelog.md
awk -v version="$new_version" -v date="$current_date" -v text="$thanks" \
'{ if ($0 ~ "## Unreleased") print "## " version " (" date ")\n" text; else print $0 }' doc/source/ref-changelog.md > temp.md && mv temp.md doc/source/ref-changelog.md
else
# If the token exists, replace the line containing the token with the new shortlog
awk -v token="$token" -v newlog="$shortlog $token" '{ if ($0 ~ token) print newlog; else print $0 }' doc/source/ref-changelog.md > temp.md && mv temp.md doc/source/ref-changelog.md
Expand Down
9 changes: 7 additions & 2 deletions dev/update-examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,14 @@ cd examples/
for d in $(printf '%s\n' */ | sort -V); do
example=${d%/}
# For each example, copy the README into the source of the Example docs
[[ $example = doc ]] || cp $example/README.md $ROOT/examples/doc/source/$example.md 2>&1 >/dev/null
[[ $example != doc ]] && cp $example/README.md $ROOT/examples/doc/source/$example.md 2>&1 >/dev/null
# For each example, copy all images of the _static folder into the examples
# docs static folder
[[ $example != doc ]] && [ -d "$example/_static" ] && {
cp $example/_static/**.{jpg,png,jpeg} $ROOT/examples/doc/source/_static/ 2>/dev/null || true
}
# For each example, insert the name of the example into the index file
[[ $example = doc ]] || (echo $INSERT_LINE; echo a; echo $example; echo .; echo wq) | ed $INDEX 2>&1 >/dev/null
[[ $example != doc ]] && (echo $INSERT_LINE; echo a; echo $example; echo .; echo wq) | ed $INDEX 2>&1 >/dev/null
done

echo "\`\`\`" >> $INDEX
Expand Down
Loading

0 comments on commit 7d96154

Please sign in to comment.