Skip to content

Commit

Permalink
Fix all href, improve formatting
Browse files Browse the repository at this point in the history
Signed-off-by: Shah, Karan <[email protected]>
  • Loading branch information
MasterSkepticista committed Dec 19, 2024
1 parent 08a2934 commit cf844bc
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 28 deletions.
8 changes: 4 additions & 4 deletions docs/about/features_index/privacy_meter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,17 @@ In this threat model, each party can audit the privacy loss of the local and glo
Workflow
-----------------------------------------------
We provide a demo code in `cifar10_PM.py <https://github.com/securefederatedai/openfl/blob/develop/openfl-tutorials/experimental/workflow/Privacy_Meter/cifar10_PM.py>`_. Here, we briefly describe its workflow.
In each round of FL, parties train, starting with the current global model as initialization, using their local dataset. Then, the current global model and updated local model will be passed to the privacy auditing module (See `audit` function in `cifar10_PM.py`) to produce a privacy loss report. The local model update will then be shared to the server and all such updates aggregated to form the next global model. Though this is a simulation so that no network sharing of models is involved, these reports could be used in a fully distributed setting to trigger actions when the loss is too high. These actions could include not sharing local updates to the aggregator, not
In each round of FL, parties train, starting with the current global model as initialization, using their local dataset. Then, the current global model and updated local model will be passed to the privacy auditing module (See :code:`audit` function in :code:`cifar10_PM.py`) to produce a privacy loss report. The local model update will then be shared to the server and all such updates aggregated to form the next global model. Though this is a simulation so that no network sharing of models is involved, these reports could be used in a fully distributed setting to trigger actions when the loss is too high. These actions could include not sharing local updates to the aggregator, not
allowing the FL system to release the model to other outside entities, or potentially re-running local training in a differentially private mode and re-auditing in an attempt to reduce the leakage before sharing occurs.

Methodology
-----------------------------------------------
We integrate the population attack from ML Privacy Meter into OpenFL. In the population attack, the adversary first computes the signal (e.g., loss, logits) on all samples in a population dataset using the target model. The population dataset is sampled from the same distribution as the train and test datasets, but is non-overlapping with both. The population dataset signals are then used to determine (using the fact that all population data are known not to be target training samples) a signal threshold for which false positives (samples whose signal against the threshold would be erroneously identified as target training samples) would occur at a rate below a provided false positive rate tolerance. Known positives (target training samples) as well as known negatives (target test samples) are tested against the threshold to determine how well this threshold does at classifying training set memberhsip.

Therefore, to use this attack for auditing privacy, we assume there is a set of data points used for auditing which is not overlapped with the training dataset. The size of the auditing dataset is indicated by `audit_dataset_ratio` argument. In addition, we also need to define which signal will be used to distinguish members and non-members. Currently, we support loss, logits and gradient norm. When the gradient norm is used for inferring the membership information, we need to specify which layer of the model we would like to compute the gradient with respect to. For instance, if we want to measure the gradient norm with respect to the 10th layer of the representation (before the fully connected layers), we can pass the following argument `--is_feature True` and `--layer_number 10` to the `cifar10_PM.py`.
Therefore, to use this attack for auditing privacy, we assume there is a set of data points used for auditing which is not overlapped with the training dataset. The size of the auditing dataset is indicated by :code:`audit_dataset_ratio` argument. In addition, we also need to define which signal will be used to distinguish members and non-members. Currently, we support loss, logits and gradient norm. When the gradient norm is used for inferring the membership information, we need to specify which layer of the model we would like to compute the gradient with respect to. For instance, if we want to measure the gradient norm with respect to the 10th layer of the representation (before the fully connected layers), we can pass the following argument :code:`--is_feature True` and :code:`--layer_number 10` to the :code:`cifar10_PM.py`.

To measure the success of the attack (privacy loss), we generate the ROC of the attack and the dynamic of the AUC during the training. In addition, parties can also indicate the false positive rate tolerance, and the privacy loss report will show the maximal true positive rate (fraction of members which is correctly identified) during the training. This false positive rate tolerance is passed to `fpr_tolerance` argument. The privacy loss report will be saved in the folder indicated by `log_dir` argument.
To measure the success of the attack (privacy loss), we generate the ROC of the attack and the dynamic of the AUC during the training. In addition, parties can also indicate the false positive rate tolerance, and the privacy loss report will show the maximal true positive rate (fraction of members which is correctly identified) during the training. This false positive rate tolerance is passed to :code:`fpr_tolerance` argument. The privacy loss report will be saved in the folder indicated by :code:`log_dir` argument.

Examples
-----------------------------------------------
`Here <https://github.com/securefederatedai/openfl/tree/f1657abe88632d542504d6d71ca961de9333913f/openfl-tutorials/experimental/workflow/Privacy_Meter>`_, we give a few commands and the results for each of them.
`Here <https://github.com/securefederatedai/openfl/tree/develop/openfl-tutorials/experimental/workflow/Privacy_Meter>`_, we give a few commands and the results for each of them.
35 changes: 17 additions & 18 deletions docs/about/features_index/workflowinterface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,29 @@
Workflow API
============

**Important Note**

The OpenFL workflow interface is experimental and subject to change. For an overview of options supported to setup Federation and run FL experiments, see `Features <../features.rst>`_
.. note::
This is an experimental functionality and subject to change. For an overview of options supported to setup FL experiments, refer to `features <../features.html>`_.

What is it?
===========

A new OpenFL interface that gives significantly more flexility to researchers in the construction of federated learning experiments. It is heavily influenced by the interface and design of `Metaflow` , the popular framework for data scientists originally developed at Netflix. There are several reasons we converged on Metaflow as inspiration for our work:
A new OpenFL interface that gives significantly more flexility to researchers in the construction of federated learning experiments. It is heavily influenced by the interface and design of `Metaflow <https://metaflow.org/>`_, a framework originally developed at Netflix. There are several reasons we converged on Metaflow as inspiration for our work:

1. Clean expression of task sequence. Flows start with a `start` task, and end with `end`. The next task in the sequence is called by `self.next`.
2. Easy selection of what should be sent between tasks using `include` or `exclude`
1. Clean expression of task sequence. Flows start with a :code:`start` task, and end with :code:`end`. The next task in the sequence is called by :code:`self.next`.
2. Easy selection of what should be sent between tasks using :code:`include` or :code:`exclude`
3. Excellent tooling ecosystem: the metaflow client gives easy access to prior runs, tasks, and data artifacts generated by an experiment.

There are several modifications we make in our reimagined version of this interface that are necessary for federated learning:

1. *Placement*: Metaflow's `@step` decorator is replaced by placement decorators that specify where a task will run. In horizontal federated learning, there are server (or aggregator) and client (or collaborator) nodes. Tasks decorated by `@aggregator` will run on the aggregator node, and `@collaborator` will run on the collaborator node. These placement decorators are interpreted by *Runtime* implementations: these do the heavy lifting of figuring out how to get the state of the current task to another process or node.
2. *Runtime*: Each flow has a `.runtime` attribute. The runtime encapsulates the details of the infrastucture where the flow will run. We support the LocalRuntime for simulating experiments on local node and FederatedRuntime to launch experiments on distributed infrastructure.
1. *Placement*: Metaflow's :code:`@step` decorator is replaced by placement decorators that specify where a task will run. In horizontal federated learning, there are server (or aggregator) and client (or collaborator) nodes. Tasks decorated by :code:`@aggregator` will run on the aggregator node, and :code:`@collaborator` will run on the collaborator node. These placement decorators are interpreted by *Runtime* implementations: these do the heavy lifting of figuring out how to get the state of the current task to another process or node.
2. *Runtime*: Each flow has a :code:`.runtime` attribute. The runtime encapsulates the details of the infrastucture where the flow will run. We support the LocalRuntime for simulating experiments on local node and FederatedRuntime to launch experiments on distributed infrastructure.
3. *Conditional branches*: Perform different tasks if a criteria is met
4. *Loops*: Internal loops are within a flow; this is necessary to support rounds of training where the same sequence of tasks is performed repeatedly.

How to use it?
==============

Let's start with the basics. A flow is intended to define the entirety of federated learning experiment. Every flow begins with the `start` task and concludes with the `end` task. At each step in the flow, attributes can be defined, modified, or deleted. Attributes get passed forward to the next step in the flow, which is defined by the name of the task passed to the `next` function. In the line before each task, there is a **placement decorator**. The placement decorator defines where that task will be run. The OpenFL Workflow Interface adopts the conventions set by Metaflow, that every workflow begins with start and concludes with the end task. In the following example, the aggregator begins with an optionally passed in model and optimizer. The aggregator begins the flow with the start task, where the list of collaborators is extracted from the runtime (:code:`self.collaborators = self.runtime.collaborators`) and is then used as the list of participants to run the task listed in self.next, aggregated_model_validation. The model, optimizer, and anything that is not explicitly excluded from the next function will be passed from the start function on the aggregator to the aggregated_model_validation task on the collaborator. Where the tasks run is determined by the placement decorator that precedes each task definition (:code:`@aggregator` or :code:`@collaborator`). Once each of the collaborators (defined in the runtime) complete the aggregated_model_validation task, they pass their current state onto the train task, from train to local_model_validation, and then finally to join at the aggregator. It is in join that an average is taken of the model weights, and the next round can begin.
Let's start with the basics. A flow is intended to define the entirety of federated learning experiment. Every flow begins with the :code:`start` task and concludes with the :code:`end` task. At each step in the flow, attributes can be defined, modified, or deleted. Attributes get passed forward to the next step in the flow, which is defined by the name of the task passed to the :code:`next` function. In the line before each task, there is a **placement decorator**. The placement decorator defines where that task will be run. The OpenFL Workflow Interface adopts the conventions set by Metaflow, that every workflow begins with start and concludes with the end task. In the following example, the aggregator begins with an optionally passed in model and optimizer. The aggregator begins the flow with the start task, where the list of collaborators is extracted from the runtime (:code:`self.collaborators = self.runtime.collaborators`) and is then used as the list of participants to run the task listed in self.next, aggregated_model_validation. The model, optimizer, and anything that is not explicitly excluded from the next function will be passed from the start function on the aggregator to the aggregated_model_validation task on the collaborator. Where the tasks run is determined by the placement decorator that precedes each task definition (:code:`@aggregator` or :code:`@collaborator`). Once each of the collaborators (defined in the runtime) complete the aggregated_model_validation task, they pass their current state onto the train task, from train to local_model_validation, and then finally to join at the aggregator. It is in join that an average is taken of the model weights, and the next round can begin.

.. code-block:: python
Expand Down Expand Up @@ -133,7 +132,7 @@ Goals
Workflow Interface API
======================

The workflow interface formulates the experiment as a series of tasks, or a flow. Every flow begins with the `start` task and concludes with `end`.
The workflow interface formulates the experiment as a series of tasks, or a flow. Every flow begins with the :code:`start` task and concludes with :code:`end`.

Runtimes
========
Expand Down Expand Up @@ -174,7 +173,7 @@ You can simulate a Federated Learning experiment locally using :code:`LocalRunti
local_runtime = LocalRuntime(aggregator=aggregator, collaborators=collaborators, backend='single_process')
Let's break this down, starting with the :code:`Aggregator` and :code:`Collaborator` components. These components represent the *Participants* in a Federated Learning experiment. Each participant has its own set of *private attributes*. As the name suggests, these *private attributes* represent private information they do not want to share with others, and will be filtered out when there is a transition from the aggregator to the collaborator or vice versa. In the example above each collaborator has it's own `train_dataloader` and `test_dataloader` that are only available when that collaborator is performing it's tasks via `self.train_loader` and `self.test_loader`. Once those collaborators transition to a task at the aggregator, this private information is filtered out and the remaining collaborator state can safely be sent back to the aggregator.
Let's break this down, starting with the :code:`Aggregator` and :code:`Collaborator` components. These components represent the *Participants* in a Federated Learning experiment. Each participant has its own set of *private attributes*. As the name suggests, these *private attributes* represent private information they do not want to share with others, and will be filtered out when there is a transition from the aggregator to the collaborator or vice versa. In the example above each collaborator has it's own :code:`train_dataloader` and :code:`test_dataloader` that are only available when that collaborator is performing it's tasks via :code:`self.train_loader` and :code:`self.test_loader`. Once those collaborators transition to a task at the aggregator, this private information is filtered out and the remaining collaborator state can safely be sent back to the aggregator.

These *private attributes* need to be set in form of a dictionary(user defined), where the key is the name of the attribute and the value is the object. In this example :code:`collaborator.private_attributes` sets the collaborator *private attributes* :code:`train_loader` and :code:`test_loader` that are accessed by collaborator steps (:code:`aggregated_model_validation`, :code:`train` and :code:`local_model_validation`).

Expand Down Expand Up @@ -226,7 +225,7 @@ Participant *private attributes* are returned by the callback function in form o
Some important points to remember while creating callback function and private attributes are:

- Callback Function needs to be defined by the user and should return the *private attributes* required by the participant in form of a key/value pair
- Callback function can be provided with any parameters required as arguments. In this example, parameters essential for the callback function are supplied with corresponding values bearing *same names* during the instantiation of the Collaborator
- Callback function can be provided with any parameters required as arguments. In this example, parameters essential for the callback function are supplied with corresponding values bearing same names during the instantiation of the Collaborator

* :code:`index`: Index of the particular collaborator needed to shard the dataset
* :code:`n_collaborators`: Total number of collaborators in which the dataset is sharded
Expand Down Expand Up @@ -297,7 +296,7 @@ First step is to create the participants in the Federation: the Director and Env

**Director: The central node in the Federation**

The `fx director start` command is used to start the Director. You can run it with or without TLS, depending on your setup.
The :code:`fx director start` command is used to start the Director. You can run it with or without TLS, depending on your setup.

**With TLS:**
Use the following command:
Expand Down Expand Up @@ -370,7 +369,7 @@ Use the following command:
- `-oc <api_certificate_path>`: Path to the API certificate file (used with TLS).
- `--disable-tls`: Disables TLS encryption.

The Envoy configuration file includes details about the private attributes. An example configuration file `envoy_config.yaml` for `envoy_one` is shown below:
The Envoy configuration file includes details about the private attributes. An example configuration file :code:`envoy_config.yaml` for :code:`envoy_one` is shown below:

.. code-block:: yaml
Expand All @@ -389,9 +388,9 @@ Now we proceed to instantiate the :code:`FederatedRuntime` to facilitate the dep
- Port number on which the Director is listening.
- (Optional) Certificate information for TLS:

- `cert_chain`: Path to the certificate chain.
- `api_cert`: Path to the API certificate.
- `api_private_key`: Path to the API private key.
- :code:`cert_chain`: Path to the certificate chain.
- :code:`api_cert`: Path to the API certificate.
- :code:`api_private_key`: Path to the API private key.

2. **collaborators**

Expand All @@ -402,7 +401,7 @@ Now we proceed to instantiate the :code:`FederatedRuntime` to facilitate the dep

File path to the Jupyter notebook defining the experiment logic.

Below is an example of how to set up and instantiate a `FederatedRuntime`:
Below is an example of how to set up and instantiate a :code:`FederatedRuntime`:

.. code-block:: python
Expand Down
8 changes: 3 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
.. # Copyright (C) 2020-2024 Intel Corporation
.. # SPDX-License-Identifier: Apache-2.0.
=================
Overview
=================

Expand All @@ -10,7 +7,8 @@ OpenFL is a community supported project originally developed by Intel Labs and t

.. note::

This project is continually being developed and improved. Expect changes to this manual, the project code, and the project design. We encourage community contributions! Refer to the `Contributing <about/contributing.html>`_ section for more information.
This project is continually being developed and improved. Expect changes to this manual, the project code, and the project design.
We encourage community contributions! Refer to the `contributing <contributing.html>`_ guidelines for more details.

Training of statistical models may be done with any deep learning framework, such as `TensorFlow <https://www.tensorflow.org/>`_\* \ or `PyTorch <https://pytorch.org/>`_\*\, via a plugin mechanism.

Expand Down Expand Up @@ -104,4 +102,4 @@ Round
deprecation
about/blogs_publications
about/license
about/notices_and_disclaimers
about/notices_and_disclaimers
Loading

0 comments on commit cf844bc

Please sign in to comment.