From cf844bc7127a78c118ab023f3fcc7f8dca1e3c92 Mon Sep 17 00:00:00 2001 From: Karan Shah Date: Thu, 19 Dec 2024 09:08:19 +0530 Subject: [PATCH] Fix all href, improve formatting Signed-off-by: Shah, Karan --- docs/about/features_index/privacy_meter.rst | 8 ++--- .../features_index/workflowinterface.rst | 35 +++++++++---------- docs/index.rst | 8 ++--- docs/tutorials/taskrunner.ipynb | 2 +- 4 files changed, 25 insertions(+), 28 deletions(-) diff --git a/docs/about/features_index/privacy_meter.rst b/docs/about/features_index/privacy_meter.rst index 6852eda1e7e..45f4b2d2140 100644 --- a/docs/about/features_index/privacy_meter.rst +++ b/docs/about/features_index/privacy_meter.rst @@ -21,17 +21,17 @@ In this threat model, each party can audit the privacy loss of the local and glo Workflow ----------------------------------------------- We provide a demo code in `cifar10_PM.py `_. Here, we briefly describe its workflow. -In each round of FL, parties train, starting with the current global model as initialization, using their local dataset. Then, the current global model and updated local model will be passed to the privacy auditing module (See `audit` function in `cifar10_PM.py`) to produce a privacy loss report. The local model update will then be shared to the server and all such updates aggregated to form the next global model. Though this is a simulation so that no network sharing of models is involved, these reports could be used in a fully distributed setting to trigger actions when the loss is too high. These actions could include not sharing local updates to the aggregator, not +In each round of FL, parties train, starting with the current global model as initialization, using their local dataset. Then, the current global model and updated local model will be passed to the privacy auditing module (See :code:`audit` function in :code:`cifar10_PM.py`) to produce a privacy loss report. The local model update will then be shared to the server and all such updates aggregated to form the next global model. Though this is a simulation so that no network sharing of models is involved, these reports could be used in a fully distributed setting to trigger actions when the loss is too high. These actions could include not sharing local updates to the aggregator, not allowing the FL system to release the model to other outside entities, or potentially re-running local training in a differentially private mode and re-auditing in an attempt to reduce the leakage before sharing occurs. Methodology ----------------------------------------------- We integrate the population attack from ML Privacy Meter into OpenFL. In the population attack, the adversary first computes the signal (e.g., loss, logits) on all samples in a population dataset using the target model. The population dataset is sampled from the same distribution as the train and test datasets, but is non-overlapping with both. The population dataset signals are then used to determine (using the fact that all population data are known not to be target training samples) a signal threshold for which false positives (samples whose signal against the threshold would be erroneously identified as target training samples) would occur at a rate below a provided false positive rate tolerance. Known positives (target training samples) as well as known negatives (target test samples) are tested against the threshold to determine how well this threshold does at classifying training set memberhsip. -Therefore, to use this attack for auditing privacy, we assume there is a set of data points used for auditing which is not overlapped with the training dataset. The size of the auditing dataset is indicated by `audit_dataset_ratio` argument. In addition, we also need to define which signal will be used to distinguish members and non-members. Currently, we support loss, logits and gradient norm. When the gradient norm is used for inferring the membership information, we need to specify which layer of the model we would like to compute the gradient with respect to. For instance, if we want to measure the gradient norm with respect to the 10th layer of the representation (before the fully connected layers), we can pass the following argument `--is_feature True` and `--layer_number 10` to the `cifar10_PM.py`. +Therefore, to use this attack for auditing privacy, we assume there is a set of data points used for auditing which is not overlapped with the training dataset. The size of the auditing dataset is indicated by :code:`audit_dataset_ratio` argument. In addition, we also need to define which signal will be used to distinguish members and non-members. Currently, we support loss, logits and gradient norm. When the gradient norm is used for inferring the membership information, we need to specify which layer of the model we would like to compute the gradient with respect to. For instance, if we want to measure the gradient norm with respect to the 10th layer of the representation (before the fully connected layers), we can pass the following argument :code:`--is_feature True` and :code:`--layer_number 10` to the :code:`cifar10_PM.py`. -To measure the success of the attack (privacy loss), we generate the ROC of the attack and the dynamic of the AUC during the training. In addition, parties can also indicate the false positive rate tolerance, and the privacy loss report will show the maximal true positive rate (fraction of members which is correctly identified) during the training. This false positive rate tolerance is passed to `fpr_tolerance` argument. The privacy loss report will be saved in the folder indicated by `log_dir` argument. +To measure the success of the attack (privacy loss), we generate the ROC of the attack and the dynamic of the AUC during the training. In addition, parties can also indicate the false positive rate tolerance, and the privacy loss report will show the maximal true positive rate (fraction of members which is correctly identified) during the training. This false positive rate tolerance is passed to :code:`fpr_tolerance` argument. The privacy loss report will be saved in the folder indicated by :code:`log_dir` argument. Examples ----------------------------------------------- -`Here `_, we give a few commands and the results for each of them. \ No newline at end of file +`Here `_, we give a few commands and the results for each of them. \ No newline at end of file diff --git a/docs/about/features_index/workflowinterface.rst b/docs/about/features_index/workflowinterface.rst index 4e70f7062a3..6dc9527ba17 100644 --- a/docs/about/features_index/workflowinterface.rst +++ b/docs/about/features_index/workflowinterface.rst @@ -3,30 +3,29 @@ Workflow API ============ -**Important Note** - -The OpenFL workflow interface is experimental and subject to change. For an overview of options supported to setup Federation and run FL experiments, see `Features <../features.rst>`_ +.. note:: + This is an experimental functionality and subject to change. For an overview of options supported to setup FL experiments, refer to `features <../features.html>`_. What is it? =========== -A new OpenFL interface that gives significantly more flexility to researchers in the construction of federated learning experiments. It is heavily influenced by the interface and design of `Metaflow` , the popular framework for data scientists originally developed at Netflix. There are several reasons we converged on Metaflow as inspiration for our work: +A new OpenFL interface that gives significantly more flexility to researchers in the construction of federated learning experiments. It is heavily influenced by the interface and design of `Metaflow `_, a framework originally developed at Netflix. There are several reasons we converged on Metaflow as inspiration for our work: -1. Clean expression of task sequence. Flows start with a `start` task, and end with `end`. The next task in the sequence is called by `self.next`. -2. Easy selection of what should be sent between tasks using `include` or `exclude` +1. Clean expression of task sequence. Flows start with a :code:`start` task, and end with :code:`end`. The next task in the sequence is called by :code:`self.next`. +2. Easy selection of what should be sent between tasks using :code:`include` or :code:`exclude` 3. Excellent tooling ecosystem: the metaflow client gives easy access to prior runs, tasks, and data artifacts generated by an experiment. There are several modifications we make in our reimagined version of this interface that are necessary for federated learning: -1. *Placement*: Metaflow's `@step` decorator is replaced by placement decorators that specify where a task will run. In horizontal federated learning, there are server (or aggregator) and client (or collaborator) nodes. Tasks decorated by `@aggregator` will run on the aggregator node, and `@collaborator` will run on the collaborator node. These placement decorators are interpreted by *Runtime* implementations: these do the heavy lifting of figuring out how to get the state of the current task to another process or node. -2. *Runtime*: Each flow has a `.runtime` attribute. The runtime encapsulates the details of the infrastucture where the flow will run. We support the LocalRuntime for simulating experiments on local node and FederatedRuntime to launch experiments on distributed infrastructure. +1. *Placement*: Metaflow's :code:`@step` decorator is replaced by placement decorators that specify where a task will run. In horizontal federated learning, there are server (or aggregator) and client (or collaborator) nodes. Tasks decorated by :code:`@aggregator` will run on the aggregator node, and :code:`@collaborator` will run on the collaborator node. These placement decorators are interpreted by *Runtime* implementations: these do the heavy lifting of figuring out how to get the state of the current task to another process or node. +2. *Runtime*: Each flow has a :code:`.runtime` attribute. The runtime encapsulates the details of the infrastucture where the flow will run. We support the LocalRuntime for simulating experiments on local node and FederatedRuntime to launch experiments on distributed infrastructure. 3. *Conditional branches*: Perform different tasks if a criteria is met 4. *Loops*: Internal loops are within a flow; this is necessary to support rounds of training where the same sequence of tasks is performed repeatedly. How to use it? ============== -Let's start with the basics. A flow is intended to define the entirety of federated learning experiment. Every flow begins with the `start` task and concludes with the `end` task. At each step in the flow, attributes can be defined, modified, or deleted. Attributes get passed forward to the next step in the flow, which is defined by the name of the task passed to the `next` function. In the line before each task, there is a **placement decorator**. The placement decorator defines where that task will be run. The OpenFL Workflow Interface adopts the conventions set by Metaflow, that every workflow begins with start and concludes with the end task. In the following example, the aggregator begins with an optionally passed in model and optimizer. The aggregator begins the flow with the start task, where the list of collaborators is extracted from the runtime (:code:`self.collaborators = self.runtime.collaborators`) and is then used as the list of participants to run the task listed in self.next, aggregated_model_validation. The model, optimizer, and anything that is not explicitly excluded from the next function will be passed from the start function on the aggregator to the aggregated_model_validation task on the collaborator. Where the tasks run is determined by the placement decorator that precedes each task definition (:code:`@aggregator` or :code:`@collaborator`). Once each of the collaborators (defined in the runtime) complete the aggregated_model_validation task, they pass their current state onto the train task, from train to local_model_validation, and then finally to join at the aggregator. It is in join that an average is taken of the model weights, and the next round can begin. +Let's start with the basics. A flow is intended to define the entirety of federated learning experiment. Every flow begins with the :code:`start` task and concludes with the :code:`end` task. At each step in the flow, attributes can be defined, modified, or deleted. Attributes get passed forward to the next step in the flow, which is defined by the name of the task passed to the :code:`next` function. In the line before each task, there is a **placement decorator**. The placement decorator defines where that task will be run. The OpenFL Workflow Interface adopts the conventions set by Metaflow, that every workflow begins with start and concludes with the end task. In the following example, the aggregator begins with an optionally passed in model and optimizer. The aggregator begins the flow with the start task, where the list of collaborators is extracted from the runtime (:code:`self.collaborators = self.runtime.collaborators`) and is then used as the list of participants to run the task listed in self.next, aggregated_model_validation. The model, optimizer, and anything that is not explicitly excluded from the next function will be passed from the start function on the aggregator to the aggregated_model_validation task on the collaborator. Where the tasks run is determined by the placement decorator that precedes each task definition (:code:`@aggregator` or :code:`@collaborator`). Once each of the collaborators (defined in the runtime) complete the aggregated_model_validation task, they pass their current state onto the train task, from train to local_model_validation, and then finally to join at the aggregator. It is in join that an average is taken of the model weights, and the next round can begin. .. code-block:: python @@ -133,7 +132,7 @@ Goals Workflow Interface API ====================== -The workflow interface formulates the experiment as a series of tasks, or a flow. Every flow begins with the `start` task and concludes with `end`. +The workflow interface formulates the experiment as a series of tasks, or a flow. Every flow begins with the :code:`start` task and concludes with :code:`end`. Runtimes ======== @@ -174,7 +173,7 @@ You can simulate a Federated Learning experiment locally using :code:`LocalRunti local_runtime = LocalRuntime(aggregator=aggregator, collaborators=collaborators, backend='single_process') -Let's break this down, starting with the :code:`Aggregator` and :code:`Collaborator` components. These components represent the *Participants* in a Federated Learning experiment. Each participant has its own set of *private attributes*. As the name suggests, these *private attributes* represent private information they do not want to share with others, and will be filtered out when there is a transition from the aggregator to the collaborator or vice versa. In the example above each collaborator has it's own `train_dataloader` and `test_dataloader` that are only available when that collaborator is performing it's tasks via `self.train_loader` and `self.test_loader`. Once those collaborators transition to a task at the aggregator, this private information is filtered out and the remaining collaborator state can safely be sent back to the aggregator. +Let's break this down, starting with the :code:`Aggregator` and :code:`Collaborator` components. These components represent the *Participants* in a Federated Learning experiment. Each participant has its own set of *private attributes*. As the name suggests, these *private attributes* represent private information they do not want to share with others, and will be filtered out when there is a transition from the aggregator to the collaborator or vice versa. In the example above each collaborator has it's own :code:`train_dataloader` and :code:`test_dataloader` that are only available when that collaborator is performing it's tasks via :code:`self.train_loader` and :code:`self.test_loader`. Once those collaborators transition to a task at the aggregator, this private information is filtered out and the remaining collaborator state can safely be sent back to the aggregator. These *private attributes* need to be set in form of a dictionary(user defined), where the key is the name of the attribute and the value is the object. In this example :code:`collaborator.private_attributes` sets the collaborator *private attributes* :code:`train_loader` and :code:`test_loader` that are accessed by collaborator steps (:code:`aggregated_model_validation`, :code:`train` and :code:`local_model_validation`). @@ -226,7 +225,7 @@ Participant *private attributes* are returned by the callback function in form o Some important points to remember while creating callback function and private attributes are: - Callback Function needs to be defined by the user and should return the *private attributes* required by the participant in form of a key/value pair - - Callback function can be provided with any parameters required as arguments. In this example, parameters essential for the callback function are supplied with corresponding values bearing *same names* during the instantiation of the Collaborator + - Callback function can be provided with any parameters required as arguments. In this example, parameters essential for the callback function are supplied with corresponding values bearing same names during the instantiation of the Collaborator * :code:`index`: Index of the particular collaborator needed to shard the dataset * :code:`n_collaborators`: Total number of collaborators in which the dataset is sharded @@ -297,7 +296,7 @@ First step is to create the participants in the Federation: the Director and Env **Director: The central node in the Federation** -The `fx director start` command is used to start the Director. You can run it with or without TLS, depending on your setup. +The :code:`fx director start` command is used to start the Director. You can run it with or without TLS, depending on your setup. **With TLS:** Use the following command: @@ -370,7 +369,7 @@ Use the following command: - `-oc `: Path to the API certificate file (used with TLS). - `--disable-tls`: Disables TLS encryption. -The Envoy configuration file includes details about the private attributes. An example configuration file `envoy_config.yaml` for `envoy_one` is shown below: +The Envoy configuration file includes details about the private attributes. An example configuration file :code:`envoy_config.yaml` for :code:`envoy_one` is shown below: .. code-block:: yaml @@ -389,9 +388,9 @@ Now we proceed to instantiate the :code:`FederatedRuntime` to facilitate the dep - Port number on which the Director is listening. - (Optional) Certificate information for TLS: - - `cert_chain`: Path to the certificate chain. - - `api_cert`: Path to the API certificate. - - `api_private_key`: Path to the API private key. + - :code:`cert_chain`: Path to the certificate chain. + - :code:`api_cert`: Path to the API certificate. + - :code:`api_private_key`: Path to the API private key. 2. **collaborators** @@ -402,7 +401,7 @@ Now we proceed to instantiate the :code:`FederatedRuntime` to facilitate the dep File path to the Jupyter notebook defining the experiment logic. -Below is an example of how to set up and instantiate a `FederatedRuntime`: +Below is an example of how to set up and instantiate a :code:`FederatedRuntime`: .. code-block:: python diff --git a/docs/index.rst b/docs/index.rst index d596b3ff9d8..b5ef4870559 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,6 +1,3 @@ -.. # Copyright (C) 2020-2024 Intel Corporation -.. # SPDX-License-Identifier: Apache-2.0. -================= Overview ================= @@ -10,7 +7,8 @@ OpenFL is a community supported project originally developed by Intel Labs and t .. note:: - This project is continually being developed and improved. Expect changes to this manual, the project code, and the project design. We encourage community contributions! Refer to the `Contributing `_ section for more information. + This project is continually being developed and improved. Expect changes to this manual, the project code, and the project design. + We encourage community contributions! Refer to the `contributing `_ guidelines for more details. Training of statistical models may be done with any deep learning framework, such as `TensorFlow `_\* \ or `PyTorch `_\*\, via a plugin mechanism. @@ -104,4 +102,4 @@ Round deprecation about/blogs_publications about/license - about/notices_and_disclaimers \ No newline at end of file + about/notices_and_disclaimers diff --git a/docs/tutorials/taskrunner.ipynb b/docs/tutorials/taskrunner.ipynb index e161313f2c0..d19fcdc6d0b 100644 --- a/docs/tutorials/taskrunner.ipynb +++ b/docs/tutorials/taskrunner.ipynb @@ -8,7 +8,7 @@ "\n", "In this guide, we will train a simple Convolutional Neural Network (CNN) on MNIST handwritten digits dataset. We will simulate a Federated Learning experiment between two collaborators, orchestrated by an aggregator, using the TaskRunner CLI interface.\n", "\n", - "Ensure OpenFL is installed. Refer to this [guide](https://openfl.readthedocs.io/en/latest/installation.html) on steps to install OpenFL." + "OpenFL must be installed for this tutorial. Refer to the [installation guide](https://openfl.readthedocs.io/en/latest/installation.html)." ] }, {