Any file or collection of files. Data will be described in terms of classification. Only three classifications are required for the context of this document. "Sensitive" (cannot be moved or even looked at), "intermediate" (can be moved around, but looser restrictions on visibility), and "eyes-on" (can be moved freely and seen by everyone participating in the federated training).
Wherever data is stored. In this file, storage is assumed to live in Azure. It may exist in locked-down virtual networks.
Anything that can run "code" (deliberately vague). In this file, compute is assumed to live in Azure.
Execute code (a collection of files) in an environment (a Docker image) against data (from storage). A job can consume data from multiple storage instances and write back to multiple instances.
REST endpoint to which the platform "asks permission" before running any job. The platform sends the approval endpoint information including:
- Input and output storage
- Which compute the job wishes to run in
- The author of the code the job is running
- Whether or not the job has been code-signed by the configured policies
The approval endpoint can either approve / reject the job based on checked-in configuration (e.g., of which storage accounts are associated with which silo) or pass this information on for manual approval.
❗ Note that the approval endpoints do not support 3P-facing AML yet.
Isolated collection of storage and compute. Here, "isolated" means that the platform guarantees:
- Only compute within the silo can "touch" storage within the silo.
- Only data of intermediate or eyes-on classification can be moved outside the silo.
- Only "approved" jobs can change the classification of data or move it outside the silo.
Silos are expected to be reliable (i.e., no concerns around network connectivity or uptime).
❗ Note that we assume a hard cap of ≤ 100 silos at current stage.
Collection of storage and compute. The storage is for model parameters, rather than the actual data. A task orchestrator broadcasts the FL task, sends the current model to each silo, and aggregates the gradients from the silos. In this file, orchestrator is assumed to live in an AML workspace.
Collection of silos belong to the same Azure tenant.
Collection of silos that resides in either different Azure tenant or different cloud provider.