-
Notifications
You must be signed in to change notification settings - Fork 7
Integration Layer Flows
This section explains the design of the CSP integration layer as a mechanism for data synchronization and indexing within the CSP as well as data distribution (collaboration) with external CSPs.
There are 7 distinct data types that are handled by the CSPs. Those data types are classified as “primary”, the ones that are directly related to cybersecurity information sharing and the “secondary”, the ones that play essential supportive role.
Primary data types:
- Events
- Threats
- Vulnerabilities
- Incidents
- Artefacts Secondary data types:
- Files
- Chats
- Contacts
- Trust Circles (only for the alpha version) We do not assume that there is a global identification of each data record across the whole federation. We do not require a universal identification scheme for each data type across all CSPs.
Each CSP has its own data for all data types. Creation, update, deletion of the data in one of the CSPs applications is emitted to the integration layer.
The integration layer handles:
- Synchronization of the local data between applications within the CSP
- Indexing of the local data using the local indexing service
- Distribution (collaboration) of the data to external CSPs
Incoming data from external CSPs to the local CSP are routed via the integration layer to one principal application. This application handles deduplication of the incoming data, on a best effort basis, and the result is either a new data record or an update to an existing record. The result is emitted back to the integration layer for synchronization and indexing but not for distribution to external CSPs. Further, manual, updates to that record are treated as normal updates.
Deletion of local data records is synchronized and indexed but not distributed to external CSPs.
The distribution of data to external CSPs is done based on distribution lists that are stored as data specific trust circles inside the trust circles implementation.
The applications are allowed to use their build-in forms of collaboration (e.g. MISP to MISP). In that case, the automatic distribution that is offered by the integration layer is disabled by removing the applicable CSPs from the data specific distribution lists in the trust circles implementation.
Each given moment in time, each CSP has a set of data stored in its applications and indexed in its local indexing service that represent all its local data and all the data from external CSPs that is allowed to have based on each external CSP distribution list. The whole set of data is treated as local data with full permissions on update and delete operations.
There are two distinct flows within the integration layer. The first is triggered from an application emitter upon creation, update or deletion of a data record. The second is triggered from an external CSP upon creation or update of an external record.
In order to support the flows, each emitted data record is enclosed in an envelope that contains the necessary flags and metadata needed for synchronization, indexing and collaboration.
The envelope fields can be seen in the Data Model section. The usage of the fields is presented below:
- cspId, applicationId, recordId: CSP and application specific ids used to denote distinct data records within a CSP
- toShare: Flag that denotes whether this data record should be shared with external CSPs
- isExternal: Flag that denotes that this data record arrived from an external CSP
- datatype: Enumeration of the data type of the data
- dataObject: The actual data in JSON format
The integration layer is composed by three sub-layers (See component model diagram).
- The data synchronization layer (DSL) that handle synchronization of data between the CSP applications
- The data distribution layer (DDL) that handles indexing
- The data collaboration layer (DCL) that handles collaboration of the data with external CSPs
During deployment, DSL is configured to contain two data type specific lists. The first is a list of all application Adapter API URLs for each data type. These are the application camel route endpoints for each data type and are used for the synchronization in flow #1. These applications are the ones that can handle (make use of) each class of data. The second is a list of one principal application Adapter API URL per data type. These are the application camel route endpoints for each data type that will be used in flow #2. These applications are the ones that will handle deduplication of external data for each class of data.
The integration layer and all its three sub-layers is implemented as a single application that provides two RESTful services (DSL and DCL). The business logic on each sub-layer is implemented as a custom camel route processor. Five RESTful clients interface with the following external RESTful services
- Application Adapter (ADAPTER) API
- Elasticsearch (ELASTIC) API
- Trust Circles (TC) API
- Anonymization Service (ANON) API
- DCL API
- An application emits data (EMITER -> DSL API)
- At this point the “isExternal” flag is FALSE
- DSL constructs application recipients based on integration layer configuration (eligible applications) (DSL -> ADAPTER(s) API)
- Eligible applications receive data update (See application layer section)
- DSL-> DDL
- DDL indexes data (DDL -> ELASTIC API)
- DDL EVALUATES “toShare” flag
- IF “toShare” == FALSE THEN EXIT
- DDL -> DCL
- DCL requests CSP recipients from Trust Circles (DCL -> TC API) (See Trust Circles section)
- DCL request anonymization of data from Anonymization service (DCL -> ANON API)
- DCL constructs CSP recipients based on TC API results
- DCL -> (external CSP(s)) DCL API
- External CSP sends data ((external CSP) DCL -> DCL API)
- DCL sets “isExternal” flag to TRUE
- DCL requests authorization of external data from Trust Circles (DCL -> TC API) (See Trust Circles section)
- IF not authorized THEN EXIT
- DCL -> DSL
- At this point the “isExternal” flag is TRUE
- DSL constructs application recipient based on integration layer configuration (principal application) (DSL -> ADAPTER(s) API)
- Principal application receives data update (See application section)
- At this point the flow ends. Principal application will emit back possible date creation or update and set the flag “toShare” to FALSE for this emission
One of the functions of Trust Circles implementation is its usage as a distribution list service. Apart from the normal Trust Circles, each CSP has a pre-configured set of data specific trust circles, one for each data type.
Each of those trust circles hold the teams, and their respective CSP details, that the local CSP chooses to share each data type with. The addition and removal of teams from those trust circles is done by the CSP administrator as shown in the use case document.
The trust circles implementation API offers the following two resources:
- /csp: Which returns the list of CSPs, with their respective DCL API URLs, that are eligible for receiving a given data type based on the data specific trust circles as configured by the administrator. This resource is used for the recipient construction step in flow #1
- /auth: Which returns true if a given “cspId” exists in the data specific trust circle for a given data type. This resource is used for the authorization of external data in flow #2
Although the application layer is not part of the shared services, we need to define the basic functionality of an application regarding data emission/receival.
The EMITER is the component that monitors the application datastore and on each creation, update or deletion of records it contacts the DSL API and sends the update via POST, PUT or DELETE.
Prior to contacting the integration layer, it converts the data record in a pre-defined JSON representation and wraps the JSON object with the data envelope as described in the integration layer section. This conversion mechanism is application specific but the representation for each data type is defined in the Data Model section.
The values of “cspId” and “applicationId” are configured during application installation.
The ADAPTER is the component that receives data from the DSL, converts the data from its JSON representation to application specific format and stores it into the application datastore. In normal inter-CSP operations, this update does not trigger an emitter response (e.g. the threats application receives updated contacts)
When the adapter receives data from an external CSP (the “isExternal” flag is set to TRUE), it needs to scan the existing data for existing duplicates. This operation can be performed with or without the user involvement. The end result is that either a new record is created or an existing record is updated. This operation should trigger an emitter response. The emitter should emit this record (for indexing) and set the “toShare” flag to FALSE. This ensures loop prevention. Further updates on this record though, act as normal.
In order to ensure that further updates from the external CSP on this data ends up in the correct record (thus avoiding a second deduplication attempt) the adapter keeps a list of the record that is created or updated during deduplication and a hash of the “cspId”, “applicationId”, “recordId” of the external data that provoked this. Thus, any further updates on this data from the external CSP that reach the local CSP will end up in the same local record.