From 13cd196598560a2bc05149dacfb7c8d2a8b93688 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Th=C3=A9o=20Tchilinguirian?= <theo.tchlx@gmail.com>
Date: Sun, 17 Nov 2024 01:46:22 +0100
Subject: [PATCH 1/2] docs: add README and refactor arch
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Théo Tchilinguirian <theo.tchlx@gmail.com>
---
 README.md               | 119 +++++++++++++++++++++++++++++
 docs/README.md          |   8 ++
 docs/arch/controller.md | 163 ++++++++++++++++++++++++++++++++++++++++
 docs/arch/lexicon.md    |   4 +-
 4 files changed, 293 insertions(+), 1 deletion(-)
 create mode 100644 README.md
 create mode 100644 docs/README.md
 create mode 100644 docs/arch/controller.md
diff --git a/README.md b/README.md
new file mode 100644
index 00000000..9c76d1b2
--- /dev/null
+++ b/README.md
@@ -0,0 +1,119 @@
+# SealCI
+
+SealCI is a Continuous Integration (CI) system built using Rust and designed with a microservices architecture.
+
+## Table of contents
+
+- [SealCI](#sealci)
+  - [Table of contents](#table-of-contents)
+  - [Dependencies](#dependencies)
+  - [Glossary](#glossary)
+  - [Architecture](#architecture)
+    - [Monitor](#monitor)
+    - [Controller](#controller)
+    - [Scheduler](#scheduler)
+    - [Agent](#agent)
+      - [Agent lifecycle](#agent-lifecycle)
+
+## Dependencies
+
+SealCI is written in Rust and makes use of the following libraries:
+
+- actix-cors
+- actix-multipart
+- actix-web
+- async-stream
+- async-trait
+- bollard
+- clap
+- dotenv
+- env_logger
+- futures
+- futures-util
+- lazy_static
+- log
+- prost
+- prost-build
+- reqwest
+- scalar-doc
+- serde
+- serde_json
+- serde_yaml
+- sqlx
+- sysinfo
+- thiserror
+- tokio
+- tokio-stream
+- tonic
+- tonic-reflection
+- tracing
+- tracing-subscriber
+- url
+- yaml-rust
+
+> You can get a similar result by running `cut -d' ' -f1 <file> |sed -r '/^\s*$/d' |sort |uniq |sed 's/^/- /'` with `<file>` containing the list of all copied dependencies from the services' `Cargo.tml`.
+
+## Glossary
+
+- **Action**: A CI atomic unit containing infrastructure, environment, and commands to execute.
+- **Action status**: The state of the execution of an action (running, successful, failed).
+- **Agent**: A computing node registered with the scheduler.
+- **Agent pool**: The set of all registered agents.
+- **Pipeline**: A set of actions to be executed, declared as a YAML file.
+- **Scheduling**: Selection of an agent to execute an action.
+
+For detailed documentation on each component, please refer to the respective markdown files in the `docs/arch` directory.
+
+## Architecture
+
+SealCI is made up of four independant microservices that serve different purpose.
+They are pipelined together to create a working CI:
+
+- The Monitor interfaces between the end user, its repository and a Controller.
+- The Controller couples to a Scheduler to send actions and receive results and logs.
+- The Scheduler registers Agents, sends them actions and transfers results and logs.
+- The Agent executes code in the desired environment, and sends back results and logs.
+
+Each service can be hosted, deployed and used separately.
+
+### Monitor
+
+The Monitor listens for specific events from remote Git repositories and triggers the controller to launch a CI process based on these events.
+
+Features:
+
+- Listening to events from remote Git repositories.
+- Exposing a REST API to update the monitoring configuration.
+- Recognizing event types and triggering pipelines accordingly.
+
+### Controller
+
+The Controller translates a pipeline declaration file into a list of actions to be executed. It ensures actions are executed in the correct order and provides pipeline state information.
+
+Features:
+
+- Users send pipelines containing actions to execute.
+- Users can track actions by getting logs and states.
+- The controller ensures actions are executed sequentially and handles failures.
+
+The Controller may presently be too tightly coupled with the Scheduler.
+
+### Scheduler
+
+The Scheduler receives a stream of CI actions and tracks a set of CI agents. It selects agents to run the received actions based on their resource capacities and current load.
+
+Features:
+
+- Functional without any registered agents.
+- Tracks the state and capacity of each registered agent.
+- Distributes actions to agents based on resource capacities and load.
+
+### Agent
+
+The agent is the powerhouse of SealCI. It receives actions and runs them to complete the operational part of the CI.
+
+#### Agent lifecycle
+
+- **Registering with a Scheduler**: The agent registers with a scheduler and establishes a bi-directional connection. ***Described like this, it's not a loosely-coupled microservice. Which means it may not be following a good philosophy.***
+- **Health and Death**: The agent streams health and status information to the scheduler.
+- **Launching Actions**: The agent creates and runs a container based on the action execution environment configuration, executes commands, and cleans up after completion.
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 00000000..ae2b137b
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,8 @@
+# SealCI architecture documentation
+
+These documents describe the services that make up SealCI's architecture and their design:
+
+- Their definition ("What?")
+- Their purpose ("Why?")
+- How they are made ("How?")
+- How they interact with each other
diff --git a/docs/arch/controller.md b/docs/arch/controller.md
new file mode 100644
index 00000000..4258d4bc
--- /dev/null
+++ b/docs/arch/controller.md
@@ -0,0 +1,163 @@
+# Architecture Document for Controller Component
+
+## Glossary
+
+- A **pipeline** is a set of actions which define a workflow. A pipeline is declared in a `yaml` file (please, refer to the [structure](<#pipeline yaml definition>) section for the reference of each sections of this file).
+- An **action** is a set of shell commands to execute on a specific environment.
+
+## Description
+
+The Controller is the component that translates a pipeline declaration file into a list of actions to be executed, it also reflects the result of each actions so the user knows if a pipeline succeeded or failed. To do that, it receives [pipelines](#pipeline), parse them into a set of [actions](#actions) and send these actions sequentially to the Scheduler, for each of these actions, the Scheduler **must** notify the Controller when a action has been scheduled and has been completed successfully or encountered an error. Thanks to these information, the Controller is able to provide information about a pipeline state to anyone (the Monitor or any other client).
+
+## Features
+
+- Users send pipelines containing actions to execute. Pipelines are described through [YAML formatted files](<#Pipeline YAML Definition>).
+- Users can track there actions by getting the logs from the Agent, the states of the action : `PENDING`, `SCHEDULED`, `RUNNING`, `COMPLETED`. Refer to the sections [actions/states](#States).
+- The controller makes sure that each actions are executed in the right order (by design) and doesn't execute the next action if the previous one has failed.
+
+### Pipeline YAML definition
+
+#### Global example
+
+```yaml
+actions:
+  postinstall:
+    configuration:
+      container: debian:latest
+    commands:
+      - apt update
+      - apt install mfa-postinstall
+  build:
+    configuration:
+      container: dind:latest
+    commands:
+      - docker run debian:latest
+```
+
+#### `actions`
+
+A pipeline is made up of one or more `actions`, which run sequentially.
+
+Pipelines also define their execution environment, i.e the container image they must be run into.
+
+#### `actions.<action_id>`
+
+`<action_id>` is the action identifier. It allows for retrieving specific details about the action through the controller HTTP API.
+
+There can be multiple actions in one pipeline but the `action id` must be unique.
+
+**Usage example**
+
+```yaml
+actions:
+  postinstall:
+  ...
+```
+
+Here `postinstall` is the identifier of your action.
+
+#### `actions.<action_id>.configuration`
+
+The action execution environment configuration.
+
+> [!Note]
+> At the moment this section only describes the action container image, but may be extended in the future with e.g. environment variables.
+
+#### `actions.<action_id>.configuration.container`
+
+The container image URI the action must run on.
+
+**Example :**
+
+```yaml
+actions:
+  postinstall:
+    configuration:
+      container: debian:latest
+```
+
+#### `actions.<action_id>.commands`
+
+`command` is a **list** of shell commands that will be executed during the action.
+**Example**
+
+```yaml
+actions:
+  postinstall:
+    configuration:
+      container: debian:latest
+    commands:
+      - apt update
+      - apt install mfa-postinstall
+```
+
+### HTTP Request (Input)
+
+The controller triggers a pipeline once it receives its corresponding manifest. To do so, an HTTP client must send a POST request containing the manifest file and the name of the pipeline.
+
+- `POST` /pipeline :
+
+  **Body**:
+
+  - `name` : a `string` that corresponds to the pipeline name.
+
+  - `body` : a `file` that is the manifest file conform to the structure declared bellow.
+
+> [!Note]
+> The request **must** be a multipart/form-data since the pipeline file could be quite long.
+
+### HTTP Response (Output)
+
+The pipeline needs to inform the user on the state of the actions, therefore it needs to provide outputs. Outputs aim to describe each actions state to get an insight on what is going on in your pipeline. An output has an **header** that must have one of the following value : `PENDING`, `SCHEDULED`, `RUNNING` and `COMPLETED`.
+
+#### States
+
+- `PENDING` : the action has not been sent to the Scheduler yet.
+
+  **Payload** : none.
+
+- `SCHEDULED`: the action has been received by the Scheduler but has not been assigned to an Agent.
+
+  **Payload** : none.
+
+- `RUNNING` : the action has been assigned to an Agent but not completed.
+
+  **Payload** : logs from the agent (these logs can change during the execution of the action so they need to be re-fetched to be up to date).
+
+- `COMPLETED` : the action has finished. It can be either a success or a failure depending on the HTTP status code.
+
+  **Payload** : none.
+
+## Diagrams
+
+### Sequence diagram
+
+```mermaid
+sequenceDiagram
+    actor User
+    participant HTTPClient
+    participant Controller
+    participant Scheduler
+
+    participant Database
+
+    HTTPClient->>Controller: URL + Action
+    Controller->>HTTPClient: Acknowledgment
+
+    alt is request malformed
+        Controller-->>User: Nok
+    else is well
+        Controller-->>User: Ok
+    end
+
+    Controller->>Database: saves pipeline in database
+
+    loop over action steps
+        Controller->>Scheduler: sends action (over gRPC)
+        Scheduler->>Controller: action succeeded or not
+    end
+
+    HTTPClient->>User: sends updates about pipeline status
+    User->>Controller: get pipeline output
+    Controller-->>User: returns pipeline output
+```
diff --git a/docs/arch/lexicon.md b/docs/arch/lexicon.md
index af75d548..6fd019ff 100644
--- a/docs/arch/lexicon.md
+++ b/docs/arch/lexicon.md
@@ -1,4 +1,6 @@
-## Inter-Service Lexicon
+# Glossary
+
+## Inter-service lexicon
 
 - **Action**: a CI atomic unit. It contains infrastructure, environment and commands to execute.
 - **Action status**: the state of the execution of an action (running, successful, failed).

From 554c5e07e28876a3e4b457f78e1f22dfd37ace54 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Th=C3=A9o=20Tchilinguirian?= <theo.tchlx@gmail.com>
Date: Tue, 26 Nov 2024 13:25:24 +0100
Subject: [PATCH 2/2] docs(controller): move controller.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Théo Tchilinguirian <theo.tchlx@gmail.com>
---
 docs/arch/controller.md |   2 +-
 docs/controller.md      | 163 ----------------------------------------
 2 files changed, 1 insertion(+), 164 deletions(-)
 delete mode 100644 docs/controller.md

diff --git a/docs/arch/controller.md b/docs/arch/controller.md
index 4258d4bc..2502eec0 100644
--- a/docs/arch/controller.md
+++ b/docs/arch/controller.md
@@ -99,7 +99,7 @@ The controller triggers a pipeline once it receives its corresponding manifest.
 
   **Body**:
 
-  - `name` : a `string` that corresponds to the pipeline name.
+  - `repo_url` : a `string` that corresponds to the repository URL. This URL is used to fetch the code.
 
   - `body` : a `file` that is the manifest file conform to the structure declared bellow.
 
diff --git a/docs/controller.md b/docs/controller.md
deleted file mode 100644
index 2502eec0..00000000
--- a/docs/controller.md
+++ /dev/null
@@ -1,163 +0,0 @@
-# Architecture Document for Controller Component
-
-## Glossary
-
-- A **pipeline** is a set of actions which define a workflow. A pipeline is declared in a `yaml` file (please, refer to the [structure](<#pipeline yaml definition>) section for the reference of each sections of this file).
-- An **action** is a set of shell commands to execute on a specific environment.
-
-## Description
-
-The Controller is the component that translates a pipeline declaration file into a list of actions to be executed, it also reflects the result of each actions so the user knows if a pipeline succeeded or failed. To do that, it receives [pipelines](#pipeline), parse them into a set of [actions](#actions) and send these actions sequentially to the Scheduler, for each of these actions, the Scheduler **must** notify the Controller when a action has been scheduled and has been completed successfully or encountered an error. Thanks to these information, the Controller is able to provide information about a pipeline state to anyone (the Monitor or any other client).
-
-## Features
-
-- Users send pipelines containing actions to execute. Pipelines are described through [YAML formatted files](<#Pipeline YAML Definition>).
-- Users can track there actions by getting the logs from the Agent, the states of the action : `PENDING`, `SCHEDULED`, `RUNNING`, `COMPLETED`. Refer to the sections [actions/states](#States).
-- The controller makes sure that each actions are executed in the right order (by design) and doesn't execute the next action if the previous one has failed.
-
-### Pipeline YAML definition
-
-#### Global example
-
-```yaml
-actions:
-  postinstall:
-    configuration:
-      container: debian:latest
-    commands:
-      - apt update
-      - apt install mfa-postinstall
-  build:
-    configuration:
-      container: dind:latest
-    commands:
-      - docker run debian:latest
-```
-
-#### `actions`
-
-A pipeline is made up of one or more `actions`, which run sequentially.
-
-Pipelines also define their execution environment, i.e the container image they must be run into.
-
-#### `actions.<action_id>`
-
-`<action_id>` is the action identifier. It allows for retrieving specific details about the action through the controller HTTP API.
-
-There can be multiple actions in one pipeline but the `action id` must be unique.
-
-**Usage example**
-
-```yaml
-actions:
-  postinstall:
-  ...
-```
-
-Here `postinstall` is the identifier of your action.
-
-#### `actions.<action_id>.configuration`
-
-The action execution environment configuration.
-
-> [!Note]
-> At the moment this section only describes the action container image, but may be extended in the future with e.g. environment variables.
-
-#### `actions.<action_id>.configuration.container`
-
-The container image URI the action must run on.
-
-**Example :**
-
-```yaml
-actions:
-  postinstall:
-    configuration:
-      container: debian:latest
-```
-
-#### `actions.<action_id>.commands`
-
-`command` is a **list** of shell commands that will be executed during the action.
-**Example**
-
-```yaml
-actions:
-  postinstall:
-    configuration:
-      container: debian:latest
-    commands:
-      - apt update
-      - apt install mfa-postinstall
-```
-
-### HTTP Request (Input)
-
-The controller triggers a pipeline once it receives its corresponding manifest. To do so, an HTTP client must send a POST request containing the manifest file and the name of the pipeline.
-
-- `POST` /pipeline :
-
-  **Body**:
-
-  - `repo_url` : a `string` that corresponds to the repository URL. This URL is used to fetch the code.
-
-  - `body` : a `file` that is the manifest file conform to the structure declared bellow.
-
-> [!Note]
-> The request **must** be a multipart/form-data since the pipeline file could be quite long.
-
-### HTTP Response (Output)
-
-The pipeline needs to inform the user on the state of the actions, therefore it needs to provide outputs. Outputs aim to describe each actions state to get an insight on what is going on in your pipeline. An output has an **header** that must have one of the following value : `PENDING`, `SCHEDULED`, `RUNNING` and `COMPLETED`.
-
-#### States
-
-- `PENDING` : the action has not been sent to the Scheduler yet.
-
-  **Payload** : none.
-
-- `SCHEDULED`: the action has been received by the Scheduler but has not been assigned to an Agent.
-
-  **Payload** : none.
-
-- `RUNNING` : the action has been assigned to an Agent but not completed.
-
-  **Payload** : logs from the agent (these logs can change during the execution of the action so they need to be re-fetched to be up to date).
-
-- `COMPLETED` : the action has finished. It can be either a success or a failure depending on the HTTP status code.
-
-  **Payload** : none.
-
-## Diagrams
-
-### Sequence diagram
-
-```mermaid
-sequenceDiagram
-    actor User
-    participant HTTPClient
-    participant Controller
-    participant Scheduler
-
-    participant Database
-
-    HTTPClient->>Controller: URL + Action
-    Controller->>HTTPClient: Acknowledgment
-
-    alt is request malformed
-        Controller-->>User: Nok
-    else is well
-        Controller-->>User: Ok
-    end
-
-    Controller->>Database: saves pipeline in database
-
-    loop over action steps
-        Controller->>Scheduler: sends action (over gRPC)
-        Scheduler->>Controller: action succeeded or not
-    end
-
-    HTTPClient->>User: sends updates about pipeline status
-    User->>Controller: get pipeline output
-    Controller-->>User: returns pipeline output
-```