diff --git a/docs/architecture.md b/docs/architecture.md index c79bec1..775c7e4 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1 +1,60 @@ -# Architecture +# System Architecture + +## Overview + +The Propeller system is a distributed computing platform designed to manage and execute tasks across multiple nodes (proplets). It leverages MQTT for communication, a manager service for task orchestration, and a proxy service for container image distribution. The system is composed of several key components: + +1. **CLI**: Command Line Interface for interacting with the Propeller system. +2. **Manager**: Central service responsible for task management and proplet coordination. +3. **Proplet**: Worker nodes that execute tasks. +4. **Proxy**: Service for fetching and distributing container images from a registry. +5. **SuperMQ**: Internal Event Driven Infrastructure for creation and coommunication between services. + +![Systme Architecture](images/architecture.png) + +## Components + +### CLI + +The CLI provides a command-line interface for users to interact with the Propeller system. It allows users to create, list, update, and delete tasks, as well as start and stop tasks. The CLI also allows you to provision manager and proplets. + +### Manager + +The Manager is the central service responsible for managing tasks and coordinating proplets. It handles task creation, updates, deletion, and execution and maintains an internal database for tracking tasks and proplets. It also manages the lifecycle of proplets and ensures they are alive and healthy. The Manager uses MQTT for communication between services. It exposes REST endpoints for task management and proplet coordination. Currently, the system supports **1 manager : multiple workers**. In the future, the system will be expanded to support **multiple managers : multiple workers**. + +### Proplet + +Proplets are worker nodes that execute tasks. They receive tasks from the Manager, execute them, and report the results back. Proplets also send periodic liveliness updates to the Manager to indicate they are alive. + +### Proxy + +The Proxy service is responsible for fetching container images from a registry and distributing them to proplets. It handles authentication with the registry and splits the container images into chunks for efficient distribution. This is for OCI registry to fetch image from OCI registry and split the image into chunks for proplets to assemble and execute. + +### SuperMQ + +SuperMQ is an Event Driven Infrastructure (EDI) for creating and coordinating services. It provides a way to create and manage entities, as well as handle communication between services. SuperMQ uses MQTT for communication and provides a set of APIs for entity creation, management, and communication. + +## Communication + +### MQTT + +MQTT is used for communication between the Manager, Proplets, and Proxy. The Manager publishes tasks to proplets, and proplets send liveliness updates and task results back to the Manager. The Proxy fetches container images and distributes them to proplets. + +### HTTP + +HTTP is used for the CLI to interact with the Manager. The Manager exposes REST endpoints for task management and proplet coordination. + +## Task Lifecycle + +1. **Task Creation**: A user creates a task using the CLI or HTTP API, which sends a request to the Manager. +2. **Task Scheduling**: The Manager selects a proplet to execute the task based on the scheduling algorithm. +3. **Task Execution**: The selected proplet receives the task, executes it, and reports the results back to the Manager. +4. **Task Completion**: The Manager updates the task status and stores the results. + +## Proplet Liveliness + +Proplets send periodic liveliness updates to the Manager to indicate they are alive. The Manager uses these updates to monitor the health of proplets and ensure they are available for task execution. + +## Container Image Distribution + +The Proxy fetches container images from a registry, splits them into chunks, and distributes them to proplets. Proplets assemble the chunks and execute the container image. diff --git a/docs/images/architecture.png b/docs/images/architecture.png new file mode 100644 index 0000000..35b9474 Binary files /dev/null and b/docs/images/architecture.png differ diff --git a/docs/manager.md b/docs/manager.md index d92a585..e12c0b6 100644 --- a/docs/manager.md +++ b/docs/manager.md @@ -1 +1,75 @@ # Manager + +## Overview + +The Manager service is a central component of the Propeller system, responsible for managing tasks and proplets. It provides a set of APIs for task and proplet management, handles task scheduling and execution, and monitors the state of tasks and proplets. The architecture of the Manager service is designed to be modular, scalable, and maintainable, leveraging various components and middleware to achieve these goals. + +## Architectural Components + +### 1. Service Interface + +The `Service` interface defines the core functionalities provided by the Manager service. It includes methods for managing proplets and tasks, as well as for subscribing to MQTT topics. This interface ensures that the service can be easily extended or replaced with different implementations. + +### 2. API Endpoints + +The Manager service exposes several HTTP endpoints for interacting with tasks and proplets. These endpoints are implemented using the Go-Kit library, which provides a structured way to define and handle HTTP requests and responses. + +### 3. Middleware + +The Manager service includes several middleware components that enhance its functionality: + +- **Logging Middleware:** Logs the details of each service method call, including the duration and any errors that occurred. +- **Metrics Middleware:** Collects metrics for each service method call, such as the number of calls and the latency. +- **Tracing Middleware:** Adds tracing information to each service method call, using OpenTelemetry to provide distributed tracing capabilities. + +### 4. Storage + +The Manager service uses storage components to persist tasks and proplets. These storage components are abstracted behind interfaces, allowing for different storage implementations (e.g., in-memory, database) to be used interchangeably. The storage components include: + +- **Tasks Storage:** Stores task details. +- **Proplets Storage:** Stores proplet details. +- **Task-Proplet Mapping Storage:** Stores the mapping between tasks and proplets. + +### 5. Scheduler + +The Manager service uses a scheduler to select the appropriate proplet for a task based on certain criteria. The scheduler is responsible for distributing tasks across available proplets in an efficient manner, ensuring optimal resource utilization. The current implementation uses a round-robin scheduler, which selects the next available proplet in a cyclic manner. + +### 6. PubSub + +The Manager service uses a PubSub component to publish and subscribe to MQTT topics for task and proplet management. This component allows the service to communicate with other components of the Propeller system, such as proplets, to coordinate task execution and monitor their state. + +### 7. Internal Handlers + +The Manager service includes internal handlers for managing proplets and tasks. These handlers are responsible for processing messages received from MQTT topics and updating the state of tasks and proplets accordingly. The handlers include: + +- **Proplet Handlers:** Handle the creation, liveness updates, and result updates of proplets. +- **Task Handlers:** Handle the creation, updating, and deletion of tasks. + +### 8. Health and Metrics Endpoints + +The Manager service includes endpoints for health checks and metrics collection: + +- **Health Endpoint:** Provides a health check endpoint (`/health`) that returns the health status of the service. +- **Metrics Endpoint:** Provides a metrics endpoint (`/metrics`) that exposes Prometheus metrics for the service. + +## Data Flow + +### 1. Task Creation + +- A client sends a `POST` request to the `/tasks` endpoint with the task details. +- The service creates a new task, assigns a unique ID, and stores it in the tasks storage. +- The service returns the created task to the client. + +### 2. Task Execution + +- A client sends a `POST` request to the `/tasks/{taskID}/start` endpoint to start a task. +- The service retrieves the task from the tasks storage and selects an appropriate proplet using the scheduler. +- The service publishes a start message to the MQTT topic for the selected proplet. +- The proplet executes the task and publishes the results to the MQTT topic. +- The service processes the results and updates the task state in the tasks storage. + +### 3. Proplet Management + +- Proplets periodically send liveness updates to the MQTT topic. +- The service processes the liveness updates and updates the state of the proplets in the proplets storage. +- The service can also handle the creation of new proplets and the updating of proplet details.