Skip to content

Latest commit

 

History

History
158 lines (115 loc) · 4.79 KB

ARCHITECTURE.md

File metadata and controls

158 lines (115 loc) · 4.79 KB

MetaQuest Architecture

This document provides an overview of the MetaQuest architecture and design principles.

Design Principles

MetaQuest is designed around the following principles:

  1. Separation of Concerns: Clear separation between data access, business logic, and user interfaces
  2. Extensibility: Plugin-based design allowing for easy addition of new features
  3. Maintainability: Well-defined interfaces and abstractions to simplify maintenance
  4. Robustness: Comprehensive error handling and validation
  5. Usability: Intuitive CLI with consistent command patterns

Architecture Overview

The application is structured using a layered architecture with the following components:

+------------------+
|       CLI        |
+------------------+
         |
+------------------+
|    Core Logic    |
+------------------+
         |
+------------------+     +------------------+
|    Data Layer    |<--->|  Plugin System   |
+------------------+     +------------------+
         |
+------------------+
| External Systems |
+------------------+

Layers

CLI Layer

The CLI layer provides the command-line interface for the application. It's responsible for:

  • Parsing command-line arguments
  • Routing commands to the appropriate handlers
  • Formatting output for the user

Core Logic Layer

The core logic layer contains the business logic of the application. It includes:

  • Data models and domain objects
  • Validation logic
  • Processing algorithms
  • Error handling

Data Layer

The data layer manages access to data sources and external systems:

  • File I/O operations
  • Data format conversions
  • Communication with external APIs
  • Caching mechanisms

Plugin System

The plugin system enables extensibility:

  • Format plugins for different file formats
  • Visualization plugins for different visualization types
  • Processing plugins for different analysis methods

Key Components

Core Components

  • Models: Data classes representing domain objects like Containment and SRAMetadata
  • Validation: Functions for validating input data and configurations
  • Exceptions: Custom exception hierarchy for clear error reporting

Data Access Components

  • file_io: Abstract file operations for reading/writing various file formats
  • branchwater: Functionality for working with Branchwater data
  • metadata: Functions for downloading and processing metadata
  • sra: Functions for downloading and working with SRA data

Processing Components

  • containment: Algorithms for analyzing containment data
  • counts: Functions for counting and summarizing metadata
  • statistics: Statistical analysis utilities

Visualization Components

  • plots: Functions for generating various types of plots
  • reporting: Tools for generating reports

Plugin System Components

  • base: Base classes and registries for plugins
  • formats: File format plugins
  • visualizers: Visualization plugins

Utility Components

  • logging: Logging configuration and utilities
  • config: Configuration management

Data Flow

  1. Input: User provides commands via CLI
  2. Command Handling: CLI routes to appropriate command handler
  3. Data Access: Commands interact with data sources via the data layer
  4. Processing: Data is processed using core logic components
  5. Visualization: Results are visualized or formatted for output
  6. Output: Results are presented to the user via CLI

Plugin System

The plugin system uses a registry pattern:

  1. Plugins inherit from a base Plugin class
  2. Plugins are registered in a PluginRegistry
  3. The application can discover plugins at runtime
  4. Plugins provide specific implementations for abstract operations

Example: Format Plugins

Format plugins handle different file formats:

  • BranchWaterFormatPlugin: Handles Branchwater CSV files
  • MastiffFormatPlugin: Handles Mastiff CSV files

Each plugin provides standard methods like:

  • validate_header: Validates file headers
  • parse_file: Parses file content
  • extract_metadata: Extracts metadata from parsed content

Error Handling

The application uses a consistent error handling approach:

  1. Custom exception hierarchy starting with MetaQuestError
  2. Specific exception types for different error categories
  3. Each layer handles errors appropriate to its level
  4. CLI layer catches and formats errors for user display

Configuration Management

The application manages configuration through:

  1. Default configuration values
  2. Configuration file in user's home directory
  3. Environment variables
  4. Command-line overrides

Future Extensions

The architecture is designed to accommodate future extensions:

  1. New file format plugins
  2. Additional visualization types
  3. More analysis algorithms
  4. Web interface or API
  5. Database integration