-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #16 from FOI-Bioinformatics/v0.3.0
Prepare for v0.3.0
- Loading branch information
Showing
41 changed files
with
6,362 additions
and
1,584 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
# MetaQuest Architecture | ||
|
||
This document provides an overview of the MetaQuest architecture and design principles. | ||
|
||
## Design Principles | ||
|
||
MetaQuest is designed around the following principles: | ||
|
||
1. **Separation of Concerns**: Clear separation between data access, business logic, and user interfaces | ||
2. **Extensibility**: Plugin-based design allowing for easy addition of new features | ||
3. **Maintainability**: Well-defined interfaces and abstractions to simplify maintenance | ||
4. **Robustness**: Comprehensive error handling and validation | ||
5. **Usability**: Intuitive CLI with consistent command patterns | ||
|
||
## Architecture Overview | ||
|
||
The application is structured using a layered architecture with the following components: | ||
|
||
``` | ||
+------------------+ | ||
| CLI | | ||
+------------------+ | ||
| | ||
+------------------+ | ||
| Core Logic | | ||
+------------------+ | ||
| | ||
+------------------+ +------------------+ | ||
| Data Layer |<--->| Plugin System | | ||
+------------------+ +------------------+ | ||
| | ||
+------------------+ | ||
| External Systems | | ||
+------------------+ | ||
``` | ||
|
||
### Layers | ||
|
||
#### CLI Layer | ||
The CLI layer provides the command-line interface for the application. It's responsible for: | ||
- Parsing command-line arguments | ||
- Routing commands to the appropriate handlers | ||
- Formatting output for the user | ||
|
||
#### Core Logic Layer | ||
The core logic layer contains the business logic of the application. It includes: | ||
- Data models and domain objects | ||
- Validation logic | ||
- Processing algorithms | ||
- Error handling | ||
|
||
#### Data Layer | ||
The data layer manages access to data sources and external systems: | ||
- File I/O operations | ||
- Data format conversions | ||
- Communication with external APIs | ||
- Caching mechanisms | ||
|
||
#### Plugin System | ||
The plugin system enables extensibility: | ||
- Format plugins for different file formats | ||
- Visualization plugins for different visualization types | ||
- Processing plugins for different analysis methods | ||
|
||
### Key Components | ||
|
||
#### Core Components | ||
|
||
- **Models**: Data classes representing domain objects like `Containment` and `SRAMetadata` | ||
- **Validation**: Functions for validating input data and configurations | ||
- **Exceptions**: Custom exception hierarchy for clear error reporting | ||
|
||
#### Data Access Components | ||
|
||
- **file_io**: Abstract file operations for reading/writing various file formats | ||
- **branchwater**: Functionality for working with Branchwater data | ||
- **metadata**: Functions for downloading and processing metadata | ||
- **sra**: Functions for downloading and working with SRA data | ||
|
||
#### Processing Components | ||
|
||
- **containment**: Algorithms for analyzing containment data | ||
- **counts**: Functions for counting and summarizing metadata | ||
- **statistics**: Statistical analysis utilities | ||
|
||
#### Visualization Components | ||
|
||
- **plots**: Functions for generating various types of plots | ||
- **reporting**: Tools for generating reports | ||
|
||
#### Plugin System Components | ||
|
||
- **base**: Base classes and registries for plugins | ||
- **formats**: File format plugins | ||
- **visualizers**: Visualization plugins | ||
|
||
#### Utility Components | ||
|
||
- **logging**: Logging configuration and utilities | ||
- **config**: Configuration management | ||
|
||
## Data Flow | ||
|
||
1. **Input**: User provides commands via CLI | ||
2. **Command Handling**: CLI routes to appropriate command handler | ||
3. **Data Access**: Commands interact with data sources via the data layer | ||
4. **Processing**: Data is processed using core logic components | ||
5. **Visualization**: Results are visualized or formatted for output | ||
6. **Output**: Results are presented to the user via CLI | ||
|
||
## Plugin System | ||
|
||
The plugin system uses a registry pattern: | ||
|
||
1. Plugins inherit from a base `Plugin` class | ||
2. Plugins are registered in a `PluginRegistry` | ||
3. The application can discover plugins at runtime | ||
4. Plugins provide specific implementations for abstract operations | ||
|
||
### Example: Format Plugins | ||
|
||
Format plugins handle different file formats: | ||
|
||
- `BranchWaterFormatPlugin`: Handles Branchwater CSV files | ||
- `MastiffFormatPlugin`: Handles Mastiff CSV files | ||
|
||
Each plugin provides standard methods like: | ||
- `validate_header`: Validates file headers | ||
- `parse_file`: Parses file content | ||
- `extract_metadata`: Extracts metadata from parsed content | ||
|
||
## Error Handling | ||
|
||
The application uses a consistent error handling approach: | ||
|
||
1. Custom exception hierarchy starting with `MetaQuestError` | ||
2. Specific exception types for different error categories | ||
3. Each layer handles errors appropriate to its level | ||
4. CLI layer catches and formats errors for user display | ||
|
||
## Configuration Management | ||
|
||
The application manages configuration through: | ||
|
||
1. Default configuration values | ||
2. Configuration file in user's home directory | ||
3. Environment variables | ||
4. Command-line overrides | ||
|
||
## Future Extensions | ||
|
||
The architecture is designed to accommodate future extensions: | ||
|
||
1. New file format plugins | ||
2. Additional visualization types | ||
3. More analysis algorithms | ||
4. Web interface or API | ||
5. Database integration |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,18 @@ | ||
""" | ||
MetaQuest - A toolkit for analyzing metagenomic datasets based on genome containment. | ||
MetaQuest helps users search through SRA datasets to find containment of specified | ||
genomes and analyze associated metadata. | ||
""" | ||
|
||
__version__ = "0.3.0" | ||
__author__ = "Andreas Sjödin" | ||
__email__ = "[email protected]" | ||
|
||
# Import commonly used modules for easier access | ||
from metaquest.core.exceptions import MetaQuestError, ValidationError | ||
from metaquest.utils.config import get_config, set_config | ||
from metaquest.utils.logging import setup_logging | ||
|
||
# Initialize logging by default | ||
setup_logging() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
""" | ||
Command Line Interface for MetaQuest. | ||
This package provides the command-line interface for the MetaQuest toolkit. | ||
""" | ||
|
||
# Import command handlers and CLI entry point | ||
from metaquest.cli.commands import ( | ||
download_test_genome_command, | ||
process_branchwater_command, | ||
extract_branchwater_metadata_command, | ||
parse_containment_command, | ||
download_metadata_command, | ||
parse_metadata_command, | ||
count_metadata_command, | ||
single_sample_command, | ||
plot_containment_command, | ||
plot_metadata_counts_command, | ||
download_sra_command, | ||
assemble_datasets_command | ||
) | ||
from metaquest.cli.main import main, create_parser |
Oops, something went wrong.