Forest Recover Digital Companion Machine Learning Pipeline Repository
This repository contains all code regarding our models used. This is part of the entire E2E pipeline for our product.
graph LR
A[Data Collection] --> B[FRDC-ML] --> C[FRDC-UI]
Currently, it's heavily WIP.
I highly recommend reading our website documentation. There contains tutorials and docs on how to use our modules.
FRDC-ML/
src/ # All relevant code
frdc/ # Package/Component Level code
load/ # Image I/O
preprocess/ # Image Preprocessing
train/ # ML Training
evaluate/ # Model Evaluation
... # ...
main.py # Pipeline Entry Point
tests/ # PyTest Tests
model-tests/ # Tests for each model
integration-tests/ # Tests that run the entire pipeline
unit-tests/ # Tests for each component
poetry.lock # Poetry managed environment file
pyproject.toml # Project-level information: requirements, settings, name, deployment info
.github/ # GitHub Actions
This is a classic, simple Python Package architecture, however, we HEAVILY EMPHASIZE encapsulation of each stage. That means, there should never be data that IMPLICITLY persists across stages.
To illustrate this, take a look at how
tests/model_tests/chestnut_dec_may/train.py
is written. It pulls in relevant
modules from each stage and constructs a pipeline.
We use Black and Flake8 as our pre-commit hooks. To install them, run the following commands:
poetry install
pre-commit install
If you're using pip
instead of poetry
, run the following commands:
pip install pre-commit
pre-commit install
Alternatively, you can use Black configured with your own IDE.