Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Eve-ning committed Dec 13, 2023
1 parent dce9bc6 commit 3cf4cef
Showing 1 changed file with 20 additions and 31 deletions.
51 changes: 20 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@
This repository contains all code regarding our models used.
This is part of the entire E2E pipeline for our product.

_Data Collection -> **FRDC-ML** -> [FRDC-UI](https://github.com/Forest-Recovery-Digital-Companion/FRDC-UI)_
```mermaid
graph LR
A[Data Collection] --> B[FRDC-ML] --> C[FRDC-UI]
```

Currently, it's heavily WIP.

Expand All @@ -23,6 +26,7 @@ FRDC-ML/
main.py # Pipeline Entry Point
tests/ # PyTest Tests
model-tests/ # Tests for each model
integration-tests/ # Tests that run the entire pipeline
unit-tests/ # Tests for each component
Expand All @@ -34,44 +38,29 @@ FRDC-ML/

## Our Architecture

This is a classic, simple Python Package architecture, however, we **HEAVILY EMPHASIZE** encapsulation of each stage.
That means, there should never be data that **IMPLICITLY** persists across stages. We enforce this by our
`src/main.py` entrypoint.

Each function should have a high-level, preferably intuitively english naming convention.

```python
from torch.optim import Adam

from frdc.load.dataset import FRDCDataset
from frdc.preprocess.morphology import remove_small_objects
from frdc.preprocess.morphology import watershed
from frdc.train import train

ar = FRDCDataset("chestnut", "date", ...)
ar = watershed(ar)
ar = remove_small_objects(ar, min_size=100)
model = train(ar, lr=0.01, optimizer=Adam, )
...
```
This is a classic, simple Python Package architecture, however, we
**HEAVILY EMPHASIZE** encapsulation of each stage.
That means, there should never be data that **IMPLICITLY** persists across
stages.

This architecture allows for
To illustrate this, take a look at how
`tests/model_tests/chestnut_dec_may/train.py` is written. It pulls in relevant
modules from each stage and constructs a pipeline.

1) Easily legible high level pipelines
2) Flexibility
1) Conventional Python signatures can be used to input arguments
2) If necessary we can leverage everything else Python
3) Easily replicable pipelines

> Initially, we evaluated a few ML E2E solutions, despite them offering great functionality, their flexibility was
> limited. From a dev perspective, **Active Learning** was a gray area, and we foresee heavy shoehorning.
> Ultimately, we decided that the risk was too great, thus we resort to creating our own solution.
> Initially, we evaluated a few ML E2E solutions, despite them offering great
> functionality, their flexibility was
> limited. From a dev perspective, **Active Learning** was a gray area, and we
> foresee heavy shoehorning.
> Ultimately, we decided that the risk was too great, thus we resort to
> creating our own solution.
## Contributing

### Pre-commit Hooks

We use Black and Flake8 as our pre-commit hooks. To install them, run the following commands:
We use Black and Flake8 as our pre-commit hooks. To install them, run the
following commands:

```bash
poetry install
Expand Down

0 comments on commit 3cf4cef

Please sign in to comment.