Getting Started
Installing the Dev. Environment
Ensure that you have the right version of Python. The required Python version can be seen in
pyproject.toml
+ [tool.poetry.dependencies] + python = "..." +Start by cloning our repository.
+ git clone https://github.com/Forest-Recovery-Digital-Companion/FRDC-ML.git +Then, create a Python Virtual Env
pyvenv
python -m venv venv/python3 -m venv venv/Install Poetry Then check if it's installed with
poetry --versionActivate the virtual environment
+ cd venv/Scripts + activate + cd ../.. ++ source venv/bin/activate +Install the dependencies. You should be in the same directory as
pyproject.toml
+ poetry install --with dev +Install Pre-Commit Hooks
+ pre-commit install +
Setting Up Google Cloud
We use Google Cloud to store our datasets. To set up Google Cloud, install the Google Cloud CLI
Then, you need to authenticate your account.
To make sure everything is working, run the following command:
+ gsutil ls +
Pre-commit Hooks
- + pre-commit install +
Running the Tests
Run the tests to make sure everything is working
+ pytest +In case of errors:
- google.auth.exceptions.DefaultCredentialsError
If you get this error, it means that you haven't authenticated your Google Cloud account. See Setting Up Google Cloud
- ModuleNotFoundError
If you get this error, it means that you haven't installed the dependencies. See Installing the Dev. Environment
Our Repository Structure
Before starting development, take a look at our repository structure. This will help you understand where to put your code.
- src/frdc/
Source Code for our package. These are the unit components of our pipeline.
- rsc/
Resources. These are usually cached datasets
- pipeline/
Pipeline code. These are the full ML tests of our pipeline.
- tests/
PyTest tests. These are unit tests & integration tests.
Unit, Integration, and Pipeline Tests
We have 3 types of tests:
Unit Tests are usually small, single function tests.
Integration Tests are larger tests that tests a mock pipeline.
Pipeline Tests are the true production pipeline tests that will generate a model.
Where Should I contribute?
- Changing a small component
If you're changing a small component, such as a argument for preprocessing, a new model architecture, or a new configuration for a dataset, take a look at the
src/frdc/
directory.- Adding a test
By adding a new component, you'll need to add a new test. Take a look at the
tests/
directory.- Changing the pipeline
If you're a ML Researcher, you'll probably be changing the pipeline. Take a look at the
pipeline/
directory.- Adding a dependency
If you're adding a new dependency, use
poetry add PACKAGE
and commit the changes topyproject.toml
andpoetry.lock
.