Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.1.2 #76

Merged
merged 32 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
0f36c20
Enable installing torch from poetry
Eve-ning Jun 19, 2024
c58126e
Drop unused deps
Eve-ning Jun 19, 2024
090eabd
Remove installation in workflow to test poetry
Eve-ning Jun 19, 2024
110c253
Force tests to run on poetry dep change
Eve-ning Jun 19, 2024
d5bc1f1
Merge pull request #75 from FR-DC/frml-159
Eve-ning Jun 19, 2024
c7c7dc0
Merge branch 'main' into 0.1.2
Eve-ning Jun 19, 2024
b80d779
Implement Per Pixel Scaler
Eve-ning Jun 20, 2024
bffa9ae
Improve logging information during initialization
Eve-ning Jun 21, 2024
5294c3d
Update FRDCDataset to hide unused methods
Eve-ning Jun 21, 2024
3d4c7e6
Update docs to encourage iterating
Eve-ning Jun 21, 2024
0bb4734
Refactor dataset.py
Eve-ning Jun 26, 2024
856eed7
Remove unraisable error
Eve-ning Jun 26, 2024
78a5bf6
Drop unused imports and add return type hint
Eve-ning Jun 26, 2024
b17cb84
Update signatures for get legacy bounds
Eve-ning Jun 26, 2024
a584ad3
Clarify attribute naming of order
Eve-ning Jun 26, 2024
1673a90
Update docs
Eve-ning Jun 26, 2024
965b623
Default DEBUG to use legacy bounds for tests
Eve-ning Jun 26, 2024
6b49b5e
Make .env non-mandatory
Eve-ning Jun 26, 2024
656b203
Fix issue with .env not being copied in workflow
Eve-ning Jun 26, 2024
80cb6fe
Merge pull request #77 from FR-DC/frml-155
Eve-ning Jun 26, 2024
a479dc8
Merge branch '0.1.2' into frml-161
Eve-ning Jun 28, 2024
725a882
Drop transform_scale in favor of Compose
Eve-ning Jun 28, 2024
719c73b
Remove buggy pre-commit
Eve-ning Jun 28, 2024
24b7f29
Add functionality to ImSS to handle nested data
Eve-ning Jun 28, 2024
6fbe562
Add flatten nested, fn_recursive -> map_nested
Eve-ning Jun 28, 2024
7e67b1f
Update train_fixmatch.py
Eve-ning Jun 28, 2024
c6eaf1d
Fix incorrect outdated name
Eve-ning Jun 28, 2024
9fb7add
Fix incorrect transform append syntax
Eve-ning Jun 28, 2024
26f1026
Fix issue with float type after transform
Eve-ning Jun 28, 2024
872f364
Fix issue with compatibility with transform append
Eve-ning Jun 28, 2024
bf801ee
Merge pull request #79 from FR-DC/frml-161
Eve-ning Jun 28, 2024
bcea011
Fix issue with MixMatch Run
Eve-ning Jun 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
LABEL_STUDIO_API_KEY=
LABEL_STUDIO_HOST=10.97.41.70
LABEL_STUDIO_PORT=8080
GCS_PROJECT_ID=frmodel
GCS_BUCKET_NAME=frdc-ds
6 changes: 5 additions & 1 deletion .github/workflows/basic-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:
pull_request:
paths:
- "src/frdc/**"
- "poetry.lock"

jobs:
build:
Expand Down Expand Up @@ -40,7 +41,6 @@ jobs:
python -m pip install flake8 pytest poetry
poetry export --with dev --without-hashes -o requirements.txt
pip install -r requirements.txt
pip install torch torchaudio torchvision lightning

- name: Lint with flake8
run: |
Expand All @@ -49,6 +49,10 @@ jobs:
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 src/ --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

- name: Copy over .env
run: |
cp .env.example .env

- name: Test with pytest
run: |
pytest
5 changes: 4 additions & 1 deletion .github/workflows/model-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,6 @@ jobs:
python3 -m pip install flake8 pytest poetry
poetry export --with dev --without-hashes -o requirements.txt
pip3 install -r requirements.txt
pip3 install torch torchvision torchaudio

- name: Check CUDA is available
run: nvidia-smi
Expand Down Expand Up @@ -81,6 +80,10 @@ jobs:
uses: mxschmitt/action-tmate@v3
if: ${{ github.event_name == 'workflow_dispatch' && inputs.debug_enabled }}

- name: Copy over .env
run: |
cp .env.example .env

- name: Run Model Training
working-directory: ${{ github.workspace }}/tests
run: |
Expand Down
2 changes: 1 addition & 1 deletion Writerside/d.tree
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
start-page="Overview.md">

<toc-element topic="Overview.md"/>
<toc-element topic="ML-Architecture.md"/>
<toc-element topic="Getting-Started.md">
<toc-element topic="Get-Started-with-Dev-Containers.md"/>
</toc-element>
Expand All @@ -28,6 +29,5 @@
<toc-element topic="preprocessing.extract_segments.md"/>
<toc-element topic="preprocessing.morphology.md"/>
<toc-element topic="preprocessing.glcm_padded.md"/>
<toc-element topic="train.frdc_lightning.md"/>
</toc-element>
</instance-profile>
5 changes: 5 additions & 0 deletions Writerside/topics/Get-Started-with-Dev-Containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,8 @@ steps such as:
- Google Cloud Application Default Credentials
- Weight & Bias API Key
- Label Studio API Key

> You can set the API Keys in the `.env` file in the root of the project.
> Be careful not to commit the `.env` file to the repository, which should
> have been ignored by default.
{style='note'}
274 changes: 147 additions & 127 deletions Writerside/topics/Getting-Started.md
Original file line number Diff line number Diff line change
@@ -1,155 +1,173 @@
# Getting Started

> Want to use a Dev Container? See [Get Started with Dev Containers](Get-Started-with-Dev-Containers.md)
> Want to use a Dev Container?
> See [Get Started with Dev Containers](Get-Started-with-Dev-Containers.md)

<procedure title="Installing the Dev. Environment" id="install">
<step>Ensure that you have the right version of Python.
The required Python version can be seen in <code>pyproject.toml</code>
<code-block lang="ini">
[tool.poetry.dependencies]
python = "..."
</code-block>
</step>
<step>Start by cloning our repository.
<code-block lang="shell">
git clone https://github.com/FR-DC/FRDC-ML.git
</code-block>
</step>
<step>Then, create a Python Virtual Env <code>pyvenv</code>
<tabs>
<tab title="Windows">
<code-block lang="shell">python -m venv venv/</code-block>
</tab>
<tab title="Linux">
<code-block lang="shell">python3 -m venv venv/</code-block>
</tab>
</tabs>
</step>
<step>
<a href="https://python-poetry.org/docs/">Install Poetry</a>
Then check if it's installed with
<code-block lang="shell">poetry --version</code-block>
<warning>
If <code>poetry</code> is not found, it's likely not in the user PATH.
</warning>
</step>
<step>Activate the virtual environment
<tabs>
<tab title="Windows">
<code-block lang="shell">
cd venv/Scripts
activate
cd ../..
</code-block>
</tab>
<tab title="Linux">
<code-block lang="shell">
source venv/bin/activate
</code-block>
</tab>
</tabs>
</step>
<step>Install the dependencies. You should be in the same directory as
<code>pyproject.toml</code>
<step>Ensure that you have the right version of Python.
The required Python version can be seen in <code>pyproject.toml</code>
<code-block lang="ini">
[tool.poetry.dependencies]
python = "..."
</code-block>
</step>
<step>Start by cloning our repository.
<code-block lang="shell">
git clone https://github.com/FR-DC/FRDC-ML.git
</code-block>
</step>
<step>Then, create a Python Virtual Env <code>pyvenv</code>
<tabs>
<tab title="Windows">
<code-block lang="shell">python -m venv venv/</code-block>
</tab>
<tab title="Linux">
<code-block lang="shell">python3 -m venv venv/</code-block>
</tab>
</tabs>
</step>
<step>
<a href="https://python-poetry.org/docs/">Install Poetry</a>
Then check if it's installed with
<code-block lang="shell">poetry --version</code-block>
<warning>
If <code>poetry</code> is not found, it's likely not in the user PATH.
</warning>
</step>
<step>Activate the virtual environment
<tabs>
<tab title="Windows">
<code-block lang="shell">
poetry install --with dev
cd venv/Scripts
activate
cd ../..
</code-block>
</step>
<step>Install Pre-Commit Hooks
</tab>
<tab title="Linux">
<code-block lang="shell">
pre-commit install
source venv/bin/activate
</code-block>
</step>
</tab>
</tabs>
</step>
<step>Install the dependencies. You should be in the same directory as
<code>pyproject.toml</code>
<code-block lang="shell">
poetry install --with dev
</code-block>
</step>
<step>
Make a copy of the <code>.env.example</code> file and rename it to
<code>.env</code>
</step>
<step>Fill in additional environment variables in the <code>.env</code> file
<code-block>
LABEL_STUDIO_API_KEY=...
LABEL_STUDIO_HOST=10.97.41.70
LABEL_STUDIO_PORT=8080
GCS_PROJECT_ID=frmodel
GCS_BUCKET_NAME=frdc-ds
</code-block>
</step>
<step>Install Pre-Commit Hooks
<code-block lang="shell">
pre-commit install
</code-block>
</step>
</procedure>


<procedure title="Setting Up Google Cloud" id="gcloud">
<step>
We use Google Cloud to store our datasets. To set up Google Cloud,
<a href="https://cloud.google.com/sdk/docs/install">
install the Google Cloud CLI
</a>
</step>
<step>
Then,
<a href="https://cloud.google.com/sdk/docs/initializing">
authenticate your account
</a>.
<code-block lang="shell">gcloud auth login</code-block>
</step>
<step>
Finally,
<a href="https://cloud.google.com/docs/authentication/provide-credentials-adc">
set up Application Default Credentials (ADC)
</a>.
<code-block lang="shell">gcloud auth application-default login</code-block>
</step>
<step>
To make sure everything is working, <a anchor="tests">run the tests</a>.
</step>
<step>
We use Google Cloud to store our datasets. To set up Google Cloud,
<a href="https://cloud.google.com/sdk/docs/install">
install the Google Cloud CLI
</a>
</step>
<step>
Then,
<a href="https://cloud.google.com/sdk/docs/initializing">
authenticate your account
</a>.
<code-block lang="shell">gcloud auth login</code-block>
</step>
<step>
Finally,
<a href="https://cloud.google.com/docs/authentication/provide-credentials-adc">
set up Application Default Credentials (ADC)
</a>.
<code-block lang="shell">gcloud auth application-default login</code-block>
</step>
<step>
To make sure everything is working, <a anchor="tests">run the tests</a>.
</step>
</procedure>

<procedure title="Setting Up Label Studio" id="ls">
<tip>This is only necessary if any task requires Label Studio annotations</tip>
<step>
We use Label Studio to annotate our datasets.
We won't go through how to install Label Studio, for contributors, it
should be up on <code>localhost:8080</code>.
</step>
<step>
Then, retrieve your own API key from Label Studio.
<a href="http://localhost:8080/user/account"> Go to your account page </a>
and copy the API key. <br/></step>
<step> Set your API key as an environment variable.
<tabs>
<tab title="Windows">
<tip>This is only necessary if any task requires Label Studio annotations</tip>
<step>
We use Label Studio to annotate our datasets.
We won't go through how to install Label Studio, for contributors, it
should be up on <code>localhost:8080</code>.
</step>
<step>
Then, retrieve your own API key from Label Studio.
<a href="http://localhost:8080/user/account"> Go to your account page </a>
and copy the API key. <br/></step>
<step> Set your API key as an environment variable.
<tabs>
<tab title="Windows">
In Windows, go to "Edit environment variables for
your account" and add this as a new environment variable with name
<code>LABEL_STUDIO_API_KEY</code>.
</tab>
<tab title="Linux">
</tab>
<tab title="Linux">
Export it as an environment variable.
<code-block lang="shell">export LABEL_STUDIO_API_KEY=...</code-block>
</tab>
</tabs>
</step>
</tab>
<tab title=".env">
In all cases, you can create a <code>.env</code> file in the root of
the project and add the following line:
<code>LABEL_STUDIO_API_KEY=...</code>
</tab>
</tabs>
</step>
</procedure>

<procedure title="Setting Up Weight and Biases" id="wandb">
<step>
We use W&B to track our experiments. To set up W&B,
<a href="https://docs.wandb.ai/quickstart">
install the W&B CLI
</a>
</step>
<step>
Then,
<a href="https://docs.wandb.ai/quickstart">
authenticate your account
</a>.
<code-block lang="shell">wandb login</code-block>
</step>
<step>
We use W&B to track our experiments. To set up W&B,
<a href="https://docs.wandb.ai/quickstart">
install the W&B CLI
</a>
</step>
<step>
Then,
<a href="https://docs.wandb.ai/quickstart">
authenticate your account
</a>.
<code-block lang="shell">wandb login</code-block>
</step>
</procedure>

<procedure title="Pre-commit Hooks" collapsible="true">
<note>This is optional but recommended.
Pre-commit hooks are a way to ensure that your code is formatted correctly.
This is done by running a series of checks before you commit your code.
</note>
<step>
<code-block lang="shell">
pre-commit install
</code-block>
</step>
<note>This is optional but recommended.
Pre-commit hooks are a way to ensure that your code is formatted correctly.
This is done by running a series of checks before you commit your code.
</note>
<step>
<code-block lang="shell">
pre-commit install
</code-block>
</step>
</procedure>

<procedure title="Running the Tests" id="tests">
<step>
Run the tests to make sure everything is working
<code-block lang="shell">
pytest
</code-block>
</step>
<step>
Run the tests to make sure everything is working
<code-block lang="shell">
pytest
</code-block>
</step>
</procedure>

## Troubleshooting
Expand All @@ -174,13 +192,15 @@ See [Setting Up Google Cloud](#gcloud)
### Couldn't connect to Label Studio

Label Studio must be running locally, exposed on `localhost:8080`. Furthermore,
you need to specify the `LABEL_STUDIO_API_KEY` environment variable. See
you need to specify the `LABEL_STUDIO_API_KEY` environment variable. See
[Setting Up Label Studio](#ls)

### Cannot login to W&B

You need to authenticate your W&B account. See [Setting Up Weight and Biases](#wandb)
If you're facing difficulties, set the `WANDB_MODE` environment variable to `offline`
You need to authenticate your W&B account.
See [Setting Up Weight and Biases](#wandb)
If you're facing difficulties, set the `WANDB_MODE` environment variable
to `offline`
to disable W&B.

## Our Repository Structure
Expand Down
Loading
Loading