Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FRML-44 Integrate CML #17

Merged
merged 31 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
fd7a486
Test custom runner
Eve-ning Nov 6, 2023
9d961ff
Try use poetry to install
Eve-ning Nov 6, 2023
2e59b92
Fix pipx not installed issue
Eve-ning Nov 6, 2023
9a1b3f2
Fix pipx not installed issue
Eve-ning Nov 6, 2023
184c330
Fix pipx not installed issue
Eve-ning Nov 6, 2023
7406710
Fix pipx not installed issue
Eve-ning Nov 6, 2023
2758d57
Fix pipx not installed issue
Eve-ning Nov 6, 2023
86cf7ad
Update poetry lock
Eve-ning Nov 6, 2023
fda1a42
Use poetry to run pytest
Eve-ning Nov 6, 2023
0b73851
Add cuda toolkit installation
Eve-ning Nov 6, 2023
1064c34
Add cuda toolkit installation
Eve-ning Nov 6, 2023
56fb64c
Test custom CML runner
Eve-ning Nov 8, 2023
3835b8a
Change container to CML GPU
Eve-ning Nov 8, 2023
cde72c5
Fix missing poetry
Eve-ning Nov 8, 2023
da1c810
Fix missing poetry
Eve-ning Nov 8, 2023
b6eac89
Remove caching
Eve-ning Nov 8, 2023
a3e61ba
Try to use torch gpu
Eve-ning Nov 8, 2023
a904fc3
Try use non-container
Eve-ning Nov 8, 2023
1f05f7b
Try use traditional pip
Eve-ning Nov 8, 2023
eebd369
Try use CML comment
Eve-ning Nov 8, 2023
dcc1f30
Try use CML comment
Eve-ning Nov 8, 2023
2d10518
Add write-all perms
Eve-ning Nov 8, 2023
51b1b6e
Fix GH perms for token
Eve-ning Nov 8, 2023
80e50c6
Try with inferred token
Eve-ning Nov 8, 2023
d0758fb
Add wandb dependency
Nov 8, 2023
d700604
Test mvp run without GLCM
Eve-ning Nov 8, 2023
276e788
Fix incorrect test run python call
Eve-ning Nov 8, 2023
13d7758
Call via module syntax to preserve path
Eve-ning Nov 8, 2023
8392545
Add src to PYTHONPATH
Eve-ning Nov 8, 2023
df6cf56
Fix some issues with WandB export and PYTHONPATH
Eve-ning Nov 8, 2023
2cf9193
Fix issue with report exporting relative to call
Eve-ning Nov 8, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/model.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: Model Training

on:
pull_request:

jobs:
build:

runs-on: self-hosted
container:
image: docker://ghcr.io/iterative/cml:0-dvc2-base1-gpu
volumes:
- /home/runner/work/_temp/_github_home:/root

steps:
- uses: actions/checkout@v3

- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: "3.11"
cache: 'pip'

- name: Install via exported requirements.txt
run: |
python -m pip install --upgrade pip
python -m pip install flake8 pytest poetry
poetry export --with dev --without-hashes -o requirements.txt
pip3 install -r requirements.txt
pip3 install torch torchvision torchaudio

- name: Set up gcloud
id: 'auth'
uses: 'google-github-actions/auth@v1'
with:
credentials_json: '${{ secrets.FRDC_DOWNLOAD_KEY }}'

- name: Set up Cloud SDK
uses: 'google-github-actions/setup-gcloud@v1'

- name: Set up WandB
run: |
echo "WANDB_API_KEY=${{ secrets.WANDB_API_KEY }}" >> $GITHUB_ENV

- name: Add src as PYTHONPATH
run: |
echo "PYTHONPATH=src" >> $GITHUB_ENV

- name: Run Model Training
run: |
python3 -m tests.model_tests.chestnut_dec_may.main

- name: Comment results via CML
run: |
cml comment update \
--target=pr \
--token ${{ secrets.GITHUB_TOKEN }} \
tests/model_tests/chestnut_dec_may/report.md
8 changes: 5 additions & 3 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@
name: Python CI

on:
push:
branches: [ "main" ]
pull_request:
workflow_dispatch:

# push:
# branches: [ "main" ]
# pull_request:

jobs:
build:
Expand Down
Empty file.
881 changes: 565 additions & 316 deletions poetry.lock

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ pytest = "^7.4.2"
pre-commit = "^3.5.0"
black = "^23.10.0"
flake8 = "^6.1.0"
wandb = "^0.16.0"


[tool.poetry.group.glcm.dependencies]
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@

from frdc.train import FRDCDataModule
from frdc.train import FRDCModule
from pipeline.model_tests.chestnut_dec_may.preprocess import preprocess
from pipeline.model_tests.utils import get_dataset
from .preprocess import preprocess
from tests.model_tests.utils import get_dataset

# Get our Test
# TODO: Ideally, we should have a separate dataset for testing.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
This test is done by training a model on the 20201218 dataset, then testing on
the 20210510 dataset.
"""
from pathlib import Path

import lightning as pl
import numpy as np
Expand All @@ -16,19 +17,30 @@

from frdc.models import FaceNet
from frdc.train import FRDCDataModule, FRDCModule
from pipeline.model_tests.chestnut_dec_may.augmentation import augmentation
from pipeline.model_tests.chestnut_dec_may.preprocess import preprocess
from pipeline.model_tests.utils import get_dataset
from tests.model_tests.chestnut_dec_may.augmentation import augmentation
from tests.model_tests.chestnut_dec_may.preprocess import preprocess
from tests.model_tests.utils import get_dataset
from lightning.pytorch.loggers import WandbLogger
import wandb

assert wandb.run is None

def train_val_test_split(x: TensorDataset) -> list[Dataset, Dataset, Dataset]:
wandb.setup(wandb.Settings(program=__name__, program_relpath=__name__))
run = wandb.init()
logger = WandbLogger(name="chestnut_dec_may", project="frdc")


def train_val_test_split(
x: TensorDataset,
) -> list[Dataset, Dataset, Dataset]:
# Defines how to split the dataset into train, val, test subsets.
# TODO: Quite ugly as it uses the global variables segments_0 and
# segments_1. Will need to refactor this.
return [
Subset(x, list(range(len(segments_0)))),
Subset(
x, list(range(len(segments_0), len(segments_0) + len(segments_1)))
x,
list(range(len(segments_0), len(segments_0) + len(segments_1))),
),
[],
]
Expand All @@ -40,12 +52,13 @@ def train_val_test_split(x: TensorDataset) -> list[Dataset, Dataset, Dataset]:
"chestnut_nature_park", "20210510", "90deg43m85pct255deg/map"
)


# Concatenate the datasets
segments = [*segments_0, *segments_1]
labels = [*labels_0, *labels_1]

BATCH_SIZE = 5
EPOCHS = 100
# EPOCHS = 100
LR = 1e-3

# Prepare the datamodule and trainer
Expand All @@ -65,10 +78,13 @@ def train_val_test_split(x: TensorDataset) -> list[Dataset, Dataset, Dataset]:
)

trainer = pl.Trainer(
max_epochs=EPOCHS,
max_epochs=1,
# Set the seed for reproducibility
fast_dev_run=True,
# TODO: Though this is set, the results are still not reproducible.
deterministic=True,
# fast_dev_run=True,
accelerator="cpu",
log_every_n_steps=4,
callbacks=[
# Stop training if the validation loss doesn't improve for 4 epochs
Expand All @@ -78,11 +94,13 @@ def train_val_test_split(x: TensorDataset) -> list[Dataset, Dataset, Dataset]:
# Save the best model
ModelCheckpoint(monitor="val_loss", mode="min", save_top_k=1),
],
logger=logger,
)

m = FRDCModule(
# Our model is the "FaceNet" model
# TODO: It's not really the FaceNet model, but a modified version of it.
# TODO: It's not really the FaceNet model,
# but a modified version of it.
model_cls=FaceNet,
model_kwargs=dict(n_out_classes=len(set(labels))),
# We use the Adam optimizer
Expand All @@ -94,3 +112,15 @@ def train_val_test_split(x: TensorDataset) -> list[Dataset, Dataset, Dataset]:
trainer.fit(m, datamodule=dm)
# TODO: Quite hacky, but we need to save the label encoder for prediction.
np.save("le.npy", dm.le.classes_)

report = f"""
# Chestnut Nature Park (Dec 2020 vs May 2021)
[WandB Report]({run.get_url()})
"""


with open(Path(__file__).parent / "report.md", "w") as f:
f.write(report)


wandb.finish()
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import numpy as np
import torch
from glcm_cupy import Features

# from glcm_cupy import Features
from torchvision.transforms.v2 import Resize

from frdc.models import FaceNet
from frdc.preprocess.glcm_padded import append_glcm_padded_cached

# from frdc.preprocess.glcm_padded import append_glcm_padded_cached
from frdc.preprocess.scale import scale_normal_per_band, scale_0_1_per_band


Expand All @@ -24,16 +26,17 @@ def segment_preprocess(ar: np.ndarray) -> torch.Tensor:

# Add a small epsilon to avoid upper bound of 1.0
ar = scale_0_1_per_band(ar, epsilon=0.001)
ar = append_glcm_padded_cached(
ar,
step_size=7,
bin_from=1,
bin_to=128,
radius=3,
features=(Features.MEAN,),
)
# We can then scale normal for better neural network convergence
# ar = append_glcm_padded_cached(
# ar,
# step_size=7,
# bin_from=1,
# bin_to=128,
# radius=3,
# features=(Features.MEAN,),
# )
# # We can then scale normal for better neural network convergence
ar = scale_normal_per_band(ar)
ar = np.rollaxis(ar, axis=2)

# TODO: Doesn't seem like we have any channel preprocessing here.
# ar = np.stack([
Expand Down
File renamed without changes.