Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / GC - Add the GC Tiger challenge winner LB2 model #38

Merged
merged 29 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
9459c3b
add initial nocuda Dockerfile and working TiffImporter and run.py script
silvandeleemput Jun 27, 2023
edcc2e4
add cuda 11.4/12.0 Dockerfiles and replaced print statement in runner
silvandeleemput Jun 27, 2023
19d64c4
cleanup ASAP install dockerfile line and add matching torchvision ver…
silvandeleemput Jun 29, 2023
74fbfa5
Merge branch 'MHubAI:main' into m-tiger-lb2
silvandeleemput Jul 25, 2023
21e073b
moved tiger_lb2 -> gc_tiger_lb2, changed Dockerfile to new base img, …
silvandeleemput Jul 25, 2023
7c4fafd
add DicomImporter import to run.py
silvandeleemput Jul 25, 2023
292a558
add torch cuda assert to tiger_lb2 runner
silvandeleemput Jul 25, 2023
667454b
Added a fix for the import issue by introducing a CLI and a subproces…
silvandeleemput Jul 26, 2023
4e49e61
Merge branch 'MHubAI:main' into m-gc-tiger-lb2
silvandeleemput Sep 13, 2023
0cffac0
Update and cleanup the tiger lb2 algorithm model code
silvandeleemput Sep 13, 2023
83b5cc2
Merge branch 'm-gc-tiger-lb2' of github.com:DIAGNijmegen/MHubAI_model…
silvandeleemput Sep 13, 2023
169a5fe
fix issue with last assert runner, re-enable ReportExporter
silvandeleemput Sep 14, 2023
982db19
replace subprocess call with self.subprocess, replace asserts with ex…
silvandeleemput Oct 7, 2023
062cb02
add basic meta.json
silvandeleemput Oct 19, 2023
ca1a450
add minimum panimg pip install fix for wsi dicom conversion
silvandeleemput Oct 24, 2023
af0287b
changed algorithm details type
silvandeleemput Oct 24, 2023
95374a3
Merge branch 'MHubAI:main' into m-gc-tiger-lb2
silvandeleemput Nov 23, 2023
a27dad5
add import mhub model definition lines, remove panimg pip install, re…
silvandeleemput Nov 23, 2023
1965629
modified meta line
silvandeleemput Nov 23, 2023
a82e538
meta.json - fix typo and match model name
silvandeleemput Jan 22, 2024
59c97c7
fix dynamic modality in default.yml, improve error message in runner #38
silvandeleemput Mar 5, 2024
53e9e52
meta.json - change version to 0.1.0
silvandeleemput Mar 5, 2024
d79cae0
Merge branch 'MHubAI:main' into m-gc-tiger-lb2
silvandeleemput Mar 7, 2024
8ea1ba9
meta.json - modified, updated, and extended
silvandeleemput Mar 7, 2024
b973f7d
Dockerfile - add reproducibility fix
silvandeleemput Mar 7, 2024
fa73a62
Dockerfile, runner, cli - added pipenv for model algorithm #38
silvandeleemput Mar 7, 2024
653799a
pull source from forked repo, added segmentation output as mha
silvandeleemput Apr 18, 2024
0571aad
meta.json - add tumor and stroma segmentation mask description to out…
silvandeleemput Apr 24, 2024
4bc6631
runner.py - fix output type filename
silvandeleemput Apr 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions models/gc_tiger_lb2/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .utils import *
31 changes: 31 additions & 0 deletions models/gc_tiger_lb2/config/default.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
general:
data_base_dir: /app/data
version: 1.0
description: Tiger challenge winner LB2 (dicom:sm to json with TIL score)

execute:
- DicomImporter
- TiffConverter
- TigerLB2Runner
- ReportExporter
- DataOrganizer

modules:
DicomImporter:
source_dir: input_data
import_dir: sorted_data
sort_data: True
meta:
mod: sm

ReportExporter:
includes:
- data: til_score
label: TIL score
value: value

DataOrganizer:
target_dir: output_data
require_data_confirmation: true
targets:
- json-->[i:sid]/gc_tiger_lb2_til_score.json
21 changes: 21 additions & 0 deletions models/gc_tiger_lb2/config/tiff_pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
general:
data_base_dir: /app/data
version: 1.0
description: Tiger challenge winner LB2 (tiff:sm to json with TIL score)

execute:
- FileStructureImporter
- TigerLB2Runner
- DataOrganizer

modules:
FileStructureImporter:
input_dir: input_data
structures:
- $instanceID@instance/wsi.tif@tiff:mod=sm

DataOrganizer:
target_dir: output_data
require_data_confirmation: true
targets:
- json-->[i:instanceID]/gc_tiger_lb2_til_score.json
54 changes: 54 additions & 0 deletions models/gc_tiger_lb2/dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Specify the base image for the environment
FROM mhubai/base:latest

# Specify/override authors label
LABEL authors="[email protected]"

# install required dependencies for algorithm
RUN pip3 install --no-cache-dir torch==2.0.1+cu118 torchvision==0.15.2+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install ASAP
RUN apt-get update \
&& apt-get -y install curl libpython3.8-dev \
&& curl --remote-name --location "https://github.com/computationalpathologygroup/ASAP/releases/download/ASAP-2.1/ASAP-2.1-py38-Ubuntu2004.deb" \
&& dpkg --install ASAP-2.1-py38-Ubuntu2004.deb || true \
&& apt-get -f install --fix-missing --fix-broken --assume-yes \
&& ldconfig -v \
&& apt-get clean \
&& echo "/opt/ASAP/bin" > /usr/local/lib/python3.8/dist-packages/asap.pth \
&& rm ASAP-2.1-py38-Ubuntu2004.deb

# Install tiger LB2 algorithm
# - Clone tiger LB2 codebase (master branch, fixed to commit 720f8dfca4624792c8e57915c4222efec5a0c2d4)
# - Subsequently we remove the .git directory to procuce a compacter docker layer
RUN git clone https://github.com/vuno/tiger_challenge.git /vuno && \
cd /vuno && git reset --hard 720f8dfca4624792c8e57915c4222efec5a0c2d4 && \
rm -rf /vuno/.git

# Install tiger LB2 dependencies
RUN pip3 install --no-cache-dir -r /vuno/requirements.txt

# Reinstall correct version of Numpy to function with ASAP 2.1
RUN pip3 install --no-cache-dir --force-reinstall numpy==1.22

# Enforce minimum version of panimg (with WSI fix)
RUN pip3 install --no-cache-dir panimg>=0.13.2

# Download and install model weights file from zenodo
RUN rm -rf /vuno/pretrained_weights && \
wget https://zenodo.org/record/8112176/files/pretrained_weights.zip -O /vuno/pretrained_weights.zip && \
unzip /vuno/pretrained_weights.zip -d /vuno && \
rm /vuno/pretrained_weights.zip

# Clone the main branch of MHubAI/models TODO
#RUN git stash \
# && git sparse-checkout set "models/gc_tiger_lb2" \
# && git fetch https://github.com/MHubAI/models.git main \
# && git merge FETCH_HEAD

# Add model and algorithm code bases to python path
ENV PYTHONPATH="/vuno:/app"

# Set default entrypoint
ENTRYPOINT ["python3", "-m", "mhubio.run"]
CMD ["--config", "/app/models/gc_tiger_lb2/config/default.yml"]
113 changes: 113 additions & 0 deletions models/gc_tiger_lb2/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
{
"id": "c5397909-0397-489f-8744-6bf3952e9a1c",
"name": "tiger_lb2",
"title": "TIGER challenge winner: Team VUNO",
"summary": {
"description": "Participants in the TIGER challenge will have to develop computer algorithms to analyze H&E-stained whole-slide images of breast cancer histopathology, to perform three tasks: detection of lymphocytes and plasma cells, which are the main types of cells considered as tumor-infiltrating lymphocytes; segmentation of invasive tumor and tumor-associated stroma, which are the main tissue compartments considered when identifying relevant regions for the TILs; compute an automated TILs score, one score per slide, based on the output of detection and segmentation.",
"inputs": [
{
"label": "Whole-slide image",
"description": "H&E-stained whole-slide image of breast cancer histopathology",
"format": "DICOM",
"modality": "SM",
"bodypartexamined": "Breast",
"slicethickness": "",
"non-contrast": false,
"contrast": false
}
],
"outputs": [
{
"type": "Prediction",
"valueType": "Probability",
"label": "TIL score",
"description": "Percentage of stromal area covered by tumour infiltrating lymphocytes. Values between 0 (percent) to 100 (percent).",
"classes": []
}
],
"model": {
"architecture": "Combination of multiple U-Nets with EfficientNet B2/B0 encoders, and YOLOv5 networks for detection.",
"training": "supervised",
"cmpapproach": "2D"
},
"data": {
"training": {
"vol_samples": 230
},
"evaluation": {
"vol_samples": 58
},
"public": false,
"external": false
}
},
"details": {
"name": "LB2",
"version": "55c49c9e-4216-4142-b1c8-f5d85781add3",
"devteam": "VUNO",
"type": "Segmentation/Prediction hybrid",
"date": {
"weights": "2023-07-06",
"code": "2023-07-06",
"pub": "2022-08-26"
},
"cite": "",
"license": {
"code": "Apache 2.0",
"weights": "CC BY-NC 4.0"
},
"publications": [],
"github": "https://github.com/vuno/tiger_challenge",
"zenodo": "https://doi.org/10.5281/zenodo.8112147",
"colab": "",
"slicer": false
},
"info": {
"use": {
"title": "Intended use",
"text": "Prediction of the percentage of stomal area covered by tumour infiltrating lymphocytes on H&E-stained whole-slide image of breast cancer histopathology.",
"references": [],
"tables": []
},
"analyses": {
"title": "Evaluation",
"text": "The prognostic value of the automatic \"TIL score\" generated by the submitted algorithms were computed for the test set. This was done by building a multivariate Cox regression model trained with predefined clinical variables and the produced TILs score. The concordance index (Uno’s C-index) of this model was computed and the algorithms were ranked based on its value. ",
"references": [
{
"label": "On the C-statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data",
"uri": "https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3079915/"
}
],
"tables": []
},
"evaluation": {
"title": "Evaluation data",
"text": "The test set consists of a separate dataset of n=707 H&E-stained whole-slide breast cancer histopathology images",
"references": [],
"tables": []
},
"training": {
"title": "Training data",
"text": "For the TIGER challenge three public training datasets were made available: 1. WSIROIS: Whole-slide images with manual annotations in regions of interest 2. WSIBULK: Whole-slide images with coarse manual annotation of the tumor bulk 3. WSITILS: Whole-slide images with visual estimation of the TILs at slide level",
"references": [
{
"label": "WSIROIS, WSIBULK, WSITILS on AWS Open Data",
"uri": "https://registry.opendata.aws/tiger/"
}
],
"tables": []
},
"ethics": {
"title": "",
"text": "",
"references": [],
"tables": []
},
"limitations": {
"title": "Limitations",
"text": "This algorithm was developed for research purposes only.",
"references": [],
"tables": []
}
}
}
50 changes: 50 additions & 0 deletions models/gc_tiger_lb2/scripts/tiger_lb2_cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""
--------------------------------------------------------
Mhub / DIAG - CLI Run script for the TIGER LB2 Algorithm
--------------------------------------------------------

--------------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
--------------------------------------------------------
"""

import argparse
from pathlib import Path

import torch

# The required pipeline methods are imported from the tiger_challenge repository
# The algorithm.rw module is imported for IO operations
import pipeline.tils_pipeline as tils_pipeline
import algorithm.rw as rw


def tiger_lb2_cli() -> None:
parser = argparse.ArgumentParser("Tiger LB2 Run CLI")
parser.add_argument("input_file", type=str, help="Input WSI TIFF file path")
parser.add_argument("output_file", type=str, help="Output JSON file path")
args = parser.parse_args()
run_tiger_lb2(
wsi_filepath=Path(args.input_file),
output_json_file=Path(args.output_file)
)


def run_tiger_lb2(wsi_filepath: Path, output_json_file: Path) -> None:
if not torch.cuda.is_available():
raise RuntimeError("run_tiger_lb2 requires CUDA to be available!")

print(f"Input WSI: {wsi_filepath}")
wsi_mri = rw.open_multiresolutionimage_image(wsi_filepath)

tils_score_writer = rw.TilsScoreWriter(output_json_file)
tils_score = tils_pipeline.run_tils_pipeline(wsi_mri)

print(f"Writing tils score to {output_json_file}")
tils_score_writer.set_tils_score(tils_score=tils_score)
tils_score_writer.save()


if __name__ == "__main__":
tiger_lb2_cli()
59 changes: 59 additions & 0 deletions models/gc_tiger_lb2/utils/TigerLB2Runner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
"""
------------------------------------------------
Mhub / DIAG - Run Module for Tiger LB2 Algorithm
------------------------------------------------

------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
------------------------------------------------
"""
from mhubio.core import Instance, InstanceData, IO, Module, ValueOutput, Meta, DataType, FileType

from pathlib import Path
import numpy as np
import SimpleITK as sitk
import torch

import sys
import json


@ValueOutput.Name('til_score')
@ValueOutput.Meta(Meta(key="value"))
@ValueOutput.Label('TIL score')
@ValueOutput.Type(int)
@ValueOutput.Description('percentage of stromal area covered by tumour infiltrating lymphocytes. Values between 0 (percent) to 100 (percent).')
class TilScoreOutput(ValueOutput):
pass


class TigerLB2Runner(Module):

CLI_SCRIPT_PATH = Path(__file__).parent.parent / "scripts" / "tiger_lb2_cli.py"

@IO.Instance()
@IO.Input('in_data', 'tiff:mod=sm', the='input whole slide image Tiff')
@IO.Output('out_data', 'gc_tiger_lb2_til_score.json', 'json:model=TigerLB2TILScore', 'in_data', the='TIGER LB2 TIL score')
@IO.OutputData('til_score', TilScoreOutput, data='in_data', the='TIGER LB2 TIL score - percentage of stromal area covered by tumour infiltrating lymphocytes. Values between 0-100 (percent).')
def task(self, instance: Instance, in_data: InstanceData, out_data: InstanceData, til_score: TilScoreOutput) -> None:
if not torch.cuda.is_available():
raise NotImplementedError("TigerLB2Runner requires CUDA to be available!")

# Execute the Tiger LB2 Algorithm through a Python subprocess
self.subprocess(
[
sys.executable,
str(self.CLI_SCRIPT_PATH),
in_data.abspath,
out_data.abspath,
]
)

if not Path(out_data.abspath).is_file():
raise OSError(f"Something went wrong when calling {self.CLI_SCRIPT_PATH} as a subprocess, couldn't find output file: {out_data.abspath}")

# export output til score as data as well
with open(out_data.abspath, "r") as f:
til_score.value = json.load(f)
assert isinstance(til_score.value, int)
1 change: 1 addition & 0 deletions models/gc_tiger_lb2/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .TigerLB2Runner import *