MHub / GC - Add the GC Tiger challenge winner LB2 model #38

silvandeleemput · 2023-07-25T21:57:14Z

It took a bit longer than expected, but this PR adds the GC Tiger challenge winner LB2 model to MHub.

This implements a Dockerfile based on the new base image
I added two pipeline files:
- default.yml/run.py (WSI-DICOM in JSON out) <--- This one uses the PanImgConverter
- tiff_pipeline.yml/run_tiff_pipeline.py (TIFF in JSON out) <-- This one was added for convenience/comparison/testing
The PR should probably be retargeted to an m-gc-tiger-lb2 branch, subsequently, the MHub model repo should be properly added to the Dockerfile

This is also an interesting model to have a look at because of the following issues/characteristics:

Uses WSI as an input and uses the PanImgConverter
Has a JSON output with a single integer value, which is the TIL score.
The TigerLB2Runner requires a GPU to be run (enforced through an assert, otherwise it would throw an error). It will require elaborate hacking to get working on a CPU (I have been able to do it for a CPU-only version), but the runtime is over one hour for the CPU version versus 15-20 mins on the GPU so I think not worth supporting it. Let me know otherwise.
This model suffers from the model import issue, (check the Dockerfile), which required me to perform some ugly hacks that even prevent me from running the pipeline directly from the configuration files.

…sions to torch install

…fixed model weights install, updated pipelines

…s call from the pipeline * Fix for the import issue: * Implemented a CLI for the Tiger lb2 runner that takes an input TIFF path and output JSON file path * Integrated the CLI inside the TigerLB2Runner as a subprocess call to the CLI through a fresh Python interpreter * Cleaned the Dockerfile some more * Caveats: * Because of the fix the extra run script for the tiff_pipeline became redundant, can be called directly from the tiff_pipeline.yml through a call of mhubio.run * For the main pipeline the run.py is still required since the TiffPanImgConverter is not automatically picked up * Updated the PanImgConverters, since the returned UnconsumedFilesException e.errors were renamed to e.file_errors

silvandeleemput · 2023-07-26T20:34:44Z

Pushed a fix for the model import issue as suggested by @LennyN95 to this PR.
To circumvent the conflicting imports issue I added a CLI script for running the model algorithm that is now called as a subprocess within the pipeline. I consider the issue solved for this particular model.

LennyN95 · 2023-08-13T13:03:08Z

Can we use ValueOutput+ReportExporter for this algorithm?
If the JSON generation is characteristic to the algorithm, I suggest keeping the JSON file as a data output and adding a ValueOutput with the score for greater flexibility and compatibility. This way, we allow users to either use the original file and simultaneously offer full flexibility via our ReportExporter mechanisms.

Dockerfile - fixed to commit for tiger_challenge repo - updated MHubAI/models clone stub - updated entrypoint pipeline default.yml - updated the outputs - use the TiffConverter - Added reportexporter template runner: - Added TIL Score as ValueOutput - Confirmed data output - Added/updated comments removed: - custom panimg converter classes - run.py script to run the pipeline

…s into m-gc-tiger-lb2

silvandeleemput · 2023-09-14T08:30:36Z

This PR has been cleaned and updated to meet the latest requirements:

Dockerfile
- fixed to commit for tiger_challenge repo
- updated MHubAI/models clone stub
- updated entrypoint
pipeline default.yml
- updated the outputs
- use the TiffConverter
- added reportexporter settings
runner:
- Added TIL Score as ValueOutput
- Confirmed data output
- Added/updated comments
removed:
- custom panimg converter classes
- run.py script to run the pipeline

Can we use ValueOutput+ReportExporter for this algorithm? If the JSON generation is characteristic to the algorithm, I suggest keeping the JSON file as a data output and adding a ValueOutput with the score for greater flexibility and compatibility. This way, we allow users to either use the original file and simultaneously offer full flexibility via our ReportExporter mechanisms.

@LennyN95 I added a ValueOutput for the TIL score for this algorithm, but when I try to export that value with the ReportExporter it outputs an empty report with: {} I get the following error using the --print option:

Start ReportExporter
Error while generating report for  {'data': 'til_score', 'label': 'TIL score', 'value': 'value'}

After doing some in-situ debugging, I found that the following line fails:

mhubio/modules/exporter/ReportExporter.py::153: assert len(data) == 1, "data name not unique."

Then I found out that data = [] is an empty list after printing the data, which was populated using the following code:

# fetch the data
data = [d for d in instance.outputData if d.name == include['data']]

It appears that instance.outputData = [] is also empty, so it seems that the outputData is somehow not properly picked up when using this decorator in the TigerLB2Runner class:

@IO.OutputData('til_score', TilScoreOutput, data='in_data', the='TIGER LB2 TIL score - percentage of stromal area covered by tumour infiltrating lymphocytes. Values between 0-100 (percent).')

@LennyN95 Do you know what is going on here? The latest code update in this PR is the code that I used. I only disabled the ReportExporter in the default.yml for now to make this fully functional, but during testing this was enabled.

I got it working with similar code before for other model PRs. It doesn't however seem to be caused by codebase changes, because I tried to revert it to older versions of mhubio up until August 3th, but it still had the same output and error. So I guess it is something else.

silvandeleemput · 2023-09-14T09:46:24Z

Alright, I got it working, this PR can now use the ReportExporter as requested.

Apparently it was a mistake in the last assert line of the runner (called the wrong method). The error message was reported in the output using --print, but because the code doesn't stop at the first error, the rest of the output obfuscated this error message. I think this further emphasizes the need for a --stop-on-error flag to avoid unnecessary debugging.

…plicit errors

models/gc_tiger_lb2/utils/TigerLB2Runner.py

models/gc_tiger_lb2/dockerfiles/Dockerfile

models/gc_tiger_lb2/config/default.yml

models/gc_tiger_lb2/dockerfiles/Dockerfile

…move first line Dockerfile

…HubAI#38

models/gc_tiger_lb2/dockerfiles/Dockerfile

silvandeleemput · 2024-03-07T15:35:21Z

output.zip

silvandeleemput · 2024-03-07T15:38:38Z

/test

sample:
  idc_version: 17.0
  data:
    - SeriesInstanceUID: 1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0
      aws_url: s3://idc-open-data/20ab74ef-080b-4975-adad-7507b7b04cb8/*
      path: dicom

reference:
  url: https://github.com/MHubAI/models/files/14526578/output.zip

Test Results (24.03.13_17.52.53_6VEUMKpYVb)

id: 7654bec5-0c56-43b5-82db-cee986e4d663
date: '2024-03-13 18:08:50'
checked_files:
- file: gc_tiger_lb2_til_score.json
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_til_score.json
  checks:
  - checker: DataFileCheck
    findings:
    - label: Value similar
      description: Value of key 'TIL score' is similar in source file
      subpath: TIL score
      info:
        scale: 1
        precision: -1
        src: 4
        ref: 5
  - checker: SizeCheck
summary:
  files_missing: 0
  files_extra: 0
  checks:
    DataFileCheck:
      files: 1
      findings:
        Value similar: 1
    SizeCheck:
      files: 1
conclusion: false

silvandeleemput · 2024-03-07T15:40:17Z

/review

We have implemented the requested changes and updated the meta.json.
Except for the reproducibility fix it should be as good as ready.

silvandeleemput · 2024-03-18T10:06:00Z

@LennyN95 thanks for checking. Small differences in outcome are to be expected when run on different GPUs due to accumulated small numerical differences. If the results only differ by a few points (something in the range of 1-2 points) this is deemed acceptable for this model as discussed with the challenge organizers.

…puts

silvandeleemput · 2024-04-24T08:25:51Z

gc_tiger_lb2_output.zip

silvandeleemput · 2024-04-24T08:30:30Z

@LennyN95 I have created a fork of the source model/algorithm repository here: https://github.com/DIAGNijmegen/tiger_vuno_algorithm.

In this fork, I have added the reproducibility fix and added the tumor and stroma segmentation mask to the output.
The PR has been updated to build from the fork and output the segmentation.

I have uploaded a new output.zip file (which includes the segmentation) and updated the URL in the test above.

Please check if everything is in order now. I suspect that the segmentation will be different due to GPU hardware/driver differences.

LennyN95 · 2024-04-24T09:05:11Z

Thank you @silvandeleemput. I create this as a new test and reverted the modification to keep the test history clean. Test requests should be edited to correct syntax errors only but not touched once one or more tests were performed (and reported in the test request).

LennyN95 · 2024-04-24T09:05:16Z

/test

sample:
  idc_version: 17.0
  data:
    - SeriesInstanceUID: 1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0
      aws_url: s3://idc-open-data/20ab74ef-080b-4975-adad-7507b7b04cb8/*
      path: dicom

reference:
  url: https://github.com/MHubAI/models/files/15091074/gc_tiger_lb2_output.zip

Test Results (24.04.25_08.24.29_0ZZ5RYzDlk)

id: fa0fa712-4d2b-4f13-adae-56cc312c4209
date: '2024-04-25 08:42:26'
missing_files:
- 1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_segmentation.mha
checked_files:
- file: gc_tiger_lb2_til_score.json
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_til_score.json
  checks:
  - checker: DataFileCheck
    findings:
    - label: Value similar
      description: Value of key 'TIL score' is similar in source file
      subpath: TIL score
      info:
        scale: 1
        precision: -1
        src: 4
        ref: 5
  - checker: SizeCheck
summary:
  files_missing: 1
  files_extra: 0
  checks:
    DataFileCheck:
      files: 1
      findings:
        Value similar: 1
    SizeCheck:
      files: 1
conclusion: false

Test Results (24.04.25_14.08.51_xHF1bsAg8N)

id: 1c55ef8c-8325-4838-898e-63a7ef2b04fe
date: '2024-04-25 14:32:07'
checked_files:
- file: gc_tiger_lb2_segmentation.mha
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_segmentation.mha
  checks:
  - checker: ImageFileCheck
    notes:
    - label: Data Type
      description: Data type of the reference image
      info: uint8
    findings:
    - label: Exception
      description: An exception occurred during check
      info: 'could not convert string to float: '''''
- file: gc_tiger_lb2_til_score.json
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_til_score.json
  checks:
  - checker: DataFileCheck
    findings:
    - label: Missing key
      description: Key 'TIL score' is missing in source file
      subpath: TIL score
  - checker: SizeCheck
    findings:
    - label: Size Difference
      description: file size is smaller than reference
      info:
        src_size: 2
        ref_size: 22
        diff_size: -20
summary:
  files_missing: 0
  files_extra: 0
  checks:
    ImageFileCheck:
      files: 1
      findings:
        Exception: 1
    DataFileCheck:
      files: 1
      findings:
        Missing key: 1
    SizeCheck:
      files: 1
      findings:
        Size Difference: 1
conclusion: false

Test Results (24.06.03_10.26.45_9t3n5BDtdf)

id: bc76f65a-b934-4712-a8cf-2c5263f357ae
date: '2024-06-03 10:54:38'
checked_files:
- file: gc_tiger_lb2_segmentation.mha
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_segmentation.mha
  checks:
  - checker: ImageFileCheck
    notes:
    - label: Data Type
      description: Data type of the reference image
      info: uint8
    findings:
    - label: Exception
      description: An exception occurred during check
      info: 'could not convert string to float: '''''
- file: gc_tiger_lb2_til_score.json
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_til_score.json
  checks:
  - checker: DataFileCheck
    findings:
    - label: Value similar
      description: Value of key 'TIL score' is similar in source file
      subpath: TIL score
      info:
        scale: 1
        precision: -1
        src: 4
        ref: 5
  - checker: SizeCheck
summary:
  files_missing: 0
  files_extra: 0
  checks:
    ImageFileCheck:
      files: 1
      findings:
        Exception: 1
    DataFileCheck:
      files: 1
      findings:
        Value similar: 1
    SizeCheck:
      files: 1
conclusion: false

Test Results (24.06.07_12.29.30_uTzBQF9Njk)

id: 3302c113-2efa-45e3-97db-7e651d4f835f
date: '2024-06-07 12:46:36'
checked_files:
- file: gc_tiger_lb2_segmentation.mha
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_segmentation.mha
  checks:
  - checker: ImageFileCheck
    notes:
    - label: Data Type
      description: Data type of the reference image
      info: uint8
    - label: Dice Score
      description: Dice score between reference and test image
      info: 1.0
- file: gc_tiger_lb2_til_score.json
  path: /app/test/src/1.3.6.1.4.1.5962.99.1.1334438926.1589741711.1637717011470.2.0/gc_tiger_lb2_til_score.json
  checks:
  - checker: DataFileCheck
    findings:
    - label: Value similar
      description: Value of key 'TIL score' is similar in source file
      subpath: TIL score
      info:
        scale: 1
        precision: -1
        src: 4
        ref: 5
  - checker: SizeCheck
summary:
  files_missing: 0
  files_extra: 0
  checks:
    ImageFileCheck:
      files: 1
    DataFileCheck:
      files: 1
      findings:
        Value similar: 1
    SizeCheck:
      files: 1
conclusion: false

silvandeleemput · 2024-06-04T07:35:24Z

@LennyN95 Regarding the results, the exact same exception as last time shows:

    - label: Data Type
      description: Data type of the reference image
      info: uint8
    findings:
    - label: Exception
      description: An exception occurred during check
      info: 'could not convert string to float: '''''

I don't understand the origin of this error yet and it wasn't present two base images ago. I will investigate this further.

silvandeleemput · 2024-06-07T20:17:13Z

@LennyN95 The results look fine now.

silvandeleemput and others added 7 commits June 27, 2023 16:28

add initial nocuda Dockerfile and working TiffImporter and run.py script

9459c3b

add cuda 11.4/12.0 Dockerfiles and replaced print statement in runner

edcc2e4

cleanup ASAP install dockerfile line and add matching torchvision ver…

19d64c4

…sions to torch install

Merge branch 'MHubAI:main' into m-tiger-lb2

74fbfa5

moved tiger_lb2 -> gc_tiger_lb2, changed Dockerfile to new base img, …

21e073b

…fixed model weights install, updated pipelines

add DicomImporter import to run.py

7c4fafd

add torch cuda assert to tiger_lb2 runner

292a558

silvandeleemput changed the title ~~Add the GC Tiger challenge winner LB2 model~~ MHub / GC - Add the GC Tiger challenge winner LB2 model Jul 26, 2023

LennyN95 self-assigned this Aug 13, 2023

silvandeleemput and others added 3 commits September 13, 2023 13:11

Merge branch 'MHubAI:main' into m-gc-tiger-lb2

4e49e61

Merge branch 'm-gc-tiger-lb2' of github.com:DIAGNijmegen/MHubAI_model…

83b5cc2

…s into m-gc-tiger-lb2

fix issue with last assert runner, re-enable ReportExporter

169a5fe

silvandeleemput added 4 commits October 7, 2023 11:53

replace subprocess call with self.subprocess, replace asserts with ex…

982db19

…plicit errors

add basic meta.json

062cb02

add minimum panimg pip install fix for wsi dicom conversion

ca1a450

changed algorithm details type

af0287b

LennyN95 reviewed Nov 20, 2023

View reviewed changes

LennyN95 reviewed Nov 21, 2023

View reviewed changes

models/gc_tiger_lb2/dockerfiles/Dockerfile Outdated Show resolved Hide resolved

LennyN95 added the +Model: Under Review label Nov 22, 2023

LennyN95 requested changes Nov 22, 2023

View reviewed changes

models/gc_tiger_lb2/dockerfiles/Dockerfile Outdated Show resolved Hide resolved

LennyN95 added the +Model: ACTION REQUIRED label Nov 22, 2023

silvandeleemput and others added 3 commits November 23, 2023 22:37

Merge branch 'MHubAI:main' into m-gc-tiger-lb2

95374a3

add import mhub model definition lines, remove panimg pip install, re…

a27dad5

…move first line Dockerfile

modified meta line

1965629

LennyN95 removed the +Model: ACTION REQUIRED label Jan 8, 2024

silvandeleemput and others added 6 commits January 22, 2024 11:34

meta.json - fix typo and match model name

a82e538

fix dynamic modality in default.yml, improve error message in runner M…

59c97c7

…HubAI#38

meta.json - change version to 0.1.0

53e9e52

Merge branch 'MHubAI:main' into m-gc-tiger-lb2

d79cae0

meta.json - modified, updated, and extended

8ea1ba9

Dockerfile - add reproducibility fix

b973f7d

LennyN95 requested changes Mar 7, 2024

View reviewed changes

models/gc_tiger_lb2/dockerfiles/Dockerfile Outdated Show resolved Hide resolved

Dockerfile, runner, cli - added pipenv for model algorithm MHubAI#38

fa73a62

github-actions bot added the TEST REQUESTED label Mar 7, 2024

github-actions bot added the REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us label Mar 7, 2024

LennyN95 removed REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us TEST REQUESTED labels Mar 19, 2024

silvandeleemput added 2 commits April 18, 2024 10:09

pull source from forked repo, added segmentation output as mha

653799a

meta.json - add tumor and stroma segmentation mask description to out…

0571aad

…puts

github-actions bot added the TEST REQUESTED label Apr 24, 2024

runner.py - fix output type filename

4bc6631

LennyN95 approved these changes Jun 11, 2024

View reviewed changes

LennyN95 merged commit 911164a into MHubAI:main Jun 11, 2024
1 check passed

LennyN95 removed the TEST REQUESTED label Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MHub / GC - Add the GC Tiger challenge winner LB2 model #38

MHub / GC - Add the GC Tiger challenge winner LB2 model #38

silvandeleemput commented Jul 25, 2023 •

edited

Loading

silvandeleemput commented Jul 26, 2023

LennyN95 commented Aug 13, 2023

silvandeleemput commented Sep 14, 2023 •

edited

Loading

silvandeleemput commented Sep 14, 2023

silvandeleemput commented Mar 7, 2024

silvandeleemput commented Mar 7, 2024 •

edited by LennyN95

Loading

silvandeleemput commented Mar 7, 2024

silvandeleemput commented Mar 18, 2024

silvandeleemput commented Apr 24, 2024

silvandeleemput commented Apr 24, 2024 •

edited

Loading

LennyN95 commented Apr 24, 2024

LennyN95 commented Apr 24, 2024 •

edited

Loading

silvandeleemput commented Jun 4, 2024

silvandeleemput commented Jun 7, 2024

MHub / GC - Add the GC Tiger challenge winner LB2 model #38

MHub / GC - Add the GC Tiger challenge winner LB2 model #38

Conversation

silvandeleemput commented Jul 25, 2023 • edited Loading

silvandeleemput commented Jul 26, 2023

LennyN95 commented Aug 13, 2023

silvandeleemput commented Sep 14, 2023 • edited Loading

silvandeleemput commented Sep 14, 2023

silvandeleemput commented Mar 7, 2024

silvandeleemput commented Mar 7, 2024 • edited by LennyN95 Loading

silvandeleemput commented Mar 7, 2024

silvandeleemput commented Mar 18, 2024

silvandeleemput commented Apr 24, 2024

silvandeleemput commented Apr 24, 2024 • edited Loading

LennyN95 commented Apr 24, 2024

LennyN95 commented Apr 24, 2024 • edited Loading

silvandeleemput commented Jun 4, 2024

silvandeleemput commented Jun 7, 2024

silvandeleemput commented Jul 25, 2023 •

edited

Loading

silvandeleemput commented Sep 14, 2023 •

edited

Loading

silvandeleemput commented Mar 7, 2024 •

edited by LennyN95

Loading

silvandeleemput commented Apr 24, 2024 •

edited

Loading

LennyN95 commented Apr 24, 2024 •

edited

Loading