This project demonstrates a tissue segmentation pipeline built using the Mask2Former model from Facebook AI. It performs binary segmentation on medical imaging data from Huron Pathalogy, distinguishing between the background and tissue regions.
- Mean IoU: 0.8032
- Mean Dice: 0.8721
- Mean Pixel Accuracy: 0.9743
Trained model download link: https://drive.google.com/drive/folders/1-041y6yOD-oFiB2gl0ltAiOwN2Ul0drQ?usp=sharing
- Pretrained Mask2Former model fine-tuned for binary segmentation.
- Support for custom datasets with images and masks.
- Efficient data preprocessing, training, and inference pipelines.
- Visualization of results for training, validation, and testing.
Ensure you have the required libraries installed. Use the following dependencies: matplotlib
, torch
, numpy
, tqdm
, Pillow
, torchvision
, transformers
, gc
, os
, scipy
. You can install these dependencies using pip:
bash
pip install matplotlib torch numpy tqdm Pillow torchvision transformers scipy
The dataset is organized as follows:
data/
├── Huron_data/
│ ├── Sliced_Images/ # Image folder
│ ├── Sliced_Masks/ # Corresponding binary masks
The project uses the Mask2Former pretrained model (facebook/mask2former-swin-base-IN21k-ade-semantic
). The model is configured for binary segmentation (two classes: tissue and background).
All details are laid out including optimal paramters and using the corresponding python files. Comments outline entire approach.
- Preprocess Data: Ensure images and masks are aligned and formatted. Use the
verify_data_alignment()
function to validate consistency. - Train the Model: Use the
train()
function to fine-tune the Mask2Former model on your dataset. - Plot metrics over epochs (loss, mIoU, dice, pixel accuracy) with
visualize_training_metrics()
andvisualize_validation_metrics()
fromdata_visualization.py
. - Save/Load the Model: Save the model and processor using the provided utility functions.
- Run Inference: Perform predictions and visualize results using
infer_and_display()
.
- Metrics: Pixel accuracy, mIoU, dice coefficient.
- Loss Function: Best performance with Scaled Dice Loss custom class (Details in loss.py)
Note: for this task, the qualitative metrics are more important than the quantiative metrics. Use infer_and_display() to inspect your results. You will notice a nearly perfect match with predictions and ground truths.
The hyperparameter tuning script is uses optuna
, a popular optimization library, to fine tune a few hyperparameters. This is done by creating a study which runs multiple trials sequentially. In each trial, the objective
function
is ran and returns the mIoU which is what is then used to assess the performance of a trial. This can take a while to run depending on the number of trials, epochs, and the size of the train subset used.
To install optuna, you can run
pip install optuna
Finally, simply make sure you have dataset downloaded into the root folder using the dataset structure shown above.
The files in Accelerate_Fine_Tuning are built to train the Mask2Former model on Hugging Face's Accelerate library. This allows users to run multi-GPU training easily. Below are the steps to set up and execute the script. The following steps assume you have already followed the steps for general training, including installing the dependencies and downloading the dataset.
For this code you will only need to install Accelerate
and pyyaml
. You can install them as follows:
pip install accelerate pyyaml
To use Accelerate, you typically need to run accelerate config
and select from a series of options. You may do this if you prefer but in the Accelerate directory, there is a default_config.yaml that can be used (this configuration is tested and is working, if you use a custom configuration, it may break). The only field you may need to change in default_config.yaml is the num_processes
which is the number of available GPUs.
For this script, you will need to create or adapt the hyperparameters.yaml file. Each hyperparameter can be tweaked to your liking. Hyperparameters can also be overwritten from the command line using --hyperparameter_name value
.
Now that everything is setup, you can run the training script. Use the following command if you are using the default_config.yaml:
accelerate launch --config_file default_config.yaml accelerate_train.py --hparams hyperparameters.yaml --data_folder path/to/data --other args
The training log and model checkpoints will be saved as follows:
results/
├── seed/
│ ├── training.log # Training log with loss, dice, IOU and pixel accuracy per epoch
│ ├── best_iou_checkpoint/ # Folder with saved model checkpoint
│ ├── hyperparameters.yaml # Hyperparameters used for this experiment
- Python >= 3.8
- GPU recommended for large datasets.
- On CPU, FUll dataset training will take 2 hours per epoch, while on GPU
- On GPU, Full dataset training will take ~5 to 30 minutes per epoch depending on the GPU
- A100 takes 5 minutes per epoch, while T4 takes 30 minutes per epoch
- Optimal performance requires at least 10 epochs.
Note: although these imports are required, they are taken care of in the python files and usage_example.ipynb file. Only the libraries must be installed on your system.
import matplotlib.pyplot as plt
import torch
import numpy as np
import torch
import torch.nn as nn
from tqdm import tqdm
from torch.optim import AdamW
import torch.nn.functional as F
from PIL import Image
from torchvision import transforms
import os
import gc
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor
- Ensure dataset paths are correct.
- For large datasets, use a GPU for faster training and inference.
- If IoU/Dice scores are low, check that images and masks are aligned properly.
This project leverages the Mask2Former model from Facebook AI Research and is implemented using PyTorch and Hugging Face Transformers.
The Mask2Former github page can be found here
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation