Skip to content
View FairMedFM's full-sized avatar

Block or report FairMedFM

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
FairMedFM/README.md

FairMedFM

Fairness Benchmarking for Medical Imaging Foundation Models

main

Abstract

The advent of foundation models (FMs) in healthcare offers unprecedented opportunities to enhance medical diagnostics through automated classification and segmentation tasks. However, these models also raise significant concerns about their fairness, especially when applied to diverse and underrepresented populations in healthcare applications. Currently, there is a lack of comprehensive benchmarks, standardized pipelines, and easily adaptable libraries to evaluate and understand the fairness performance of FMs in medical imaging, leading to considerable challenges in formulating and implementing solutions that ensure equitable outcomes across diverse patient populations. To fill this gap, we introduce FairMedFM, a fairness benchmark for FM research in medical imaging. FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes. It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation. Our exhaustive analysis evaluates the fairness performance over different evaluation metrics from multiple perspectives, revealing the existence of bias, varied utility-fairness trade-offs on different FMs, consistent disparities on the same datasets regardless FMs, and limited effectiveness of existing unfairness mitigation methods.

Structure

FairMedFM captures comprehensive modules for benchmarking the fairness of foundation models in medical image analysis.

main

  • Dataloader: provides a consistent interface for loading and processing imaging data across various modalities and dimensions, supporting both classification and segmentation tasks.
  • Model: a one-stop library that includes implementations of the most popular pre-trained foundation models for medical image analysis.
  • Usage Wrapper: encapsulates foundation models for various use cases and tasks, including linear probe, zero-shot inference, PEFT, promptable segmentation, etc.
  • Trainer: offers a unified workflow for fine-tuning and testing wrapped models, and includes state-of-the-art unfairness mitigation algorithms.
  • Evaluation includes a set of metrics and tools to visualize and analyze fairness across different tasks.
Tasks Supported Usages Supported Models Supported Datasets
Image Classification Linear probe, zero-shot, CLIP adaptaion, PEFT CLIP, BLIP, BLIP2, MedCLIP, BiomedCLIP, PubMedCLIP, DINOv2, C2L, LVM-Med, MedMAE, MoCo-CXR CheXpert, MIMIC-CXR, HAM10000, FairVLMed10k, GF3300, PAPILA, BRSET, COVID-CT-MD, ADNI-1.5T
Image Segmentation Interactive segmentation prompted with boxes and points SAM, MobileSAM, TinySAM, MedSAM, SAM-Med2D, FT-SAM, SAM-Med3D, FastSAM3D, SegVol HAM10000, TUSC, FairSeg, Montgomery County X-ray, KiTS, CANDI, IRCADb, SPIDER

Schedule

  • Release the classification tasks.

  • Release the segmentation tasks.

    • 2D dataset + 2D SAMs
    • 3D dataset + 2D SAMs
    • 3D dataset + 3D SAMs
  • Release more models

  • Release the preprocessed datasets for classification.

  • Release examples and tutorials.

  • Integration of the classic strategies.

Installation

The installation requires three steps.

  1. Download from github

    git clone https://github.com/FairMedFM/FairMedFM.git
    cd FairMedFM
    
  2. Creating conda environment

    conda env create -f environment.yaml
    conda activate fairmedfm
    
  3. Download Pretrained FMs

    wget https://object-arbutus.cloud.computecanada.ca:443/rjin/pretrained.zip
    unzip pretrained.zip
    rm -f pretrained.zip
    

Our notebook tutorials also contains how to setup the environment in Colab. Open In Colab

Data

You can either download our pre-processed data directly (see next section) or pre-process customized data your self. However, not all dataset we used permit us to release the data on our end (e.g., dataset like MIMIC and ADNI requires the user go through their data usage application first). In such case, we cannot provide the download link of our preprocessed dataset for them, but we have the original dataset downloading link and our pre-process scripts released.

Preprocess data on your own

We provide data preprocessing scripts for each datasets here. The data preprocessing contains 3 steps:

  • (Optional) preprocess imaging data.
  • Preprocess metadata and sensitive attributes.
  • Split dataset into training set and test set with balanced subgroups (for classification only).

Our data is downloaded uisng the following links.

Classification Dataset

Dataset Link
CheXpert Original data
Demographic data
MIMIC-CXR MIMIC-CXR
PAPILA PAPILA
HAM10000 HAM10000
OCT OCT
OL3I OL3I
COVID-CT-MD COVID-CT-MD
ADNI ADNI-1.5T

Segmentation Dataset

Dataset Link
HAM10000 HAM10000
TUSC TUSC
FairSeg FairSeg
Montgomery County X-ray Montgomery County X-ray
KiTS2023 KiTS2023
IRCADb IRCADb
CANDI CANDI
SPIDER SPIDER

Use Our Pre-processed Data

We offer data downloading through the S3 link. We are working to build this feature now.

Classification Dataset

Dataset Link
CheXpert Requires application on original data provider.
MIMIC-CXR Requires application on original data provider.
PAPILA PAPILA
HAM10000 HAM10000
OCT Waiting for more storage resources
OL3I Waiting for more storage resources
COVID-CT-MD Waiting for more storage resources
ADNI Requires application on original data provider.

Notebook Tutorial

We offer some examples of how to use our package through the notebook.

Feature Notebook
Linear Probing Open In Colab
CLIP Zero-shot and Adaptor Open In Colab
Segmentation Open In Colab

Running Experiment

Classification

We provide an example of running a linear-probe (classification) experiment of the CLIP model on the MIMIC-CXR dataset to evaluate fairness on sex. Please refer to parse_args.py for more details.

python main.py --task cls --usage lp --dataset CXP --sensitive_name Sex --method erm --total_epochs 100 --warmup_epochs 5 --blr 2.5e-4 --batch_size 128 --optimizer adamw --min_lr 1e-5 --weight_decay 0.05

Segmentation (2D SAMs)

We also provide an example of using SAM with center point prompt on the TUSC dataset to evaluate fairness on sex. Please refer to parse_args.py for more details.

python main.py --task seg --usage seg2d --dataset TUSC --sensitive_name Sex --method erm --batch_size 1 --pos_class 255 --model SAM --sam_ckpt_path ./weights/SAM.pth --img_size 1024 --prompt center

Acknowledgement

We thank MEDFAIR for their pioneering works on benchmarking fairness for medical image analysis, and Slide-SAM for the SAM inference framework.

License

This project is released under the CC BY 4.0 license. Please see the LICENSE file for more information.

Citation

If you think our project is helpful and love our project, it's nice if you can cite us. Such supports will help us secure resources for further developing similar projects.

@article{jin2024fairmedfm,
  title={FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models},
  author={Jin, Ruinan and Xu, Zikang and Zhong, Yuan and Yao, Qiongsong and Dou, Qi and Zhou, S Kevin and Li, Xiaoxiao},
  journal={arXiv preprint arXiv:2407.00983},
  year={2024}
}

Popular repositories Loading

  1. FairMedFM FairMedFM Public

    Jupyter Notebook 30 3