This folder contains four notebooks that show how to train, optimize, quantize and show live inference on a MONAI segmentation model with PyTorch Lightning and OpenVINO:
1. Data Preparation for 2D Segmentation of 3D Medical Data
2. Train a 2D-UNet Medical Imaging Model with PyTorch Lightning
3a. Convert and Quantize a UNet Model and Show Live Inference using POT
3b. Convert and Quantize a UNet Model and Show Live Inference using NNCF
The main difference between the POT and NNCF quantization notebooks is that NNCF performs quantization within the PyTorch framework, while POT performs quantization after the PyTorch model has been converted to OpenVINO IR format. We provide a pretrained model and a subset of the dataset for the quantization notebooks, so it is not required to run the data preparation and training notebooks before running the quantization tutorials.
The quantization tutorials show how to:
- Convert an ONNX model to OpenVINO IR with Model Optimizer
- Quantize a model with OpenVINO's Post-Training Optimization Tool API or NNCF
- Evaluate the F1 score metric of the original model and the quantized model
- Benchmark performance of the original model and the quantized model
- Show live inference with OpenVINO's async API and MULTI plugin
In addition to the notebooks in this folder, the Live Inference and Benchmark CT-scan data demo notebook contains the live-inference part of the quantization tutorial. It includes a pre-quantized model.
If you have not done so already, please follow the Installation Guide to install all required dependencies.