Skip to content

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Notifications You must be signed in to change notification settings

cosmoimd/DeepLearningExamples

 
 

Repository files navigation

Polyp Detection using NVIDIA Deep Learning Examples for Tensor Cores

This repository is a fork from https://github.com/NVIDIA/DeepLearningExamples and provides all the training and testing codes used for training and testing SSD polyp detection models in the paper "REAL-Colon: A dataset for developing real-world AI applications in colonoscopy". The REAL (Real-world multi-center Endoscopy Annotated video Library) - colon dataset is a dataset composed of 60 recordings of real-world colonoscopies. Full details and code to download the dataset and prepare data for model training and testing can be found here: https://github.com/cosmoimd/real-colon-dataset For full details on the dataset and to cite this work, please refer to: Carlo Biffi, Giulio Antonelli, Sebastian Bernhofer, Cesare Hassan, Daizen Hirata, Mineo Iwatate, Andreas Maieron, Pietro Salvagnini, and Andrea Cherubini. "REAL-Colon: A dataset for developing real-world AI applications in colonoscopy." arXiv preprint arXiv:2403.02163 (2024). Available at: https://arxiv.org/abs/2403.02163.

SSD Model Training and Evaluation

The SSD model is defined here https://github.com/cosmoimd/DeepLearningExamples/tree/master/PyTorch/Detection/SSD/

Training

  • Build and run the docker container with docker build . -t nvidia_ssd and then docker run --rm -it --gpus=all --ipc=host nvidia_ssd. Here you can also add any paths necessary for the code using the -v flag.
  • Add to the dataset_folder in PyTorch/Detection/SSD/ssd/utils.py the output_folder path obtained from the export_coco_format.py code run in the previous step. In this way, the model will be trained with an user-defined train/valid/test split of the data according to the user needds.
  • To start training run: CUDA_VISIBLE_DEVICES=0 python main.py --dataset-name real_colon --backbone resnet50 --warmup 300 --bs 64 --epochs 65 --data /coco --save ./models. This will also save the model checkpoint in ./models.

Validation

To evaluate the trained models:

  • In the docker container, run python ./main.py --backbone resnet50 --dataset-name real_colon --json-save-path /path/to/save/json/files --mode testing --no-skip-empty --checkpoint /your/model/path --data /path/to/dir/containing/test/set/
  • To compute False Positive and True Positive Rates (FPR and TRP) per video, from output jsons folder defined with --json-save-path in the previous code, please run: python3 PyTorch/Detection/SSD/fpr_trp_eval.py <path_to_json_folder>
  • To create output videos with with model predictions and GT boxes run:python3 result_visualisation.py <path_to_json_folder> <output_folder>

Contact

Andrea Cherubini - [email protected] Carlo Biffi - [email protected]

NVIDIA Deep Learning Examples for Tensor Cores

Introduction

This repository provides State-of-the-Art Deep Learning examples that are easy to train and deploy, achieving the best reproducible accuracy and performance with NVIDIA CUDA-X software stack running on NVIDIA Volta, Turing and Ampere GPUs.

NVIDIA GPU Cloud (NGC) Container Registry

These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:

  • The latest NVIDIA examples from this repository
  • The latest NVIDIA contributions shared upstream to the respective framework
  • The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
  • Monthly release notes for each of the NVIDIA optimized containers

Computer Vision

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
EfficientNet-B0 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-B4 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-WideSE-B0 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet-WideSE-B4 PyTorch Yes Yes - Supported - Supported Yes -
EfficientNet v1-B0 TensorFlow2 Yes Yes Yes Example - Supported Yes -
EfficientNet v1-B4 TensorFlow2 Yes Yes Yes Example - Supported Yes -
EfficientNet v2-S TensorFlow2 Yes Yes Yes Example - Supported Yes -
GPUNet PyTorch Yes Yes - Example Yes Example Yes -
Mask R-CNN PyTorch Yes Yes - Example - Supported - Yes
Mask R-CNN TensorFlow2 Yes Yes - Example - Supported Yes -
nnUNet PyTorch Yes Yes - Supported - Supported Yes -
ResNet-50 MXNet Yes Yes - Supported - Supported - -
ResNet-50 PaddlePaddle Yes Yes - Example - Supported - -
ResNet-50 PyTorch Yes Yes - Example - Example Yes -
ResNet-50 TensorFlow Yes Yes - Supported - Supported Yes -
ResNeXt-101 PyTorch Yes Yes - Example - Example Yes -
ResNeXt-101 TensorFlow Yes Yes - Supported - Supported Yes -
SE-ResNeXt-101 PyTorch Yes Yes - Example - Example Yes -
SE-ResNeXt-101 TensorFlow Yes Yes - Supported - Supported Yes -
SSD PyTorch Yes Yes - Supported - Supported - Yes
SSD TensorFlow Yes Yes - Supported - Supported Yes Yes
U-Net Med TensorFlow2 Yes Yes - Example - Supported Yes -

Natural Language Processing

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
BERT PyTorch Yes Yes Yes Example - Example Yes -
GNMT PyTorch Yes Yes - Supported - Supported - -
ELECTRA TensorFlow2 Yes Yes Yes Supported - Supported Yes -
BERT TensorFlow Yes Yes Yes Example - Example Yes Yes
BERT TensorFlow2 Yes Yes Yes Supported - Supported Yes -
GNMT TensorFlow Yes Yes - Supported - Supported - -
Faster Transformer Tensorflow - - - Example - Supported - -

Recommender Systems

Models Framework AMP Multi-GPU Multi-Node ONNX Triton DLC NB
DLRM PyTorch Yes Yes - Yes Example Yes Yes
DLRM TensorFlow2 Yes Yes Yes - Supported Yes -
NCF PyTorch Yes Yes - - Supported - -
Wide&Deep TensorFlow Yes Yes - - Supported Yes -
Wide&Deep TensorFlow2 Yes Yes - - Supported Yes -
NCF TensorFlow Yes Yes - - Supported Yes -
VAE-CF TensorFlow Yes Yes - - Supported - -
SIM TensorFlow2 Yes Yes - - Supported Yes -

Speech to Text

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
Jasper PyTorch Yes Yes - Example Yes Example Yes Yes
QuartzNet PyTorch Yes Yes - Supported - Supported Yes -

Text to Speech

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
FastPitch PyTorch Yes Yes - Example - Example Yes Yes
FastSpeech PyTorch Yes Yes - Example - Supported - -
Tacotron 2 and WaveGlow PyTorch Yes Yes - Example Yes Example Yes -
HiFi-GAN PyTorch Yes Yes - Supported - Supported Yes -

Graph Neural Networks

Models Framework AMP Multi-GPU Multi-Node ONNX Triton DLC NB
SE(3)-Transformer PyTorch Yes Yes - - Supported - -
MoFlow PyTorch Yes Yes - - Supported - -

Time-Series Forecasting

Models Framework AMP Multi-GPU Multi-Node TensorRT ONNX Triton DLC NB
Temporal Fusion Transformer PyTorch Yes Yes - Example Yes Example Yes -

NVIDIA support

In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.

Glossary

Multinode Training Supported on a pyxis/enroot Slurm cluster.

Deep Learning Compiler (DLC) TensorFlow XLA and PyTorch JIT and/or TorchScript

Accelerated Linear Algebra (XLA) XLA is a domain-specific compiler for linear algebra that can accelerate TensorFlow models with potentially no source code changes. The results are improvements in speed and memory usage.

PyTorch JIT and/or TorchScript TorchScript is a way to create serializable and optimizable models from PyTorch code. TorchScript, an intermediate representation of a PyTorch model (subclass of nn.Module) that can then be run in a high-performance environment such as C++.

Automatic Mixed Precision (AMP) Automatic Mixed Precision (AMP) enables mixed precision training on Volta, Turing, and NVIDIA Ampere GPU architectures automatically.

TensorFloat-32 (TF32) TensorFloat-32 (TF32) is the new math mode in NVIDIA A100 GPUs for handling the matrix math also called tensor operations. TF32 running on Tensor Cores in A100 GPUs can provide up to 10x speedups compared to single-precision floating-point math (FP32) on Volta GPUs. TF32 is supported in the NVIDIA Ampere GPU architecture and is enabled by default.

Jupyter Notebooks (NB) The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

Feedback / Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Known issues

In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.

About

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 51.9%
  • Python 41.7%
  • Shell 2.7%
  • C++ 2.2%
  • Cuda 1.0%
  • Makefile 0.2%
  • Other 0.3%