Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Brevitas: neural network quantization in PyTorch
A pytorch quantization backend for optimum
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…
A library for training, compressing and deploying computer vision models (including ViT) with edge devices
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
a fast, scalable, multi-language and extensible build system
The Web framework for perfectionists with deadlines.
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Universal LLM Deployment Engine with ML Compilation
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Simple tool for partial optimization of ONNX. Further optimize some models that cannot be optimized with onnx-optimizer and onnxsim by several tens of percent. In particular, models containing Eins…
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-t…
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Visualizer for neural network, deep learning and machine learning models
Empowering everyone to build reliable and efficient software.
An Open Source Machine Learning Framework for Everyone
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Transformer related optimization, including BERT, GPT
Development repository for the Triton language and compiler
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.