Triton Inference Server

All

35 repositories

client
Public
Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
Python
•
BSD 3-Clause "New" or "Revised" License
•233•571•32•29•Updated Nov 24, 2024Nov 24, 2024
server
Public
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
machine-learning cloud deep-learning gpu inference edge datacenter
Python
•
BSD 3-Clause "New" or "Revised" License
•1.5k•8.4k•582•63•Updated Nov 24, 2024Nov 24, 2024
perf_analyzer
Public
C++
•
BSD 3-Clause "New" or "Revised" License
•6•26•10•11•Updated Nov 23, 2024Nov 23, 2024
tensorflow_backend
Public
The Triton backend for TensorFlow.
C++
•
BSD 3-Clause "New" or "Revised" License
•19•45•0•3•Updated Nov 23, 2024Nov 23, 2024
tensorrt_backend
Public
The Triton backend for TensorRT.
C++
•
BSD 3-Clause "New" or "Revised" License
•29•64•0•2•Updated Nov 23, 2024Nov 23, 2024
third_party
Public
Third-party source packages that are modified for use in Triton.
C
•
BSD 3-Clause "New" or "Revised" License
•56•6•0•6•Updated Nov 23, 2024Nov 23, 2024
vllm_backend
Public
Python
•
BSD 3-Clause "New" or "Revised" License
•20•193•0•8•Updated Nov 23, 2024Nov 23, 2024
repeat_backend
Public
An example Triton backend that demonstrates sending zero, one, or multiple responses for each request.
C++
•
BSD 3-Clause "New" or "Revised" License
•7•5•0•1•Updated Nov 23, 2024Nov 23, 2024
redis_cache
Public
TRITONCACHE implementation of a Redis cache
C++
•
BSD 3-Clause "New" or "Revised" License
•4•12•2•1•Updated Nov 23, 2024Nov 23, 2024
pytorch_backend
Public
The Triton backend for the PyTorch TorchScript models.
C++
•
BSD 3-Clause "New" or "Revised" License
•43•127•0•4•Updated Nov 23, 2024Nov 23, 2024
python_backend
Public
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
C++
•
BSD 3-Clause "New" or "Revised" License
•146•554•0•13•Updated Nov 23, 2024Nov 23, 2024
core
Public
The core library and APIs implementing the Triton Inference Server.
C++
•
BSD 3-Clause "New" or "Revised" License
•105•105•0•22•Updated Nov 23, 2024Nov 23, 2024
onnxruntime_backend
Public
The Triton backend for the ONNX Runtime.
inference backend triton-inference-server onnx-runtime
C++
•
BSD 3-Clause "New" or "Revised" License
•57•134•71•4•Updated Nov 23, 2024Nov 23, 2024
model_analyzer
Public
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
deep-learning gpu inference performance-analysis
Python
•
Apache License 2.0
•75•434•22•6•Updated Nov 23, 2024Nov 23, 2024
developer_tools
Public
C++
•10•18•0•5•Updated Nov 22, 2024Nov 22, 2024
fil_backend
Public
FIL backend for the Triton Inference Server
Jupyter Notebook
•
Apache License 2.0
•36•72•51•3•Updated Nov 22, 2024Nov 22, 2024
tensorrtllm_backend
Public
The Triton TensorRT-LLM Backend
Python
•
Apache License 2.0
•108•710•262•19•Updated Nov 22, 2024Nov 22, 2024
common
Public
Common source, scripts and utilities shared across all Triton repositories.
C++
•
BSD 3-Clause "New" or "Revised" License
•75•62•0•5•Updated Nov 22, 2024Nov 22, 2024
backend
Public
Common source, scripts and utilities for creating Triton backends.
C++
•
BSD 3-Clause "New" or "Revised" License
•90•295•0•4•Updated Nov 22, 2024Nov 22, 2024
tutorials
Public
This repository contains tutorials and examples for Triton Inference Server
Python
•
BSD 3-Clause "New" or "Revised" License
•96•570•8•16•Updated Nov 22, 2024Nov 22, 2024
openvino_backend
Public
OpenVINO backend for Triton.
C++
•
BSD 3-Clause "New" or "Revised" License
•16•30•5•4•Updated Nov 20, 2024Nov 20, 2024
pytriton
Public
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
gpu deep-learning inference
Python
•
Apache License 2.0
•51•745•10•0•Updated Nov 19, 2024Nov 19, 2024
square_backend
Public
Simple Triton backend used for testing.
C++
•
BSD 3-Clause "New" or "Revised" License
•4•2•0•0•Updated Nov 19, 2024Nov 19, 2024
local_cache
Public
Implementation of a local in-memory cache for Triton Inference Server's TRITONCACHE API
C++
•
BSD 3-Clause "New" or "Revised" License
•1•5•1•0•Updated Nov 19, 2024Nov 19, 2024
identity_backend
Public
Example Triton backend that demonstrates most of the Triton Backend API.
C++
•
BSD 3-Clause "New" or "Revised" License
•12•6•0•0•Updated Nov 19, 2024Nov 19, 2024
checksum_repository_agent
Public
The Triton repository agent that verifies model checksums.
C++
•
BSD 3-Clause "New" or "Revised" License
•7•10•0•0•Updated Nov 19, 2024Nov 19, 2024
triton_cli
Public
Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.
Python
•2•51•3•2•Updated Nov 18, 2024Nov 18, 2024
dali_backend
Public
The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
python deep-learning gpu image-processing dali data-preprocessing nvidia-dali fast-data-pipeline
C++
•
MIT License
•29•125•22•5•Updated Nov 5, 2024Nov 5, 2024
model_navigator
Public
Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.
deep-learning gpu inference
Python
•
Apache License 2.0
•25•185•4•1•Updated Sep 10, 2024Sep 10, 2024
contrib
Public
Community contributions to Triton that are not officially supported or maintained by the Triton project.
Python
•
BSD 3-Clause "New" or "Revised" License
•7•8•0•1•Updated Jun 5, 2024Jun 5, 2024