⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
-
Updated
Oct 8, 2024 - Python
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Large-scale LLM inference engine
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
scalable and robust tree-based speculative decoding algorithm
Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
REST: Retrieval-Based Speculative Decoding, NAACL 2024
[NeurIPS'23] Speculative Decoding with Big Little Decoder
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
minimal C implementation of speculative decoding based on llama2.c
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding
Accelerating LLM inference with techniques like speculative decoding, quantization, and kernel fusion, focusing on implementing state-of-the-art research papers.
Verification of the effect of speculative decoding in Japanese.
Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"
Unofficial implementation of Token Recycling self-speculative decoding method.
Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder
Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.
To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."