Awesome-LLM-Constrained-Decoding

Towards reliable, controllable and more efficient generation with Large Language Models (LLMs)

A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.

Libraries

Library	Feature	Stars
ggerganov/llama.cpp	contains a built-in support for CFG and supports JSON Schema through conversion to CFG
guidance-ai/guidance	CFG, Regex, JSON Schema, Token Forcing, compatible with Transformers, LLAMA-CPP
outlines-dev/outlines	CFG, Unicode support, Hugging Face ecosystem, VLLM support
sgl-project/sglang	Regex support, emphasis on LLM inference efficiency, compressed FSM
eth-sri/lmql	Regex support, various constraints, more powerful control flow
jxnl/instructor	Try-Reject-Repeat approach to ensure constraints are met
microsoft/aici	A general framework of LLM controller with native support for CFG, Regex, JSON Schema
noamgat/lm-format-enforcer	Regex, JSON Schema, Beam Search etc.
epfl-dlab/transformers-CFG	CFG (EBNF Interface), Compatible with Transformers, Easy to extend for research
uiuc-focal-lab/syncode	CFG generation that supports builtin grammars like JSON, Python, Go, and more

Disclaimer:

The libraries listed above are not exhaustive and are subject to change.
The features mentioned are 100% not exhaustive and I strongly recommend checking the respective repositories for more details.
The libraries are listed by the Github stars
If you are the author of a library and would like to add or update the information, please open an issue or submit a pull request.

Papers

Papers with are newly added papers (not necessarily newly published papers).

Date	Paper	Publication
2024-10	IterGen: Iterative Structured LLM Generation	Arxiv
2024-08	Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models	Arxiv
2024-08	FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking	Arxiv
2024-07	Automata-based constraints for language model decoding	CoLM
2024-06	Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access	ACL
2024-05	Grammar-Aligned Decoding	Preprint
2024-03	SynCode: LLM Generation with Grammar Augmentation	Arxiv
2024-03	Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation	ICML
2024-02	Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars	Arxiv
2024-02	Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents	Arxiv
2023-12	SGLang: Efficient Execution of Structured Language Model Programs	Preprint
2023-12	Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context	NeurIPS
2023-11	Prompt Sketching for Large Language Models	Preprint
2023-11	Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs	PADL
2023-10	Don't Fine-Tune, Decode: Syntax Error-Free Tool Use via Constrained Decoding	Arxiv
2023-10	Amortizing intractable inference in large language models	ICLR
2023-10	KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection	EMNLP
2023-07	Efficient Guided Generation for Large Language Models	Arxiv
2023-06	Grammar Prompting for Domain-Specific Language Generation with Large Language Models	NeurIPS
2023-06	Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning	EMNLP
2023-06	Prompting Is Programming: A Query Language for Large Language Models	PLDI
2023-05	Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing	EMNLP Findings
2023-04	Tractable Control for Autoregressive Language Generation	ICML
2022-11	Validating Large Language Models with ReLM	MLSys
2022-11	CodePAD: Sequence-based Code Generation with Pushdown Automaton	ISSTA
2022-05	Gradient-Based Constrained Sampling from Language Models	EMNLP
2022-01	Synchromesh: Reliable code generation from pre-trained language models	ICLR
2021-12	PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models	EMNLP
2021-12	Constrained Language Models Yield Few-Shot Semantic Parsers	EMNLP
2021-12	Controlled Text Generation as Continuous Optimization with Multiple Constraints	NeurIPS
2021-06	NEUROLOGIC DECODING:(Un)supervised Neural Text Generation with Predicate Logic Constraints	NAACL
2019-05	A General-Purpose Algorithm for Constrained Sequential Inference	CoNLL
2019-05	Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting	NAACL
2018-09	CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling	AAAI
2018-05	Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation	NAACL
2018-04	Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method	AAAI
2017-12	Guided Open Vocabulary Image Captioning with Constrained Beam Search	EMNLP
2017-06	Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search	ACL

Benchmark & Datasets & Evaluation

Date	Paper	Publication
2024-05	COLLIE: Systematic Construction of Constrained Text Generation Tasks	ICLR
2023-12	BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing	NeurIPS Track on Datasets and Benchmarks
2023-10	Evaluating Large Language Models on Controlled Generation Tasks	Arxiv
2023-09	Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?	Arxiv
2020-12	CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning	EMNLP Findings

Survey

Date	Paper	Publication
2024-04	"We Need Structured Output": Towards User-centered Constraints on Large Language Model Output	Arxiv

Blog Posts

Leveraging Constrained Sampling for Fill-in-the-Middle Code Completion by nielstron
Proper Well-Formedness for Finite LLM Sampling by nielstron
LLM Decoding with Regex Constraints by Vivien
Constrained Decoding is Posterior Inference by Saibo-creator
Making Structured Generation Faster Than Unstructured
Coding For Structured Generation with LLMs
Beating GPT-4 with Open Source
Prompt Efficiency - Using Structured Generation to get 8-shot performance from 1-shot.
How fast can grammar-structured generation be?
Structured Generation Improves LLM performance: GSM8K Benchmark
Coalescence: making LLM inference 5x faster
Constrained Decoding with Arbitrary Constraints is NP-hard
LLMs are bad at returning code in JSON

Many of the blogs are written by Outlines team, many thanks to them for their great work! ❤️

Disclaimer

This list is not exhaustive and will be updated regularly. If you have any suggestions or want to add a paper, please feel free to open an issue or submit a pull request. We hope to include all relevant papers in this list.

Contributing

Contributions are welcome! Feel free to submit a pull request or open an issue. Please make sure to read the Contributing Guidelines before contributing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Awesome-LLM-Constrained-Decoding

Table of Contents

Libraries

Papers

Benchmark & Datasets & Evaluation

Survey

Blog Posts

Disclaimer

Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome-LLM-Constrained-Decoding

Table of Contents

Libraries

Papers

Benchmark & Datasets & Evaluation

Survey

Blog Posts

Disclaimer

Contributing