ORPO

`Updates (24.03.25)`

Sample script for ORPOTrainer in 🤗TRL is added to trl/test_orpo_trainer_demo.py
New model, 🤗kaist-ai/mistral-orpo-capybara-7k, is added to 🤗ORPO Collection
Now you can try ORPO in 🤗TRL, Axolotl and LLaMA-Factory🔥
We are making general guideline for training LLMs with ORPO, stay tuned🔥
Mistral-ORPO-β achieved a 14.7% in the length-controlled (LC) win rate on official AlpacaEval Leaderboard🔥

This is the official repository for ORPO: Monolithic Preference Optimization without Reference Model. The detailed results in the paper can be found in:

`Model Checkpoints`

Our models trained with ORPO can be found in:

Mistral-ORPO-Capybara-7k: 🤗 kaist-ai/mistral-orpo-capybara-7k
Mistral-ORPO-⍺: 🤗 kaist-ai/mistral-orpo-alpha
Mistral-ORPO-β: 🤗 kaist-ai/mistral-orpo-beta

And the corresponding logs for the average log probabilities of chosen/rejected responses during training are reported in:

Mistral-ORPO-Capybara-7k: TBU
Mistral-ORPO-⍺: Wandb Report for Mistral-ORPO-⍺
Mistral-ORPO-β: Wandb Report for Mistral-ORPO-β

`AlpacaEval`

Figure 1. AlpacaEval 2.0 score for the models trained with different alignment methods.

`MT-Bench`

Figure 2. MT-Bench result by category.

`IFEval`

IFEval scores are measured with EleutherAI/lm-evaluation-harness by applying the chat template. The scores for Llama-2-Chat (70B), Zephyr-β (7B), and Mixtral-8X7B-Instruct-v0.1 are originally reported in this tweet.

Model Type	Prompt-Strict	Prompt-Loose	Inst-Strict	Inst-Loose
Llama-2-Chat (70B)	0.4436	0.5342	0.5468	0.6319
Zephyr-β (7B)	0.4233	0.4547	0.5492	0.5767
Mixtral-8X7B-Instruct-v0.1	0.5213	0.5712	0.6343	0.6823
Mistral-ORPO-⍺ (7B)	0.5009	0.5083	0.5995	0.6163
Mistral-ORPO-β (7B)	0.5287	0.5564	0.6355	0.6619

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
assets/img		assets/img
outputs		outputs
scripts		scripts
src		src
trl		trl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ORPO

`Updates (24.03.25)`

`Model Checkpoints`

`AlpacaEval`

`MT-Bench`

`IFEval`

About

Releases

Packages

Contributors 4

Languages

License

xfactlab/orpo

Folders and files

Latest commit

History

Repository files navigation

ORPO

Updates (24.03.25)

Model Checkpoints

AlpacaEval

MT-Bench

IFEval

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

`Updates (24.03.25)`

`Model Checkpoints`

`AlpacaEval`

`MT-Bench`

`IFEval`

Packages