Skip to content

Pull requests: huggingface/open-r1

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

adds support for running GRPO on IOI problems
#495 opened Mar 9, 2025 by guipenedo Loading…
sft learn to generate eos token
#494 opened Mar 9, 2025 by digitalSquirrel1 Loading…
FIX: Three Bugs in async E2B code sandbox
#493 opened Mar 9, 2025 by rasdani Loading…
Resolve double BOS token issue
#462 opened Mar 3, 2025 by eldarkurtic Loading…
translate readme to Chinese(traditional)
#432 opened Feb 25, 2025 by JillChen525 Loading…
Start agent traces
#414 opened Feb 24, 2025 by aymeric-roucher Draft
WIP "Faster" grpo trainer
#371 opened Feb 19, 2025 by edbeeching Draft
3 of 4 tasks
Fix dataset url
#347 opened Feb 17, 2025 by Zzhiter Loading…
Fix reasoning_steps_reward function
#335 opened Feb 16, 2025 by rocke2020 Loading…
fix bug, solutions not found
#334 opened Feb 15, 2025 by hellen9527 Loading…
Update sglang README.md
#330 opened Feb 15, 2025 by yh-yao Loading…
Update grpo.py
#325 opened Feb 14, 2025 by tpoisonooo Loading…
add text similarity for more common accuracy reward
#322 opened Feb 14, 2025 by sungatetop Loading…
fix: sft fix
#307 opened Feb 13, 2025 by pointerhacker Loading…
Fix eval max length
#297 opened Feb 12, 2025 by Some-random Loading…
[rewards] use dense rep penalty
#296 opened Feb 12, 2025 by kashif Loading…
Update README.md
#291 opened Feb 12, 2025 by tpoisonooo Loading…
Performance improvements of reward calculation
#286 opened Feb 11, 2025 by saidineshpola Loading…
ProTip! Filter pull requests by the default branch with base:main.