huggingface / open-r1 Public

Notifications You must be signed in to change notification settings
Fork 2k
Star 22.6k

Code
Issues 206
Pull requests 38
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/open-r1

Labels 11 Milestones 0

New pull request New

38 Open 157 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

adds support for running GRPO on IOI problems

#495 opened Mar 9, 2025 by guipenedo

Loading…

sft learn to generate eos token

#494 opened Mar 9, 2025 by digitalSquirrel1

Loading…

FIX: Three Bugs in async E2B code sandbox

#493 opened Mar 9, 2025 by rasdani

Loading…

Extend max_model_length to prevent context truncation

#463 opened Mar 3, 2025 by eldarkurtic

Loading…

Resolve double BOS token issue

#462 opened Mar 3, 2025 by eldarkurtic

Loading…

SFT configs for Qwen coder models

#438 opened Feb 26, 2025 by edbeeching • Draft

translate readme to Chinese(traditional)

#432 opened Feb 25, 2025 by JillChen525

Loading…

Start agent traces

#414 opened Feb 24, 2025 by aymeric-roucher • Draft

feat: make reward functions to support parallel computation

#398 opened Feb 23, 2025 by 0x404

Loading…

New GRPO dataset and tasks: formally-verified program correctness

#379 opened Feb 20, 2025 by ocramz

Loading…

WIP "Faster" grpo trainer

#371 opened Feb 19, 2025 by edbeeching • Draft

3 of 4 tasks

Fix dataset url

#347 opened Feb 17, 2025 by Zzhiter

Loading…

Fix reasoning_steps_reward function

#335 opened Feb 16, 2025 by rocke2020

Loading…

fix bug, solutions not found

#334 opened Feb 15, 2025 by hellen9527

Loading…

Update sglang README.md

#330 opened Feb 15, 2025 by yh-yao

Loading…

Update grpo.py

#325 opened Feb 14, 2025 by tpoisonooo

Loading…

#322 opened Feb 14, 2025 by sungatetop

Loading…

fix: sft fix

#307 opened Feb 13, 2025 by pointerhacker

Loading…

Fix: Default value of cosine_min_value_wrong parameter

#305 opened Feb 13, 2025 by zhangsheng377

Loading…

Simplified installation requirements to support more accelerators

#303 opened Feb 13, 2025 by ji-huazhong

Loading…

Fix eval max length

#297 opened Feb 12, 2025 by Some-random

Loading…

[rewards] use dense rep penalty

#296 opened Feb 12, 2025 by kashif

Loading…

Update README.md

#291 opened Feb 12, 2025 by tpoisonooo

Loading…

Performance improvements of reward calculation

#286 opened Feb 11, 2025 by saidineshpola

Loading…

[GRPO] generate with prompt containing the first <think> tag

#283 opened Feb 11, 2025 by kashif

Loading…

Previous 1 2 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly