Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: GRPO support #2340

Open
tikikun opened this issue Feb 4, 2025 · 5 comments
Open

Feature request: GRPO support #2340

tikikun opened this issue Feb 4, 2025 · 5 comments

Comments

@tikikun
Copy link

tikikun commented Feb 4, 2025

As you all might have already known by now DeepSeek-R1 with its GRPO training was quite successful, should we consider bringing GRPO into torchtune?

@vgoklani
Copy link

vgoklani commented Feb 4, 2025

Yes please. We need a reference-level implementation that's both clear and concise. Should obviously leverage FSDP2

@felipemello1
Copy link
Contributor

Hi all, thanks for you interest! We have a couple of PRs cooking. Please feel free to take a look and contribute/review.

@akashc1
Copy link
Contributor

akashc1 commented Feb 4, 2025

I've also started working on GRPO on my end, @felipemello1 can you point me to those PRs? would love to collaborate on this.

I'm especially interested in the RLVR approach used in DeepSeek R1 and Tulu 3

@felipemello1
Copy link
Contributor

Take a look at this one that is being done by an user: #2326

@felipemello1
Copy link
Contributor

@akashc1 @dzheng256 @tikikun @vgoklani and others, please check the new comment with a task list. If you guys feel like contributing, that would be a good starting point: #2326

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants