Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Created a cookbook that walks you through finetuning a Model with GRPO #1559

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

apokryphosx
Copy link

@apokryphosx apokryphosx commented Feb 6, 2025

Description

I added a cookbook that walks a user through finetuning with GRPO

Motivation and Context

Finetuning Agents with RL is a necessary step towards AGI, and GRPO has emerged as a compute cheap alternative to PPO.

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • [ x] New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of example)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • [ x] I have read the CONTRIBUTION guide. (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly. (required for a bug fix or a new feature)
  • I have updated the documentation accordingly.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zjrwtx zjrwtx assigned zjrwtx and unassigned zjrwtx Feb 6, 2025
@zjrwtx zjrwtx self-requested a review February 6, 2025 10:19
@zjrwtx
Copy link
Collaborator

zjrwtx commented Feb 6, 2025

thanks! @apokryphosx Look good to me,but there are something tha still need to be improved:
1.about the pr title, it would be better: docs:Finetuning a Model with GRPO
image
please refer to the contributing docs:

https://github.com/camel-ai/camel/blob/master/CONTRIBUTING.md#pull-request-item-stage

@zjrwtx
Copy link
Collaborator

zjrwtx commented Feb 6, 2025

2.can we add some chat history when using this GRPO model?
3.can we add the final preview when upload the model to huggingface?
4.can we have a directly use demo from the huggingface model which we upload?

@Wendong-Fan Wendong-Fan added the Data Related to camel data processing label Feb 6, 2025
@Wendong-Fan Wendong-Fan added this to the Sprint 22 milestone Feb 6, 2025
@Wendong-Fan Wendong-Fan changed the title Created a cookbook that walks you through finetuning a Model with GRPO docs: Created a cookbook that walks you through finetuning a Model with GRPO Feb 6, 2025
@apokryphosx
Copy link
Author

2.can we add some chat history when using this GRPO model? 3.can we add the final preview when upload the model to huggingface? 4.can we have a directly use demo from the huggingface model which we upload?

Sure thing! I'll take care off it

@Wendong-Fan
Copy link
Member

Thanks for the contribution @apokryphosx ! Could you leave the link to the colab notebook and give use the viewing access? That would be helpful for the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cookbook Data Related to camel data processing
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[Feature Request] Implement GRPO, PPO and potentially other policy gradient methods to finetune LM Agents
3 participants