-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: update trl to grpo_vllm branch, move lighteval to uv #30
base: main
Are you sure you want to change the base?
Conversation
@lewtun This will break the GH action at the moment, was leaving it for the other contributor to come through, move the action, and setup with ruff for the whole thing, but it syncs and we're ready. Once I get over into TRL I'll swap over to sdpa, but as long as it's from the wheel, flash-attn syncs just fine at the moment. |
This awesome, thanks a lot! Would you mind updating the installation instructions in the |
yep! I just went from fresh clone to a PPO test run and worked out a few things along the way, will do one more pass on pyproj and readme and ping |
|
||
[tool.uv.sources] | ||
transformers = { git = "https://github.com/huggingface/transformers.git", branch = "main" } | ||
trl = { git = "https://github.com/huggingface/trl.git", branch = "grpo_vllm" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use main branch here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy, I'm doing uv pip install -e . from that venv (which I'll document) in my trl clone after I uv sync, so that makes more sense anyway.
I can pick the creation of GA with the new setup. |
OK, sorry for the delay - I wanted to make sure it all built and compiled and then things could start up, and I lost a PCIe slot apparently that was throwing issues. I'm clear on this and will tie it together this evening once I have a few other things off my plate! |
This is still WIP - I'm going to take the opportunity here to swap from flash-attn to sdpa and remove that dependency, and then I need to verify and push the lockfile.
Additionally, when syncing the exiting lighteval branch with git lfs intalled for the repo, it looks like a blob may not have been pushed?