-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create proposal for training repo #85
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,3 +2,5 @@ | |
|
||
# Spelling | ||
dictionary.dic | ||
|
||
venv | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Modify Repository Proposal: Training | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about simply |
||
|
||
## Summary | ||
|
||
This document proposes taking the existing [instructlab/training](https://github.com/instructlab/training) and transforming it into a library that will be consumed | ||
by other clients such as the CLI and what is currently the `main_ds.py` file | ||
that launches a full fine-tune. | ||
|
||
## Background | ||
|
||
Today, we have training implemented for different use-cases | ||
existing in different repos. LoRA/QLoRA training currently lives | ||
in the [`instructlab`](https://github.com/instructlab/instructlab) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIUC, this logic will have to be extracted to |
||
CLI repo and contains logic for both Linux and MacOS training. | ||
Each of these then brings additional logic for platform-specific accelerators. | ||
For example NVIDIA CUDA and Intel Habana on Linux, and MLX and MPS on MacOS. | ||
|
||
On the other hand, we have logic that exists today for full fine-tuning | ||
in the [instructlab/training](https://github.com/instructlab/training) repo. Today, this implementation is CUDA-specific | ||
as it relies on NCCL for handling communication across various | ||
NVIDIA GPUs. | ||
We want to bring support for alternative accelerators such as Intel Gaudi to this repo as well, without rewriting much of the logic. | ||
|
||
We need to have a library which provides a common interface | ||
for training, allowing all of the hard logic to be concentrated | ||
in a single point. | ||
|
||
The [instructlab/training](https://github.com/instructlab/training) repo should then | ||
become the home for the overall training library. | ||
|
||
This library will then contain a "simple" interface for all | ||
other clients to pull from and use without much needing to change | ||
on their side other than how they intend to use it. | ||
|
||
## Maintainers | ||
|
||
The maintainers should be the folks who currently work on training. | ||
Namely: | ||
|
||
- [Aldo Pareja](https://github.com/aldopareja) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd like to be added to this as the community train rep currently. Or @alimaredia. If we are planning on moving that code to the library, I would appreciate a part in that 🙏 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
- [James Kunstle](https://github.com/orgs/instructlab/people/JamesKunstle) | ||
- [Oleg Silkin](https://github.com/orgs/instructlab/people/RobotSail) | ||
- [Mustafa Eyceoz](https://github.com/orgs/instructlab/people/Maxusmusti) | ||
Comment on lines
+41
to
+43
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These 3 links return 404 for me. |
||
|
||
We can also blanket access by setting it to the `Backend Maintainers` GitHub team. | ||
|
||
## Alternatives Considered | ||
|
||
### Keep everything separate | ||
|
||
Rather than consolidating the logic, we can keep the existing | ||
logic as-is. | ||
|
||
This would mean that the [CLI repo](https://github.com/instructlab/instructlab) and the [training repo](https://github.com/instructlab/training) would both maintain their own implementations of training. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is more what I was thinking, and this overlaps with #52 alot. Something we need to reconcile There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this comment mean that this alternative is what you had in mind? |
||
|
||
This means that if extra logic must be added for Intel Gaudi or NVIDIA, it would need to be added and tested in two different places. | ||
|
||
Since we need to move the full training to live inside of the the | ||
[CLI repo](https://githhub.com/instructlab/instructlab), | ||
this would now have two duplicate implementations of the | ||
full fine-tune training and add additional points of complications. | ||
|
||
### Move everything into one repo but don't design an interface | ||
|
||
We can move the existing training logic into the training repo | ||
and simply have the existing clients consume it this way. | ||
|
||
The challenge here is that a lot of the logic is very specific | ||
to the client application, so there would be cross-development. | ||
|
||
If someone wants to create a new PR to the CLI repo for a change | ||
they're making to the LoRA training, they'd need to create | ||
another PR into the training repo and maintain two PRs at once. | ||
When a maintainer requests a change in one PR, both would need to be updated to accommodate each other. | ||
|
||
The challenges this presents to both the developers and maintainers | ||
is clear. Therefore a natural conclusion is that we need a separate library that provides an extensible interface for other clients to consume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was merged in #90. Please rebase.