Skip to content

Commit

Permalink
create proposal for training repo
Browse files Browse the repository at this point in the history
Signed-off-by: Oleg S <[email protected]>
  • Loading branch information
RobotSail committed Jun 10, 2024
1 parent a04745f commit f719c6d
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

# Spelling
dictionary.dic

venv
10 changes: 10 additions & 0 deletions .spellcheck-en-custom.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,17 @@ Eval
Excalidraw
exfiltrate
exfiltrating
Eyceoz
Finetuning
formedness
Gaudi
GFX
GGUF
GGUFs
GiB
Gmail
gpu
Habana
hipBLAS
ilab
impactful
Expand All @@ -58,6 +61,7 @@ Kaggle
Kaggle's
Kai
Kubernetes
Kunstle
lignment
LLM
llms
Expand All @@ -72,8 +76,12 @@ Miniforge
Mixtral
MLX
mlx
MPS
Mustafa
NCCL
NVidia
Nvidia
Oleg
ollama
Ollama
orchestrator
Expand All @@ -98,6 +106,7 @@ RDNA
README
rebase
repo
repos
ROCm
RTX
RX
Expand All @@ -110,6 +119,7 @@ sexualized
SHA
Shivchander
Signoff
Silkin
Srivastava
subdirectory
Sudalairaj
Expand Down
77 changes: 77 additions & 0 deletions docs/training/training-repo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Modify Repository Proposal: Training

## Summary

This document proposes taking the existing [instructlab/training](https://github.com/instructlab/training) and transforming it into a library that will be consumed
by other clients such as the CLI and what is currently the `main_ds.py` file
that launches a full fine-tune.

## Background

Today, we have training implemented for different use-cases
existing in different repos. LoRA/QLoRA training currently lives
in the [`instructlab`](https://github.com/instructlab/instructlab)
CLI repo and contains logic for both Linux and MacOS training.
Each of these then brings additional logic for platform-specific accelerators.
For example NVIDIA CUDA and Intel Habana on Linux, and MLX and MPS on MacOS.

On the other hand, we have logic that exists today for full fine-tuning
in the [instructlab/training](https://github.com/instructlab/training) repo. Today, this implementation is CUDA-specific
as it relies on NCCL for handling communication across various
NVIDIA GPUs.
We want to bring support for alternative accelerators such as Intel Gaudi to this repo as well, without rewriting much of the logic.

We need to have a library which provides a common interface
for training, allowing all of the hard logic to be concentrated
in a single point.

The [instructlab/training](https://github.com/instructlab/training) repo should then
become the home for the overall training library.

This library will then contain a "simple" interface for all
other clients to pull from and use without much needing to change
on their side other than how they intend to use it.

## Maintainers

The maintainers should be the folks who currently work on training.
Namely:

- [Aldo Pareja](https://github.com/aldopareja)
- [James Kunstle](https://github.com/orgs/instructlab/people/JamesKunstle)
- [Oleg Silkin](https://github.com/orgs/instructlab/people/RobotSail)
- [Mustafa Eyceoz](https://github.com/orgs/instructlab/people/Maxusmusti)

We can also blanket access by setting it to the `Backend Maintainers` GitHub team.

## Alternatives Considered

### Keep everything separate

Rather than consolidating the logic, we can keep the existing
logic as-is.

This would mean that the [CLI repo](https://github.com/instructlab/instructlab) and the [training repo](https://github.com/instructlab/training) would both maintain their own implementations of training.

This means that if extra logic must be added for Intel Gaudi or NVIDIA, it would need to be added and tested in two different places.

Since we need to move the full training to live inside of the the
[CLI repo](https://githhub.com/instructlab/instructlab),
this would now have two duplicate implementations of the
full fine-tune training and add additional points of complications.

### Move everything into one repo but don't design an interface

We can move the existing training logic into the training repo
and simply have the existing clients consume it this way.

The challenge here is that a lot of the logic is very specific
to the client application, so there would be cross-development.

If someone wants to create a new PR to the CLI repo for a change
they're making to the LoRA training, they'd need to create
another PR into the training repo and maintain two PRs at once.
When a maintainer requests a change in one PR, both would need to be updated to accommodate each other.

The challenges this presents to both the developers and maintainers
is clear. Therefore a natural conclusion is that we need a separate library that provides an extensible interface for other clients to consume.

0 comments on commit f719c6d

Please sign in to comment.