What's new
Added 🎉
- A bunch of annealing configs
constant_with_warmup
learning rate scheduleone_in_eight
configuration for activation checkpointing- New tokenizer in the source instead of from huggingface
- Improved support for GCS
torch.compile()
now only compiles each block, not the whole model.- Support for
torch.compile()
withdynamic=True
- Resetting the
torch.compile()
after every evaluation, because evaluation messes with the compiled versions - Added more in-loop evaluation tasks to pick from, mostly for scaling law.
Commits
b41634f One more hint for what's going on.
d74e835 A little more help for getting started
24ce0ca Merge pull request #756 from allenai/MoreCheckpoints2
69d1e4e Note about and link to Huggingface
3c6d515 Merge pull request #754 from allenai/MoreCheckpoints
645587e Merge branch 'main' of https://github.com/allenai/LLM
a346674 Fix link
4f0d7d1 We use safetensors now.
a6e6e2b Remove links that don't work
e6f6b45 Remove obsolete docs
0d14158 Merge pull request #745 from allenai/improve-documentation
1048c16 Merge pull request #750 from allenai/dave/annealing_peteish_v2
767047c Merge pull request #749 from allenai/mattj/legalwhammy2-augusta
9c677c9 Merge pull request #748 from allenai/oeeval-ladder-testtrain
7e81a6c Merge pull request #739 from allenai/peteish13-augusta
31c385f Merge pull request #742 from allenai/GoogleStorage
afd728f Merge pull request #738 from allenai/annealing_peteish_v2_neweval
837a4ff Merge pull request #687 from allenai/kylel/config-diff