In this project we explored the training dynamics of Parameter-Efficient Fine-Tuning (PEFT) methods, with an emphasis on Low-Rank Adaptation (LoRA). Mainly, we wanted to evaluate, whether it is possible to further reduce memory overhead of fine-tuning by selectively deactivating gradient updates for certain modules during training. In our method, we measured either activation magnitude of the adapted layers in the forward pass, or the gradient magnitude of the same vector in the backward pass. After aggregating these measurements using the L2 norm, we regarded them as importance measures and used them to dynamically select modules to update during training.
Our results show, that we can effectively reduce the number of updated parameters during training by
To install the requirements for FoMo-LoRA, run the following commands:
git clone https://github.com/JesseBrouw/FoMo-LoRA.git
cd FoMo-LoRA
scripts/setup.sh
To use FoMo-LoRA, inspect the help message of the train
module as well as the provided job files in scripts
:
python -m src.train --help
This project uses the following license: license_name.