Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Dynamic Low-Rank Adaptation (DoRA ACL2024) #2278

Open
dohuyduc2002 opened this issue Dec 13, 2024 · 2 comments
Open

Adding Dynamic Low-Rank Adaptation (DoRA ACL2024) #2278

dohuyduc2002 opened this issue Dec 13, 2024 · 2 comments

Comments

@dohuyduc2002
Copy link

Feature request

Paper link: https://arxiv.org/pdf/2405.17357
Source code: https://github.com/MIkumikumi0116/DoRA/blob/main/Src/Finetune_And_Benchmark/Finetune_Utils.py

Motivation

When I read this paper, I found this intrigued me about enhance AdaLoRA in these quotes:

Compared to existing methods of dynamic parameter allocation (e.g., AdaLoRA), DoRA can allocate parameter budgets more appropriately based on a richer set of information from projection matrices.

Compared to previous methods (Zhang et al., 2023), we use ( |\Delta W_i|_F ) instead of ( c_i ) to assess the importance of components, thereby incorporating information from ( A_i ) and ( B_i ) for a more comprehensive evaluation of component importance.

I have checked AdaLoRA and found that there are 2 implementation in DoRA paper can be added to PEFT

  • Implement DEM loss in the Trainer with method compute_loss to integrate this loss into DoRA
  • The pruning method from DoRA to find the rank for LoRA layers

Your contribution

I'm working on reimplementing this paper, further update will be added to this issue

@dohuyduc2002 dohuyduc2002 changed the title add Dynamic Low-Rank Adaptation (DoRA ACL2024) Adding Dynamic Low-Rank Adaptation (DoRA ACL2024) Dec 13, 2024
@BenjaminBossan
Copy link
Member

Thanks for bringing this paper to our attention and offering to work on the implementation. I haven't looked at the details, but wanted to mention a few things:

  1. The change to Trainer that you mentioned would have to be added to tranformers, as PEFT itself doesn't offer any training code directly. However, if it's implemented as a callback, I could see adding that to PEFT. Otherwise, we can add a training example that subclasses Trainer for users to copy.
  2. There is already a method called DoRA in PEFT, so we should use another name to avoid confusion.
  3. Besides AdaLoRA, it could also be worth it take a look at the newly added EVA method, which also reallocates LoRA ranks in a data driven way.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants