-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to add LoRA on specific head? #2293
Comments
With config = LoraConfig(
target_modules=["seq.0", "seq.2"], # use the layer names according to the model you are using
modules_to_save=["seq.4"],
) You can retrieve the name and type of each layer of your model with this code: for n, m in base_model.named_modules(): # replace `base_model` with the variable your pretrained model is stored in
print((n, type(m))) For example for Llama 3.2 1b:
target_modules=["layers.0.self_attn.q_proj", "layers.0.self_attn.v_proj"], # q and v for self_attn layer 0
target_modules=["q_proj", "v_proj"], # q and v for all self_attn layers |
Hi, @d-kleine, thanks for the reply. I am thinking more about something like adding LoRA on some attention heads only, which means my target might be: |
It is not possible to target specific heads. The issue is that the weights of all heads are combined into a single |
Thanks for the prompt reply, @BenjaminBossan. I found some possible approaches like this previous issue, where SAM's Q, K, and V are successfully separated. This might be used in my case, where I want to separate each head out? |
That is possible, it means that you have to implement the whole transformer attention module for yourself and you might be missing out on some optimizations (flash attention, caching). Alternatively, you might be able to write a custom LoRA layer that, say, masks out the heads that should not be touched, and register it with the PEFT dispatcher to be applied to the whole attention module, e.g. |
Thanks, @BenjaminBossan.
I tried to do this on |
You mean that using the same weights with your implementation, the performance already is degraded? Yes, this most likely means there is a bug somewhere. You could paste your implementation here and mark the parts of the code that you changed, and I can take a look. |
Feature request
Could I add LoRA only to some selected heads on the model?
I read some documentation here, but am still not sure about how to implement my goal.
Motivation
Current LoRA Config can allow users to decide where matrices to add LoRA, a more fine-grained control on which heads to add LoRA would be beneficial for the developers.
Your contribution
I would appreciate some tips on how to implement this.
The text was updated successfully, but these errors were encountered: