Peft LoRA Attention Masking #6919
-
In my quest to control all parts of the generation, and given the new discussion about LoRA merging, I was trying to test the possibility of applying attention masking to each LoRAs since this would be affected by the new merging mechanism. I don’t know much about how PEFT works and couldn’t find any information on this, as is generally more used for LLMs. Is there a straightforward way that I can use a binary mask with the PEFT LoRAs from diffusers? I mean like in IP Adapters where we can use the IPAttentionProcessor for this. There’s an extension for Automatic1111, and the Kohya repo also has a script for it, but they don’t use PEFT. I think I might have to do it in the linear LoRA layer: However, I was hoping to avoid this, as I would need to customize another library apart from diffusers. @sayakpaul @patrickvonplaten, could you guide me with this or perhaps ping someone from PEFT who can? If this implies a lot of code refactoring, I can discard the idea and move on. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
Thanks for opening the discussion. We rely on Tagging @BenjaminBossan from the PEFT team. |
Beta Was this translation helpful? Give feedback.
-
Same issue. How to customize the lora attention processor with PEFT?
To fine-tune customized Lora Attention Processor successfully, should I just avoid using PEFT temporally? |
Beta Was this translation helpful? Give feedback.
-
I could look into this. Can you give me some pointers about attention masking? |
Beta Was this translation helpful? Give feedback.
-
yeah sure, most of the logic would be the same as what we're doing with IP Adapters Attention Masking in this PR #6847, so the code to generate the latent mask should be the same. Before the switch to peft we could have done the same here as @cjfcsjt commented: So basically what we needed to do before peft was to match the shapes of the binary mask tensors to the This is the code on how the automatic1111 extension does it if it helps: Ideally we should pass a mask for each lora, the easiest method IMO would be when we set the adapters in the pipeline:
This would make it compatible with the new merging methods for LoRAs discussed here #6892 |
Beta Was this translation helpful? Give feedback.
I don't think I have the full picture yet, but this is what I get: The idea is to supply a mask with the same shape as the LoRA adapter's output (and hence as the underlying base layer's output), which is simply multiplied element-wise to the output at the very end of
forward
.Supplying such a mask is currently not supported by PEFT. To support this, I could imagine:
forward
. I'm not a huge fan since this is very specific to this use case.