Peft LoRA Attention Masking #6919

asomoza · 2024-02-09T06:58:55Z

asomoza
Feb 9, 2024
Maintainer

In my quest to control all parts of the generation, and given the new discussion about LoRA merging, I was trying to test the possibility of applying attention masking to each LoRAs since this would be affected by the new merging mechanism.

I don’t know much about how PEFT works and couldn’t find any information on this, as is generally more used for LLMs. Is there a straightforward way that I can use a binary mask with the PEFT LoRAs from diffusers? I mean like in IP Adapters where we can use the IPAttentionProcessor for this.

There’s an extension for Automatic1111, and the Kohya repo also has a script for it, but they don’t use PEFT.

I think I might have to do it in the linear LoRA layer:

https://github.com/huggingface/peft/blob/c1a83fd692cfbd5297d252e4eae4fdc4c59efa8c/src/peft/tuners/lora/layer.py#L310

However, I was hoping to avoid this, as I would need to customize another library apart from diffusers.

@sayakpaul @patrickvonplaten, could you guide me with this or perhaps ping someone from PEFT who can? If this implies a lot of code refactoring, I can discard the idea and move on.

Answered by BenjaminBossan

Feb 12, 2024

I don't think I have the full picture yet, but this is what I get: The idea is to supply a mask with the same shape as the LoRA adapter's output (and hence as the underlying base layer's output), which is simply multiplied element-wise to the output at the very end of forward.

Supplying such a mask is currently not supported by PEFT. To support this, I could imagine:

Add an extra argument to PEFT layers to provide such a mask, which is then multiplied to the output before returning from forward. I'm not a huge fan since this is very specific to this use case.
Add a new attention processor wrapper class to diffusers that wraps the LoRA layer and applies the mask. This would require to pas…

View full answer

sayakpaul · 2024-02-09T07:05:27Z

sayakpaul
Feb 9, 2024
Maintainer

Thanks for opening the discussion. We rely on peft for all things LoRA. So, I think this needs to added at pefts end.

Tagging @BenjaminBossan from the PEFT team.

0 replies

cjfcsjt · 2024-02-09T10:37:54Z

cjfcsjt
Feb 9, 2024

Same issue. How to customize the lora attention processor with PEFT?
I try to follow the diffusers example

diffusers/examples/research_projects/lora/train_text_to_image_lora.py

Line 527 in ab71134

    
           lora_attn_procs[name] = LoRAAttnProcessor(hidden_size=hidden_size, cross_attention_dim=cross_attention_dim)

to customize my lora attention processor but find following problems:

This link pointed that the above example has a bug (i.e., usage of AttnProcsLayers).
Even if I remove AttnProcsLayers, this example is not compatible with the newest diffusers version. For example, it will automatically using the PEFT backend (here), and the lora layers I added will be ignored in the forward process.

To fine-tune customized Lora Attention Processor successfully, should I just avoid using PEFT temporally?

2 replies

sayakpaul Feb 9, 2024
Maintainer

LoRA attention processor is deprecated in favor of supporting PEFT.

And you are following an example that we don't maintain, it's in the research_projects folder. Here's the maintained one: https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py.

cjfcsjt Feb 9, 2024

Thanks for your reply.

I tried the attn_processor since I want to add the customized attention mask purely in the CrossAttention Layer. Is there any way (e.g., API) to implement this in PEFT?

BenjaminBossan · 2024-02-09T11:01:08Z

BenjaminBossan
Feb 9, 2024
Maintainer

I was trying to test the possibility of applying attention masking to each LoRAs since this would be affected by the new merging mechanism.

I don’t know much about how PEFT works and couldn’t find any information on this, as is generally more used for LLMs. Is there a straightforward way that I can use a binary mask with the PEFT LoRAs from diffusers? I mean like in IP Adapters where we can use the IPAttentionProcessor for this.

I could look into this. Can you give me some pointers about attention masking?

0 replies

asomoza · 2024-02-09T14:16:24Z

asomoza
Feb 9, 2024
Maintainer Author

yeah sure, most of the logic would be the same as what we're doing with IP Adapters Attention Masking in this PR #6847, so the code to generate the latent mask should be the same.

Before the switch to peft we could have done the same here as @cjfcsjt commented:

diffusers/src/diffusers/models/attention_processor.py

Line 1940 in ab71134

attn.to_q.lora_layer = self.to_q_lora.to(hidden_states.device)

So basically what we needed to do before peft was to match the shapes of the binary mask tensors to the hidden_states and then do a element-wise multiplication.

This is the code on how the automatic1111 extension does it if it helps:

https://github.com/lifeisboringsoprogramming/sd-webui-lora-masks/blob/c68f7b1c5b8a7a304e5707ce76f83694da5a5602/scripts/lora_masks_lora_compvis.py#L108

Ideally we should pass a mask for each lora, the easiest method IMO would be when we set the adapters in the pipeline:

pipe.set_adapters(["pixel", "toy"], adapter_weights=[0.5, 1.0], masks=[mask_one, mask_two])

This would make it compatible with the new merging methods for LoRAs discussed here #6892

2 replies

BenjaminBossan Feb 12, 2024
Maintainer

I don't think I have the full picture yet, but this is what I get: The idea is to supply a mask with the same shape as the LoRA adapter's output (and hence as the underlying base layer's output), which is simply multiplied element-wise to the output at the very end of forward.

Supplying such a mask is currently not supported by PEFT. To support this, I could imagine:

Add an extra argument to PEFT layers to provide such a mask, which is then multiplied to the output before returning from forward. I'm not a huge fan since this is very specific to this use case.
Add a new attention processor wrapper class to diffusers that wraps the LoRA layer and applies the mask. This would require to pass and route the mask through the forward pass of the whole pipeline. The class could probably be very simple:

class Wrapper(nn.Module):
    def __init__(self, lora_layer):
        super().__init__(self)
        self.lora_layer = lora_layer

    def forward(self, *args, mask, **kwargs):
        lora_outut = self.lora_layer(*args, **kwargs)
        if mask is None:
            return lora_output
        return lora_outut * mask

Add a forward hook to the LoRA layer that applies the mask to the LoRA output. If the hook is added at runtime for each forward call, this should avoid the need to route the mask argument through the pipeline. Such a hook could be defined ad hoc in any script, so technically no change in PEFT or diffusers is required, though it could be nice to add it to diffusers for convenience.

Again, I don't have the full picture yet, so maybe I'm missing something that makes my proposals invalid, please let me know.

Answer selected by asomoza

asomoza Feb 12, 2024
Maintainer Author

I agree that this feature probably shouldn’t be added to PEFT just for this single use case, I will try 2 and 3 and do some tests, thank you for your help. it guides me towards what I need to do. and I think I can sort out the rest

After some tests and if it generates good results I'll open a issue to discuss if it can be added to diffusers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peft LoRA Attention Masking #6919

{{title}}

Replies: 4 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Peft LoRA Attention Masking #6919

asomoza Feb 9, 2024 Maintainer

Replies: 4 comments · 4 replies

sayakpaul Feb 9, 2024 Maintainer

cjfcsjt Feb 9, 2024

sayakpaul Feb 9, 2024 Maintainer

cjfcsjt Feb 9, 2024

BenjaminBossan Feb 9, 2024 Maintainer

asomoza Feb 9, 2024 Maintainer Author

BenjaminBossan Feb 12, 2024 Maintainer

asomoza Feb 12, 2024 Maintainer Author

asomoza
Feb 9, 2024
Maintainer

Replies: 4 comments 4 replies

sayakpaul
Feb 9, 2024
Maintainer

cjfcsjt
Feb 9, 2024

sayakpaul Feb 9, 2024
Maintainer

BenjaminBossan
Feb 9, 2024
Maintainer

asomoza
Feb 9, 2024
Maintainer Author

BenjaminBossan Feb 12, 2024
Maintainer

asomoza Feb 12, 2024
Maintainer Author