Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why set_lora_device doesn't work #9913

Open
West2022 opened this issue Nov 12, 2024 · 5 comments
Open

Why set_lora_device doesn't work #9913

West2022 opened this issue Nov 12, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@West2022
Copy link

Describe the bug

When I load serverl loras with set_lora_device(), the GPU memory continues to grow, cames from 20G to 25G, this function doesn't work

Reproduction

for key in lora_list:
weight_name = key + ".safetensors"
pipe.load_lora_weights(lora_path, weight_name=weight_name, adapter_name=key, local_files_only=True)
adapters = pipe.get_list_adapters()
print(adapters)
pipe.set_lora_device([key], torch.device('cpu'))

Logs

No response

System Info

V100 32G
diffusers 0.32.0.dev0
torch 2.0.1+cu118
peft 0.12.0

Who can help?

No response

@West2022 West2022 added the bug Something isn't working label Nov 12, 2024
@sayakpaul
Copy link
Member

The reproduction seems very incomplete. Can you please provide a fuller reproduction?

Also what versions of diffusers and peft are you using?

@West2022
Copy link
Author

West2022 commented Nov 13, 2024

The reproduction seems very incomplete. Can you please provide a fuller reproduction?

Also what versions of diffusers and peft are you using?

I need to load multiple Loras and switch between different Loras. Each time, I load the used Lora onto the GPU through set_lora_device, while the unused ones are loaded onto the CPU. When initializing these Loras, after load_lora_weight, they are uniformly loaded onto the CPU, as shown in the code:

lora_list = ['aaa', 'bbb', 'ccc', 'ddd', 'eee']
for key in lora_list:
    weight_name = key + ".safetensors"
    pipe.load_lora_weights(lora_path, weight_name=weight_name, adapter_name=key, local_files_only=True)
    adapters = pipe.get_list_adapters()
    print(adapters)
    pipe.set_lora_device([key], torch.device('cpu'))

actually, it does save much GPU memory, but GPU memory continues to grow slowly. I understand that after calling pipe.set_lora_device([adapter], 'cpu'), GPU VRAM should not grow

before:
19911MiB / 32510MiB
after:
21279MiB / 32510MiB

diffusers and peft version:
diffusers 0.32.0.dev0
torch 2.0.1+cu118
peft 0.12.0

@West2022
Copy link
Author

West2022 commented Nov 13, 2024

The reproduction seems very incomplete. Can you please provide a fuller reproduction?

Also what versions of diffusers and peft are you using?

There's some issues. If different Loras are loaded, some Loras contain text_encoder, while others do not, only contain unet, then set_lora_device () will report an key error

for component in self._lora_loadable_modules:
      model = getattr(self, component, None)
      if model is not None:
          for module in model.modules():
              if isinstance(module, BaseTunerLayer):
                    for adapter_name in adapter_names:
                        module.lora_A[adapter_name].to(device)
                        module.lora_B[adapter_name].to(device)
                        # this is a param, not a module, so device placement is not in-place -> re-assign
                        if hasattr(module, "lora_magnitude_vector") and module.lora_magnitude_vector is not None:
                            if adapter_name in module.lora_magnitude_vector:
                                module.lora_magnitude_vector[adapter_name] = module.lora_magnitude_vector[
                                    adapter_name
                                ].to(device)

Need to determine if the key is in the module.lora_A and module.lora_B

if adapter_name in module.lora_A:
    module.lora_A[adapter_name].to(device)
if adapter_name in module.lora_B:
    module.lora_B[adapter_name].to(device)

@sayakpaul
Copy link
Member

Can you try with a more recent version of PyTorch and peft?

If different Loras are loaded, some Loras contain text_encoder, while others do not, only contain unet, then set_lora_device () will report an key error

Yeah this seems right. This also seems like a different issue. Would you maybe like to open a PR for this?

Cc: @BenjaminBossan

@BenjaminBossan
Copy link
Member

Without a reproducer, it's a bit hard to check, but I think this change should solve the issue:

def set_lora_device(model, adapter_names, device):
    # copied from LoraBaseMixin.set_lora_device
    for module in model.modules():
        if isinstance(module, BaseTunerLayer):
            for adapter_name in adapter_names:
                # ADDED next 2 lines:
                if (adapter_name not in module.lora_A) and (adapter_name not in module.lora_embedding_A):
                    continue
                module.lora_A[adapter_name].to(device)
                module.lora_B[adapter_name].to(device)
                # this is a param, not a module, so device placement is not in-place -> re-assign
                if hasattr(module, "lora_magnitude_vector") and module.lora_magnitude_vector is not None:
                    if adapter_name in module.lora_magnitude_vector:
                        module.lora_magnitude_vector[adapter_name] = module.lora_magnitude_vector[adapter_name].to(
                            device
                        )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants