Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PAG] - Adaptive Scale bug #9930

Open
elismasilva opened this issue Nov 15, 2024 · 0 comments
Open

[PAG] - Adaptive Scale bug #9930

elismasilva opened this issue Nov 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@elismasilva
Copy link
Contributor

elismasilva commented Nov 15, 2024

Describe the bug

I am looking for the purpose of the PAG adaptive scale? Because I was passing a value in it, for example 5.0, and passing 3.0 in the PAG scale, according to the implemented code we will have a negative number and the scale will return 0 and the PAG will not be applied and I did not find an explanation about this parameter in the documentation.

So i found it on an ComfyUI documentation: "This dampening factor reduces the effect of PAG during the later stages of the denoising process, speeding up the overall sampling. A value of 0.0 means no penalty, while 1.0 completely removes PAG"

Then I realized that I was passing values ​​above 1.0, however when I pass values ​​of 0.2 it is enough for it not to apply the PAG. I suspect this could be a problem.

If you run the code below, you will see that in the third image where I pass a scale of 0.2 in adaptive_scale it practically invalidates the PAG in the first generation steps.

I propose a possible solution:

After this code:

if self.do_pag_adaptive_scaling:

We can change for:

if self.do_pag_adaptive_scaling:           
        signal_scale = self.pag_scale
        if t / self.num_timesteps > self.pag_adaptive_scale:
            signal_scale = 0
        return signal_scale
else:
    return self.pag_scale

And inside every PAG pipeline, we need change "t" variable for "i" variable is passed with param on this function, to receive the number of current step.

noise_pred, self.do_classifier_free_guidance, self.guidance_scale, t, True

With this, the logic will not be that the higher the adaptive scale value, the faster the PAG will be disabled, but quite the opposite. The scale will tell you exactly at what point in the process the PAG will be disabled. If the scale exceeds 0.5 in a 30-step generation, the PAG will be disabled from step 15 onwards. The scale applied will be the same until the moment of the cut and will not be a variable scale.
I don't know if this was the original purpose of this parameter, but it works well for me.

Reproduction

from diffusers import AutoPipelineForText2Image
import torch

device = "cuda"

pipeline_sdxl = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    enable_pag=True,
    pag_applied_layers=["mid"],
    torch_dtype=torch.float16
).to(device)

pipeline = AutoPipelineForText2Image.from_pipe(pipeline_sdxl, enable_pag=True).to(device)
pipeline.enable_vae_tiling() 
pipeline.enable_model_cpu_offload()

prompt = "an insect robot preparing a delicious meal, anime style"

for i, pag_scale in enumerate([0.0, 3.0, 3.0]):
    generator = torch.Generator(device="cpu").manual_seed(0)
    images = pipeline(
        prompt=prompt,
        num_inference_steps=25,
        guidance_scale=7.0,
        generator=generator,
        pag_scale=pag_scale,
        pag_adaptive_scale=0.0 if i < 2 else 0.2
    ).images[0]
    images.save(f"./data/result_pag_{i+1}.png")

Logs

N/A

System Info

  • 🤗 Diffusers version: 0.32.0.dev0
  • Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Running on Google Colab?: No
  • Python version: 3.10.11
  • PyTorch version (GPU?): 2.4.0+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): 0.10.1 (cpu)
  • Jax version: 0.4.35
  • JaxLib version: 0.4.35
  • Huggingface_hub version: 0.26.2
  • Transformers version: 4.46.2
  • Accelerate version: 1.1.1
  • PEFT version: 0.13.2
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.5
  • xFormers version: 0.0.27.post2
  • Accelerator: NVIDIA GeForce RTX 3060 Ti, 8192 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@yiyixuxu , @asomoza

@elismasilva elismasilva added the bug Something isn't working label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant