Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sdxl controlnet reference community pipeline #9893

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
86 changes: 84 additions & 2 deletions examples/community/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2659,7 +2659,7 @@ Output Image
`prompt: 1 girl`

`reference_attn=True, reference_adain=True, num_inference_steps=20`
![Output_image](https://github.com/zideliu/diffusers/assets/34944964/743848da-a215-48f9-ae39-b5e2ae49fb13)
![Output_image](https://github.com/user-attachments/assets/4fd5bc0b-8df6-4581-9191-c07faaf10ce8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the url at https://github.com/zideliu/diffusers/assets/34944964/743848da-a215-48f9-ae39-b5e2ae49fb13 cannot be found (404). The diffusers repository fork seems to have been removed by their author. Do you confirm?

Actually, I've updated the stable_diffusion_xl_reference.py file with the latest modifications from StableDiffusionXLPipeline on my side. If you agree, I propose to create a new PR:

  • Upgrading the StableDiffusionXLReferencePipeline to use the latest modifications from StableDiffusionXLPipeline (IP adapters...),
  • Replacing the human example of this updated StableDiffusionXLReferencePipeline by a non-human one and fixing the 404 url at the same time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other PR is #9938.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could open a PR to https://hf.co/datasets/huggingface/documentation-images and host the images there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Reference Image
![reference_image](https://github.com/huggingface/diffusers/assets/34944964/449bdab6-e744-4fb2-9620-d4068d9a741b)
Expand All @@ -2681,6 +2681,88 @@ Output Image
`reference_attn=True, reference_adain=True, num_inference_steps=20`
![output_image](https://github.com/huggingface/diffusers/assets/34944964/9b2f1aca-886f-49c3-89ec-d2031c8e3670)

### Stable Diffusion XL ControlNet Reference

This pipeline uses the Reference Control and with ControlNet. Refer to the [Stable Diffusion ControlNet Reference](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#stable-diffusion-controlnet-reference) and [Stable Diffusion XL Reference](https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#stable-diffusion-xl-reference) sections for more information.

```py
from diffusers import ControlNetModel, AutoencoderKL
from diffusers.schedulers import UniPCMultistepScheduler
from diffusers.utils import load_image
import numpy as np
import torch

import cv2
from PIL import Image

from .stable_diffusion_xl_controlnet_reference import StableDiffusionXLControlNetReferencePipeline

# download an image
canny_image = load_image(
"https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
).resize((1024, 1024))

ref_image = load_image(
"https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png"
)

# initialize the models and pipeline
controlnet_conditioning_scale = 0.5 # recommended for good generalization
controlnet = ControlNetModel.from_pretrained(
"diffusers/controlnet-canny-sdxl-1.0", torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetReferencePipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16
).to("cuda:0")

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)

# get canny image
image = np.array(canny_image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

# generate image
image = pipe(
prompt="1girl",
num_inference_steps=20,
controlnet_conditioning_scale=controlnet_conditioning_scale,
image=canny_image,
ref_image=ref_image,
reference_attn=False,
reference_adain=True,
style_fidelity=1.0,
generator=torch.Generator("cuda").manual_seed(42)
).images[0]
```

Canny ControlNet Image

![canny_image](https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png)

Reference Image

![ref_image](https://hf.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png)

Output Image

`prompt: 1 girl`

`reference_attn=True, reference_adain=True, num_inference_steps=20, style_fidelity=1.0`

![Output_image](https://github.com/user-attachments/assets/b320f081-2421-4b96-9ef1-4685b77179ba)

`reference_attn=False, reference_adain=True, num_inference_steps=20, style_fidelity=1.0`

![Output_image](https://github.com/user-attachments/assets/05ee0816-6c4e-4f4b-81fc-71bb0082be1e)

`reference_attn=True, reference_adain=False, num_inference_steps=20, style_fidelity=1.0`

![Output_image](https://github.com/user-attachments/assets/cfb7426b-9e1b-4401-8e7e-609d67b84f8b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using human references in the examples and restrict ourselves to using non-human objects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I've pushed modifications to fix that.


### Stable diffusion fabric pipeline

FABRIC approach applicable to a wide range of popular diffusion models, which exploits
Expand Down Expand Up @@ -4692,4 +4774,4 @@ with torch.no_grad():
```

In the folder examples/pixart there is also a script that can be used to train new models.
Please check the script `train_controlnet_hf_diffusers.sh` on how to start the training.
Please check the script `train_controlnet_hf_diffusers.sh` on how to start the training.
Loading