Using StableDiffusionControlNetImg2ImgPipeline Enable_vae_tiling(), seemingly fixed the patch is 512 x 512, where should I set the relevant parameters #9983

reaper19991110 · 2024-11-21T09:21:24Z

pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")
prompt = "a beautiful landscape photograph"
pipe.enable_vae_tiling()

The text was updated successfully, but these errors were encountered:

sayakpaul · 2024-11-21T11:18:09Z

SD's default resolution is 512x512.

You should pass height and width when calling the pipeline.

SahilCarterr · 2024-11-21T17:25:47Z

You can adjust the code as following inorder to use height and width

from diffusers import StableDiffusionControlNetImg2ImgPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import numpy as np
import torch

import cv2
from PIL import Image

# download an image
image = load_image(
    "https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
)
np_image = np.array(image)

# get canny image
np_image = cv2.Canny(np_image, 100, 200)
np_image = np_image[:, :, None]
np_image = np.concatenate([np_image, np_image, np_image], axis=2)
canny_image = Image.fromarray(np_image)

# load control net and stable diffusion v1-5
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetImg2ImgPipeline.from_pretrained(
    "Lykon/dreamshaper-8", controlnet=controlnet, torch_dtype=torch.float16
)

# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

# generate image
generator = torch.manual_seed(0)
image = pipe(
    "futuristic-looking woman",
    generator=generator,
    image=image,
    height=640,
    width=640,
    control_image=canny_image,
).images[0]
image

Output

@reaper19991110

hlky · 2024-11-22T11:56:09Z

Patch size for tiled VAE is not directly configurable, the value is calculated from the VAE's config. Could be beneficial for low end GPU users to allow tile size to be configurable.

diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py

Line 455 in 64b3e0f

    
           def tiled_decode(self, z: torch.Tensor, return_dict: bool = True) -> Union[DecoderOutput, torch.Tensor]:

diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py

Lines 131 to 138 in 64b3e0f

    
           self.tile_sample_min_size = self.config.sample_size 
        
           sample_size = ( 
        
               self.config.sample_size[0] 
        
               if isinstance(self.config.sample_size, (list, tuple)) 
        
               else self.config.sample_size 
        
           ) 
        
           self.tile_latent_min_size = int(sample_size / (2 ** (len(self.config.block_out_channels) - 1))) 
        
           self.tile_overlap_factor = 0.25

sayakpaul · 2024-11-22T12:19:31Z

Don't want to overwhelm you with requests but if you want to take it up, you're more than welcome to :)

Otherwise, pinging @DN6 and @a-r-r-o-w.

a-r-r-o-w · 2024-11-22T15:04:36Z

Thanks for pinging! I actually have planned for rewriting our VAE tiling implementations to make them all look similar and more easily configurable. Mochi's VAE tiling implementation would be the point-of-reference for the refactors. #9903 is the first PR in the series of changes, and next would be Allegro. For the image VAE, we will have to introduce deprecation warnings so will take that one a bit slower

reaper19991110 · 2024-12-02T08:32:44Z

I found that the parameter tile_latent_min_size in the code is the Patch size for tiled VAE, and this value is set to 512 in the config.json file of the vae used. So directly modifying the parameters of config.json is the most straightforward way. @hlky

sayakpaul assigned a-r-r-o-w Nov 22, 2024

reaper19991110 closed this as completed Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using StableDiffusionControlNetImg2ImgPipeline Enable_vae_tiling(), seemingly fixed the patch is 512 x 512, where should I set the relevant parameters #9983

Using StableDiffusionControlNetImg2ImgPipeline Enable_vae_tiling(), seemingly fixed the patch is 512 x 512, where should I set the relevant parameters #9983

reaper19991110 commented Nov 21, 2024

sayakpaul commented Nov 21, 2024

SahilCarterr commented Nov 21, 2024

hlky commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

a-r-r-o-w commented Nov 22, 2024

reaper19991110 commented Dec 2, 2024

Using StableDiffusionControlNetImg2ImgPipeline Enable_vae_tiling(), seemingly fixed the patch is 512 x 512, where should I set the relevant parameters #9983

Using StableDiffusionControlNetImg2ImgPipeline Enable_vae_tiling(), seemingly fixed the patch is 512 x 512, where should I set the relevant parameters #9983

Comments

reaper19991110 commented Nov 21, 2024

sayakpaul commented Nov 21, 2024

SahilCarterr commented Nov 21, 2024

hlky commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

a-r-r-o-w commented Nov 22, 2024

reaper19991110 commented Dec 2, 2024