DDIMScheduler (or DDPM) of StableDiffusion with 1000 num_inference_steps bugs out #10003

HilaManor · 2024-11-23T15:18:23Z

Describe the bug

There's something going on with the set_timesteps offset parameters on Stable Diffusion (1v4). The timesteps are set from 1->1000 instead from 0, and so it tries to index out of bounds

Reproduction

model_id = "CompVis/stable-diffusion-v1-4"
scheduler = DDIMScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler).to(device)
pipe("an image of a cat", num_inference_steps=1000)

gives
IndexError: index 1000 is out of bounds for dimension 0 with size 1000
because scheduler.timesteps gives: tensor([1000, 999, 998, 997, ..., 3, 2, 1], device='cuda:0')
instead of tensor([999, 998, 997, ..., 3, 2, 1, 0], device='cuda:0')

Logs

No response

System Info

ste the text below in your GitHub issue and FILL OUT the two last points.

🤗 Diffusers version: 0.31.0
Platform: Linux-5.15.0-119-generic-x86_64-with-glibc2.31
Running on Google Colab?: No
Python version: 3.12.7
PyTorch version (GPU?): 2.5.1+cu124 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.26.2
Transformers version: 4.46.3
Accelerate version: 1.1.1
PEFT version: not installed
Bitsandbytes version: not installed
Safetensors version: 0.4.5
xFormers version: 0.0.28.post3
Accelerator: NVIDIA RTX A6000, 49140 MiB

Who can help?

@yiyixuxu @asomoza @sayakpaul @DN6

The text was updated successfully, but these errors were encountered:

SahilCarterr · 2024-11-23T17:28:10Z

I think num_inference_steps=1000 doesn't make practical sense

Increasing the steps significantly increases the computation time without proportional quality improvement.
After a certain number of steps, additional steps won't improve the image significantly.
The DDIMScheduler is designed to work optimally with fewer steps (typically 20–150).
You might experiment with 100–150 steps if you're looking for marginal improvements, but 1000 is overkill.

Additionally you can go through tutorial here to improve quality of SD
@HilaManor

hlky · 2024-11-23T19:25:52Z

The scheduler config has steps_offset=1 compared to default 0, timestep_spacing isn't specified in the config and for DDIMScheduler the default is leading:

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Lines 323 to 328 in 7ac6e28

    
           elif self.config.timestep_spacing == "leading": 
        
               step_ratio = self.config.num_train_timesteps // self.num_inference_steps 
        
               # creates integer timesteps by multiplying by ratio 
        
               # casting to int to avoid issues when num_inference_step is power of 3 
        
               timesteps = (np.arange(0, num_inference_steps) * step_ratio).round()[::-1].copy().astype(np.int64) 
        
               timesteps += self.config.steps_offset

from diffusers import DDIMScheduler

model_id = "CompVis/stable-diffusion-v1-4"
scheduler = DDIMScheduler.from_pretrained(model_id, subfolder="scheduler")
scheduler.set_timesteps(1000)
scheduler.timesteps
>>> tensor([1000,  999, ...,   2,    1])

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Line 405 in 7ac6e28

alpha_prod_t = self.alphas_cumprod[timestep]

timestep_spacing="leading" needs steps_offset=0 to get 1000 steps

from diffusers import DDIMScheduler

model_id = "CompVis/stable-diffusion-v1-4"
scheduler = DDIMScheduler.from_pretrained(model_id, subfolder="scheduler", steps_offset=0)
scheduler.set_timesteps(1000)
scheduler.timesteps
>>> tensor([999, 998, ...,   1,   0])

Most schedulers use default timestep_spacing="linspace" which does work with 1000 steps and gives the same 1000 timesteps

from diffusers import DDIMScheduler

model_id = "CompVis/stable-diffusion-v1-4"
scheduler = DDIMScheduler.from_pretrained(model_id, subfolder="scheduler", timestep_spacing="linspace")
scheduler.set_timesteps(1000)
scheduler.timesteps
>>> tensor([999, 998, ...,   1,   0])

DDIM explicitly doesn't support > 1000 steps, other schedulers don't have this check

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Lines 306 to 311 in 7ac6e28

    
           if num_inference_steps > self.config.num_train_timesteps: 
        
               raise ValueError( 
        
                   f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:" 
        
                   f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle" 
        
                   f" maximal {self.config.num_train_timesteps} timesteps." 
        
               )

EulerDiscreteScheduler with timestep_spacing="linspace" for example works with > 1000 steps

scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler", timestep_spacing="linspace")
scheduler.set_timesteps(1111)
scheduler.timesteps
>>> tensor([9.9900e+02, 9.9810e+02, 9.9720e+02,  ..., 1.8000e+00, 9.0000e-01,
        0.0000e+00])

Inference tested for a few steps:

from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

model_id = "CompVis/stable-diffusion-v1-4"
scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler", timestep_spacing="linspace")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler)
pipe("an image of a cat", num_inference_steps=1111)

There is a difference in casting for linspace and trailing:
Euler

diffusers/src/diffusers/schedulers/scheduling_euler_discrete.py

Lines 380 to 383 in 7ac6e28

    
           if self.config.timestep_spacing == "linspace": 
        
               timesteps = np.linspace( 
        
                   0, self.config.num_train_timesteps - 1, num_inference_steps, dtype=np.float32 
        
               )[::-1].copy()

DDIM

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Lines 317 to 322 in 7ac6e28

    
           timesteps = ( 
        
               np.linspace(0, self.config.num_train_timesteps - 1, num_inference_steps) 
        
               .round()[::-1] 
        
               .copy() 
        
               .astype(np.int64) 
        
           )

Euler

diffusers/src/diffusers/schedulers/scheduling_euler_discrete.py

Lines 392 to 399 in 7ac6e28

    
           elif self.config.timestep_spacing == "trailing": 
        
               step_ratio = self.config.num_train_timesteps / self.num_inference_steps 
        
               # creates integer timesteps by multiplying by ratio 
        
               # casting to int to avoid issues when num_inference_step is power of 3 
        
               timesteps = ( 
        
                   (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(np.float32) 
        
               ) 
        
               timesteps -= 1

DDIM

diffusers/src/diffusers/schedulers/scheduling_ddim.py

Lines 329 to 334 in 7ac6e28

    
           elif self.config.timestep_spacing == "trailing": 
        
               step_ratio = self.config.num_train_timesteps / self.num_inference_steps 
        
               # creates integer timesteps by multiplying by ratio 
        
               # casting to int to avoid issues when num_inference_step is power of 3 
        
               timesteps = np.round(np.arange(self.config.num_train_timesteps, 0, -step_ratio)).astype(np.int64) 
        
               timesteps -= 1

leading appears to not work with > 1000 steps on other schedulers

HilaManor · 2024-11-23T22:55:17Z

@hlky Thanks for the detailed reply! However I'm still a bit confused what to actually do.
Trying to load StableDiffusionPipeline from pretrained with a scheduler loaded with steps_offset=0 leads to a warning that this is deprecated and then overrides with offset=1. So it keeps this bug there:

diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

Lines 212 to 224 in 7ac6e28

    
           if hasattr(scheduler.config, "steps_offset") and scheduler.config.steps_offset != 1: 
        
               deprecation_message = ( 
        
                   f"The configuration file of this scheduler: {scheduler} is outdated. `steps_offset`" 
        
                   f" should be set to 1 instead of {scheduler.config.steps_offset}. Please make sure " 
        
                   "to update the config accordingly as leaving `steps_offset` might led to incorrect results" 
        
                   " in future versions. If you have downloaded this checkpoint from the Hugging Face Hub," 
        
                   " it would be very nice if you could open a Pull request for the `scheduler/scheduler_config.json`" 
        
                   " file" 
        
               ) 
        
               deprecate("steps_offset!=1", "1.0.0", deprecation_message, standard_warn=False) 
        
               new_config = dict(scheduler.config) 
        
               new_config["steps_offset"] = 1 
        
               scheduler._internal_dict = FrozenDict(new_config)

@SahilCarterr Thank you for your comment, however num_inference_steps=1000 is practical for applications different from simple text-to-image generation. Additionally, DDPMScheduler (which is designed for 1000 steps), has this bug as well.
In any case, this is still an unexpected behavior that should be accounted for.

hlky · 2024-11-24T00:14:05Z

@HilaManor Thanks for checking this! You're right looks like step_offset=0 won't work. Could you test timestep_spacing="linspace" or timestep_spacing="trailing" with your use case, both create the same tensor([999, 998, 997, ..., 3, 2, 1, 0], device='cuda:0') that you're expecting at 1000 steps. Using min(timestep, self.config.num_train_timesteps - 1) to index alphas_cumprod could work or the check might need to be >= instead of >, at least for timestep_spacing="leading".

sayakpaul · 2024-11-24T03:17:07Z

Cc: @yiyixuxu here.

HilaManor · 2024-11-26T11:10:21Z

@hlky Thanks again for your comment. timestep_spacing="trailing" does solve this only for t=1000, however in my case I also sometimes need less steps, which then causes differences from leading:

leading for 10 steps: tensor([901, 801, 701, 601, 501, 401, 301, 201, 101,   1], device='cuda:0')
trailing for 10 steps: tensor([999, 899, 799, 699, 599, 499, 399, 299, 199,  99], device='cuda:0')

An interesting question is also is it expected for the last timestep to be 1 when setting less steps? I know that it calculates if noise should be added in the last step (and it doesn't in this case), however the t=1 denoiser shouldn't be expected to output x_0 directly theoretically.

HilaManor · 2024-11-27T09:53:45Z

I think the following bug is related:
in DPMSolverMultistepScheduler witht the same pretrained config, when using set_timesteps(1000) it creates a sigma array such that the middle of the array has the same sigmas, which casues nans from timestep500 onwards
self.sigmas = [,...,1.6401, 1.6346, 1.6291, 1.6237, 1.6183, 1.6183, 1.6129, 1.6075, 1.6022, 1.5968,...].

Using "leading" leads to a constant tensor of 1000 values of 0.0413 with a trailing 0. "trailing" I think may be close to the expected behaviour?

the reason is the timesteps vector- in leading it's just 0s, and in linspace the middle timestep is repeated

HilaManor added the bug Something isn't working label Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDIMScheduler (or DDPM) of StableDiffusion with 1000 num_inference_steps bugs out #10003

DDIMScheduler (or DDPM) of StableDiffusion with 1000 num_inference_steps bugs out #10003

HilaManor commented Nov 23, 2024

SahilCarterr commented Nov 23, 2024

hlky commented Nov 23, 2024

HilaManor commented Nov 23, 2024 •

edited

Loading

hlky commented Nov 24, 2024

sayakpaul commented Nov 24, 2024

HilaManor commented Nov 26, 2024

HilaManor commented Nov 27, 2024 •

edited

Loading

DDIMScheduler (or DDPM) of StableDiffusion with 1000 num_inference_steps bugs out #10003

DDIMScheduler (or DDPM) of StableDiffusion with 1000 num_inference_steps bugs out #10003

Comments

HilaManor commented Nov 23, 2024

Describe the bug

Reproduction

Logs

System Info

Who can help?

SahilCarterr commented Nov 23, 2024

hlky commented Nov 23, 2024

HilaManor commented Nov 23, 2024 • edited Loading

hlky commented Nov 24, 2024

sayakpaul commented Nov 24, 2024

HilaManor commented Nov 26, 2024

HilaManor commented Nov 27, 2024 • edited Loading

HilaManor commented Nov 23, 2024 •

edited

Loading

HilaManor commented Nov 27, 2024 •

edited

Loading