-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add beta
, exponential
and karras
sigmas to FlowMatchEulerDiscreteScheduler
#10001
base: main
Are you sure you want to change the base?
Conversation
634cb90
to
f41b2f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the initiative!! @hlky
I left some comment as I was going through the the PR; however, now after going through all the code changes, I actually think it becomes pretty clear to me these two things do not naturally combine: they share very little common logic together, and most of the code should be moved inside a big if use_flow_match: ... else: ...
block if they have not already in there
I think we should:
- add the karras/beta/exponential to flow matching scheduler
- in a future PR, we should think after separating the sigmas schedule as its own abstraction, like you suggested to me:)
let me know what you think!
max_shift: Optional[float] = 1.15, | ||
base_image_seq_len: Optional[int] = 256, | ||
max_image_seq_len: Optional[int] = 4096, | ||
invert_sigmas: bool = False, | ||
): | ||
if self.config.use_beta_sigmas and not is_scipy_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole section to calculation betas
-> alphas_cumprod
is not relevant to flow matching, no? since it's only used to calculate sigmas when not use_flow_match
; but it isn't clear from the code because is it outside of the if use_flow_match ... else ...
block
sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) | ||
else: | ||
sigmas = (((1 - self.alphas_cumprod) / self.alphas_cumprod) ** 0.5).flip(0) | ||
|
||
# setable values | ||
self.num_inference_steps = None | ||
|
||
# TODO: Support the full EDM scalings for all prediction types and timestep types | ||
if timestep_type == "continuous" and prediction_type == "v_prediction": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is v_prediction
relevant to flow matching? can people configure use_flow_match
+ v_prediction
? based on the code, it is possible. But does this make sense?
|
||
else: | ||
if self.config.use_karras_sigmas: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have this argument sigmas
is to accept a custom sigmas
schedule from user, i.e. they should either pass a custom sigmas
or choose to use one of the pre-set sigma schedules (e.g. karras, beta, exponential) ;
so we should not have this logic here
if sigma is not None and self.config.use_flow_match:
...
if self.config.use_karras_sigmas:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this was due to how pipelines are currently set up and I wanted to test the noise schedules without modifying the pipelines.
Some of the logic is the same as existing FlowMatchEuler which applies shifting etc to the supplied sigmas.
Although I'm not sure the context of why Flux pipelines pass sigmas
and why those are calculated differently to the sigmas is None
path. Karras etc might actually work better when applied before the scaling so the future refactor will be really useful here, I'll run some tests to check that and I'll check pipelines modified to pass number of steps when using karras etc
Thanks for the review. This was mainly to see whether it could be combined, there are only a few key differences, it should be possible to refactor Euler in a way that FlowMatchEuler only overrides a few things instead, it would be great to use |
f41b2f1
to
3319bc5
Compare
beta
, exponential
and karras
sigmas to FlowMatchEulerDiscreteScheduler
3319bc5
to
39f634f
Compare
timesteps = np.linspace( | ||
self._sigma_to_t(self.sigma_max), self._sigma_to_t(self.sigma_min), num_inference_steps | ||
) | ||
|
||
sigmas = timesteps / self.config.num_train_timesteps | ||
else: | ||
num_inference_steps = len(sigmas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note passing sigmas to Euler
vs FlowMatchEuler
is different
sigmas = [14.615, 6.315, 3.771, 2.181, 1.342, 0.862, 0.555, 0.380, 0.234, 0.113, 0.0] |
num_inference_steps = len(timesteps) if timesteps is not None else len(sigmas) - 1 |
sigmas = np.linspace(1.0, 1 / num_inference_steps, num_inference_steps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is a little bit different (even though I think it should be the same); that's why we created it in the pipeline for convenience (and had planned to move the logic to inside scheduler too, but just haven't done that)
The current logic of set_timesteps
(for sigma is None path) in FlowMatchEuler is implemented based on SD3( it was the first flow-matching model we have), and it is somehow a little bit different from all the flow-matching models following it; I think the main difference is it applied the shift twice, first time here
sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) |
and then
self._sigma_to_t(self.sigma_max), self._sigma_to_t(self.sigma_min), num_inference_steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, it is very clear the flux method is the "standard" way by now
We had planned to move the flux sigmas to the scheduler, too, but since we want to separate the noise scheduler logic way from the scheduler; maybe we don't have to spend time on that for now because it might be the opposite direction :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry this here is just explain your question here
why Flux pipelines pass sigmas and why those are calculated differently to the sigmas is None path
I think i might put it in the wrong thread here, sorry!
@hlky @yiyixuxu @asomoza |
Yes it's ready for review import torch
from diffusers import FluxPipeline, FlowMatchEulerDiscreteScheduler
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_vae_tiling()
pipe = pipe.to("cuda")
config = pipe.scheduler.config
euler_flow_beta = FlowMatchEulerDiscreteScheduler.from_config(config, use_beta_sigmas=True)
euler_flow_exponential = FlowMatchEulerDiscreteScheduler.from_config(config, use_exponential_sigmas=True)
euler_flow_karras = FlowMatchEulerDiscreteScheduler.from_config(config, use_karras_sigmas=True)
pipe.scheduler = euler_flow_beta
generator = torch.Generator("cuda").manual_seed(0)
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=30, guidance_scale=3.5, generator=generator).images[0]
image.save("flow_beta.png")
pipe.scheduler = euler_flow_exponential
generator = torch.Generator("cuda").manual_seed(0)
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=30, guidance_scale=3.5, generator=generator).images[0]
image.save("flow_exponential.png")
pipe.scheduler = euler_flow_karras
generator = torch.Generator("cuda").manual_seed(0)
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=30, guidance_scale=3.5, generator=generator).images[0]
image.save("flow_karras.png") |
What does this PR do?
Add
beta
,exponential
andkarras
sigmas toFlowMatchEulerDiscreteScheduler
.Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
cc @yiyixuxu @asomoza