New sampling method: Incremental Fine-tuning during sampling. #61
Replies: 5 comments 5 replies
-
Also wrote a twitter thread about it. https://twitter.com/cloneofsimo/status/1604242844574552064 |
Beta Was this translation helpful? Give feedback.
-
Thing to note : this isn't something that is impossible previously, but you would need EXTREMELY large amount of memory space to pull this off. Clearly, "merging model" needs at least extra 4gb. This becomes infeasible if we are talking about something of 50 steps. That's 200GB! But luckily for LoRA, you can dynamically merge models so this takes near-zero overhead. |
Beta Was this translation helpful? Give feedback.
-
Awesome work as always. Can this be done with a normal .ckpt file? |
Beta Was this translation helpful? Give feedback.
-
Super interesting! I'd be interested in comparing this to an image without scheduled apaptation, but with alpha > 1.0. Can we achieve similar results if alpha = 5, for example. I would guess that sweeping over alpha > 1 would get you an image that is close to that of scheduled adaptation (at least implemented with heaviside). I'm not grokking why turning on the adaptation at a particular point is better than just cranking up the scale of adaptation. Do you have an intuitive explanation why scheduling can lead to superior generation? |
Beta Was this translation helpful? Give feedback.
-
For a lack of a better name, let's call this scheduled adaptiveness. The basic trick goes like this: during sampling, start from base model, and approach the fine-tuned model. i.e., during$t$ step:
where$m$ is a monotonically increasing function. e.g.,
where H is heaviside step function
We can implement this with diffusers quite beautifully, using callback.
Following image is with / without scheduled adaptiveness, with prompt
There were no white shirt in the training data. All clothes were black. So model can't distinguish if black cloth is something that is intrinsic to Wednesday. We know that this isn't the case trivially, but a gradient, it has no idea.
But base model (sd1.5) certainly knows this. So, it first creates obviously character with white shirt in first few steps. Right after few steps, LoRA comes in and plays it's role to make it closer to
<wday>
.With schedule of 0-1, this is nearly equivalent to sampling with Base model and SDEditing with Fine-tuned one.
Beta Was this translation helpful? Give feedback.
All reactions