-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After running for a while, get error #1
Comments
@Dawgmastah Damn i know exactly what this error means. You were so close to getting it finished too. I put encoder instead of decoder on one of the layers. I've fixed it. You can re-download weight_matching and try it again How long did it run and what was the size of the models you were merging ? |
At least with this, I now know for sure the permutation spec itself will run without error |
It ran for about...4 minutes? |
Really? Huh. What CPU are you running on ? Can you try again then and see if it works then and/or takes longer than 4 minutes ? |
I have a Ryzen 5 5600x, running again right now, will report back |
Thanks. Also, how much RAM do you have ? |
32gb of which 31 i sin use haha |
If relevant (but I think not cuz jax I think doesnt support cuda in windows, have a Quadro P6000 with 24GB mem |
New error, about like 6 mins of running: 0/P_model.diffusion_model.input_blocks.5.0_inner: 0.0 |
Ok. boggling my mind your system is running through this in minutes lol. What models are you merging, may i ask ? Anyway it finished permutating and it seems i can't save the new parameters with that line of code. Hold on, let me think on what to change. |
Two dreambooth models based on f222, was testing to see if dreambooth training is retained. Im a super noob with python, so I hope I installed dependencies correctly? =S It sounds fast but the computer bogs down HARD and fans ramp way up. |
DIsclaimer, I have one of the models open (I thought it wouldnt touch the models so hopefully its not that, would hate to have to retrain) |
No this wouldn't be why, don't worry. It's on my end. |
I figured it out :) and sent a change request? (Sorry im new at git) After a quick google search, just needed to change to square brackets. HOWEVER, Its not saving the final ckpt, or at least not in the same folder the models are The console outputs a list of numbers so large I cant read the beggining of them, section:
|
instead of changing the brackets to square, try running with state_b.update(updated_params) |
On it |
Update, it now outputs : HOWEVER Versions whole text is..... And full contents of data is: €�}q X |
After the output file line in the SD_rebasin_merge file, paste model_b["state_dict"].update(updated_params) then replace torch.save etc with torch.save(model_b, output_file) |
On it, hopefully understood the instructions correctly |
Basically this. https://imgur.com/a/nrOLiMd |
Then I did it right, executing |
Ok progress, merged model is now double the initial ones (8gb), testing if it works |
You might need to prune it to load it in automatic's UI (if that's what you're using) and some other UI's. Well finger's crossed man haha |
Indeed it wont load, how does one prune a model? =S Bunch of errors including the safety checker: Loading weights [0cdcfbbe] from X:\AIMODELS\merged.ckpt The file may be malicious, so the program is not going to read it. Traceback (most recent call last): |
I've just uploaded a prune.py script ( wasn't written by me). place it in the directory of the merged model. run python prune.py (in a console you can interact with, cmd is fine) because It will ask some questions. Where the file is, where you want it and what to name it. don't forget to put .ckpt in at the end of the names |
Error when pruning: (merging) X:\AIMODELS>python prune.py |
Perhaps its saving stuff it shouldnt into the ckpt? Because automati complained as well of finding "unsecure" code also related to FrozenDict |
Okay let's go back a few steps then. May have been overthinking some things. In the torch.save line, have it be instead |
On it |
Latest commit fixes divide by 0 error. |
@Dawgmastah did you happen to know how much iterations it will go through? I'm thinking of stopping and trying with the newest revision |
as many as you decide in max iter, or with parameter --iterations ## Im surprised its taking so long and/or crashing Specs: Ryzen 5 5600x |
nooo guess i'll wait another 3 hours, it isnt so bad, i have all these tabs open and just using the laptop normally |
Just checked, default is 10 |
ok, well, im at 9, ill wait update: landed on the error FINDING PERMUTATIONS |
Using a different system: 2 minutes on a windows machine, 24gb ram, It seems just be using cpu only. regular Anything (4gb f32) currently not working as stated on commit 93b0e95 as stated above |
test anything f32 as model a and model b. let me know if its broken for both |
anything f32 as model a and model b is broken for both currently. |
i just ran it on 2, 2GB models and it took around 1-3 min lol. tbh idk if im supposed to change anything or any settings i can play with.. i just got both models and did the run. it reached 9 iterations and it was done. it didnt retain anything though..both are DB models of people. both models weren't retained |
@Maki9009 were you using |
@ClashSAN no my keywords were names, yeah the alpha changed. i just did 150 iterations and still nothing even close to my dreambooth models. idk |
what ? default iterations is 10 |
@ClashSAN I set iteration to 150, the --alpha was set at 0.5, the new alpha while doing iteration was changing by iteration 150 it was at 0.0066225165562913135 |
What are the base models for both dreambooth models? |
@vaguenebula 1.5 merged with the 840vae |
I don't know what 840vae is, but some newer model architectures don't seem to work for the time being |
@vaguenebula either way from reading this enitre comment section... its still doesnt merge properly for others it seems |
Recently, people have gotten some models to merge really well, but not every model works |
The merge should be close to weighted sum. So if it's not close, there's probably some kind of layer that needs to be added |
@vaguenebula tbh idk, i only ran the resbain merge not the weight matching script. idk if i need to do that also |
@Dawgmastah Hey. Did you ever get around to doing those layer tests ? |
So I got thinking on what Samuel said and with the surprising albeit very disappointing release of v2.0, i thought "Surely if any change will leave the permutation basin, this is it right ?" So i got to trying to merge a v2 and v1.5 model. Nothing complicated, just back to basics, re_basin permutation only and removing the uncommon layers between the 2 models (not as much as you might think). The results were....well why don't you see for yourself. https://imgur.com/a/v1lCNtU Also calculating on single precision .float() increased the speed of the permutations on CPU only by a lot. No GPU, 8GB RAM, i5 11th gen, Turned hours to minutes. |
So you think the larger basin made the feature matching work better? How were the results? |
Yes I think re-basin alone only really works for huge changes. As for the results, not tested yet. Still running. I'm on iteration 55. Most of the layers have been permuted to 0 but a few still remain. Because of how the permutation works I'm going to run a second time after this is done, this time applying the permutation to the v1.5 model. |
Hey, was not home on the weekend, what are the tests? I can try them |
@Dawgmastah |
Gave models that just produced noise. After looking into it more, looks like the Unet actually received major change from 1.5 to 2.0. Previously, the Unet layers with attention blocks used a fixed number of heads. 2.0 uses a fixed number of channels per head instead and the number of heads change depending on the layer. Since the number of parameters were unchanged, it's a change that can slip by. Anyway, I'll exclude the attention layers and try again. |
Do you have a specific commit with it to test? |
@Dawgmastah |
Error is:
0/P_model.diffusion_model.output_blocks.6.0_inner3: 0.0
0/P_model.diffusion_model.output_blocks.4.0_inner2: 0.0
0/P_bg371: 0.0
0/P_bg206: 0.0
0/P_model.diffusion_model.output_blocks.6.0_inner2: 0.0
Traceback (most recent call last):
File "X:\AIMODELS\SD_rebasin_merge.py", line 27, in
updated_params = unflatten_params(apply_permutation(permutation_spec, final_permutation, flatten_params(state_b)))
File "X:\AIMODELS\weight_matching.py", line 786, in apply_permutation
return {k: get_permuted_param(ps, perm, k, params) for k in params.keys()}
File "X:\AIMODELS\weight_matching.py", line 786, in
return {k: get_permuted_param(ps, perm, k, params) for k in params.keys()}
File "X:\AIMODELS\weight_matching.py", line 773, in get_permuted_param
for axis, p in enumerate(ps.axes_to_perm[k]):
KeyError: 'first_stage_model.decoder.conv_in.weight'
The text was updated successfully, but these errors were encountered: