Replies: 1 comment
-
Ah I just ended up reading the kohya-ss scripts cascade branch and got the answer to first quesiton: So I assume Onetrainer is doing this also? Official learning rate default is 1e-4 (0.0001) and official settings use bf16 for training. I think I was accidently using fp16 which says is unstable. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I had a couple Cascade questions. Not sure if anyone can answer.
tldr; Does the default comfyui cascade workflow work with the Onetrainer model output here? and does Onetrainer train the text_encoder and unet for cascade like kohya ss gui does?
kohya ss gui repo kind of confuses me, it creates two files a text_model.safetensors (text encoder trained model) and the unet model.
The setup in comfyui with koyha gui: You put text encoder model in a load clip node (connected to positive and negative prompt), the stage c model which in a unetloader node, stage b in unetloader node, and stage a load vae node.
The only way for me to get the default comfyui cascade workflow working was to convert the checkpoints in comfyui. But basic unet / loadclip workflow would also sort of work but I heard this wasn't right anymore.
My goal here: Want to give Onetrainer a shot because I've no luck with prompting flexibility using the kohya ss gui cascade training. Here's a sample link of me trying to work with dev (scroll down to my samples of ted and a woman in a bikini): bmaltais/kohya_ss#1982 I can no longer reproduce the quality I initially got there, not sure if an update did it or what. I was just getting bad quality stuff yesterday with further testing, kind of worse than Dall-e 1 and gave up haha.
So I got lucky with on first few tries with Cascade I think. With SDXL dreambooth I have no issues with and get really great stuff btw, cascade has been a huge pain. For specs I have 24gb vram.
Thing I tried: I had to constantly use a text_model.safetensor (text encoder) model that was trained about 300 less steps than trained the stage c model (in comfyui), adjusting the text encoder learning rate in kohya gui didn't help. I just had a feeling like something wasn't right with the training code or parameters it was using. The dev there just uses kohya script though, and unsure if he changes anything in them, I feel like it should work.
But related to my second other question.. It seemed to make a huge difference in the quality in that repo if you train text_encoder that and the unet together. Still not a lot of prompting flexibility though when it trained them both.
I tried to train each each of those individually, with one not the other, like the dev did. and the results were not that great. (you can see from his testing examples). The text encoder training seemed to make a bigger difference in the subject likeness over there. So wondering if Onetrainer trains both or if not, if it will in the future. Is anyone here getting good training results with Cascade?
I'm gonna give this a shot though and train a 1960's celebrity on 74 photos tomorrow. I'll let you know how it goes. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions