Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix samples, LoRA training. Add system prompt, use_flash_attn #15

Merged
merged 6 commits into from
Feb 24, 2025

Conversation

rockerBOO
Copy link

@rockerBOO rockerBOO commented Feb 23, 2025

  • Fixed samples. Samples are now working as expected. Many need more functionality to support "shift" values.
  • Fixed training code. Lumina uses a reverse timestep (1000 = noise, 0 = clean).
  • Fixed batch size. Tokens now are padded to max length.
  • Updated FlowMatchEulerDiscreteScheduler code from upstream.
  • Add --use_flash_attn to use flash attention. If not installed will show an error. Default to SDPA attention.
  • Add --system_prompt. System prompt can be used in dataset_config, and dataset subsets. Current does not set a system prompt by default.
  • Add --samples_batch_size. Do samples in a batch, defaults to training_batch_size or 1. Might be a good or bad idea for general use but seems practical since sampling I was getting 2x less memory usage so better utilizing the GPU for samples seems like a good idea. Samples are batched based on their common latent properties like seed, width, height, CFG.

In my opinion these changes make it good to go. Should be all working!

Some example system prompts from Lumina 2

if args.system_type == "align":
    system_prompt = "You are an assistant designed to generate high-quality images with the highest degree of image-text alignment based on textual prompts. <Prompt Start> "  
elif args.system_type == "base":
    system_prompt = "You are an assistant designed to generate high-quality images based on user prompts. <Prompt Start> " 
elif args.system_type == "aesthetics":
    system_prompt = "You are an assistant designed to generate high-quality images with highest degree of aesthetics based on user prompts. <Prompt Start> " 
elif args.system_type == "real":
    system_prompt = "You are an assistant designed to generate superior images with the superior degree of image-text alignment based on textual prompts or user prompts. <Prompt Start> "
elif args.system_type == "4grid":
    system_prompt = "You are an assistant designed to generate four high-quality images with highest degree of aesthetics arranged in 2x2 grids based on user prompts. <Prompt Start> "  

Notes:

  • Might want to support a full range of timeshifts. During training they use "uniform" but timeshift might be better.
  • Image sampling could be better but not sure what is not working correctly.

women-lumina-2025-02-22-220712-e05cbfa8_000064_03_20250222223415_42

Screenshot 2025-02-22 at 23-08-00 hopeful-night-2 women-lumina-kohya-lora – Weights   Biases

@sdbds
Copy link
Owner

sdbds commented Feb 23, 2025

Thanks for your contribution, It seems that the training loss part works well.
Part about shift value:
I've talked to Lumina's team and they recommend it
Training uses adapter shift(flux_shift) and inference with fixed number that shift value = 6

@rockerBOO
Copy link
Author

I will get a shift version up a little later today. I think we can use the same system as the flux but I was having issues with the reverse timesteps. Took awhile to get it even "working" so we can iterate on better approaches.

@rockerBOO
Copy link
Author

Here they use "uniform" as default snr, and then do a shift after that. Like mentioned I will try to get the "flux_shift" like version for lumina 2. AI Toolkit has that distinction as well.

@rockerBOO
Copy link
Author

Running a test now but we are using the "noise scheduler" https://github.com/sdbds/sd-scripts/pull/15/files#diff-ca13b8da6dab6243438086787e990ba4a5ed0661b56f9a382dc06c42bc54912eR225-R226 which has the discrete_flow_shift argument. This should apply the timestep shift appropriately. I am changing the default to 6.0 there.

Will post up the updated code after the test is complete.

@rockerBOO
Copy link
Author

rockerBOO commented Feb 23, 2025

Using 6.0 shift:

Screenshot 2025-02-23 at 18-01-47 crimson-donkey-23 women-lumina-kohya-lora – Weights   Biases

Shift 6 Shift 3.0 Base Model
ComfyUI_00068_ ComfyUI_00065_ ComfyUI_00066_

There could be more things to be done to make it proper but I think it works as expected with the shift. I do not know enough to be able to figure out the other details.

@sdbds
Copy link
Owner

sdbds commented Feb 24, 2025

It looks good, I'll check again if the sampling part can be optimized.
If you have any questions, I can also ask their team.

@sdbds sdbds merged commit 653621d into sdbds:lumina Feb 24, 2025
1 check passed
@rockerBOO
Copy link
Author

Thank you, I'll think about what I might ask. I am not as familiar with the timestep scheduling aspect and mostly inferring it from other codebases. For example it's a lot from https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora_lumina2.py#L1357-L1399. It relies on the scheduler to do the timestep shifting, but I'm only assuming it'd doing what it should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants