Feature to improve training quality via detection of out-of-tolerance latent mean/std values #2010

araleza · 2025-03-27T08:52:20Z

This is a new feature, currently only for Flux LoRA training (although it could be applied to full fine-tune later too at least). It analyses the latents of training images, and checks that their mean (average) and standard deviation values are near 0.0 and 1.0 respectively.

It was requested as a feature here: std/mean detection code.

Flux is a diffusion model that takes gaussian noise which is set to have a mean of 0.0 and an std of 1.0 as a starting image. So having training images with that same characteristic may offer improved training as the network does not have to learn how to adjust mean and std values. Current diffusion models don't seem to be good at that.

The default tolerances for detection can be set via --latent_threshold_warn_levels=mean,std_max, e.g. --latent_threshold_warn_levels=0.15,1.40. The test can be disabled using --latent_threshold_warn_levels=disable

If images do not pass the threshold test, then a warning message appears like this:

The std_max value sets the upper limit for the standard deviation. A lower limit is also set to 1.0 / std_max. For example, a std_max value of 1.40 also creates a lower threshold of around 0.714.

Sometimes it's not obvious why the mean and std values are not near 0 and 1. In that case, a parameter --latent_threshold_visualizer can be passed in which will show the latent average values in a window. (This has been tested on Ubuntu Linux. Please can someone try it on Windows? But it should probably work).

Various changes that can move latent means/stds towards 0,1 are possible. e.g.:

Adjusting image brightness/contrast
Adjusting image gamma curve
Adjusting image saturation
Drawing lighter areas in black backgrounds / darker areas in white backgrounds
Replacing backgrounds with inpainting
Redrawing parts of the image with img2img
Downsizing the image by 2x to remove noise / sharpness
Performing a 0.5 (in Gimp) gaussian blur on most of the image

Or even:

Deleting the image from the training set

One thing I've found in the small number of days I've been training with images that are closer to mean/std 0,1 is that I've had to raise the alpha value of my training, and also reduce the LR. I think that might be because the 'gravity well' of the model remaining closer to base model quality is stronger due to it not being disrupted by training images that are outside the 0,1 distribution.

Edit: I need to check that that test for args.latent_threshold_warn_levels in train_network.py doesn't break e.g. SDXL training, which won't have that option.

std/mean detection code.

Added first version of out-of-tolerance latent

98f3afe

std/mean detection code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature to improve training quality via detection of out-of-tolerance latent mean/std values #2010

Feature to improve training quality via detection of out-of-tolerance latent mean/std values #2010

araleza commented Mar 27, 2025 •

edited

Loading

Feature to improve training quality via detection of out-of-tolerance latent mean/std values #2010

Are you sure you want to change the base?

Feature to improve training quality via detection of out-of-tolerance latent mean/std values #2010

Conversation

araleza commented Mar 27, 2025 • edited Loading

araleza commented Mar 27, 2025 •

edited

Loading