Flux.1 cannot load standard transformer in nf4 #9996

vladmandic · 2024-11-22T16:55:11Z

Describe the bug

loading different flux transformer models is fine except for nf4.
it works for 1% of fine-tunes provided on Huggingface, but it doesn't work for 99% standard fine-tunes available on CivitAI.

example of such model: https://civitai.com/models/118111?modelVersionId=1009051

note i'm using FluxTransformer2DModel directly as its easiest for reproduction plus majority of flux fine-tunes are provided as transformer-only, not full models. but where full model does exist, its exactly the same problem using FluxPipeline

Reproduction

import torch
import bitsandbytes as bnb
import diffusers

print(f'torch=={torch.__version__} diffusers=={diffusers.__version__} bnb=={bnb.__version__}')
kwargs = { 'low_cpu_mem_usage': True, 'torch_dtype': torch.bfloat16, 'cache_dir': '/mnt/models/huggingface' }
files = [
    'flux-c4pacitor_v2alpha-f1s-bf16.safetensors',
    'flux-iniverse_v2-f1d-fp8.safetensors',
    'flux-copax_timeless_xplus_mix2-nf4.safetensors',
]

for f in files:
    print(f)
    try:
        transformer = diffusers.FluxTransformer2DModel.from_single_file(f, **kwargs)
        print(transformer.__class__)
    except Exception as e:
        print(e)
    transformer = None
    torch.cuda.empty_cache()

Logs

in `diffusers/loaders/single_file_utils.py:convert_flux_transformer_checkpoint_to_diffusers`


q, k, v, mlp = torch.split(checkpoint.pop(f"single_blocks.{i}.linear1.weight"), split_size, dim=0)


> RuntimeError: split_with_sizes expects split_sizes to sum exactly to 33030144 (input tensor's size at dimension 0), but got split_sizes=[3072, 3072, 3072, 12288]

System Info

torch==2.5.1+cu124 diffusers==0.32.0.dev0 bnb==0.44.1

Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza

The text was updated successfully, but these errors were encountered:

sayakpaul · 2024-11-22T17:14:54Z

I don't think we support loading single-file NF4 checkpoints yet. Loading pre-quantized NF4 (or more generally bnb) checkpoint is only supported via from_pretrained() as of now.

Some reasoning as to why that is the case (from #9165 (comment)):

For the diffusion model as in keys prefixed with model.diffusion_model, we suggest following the saving and loading approach in the OP because we cannot define a clear mechanism to load the quantization stats for the attention modules from those keys and associated tensors.

But maybe there's a way now.

@DN6 would you be able to check this? If not, I will find time.

vladmandic · 2024-11-22T17:21:04Z

I don't think we support loading single-file NF4 checkpoints yet

i'm specifically loading transformer-only. although its the same issue with single-file checkpoint (although they are much less frequent).
if this needs to be converted from issue to feature-request, fine by me, but its a high priority item non-the-less since right now diffusers cannot work with majority of models uploaded on civitai - and that is pretty much a standard nowadays.

sayakpaul · 2024-11-22T17:24:52Z

i'm specifically loading transformer-only. although its the same issue with single-file checkpoint (although they are much less frequent).

Clearing some confusion. I think what you mean is the following. When trying to load a standard transformer checkpoint (which is the original BFL format) using from_single_file() -- it almost always succeeds. However, when trying to load pre-quantized (NF4) checkpoint (having the same BFL format but with NF4 specific keys added) using from_single_file() -- it fails. Correct?

Issue / feature request is fine and high-prio is fine too.

vladmandic · 2024-11-22T17:41:17Z

yes, that is correct. it works for transformers in fp32/fp16/fp8 safetensors, but fails for nf4.
(using my gguf code, it also works for .gguf in different quants which leaves nf4 as only one that fails - and unfortunately, that is the highly desired one)

sayakpaul · 2024-11-22T17:45:29Z

it works for transformers in fp32/fp16/fp8 safetensors, but fails for nf4.

Thanks for confirming! Just to give you a reason as to why it fails is because NF4 state dicts have special quantization keys and they also compress some original dimensionality which is why you faced the error you reported. See, for example: https://huggingface.co/hf-internal-testing/flux.1-dev-nf4-pkg/tree/main/transformer?show_file_info=transformer%2Fdiffusion_pytorch_model.safetensors (quant_map for example).

and unfortunately, that is the highly desired one

Yes, not denying it. This will be supported :)

sayakpaul · 2024-11-22T18:12:35Z

(using my gguf code, it also works for .gguf in different quants which leaves nf4 as only one that fails - and unfortunately, that is the highly desired one)

@vladmandic do you have a reference for this? Would be quite helpful!

vladmandic · 2024-11-22T18:27:39Z

@vladmandic do you have a reference for this? Would be quite helpful!

GGUF? i shared it in #9487 (comment)

bghira · 2024-11-22T19:20:55Z

the problem with civitai models is that they are not standard. everything is rather ad-hoc for state dicts for most models released. it's unfortunate that it became the more prevalent/common means of distribution right now, but i'm working directly with civitai to improve this situation down the line and allow model creators to provide configuration details for a given model and even directly support Diffusers to this effect.

vladmandic · 2024-11-22T20:40:21Z

the problem with civitai models is that they are not standard. everything is rather ad-hoc for state dicts for most models released.

i totally agree with that statement - its a wild west!
one reason why i'm pushing OMI to create standards before eventually releasing a model.

bghira · 2024-11-22T20:42:09Z

yes, let OMI create a new standard to solve the unity issue among standards 🤣

sorry for the noise, i'll see myself out

vladmandic · 2024-11-22T20:45:35Z

i've used that xkcd many times myself!
for omi, i actually dont care what the standard is as long as its one. right now differences in implementations for same model in different formats are whats a killer.

vladmandic added the bug Something isn't working label Nov 22, 2024

yiyixuxu assigned DN6 Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux.1 cannot load standard transformer in nf4 #9996

Flux.1 cannot load standard transformer in nf4 #9996

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024 •

edited

Loading

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

vladmandic commented Nov 22, 2024

bghira commented Nov 22, 2024

vladmandic commented Nov 22, 2024

bghira commented Nov 22, 2024

vladmandic commented Nov 22, 2024

Flux.1 cannot load standard transformer in nf4 #9996

Flux.1 cannot load standard transformer in nf4 #9996

Comments

vladmandic commented Nov 22, 2024

Describe the bug

Reproduction

Logs

System Info

Who can help?

sayakpaul commented Nov 22, 2024 • edited Loading

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

sayakpaul commented Nov 22, 2024

vladmandic commented Nov 22, 2024

bghira commented Nov 22, 2024

vladmandic commented Nov 22, 2024

bghira commented Nov 22, 2024

vladmandic commented Nov 22, 2024

sayakpaul commented Nov 22, 2024 •

edited

Loading